The Large Hadron Collider (LHC) was recently celebrated when experiments on this enormous particle collider gave strong evidence for the existence of the famed Higgs boson, a.k.a. the “God Particle”, a subatomic particle that everyone agrees is very important, even if they can’t explain exactly why.
Aside from the subject of its proportionately enormous cost, the LHC was previously in the news as the target of an accusation that it could be the doom of the human race, the Earth, and indeed the whole Universe. The critics claimed that it was possible that the high energy collisions being performed in the collider could cause the formation of black holes, strangelets, or worse; because it would be impossible to say for certain that such a doomsday scenario would not occur, we should not take the risk at all.
At the time, spokesmen for CERN (the organization that built and operates the LHC) could not bring themselves to say that there was absolutely no risk, only that it was “nonzero but negligible” (unfortunately I can’t find the original article that I read, so this is a paraphrase). The word “nonzero” would appear to concede the very point that critics were trying to make, but I suppose what they meant was “so improbable that it is practically zero”. It is still theoretically possible, just like it is theoretically possible that a cup of water will spontaneously boil, or that the molecules of air in a room would just by chance all happen to crowd into the left side of the room, even though for all practical purposes we will never actually see this happen.
Although uncertainty and probability are concepts that we have to deal with on a daily basis (“is it likely to rain?” “are you going to take out the trash?” – “maybe”), our everyday language doesn’t really have a good way to express degrees of belief. To return to the Higgs boson, more reputable news sources were careful to report that it was not found with absolute certainty, but to a “five sigma” degree, i.e. to within five standard deviations, which works out to odds of about 1 in 3.5 million that the result was a fluke. CERN’s director-general Rolf Heuer neatly sums up the tension between our commonsense notions and the scientist’s linguistic caution:
‘As a lay man I say we have it. As a scientist I have to say “probably”.’ (via Significance)
Unfortunately there isn’t much consensus on how to translate quantitative probability into common speech. A probabilistic or statistical intuition is difficult to acquire, even though it’s a valuable reasoning tool. Most of us simply aren’t trained to think in this way, which explains why the Prosecutor’s Fallacy and the Monty Hall Problem continue to befuddle generations of students and professionals. No field of science today can do without the tools of statistics, especially in this present era of Big Data.
One solution is to have an agreement about what the words “likely” or “unlikely” actually mean. Climate science is heavily dependent on statistical modeling to make predictions about future trends from past observations, and so any conclusions that result have to be qualified in terms of their likelihood, based on the available data. This makes it a prime target for attack from agnotologists (a wonderful word recently coined to refer to “doubt merchants” who deliberately sow doubt for political or financial ends, e.g. climate skeptics, the tobacco lobby).
The Intergovernmental Panel on Climate Change (IPCC) has a Guidance Note for its report-writers, offering them a calibrated scale of terminology for speaking about likelihood, which I reproduce below.
|Term||Likelihood of outcome|
|Virtually certain||99-100% probability|
|About as likely as not||33 to 66%|
It’s clear that a scale like this can’t be used for everything in life. We can be comfortable saying that “it is extremely unlikely that global climate change of the past 50 years can be explained without external forcing”, if the models tell us that the probability is less than 5%. However, we would not dare to say that “it is extremely unlikely that this beauty product will cause skin cancer” if the probability of getting skin cancer from using it is, say, 4%. As the saying goes, context matters.
A deeper problem with the IPCC’s Guidance Note is how it papers over the statistical meanings of “likelihood”, “probability”, and “degree of belief”. We can use these terms interchangeably, but in the context of statistics they have specific and different meanings. Even among statisticians, there has been deep philosophical disagreement over what randomness and chance actually mean.
“Likelihood”, for example, refers to the probability of a given set of data being observed if the theoretical model constructed to explain them is actually true. This is distinct from the probability of the theoretical model given that set of data, which is called the “posterior probability”. Both are valid ways of quantifying our uncertainty about how well a theoretical model fits the data, and indeed both kinds of values are reported in various fields of science. What’s important is that the readers know which one they’re dealing with, and what the prior assumptions are, so that they can make informed decisions.
In the end, there’s probably no single way to present a nuanced probabilistic argument in simple language, without losing much of the accuracy and shades of meaning. The fact of the matter is that if we want to understand the morose soul-searching that many social and medical sciences are going through, or the long-standing grumbles about unthinking use of p-values in the scientific literature, we have to know something about the statistics behind them, and accept that it will always be an uphill struggle to keep our probabilistic intuitions in good working order.