Michael E. Mann and four others published the peer-reviewed paper “The Likelihood of Recent Record Warmth” in Nature: Scientific Reports (DOI: 10.1038/srep19831). I shall call this authors of this paper “Mann” for ease. Mann concludes (emphasis original):
We find that individual record years and the observed runs of record-setting temperatures were extremely unlikely to have occurred in the absence of human-caused climate change, though not nearly as unlikely as press reports have suggested. These same record temperatures were, by contrast, quite likely to have occurred in the presence of anthropogenic climate forcing.
This is confused and, in part, in error, as I show below. I am anxious people understand that Mann’s errors are in no way unique or rare; indeed, they are banal and ubiquitous. I therefore hope this article serves as a primer in how not to analyze time series.

First Error

Suppose you want to guess the height of the next person you meet when going outdoors. This value is uncertain, and so we can use probability to quantify our uncertainty in its value. Suppose as a crude approximation we used a normal distribution (it’s crude because we can only measure height to positive, finite, discrete levels and the normal allows numbers on the real line). The normal is characterized by two parameters, a location and spread. Next suppose God Himself told us that the values of these parameters were 5’5″ and 1’4″. We are thus as certain as possible in the value of these parameters. But are we as certain in the height of the next person? Can we, for instance, claim there is a 100% chance the next person will be, no matter what, 5’5″?

Obviously not. All we can say are things like this: “Given our model and God’s word on the value of the parameters, we are about 90% sure the next person’s height will be between 3’3″ and 7’7″.” (Don’t forget children are persons, too. The strange upper range is odd because the normal is, as advertised, crude. But it does not matter which model is used: my central argument remains.)

What kind of mistake would be it to claim that the next person will be for certain 5’5″? Whatever name you give this, it is the first error which pervades Mann’s paper.

The temperature values (anomalies) they use are presented as if they are certain, when in fact they are the estimates of a parameter of some probability model. Nobody knows that the temperature anomaly was precisely -0.10 in 1920 (or whatever value was claimed). Since this anomaly was the result of a probability model, to say we know it precisely is just like saying we know the exact height will be certainly 5’5″. Therefore, every temperature (or anomaly) that is used by Mann must, but does not, come equipped with a measure of its uncertainty.

We want the predictive uncertainty, as in the height example, and not the parametric uncertainty, which would only show the plus-or-minus in the model’s parameter value for temperature. In the height example, we didn’t have any uncertainty in the parameter because we received the value from on High. But if God only told us the central parameter was 5’5″ +/- 3″, then the uncertainty we have in the next height must widen—and by a lot—to take this extra uncertainty into account. The same is true for temperatures/anomalies.

Therefore, every graph and calculation in Mann’s paper which uses the temperatures/anomalies as if they were certain is wrong. In Mann’s favor, absolutely everybody makes the same error as they. This, however, is no excuse. An error repeated does not become a truth.

Nevertheless, I, like Mann and everybody else, will assume that this magnificent, non-ignorable, and thesis-destroying error does not exist. I will treat the temperatures/anomalies as if they are certain. This trick does not fix the other errors, which I will now show.

Second Error

You are in Las Vegas watching a craps game. On the come out, a player throws a “snake eyes” (a pair of ones). Given what we know about dice (they have six sides, one of which must show, etc.) the probability of snake eyes is 1/36. The next player (because the first crapped out) opens also with snake eyes. The probability of this is also 1/36.

Now what, given what we know of dice, is the probability of two snake eyes in a row? Well, this is 1/36 * 1/36 = 1/1296. This is a small number, about 0.0008. Because it is less than the magic number in statistics, does that mean the casino is cheating and causing the dice to come up snake eyes? Or can “chance” explain this?

First notice that in each throw, some things caused each total, i.e. various physical forces caused the dice to land the way they did. The players at the table did not know these causes. But a physicist might: he might measure the gravitional field, the spin (in three dimensions) of the dice as they left the players’ hands, the momentum given the dice by the throwers, the elasticity of table, the friction of the tablecloth, and so forth. If the physicist could measure these forces, he would be able to predict what the dice would do. The better he knows the forces, the better he could predict. If he knew the forces precisely he could predict the outcome with certainty. (This is why Vegas bans contrivances to measure forces/causes.)

From this it follows that “chance” did not cause the dice totals. Chance is not a physical force, and since it has no ontological being, it cannot be an efficient cause. Chance is thus a product of our ignorance of forces. Chance, then, is a synonym for probability. And probability is not a cause.

This means it is improper to ask, as most do ask, “What is the chance of snake eyes?” There is no single chance: the question has no proper answer. Why? Because the chance calculated depends on the information assumed. The bare question “What is the chance” does not tell us what information to assume, therefore it cannot be answered.

To the player, who knows only the possible totals of the dice, the chance is 1/36. To the physicist who measured all the causes, it is 1. To a second physicist who could only measure partial causes, the chance would be north of 1/36, but south of 1, depending on how the measurements were probative of the dice total. And so forth.

We have two players in a row shooting snake eyes. And we have calculated, from the players’ perspective, i.e. using their knowledge, the chance of this occurring. But we could have also asked, “Given only our knowledge of dice totals etc., what are the chances of seeing two snake eyes in a row in a sequence of N tosses?” N can be 2, 3, 100, 1000, any number we like. Because N can vary, the chance calculated will vary. That leads to the natural question: what is the right N to use for the Vegas example?

The answer is: there is no right N. The N picked depends on the situation we want to consider. It depends on decisions somebody makes. What might these decisions be? Anything. To the craps player who only has $20 to risk, N will be small. To the casino, it will be large. And so on.

Why is this important? Because the length of some sequence we happen to observe is not inherently of interest in and of itself. Whatever N is, it is still the case that some thing or things caused the values of the sequence. The probabilities we calculate cannot eliminate cause. Therefore, we have to be extremely cautious in interpreting the chance of any sequence, because (a) the probabilities we calculate depend on the sequence’s length and the length of interest depends on decisions somebody makes, and (b) in no case does cause somehow disappear the larger or smaller N is.

The second error Mann makes, and an error which is duplicated far and wide, is to assume that probability has any bearing on cause. We want to know what caused the temperatures/anomalies to take the values they did. Probability is of no help in this. Yet Mann assumes because the probability of a sequence calculated conditional on one set of information is different from the probability of the same sequence calculated conditional on another set of information, that therefore the only possible cause of the sequence (or of part of it) is thus global warming. This is the fallacy of the false dichotomy. The magnitude and nature of this error is discussed next.

The fallacy of the false dichotomy in the dice example is now plain. Because the probability of the observed N = 2 sequence of snake eyes was low given the information only about dice totals, it does not follow that therefore the casino cheated. Notice that, assuming the casino did cheat, the probability of two snake eyes is high (or even 1, assuming the casino had perfect control). We cannot compare these two probabilities, 0.0008 and 1, and conclude that “chance” could not have been a cause, therefore cheating must have.

And the same is true in temperature/anomaly sequences, as we shall now see.

Third Error

Put all this another way: suppose N is a temperature/anomaly series of which a physicist knows the cause of every value. What, given the physicist’s knowledge, is the chance of this sequence? It is 1. Why? Because it is no different than the dice throws: if we know the cause, we can predict with certainty. But what if we don’t know the cause? That is an excellent question.

What is the probability of a temperature/anomaly sequence where we do not know the cause? Answer: there is none. Why? Because since all probability is conditional on the knowledge assumed, if we do not assume anything no probability can be calculated. Obviously, the sequence happened, therefore it was caused. But absent knowledge of cause, and not assuming anything else like we did arbitrarily in the height example or as was natural in the case of dice totals, we must remain silent on probability.

Suppose we assume, arbitrarily, only that anomalies can only take the values -1 to 1 in increments of 0.01. That makes 201 possible anomalies. Given only this information, what is the probability the next anomaly takes the value, say, 0? It is 1/201. Suppose in fact we observe the next anomaly to be 0, and further suppose the anomaly after that is also 0. What are the chances of two 0s in a row? In a sequence of N = 2, and given only our arbitrary assumption, it is 1/201 * 1/201 = 1/40401. This is also less than the magic number. Is it thus the case that Nature “cheated” and made two 0s in a row?

Well, yes, in the sense that Nature causes all anomalies (and assuming, as is true, we are part of Nature). But this answer doesn’t quite capture the gist of the question. Before we come to that, assume, also arbitrarily, that a different set of information, say that the uncertainty in the temperatures/anomalies is represented by a more complex probability model (our first arbitrary assumption was also a probability model). Let this more complex probability model be an autoregressive moving-average, or ARMA, model. Now this model has certain parameters, but assume we know what these are.

Given this ARMA, what is the probability of two 0s in a row? It will be some number. It is not of the least importance what this number is. Why? For the same reason the 1/40401 was of no interest. And it’s the same reason any probability calculated from any probability model is of no interest to answer questions of cause.

Look at it this way. All probability models are silent on cause. And cause is what we want to know. But if we can’t know cause, and don’t forget we’re assuming we don’t know the cause of our temperature/anomaly sequence, we can at least quantify our uncertainty in a sequence conditional on some probability model. But since we’re assuming the probability model, the probabilities it spits out are the probabilities it spits out. They do not and cannot prove the goodness or badness of the model assumption. And they cannot be used to claim some thing other than “chance” is the one and only cause: that’s the fallacy of the false dichotomy. If we assume the model we have is good, for whatever reason, then whatever the probability of the sequence it gives, the sequence must still have been caused, and this model wasn’t the cause. Just like in the dice example, where the probability of two snake eyes, according to our simple model, were low. That low probability did not prove, one way or the other, that the casino cheated.

Mann calls the the casino not cheating the “null hypothesis”. Or rather, their “null hypothesis” is that his ARMA model (they actually created several) caused the anomaly sequence, with the false dichotomy alternate hypothesis that global warming was the only other (partial) cause. This, we now see, is wrong. All the calculations Mann provides to show probabilities of the sequence under any assumption—one of their ARMA or one of their concocted CMIP5 “all-forcing experiments”—have no bearing whatsoever on the only relevant physical question: What caused the sequence?

Fourth Error

It is true that global warming might be a partial cause of the anomaly sequence. Indeed, every working scientist assumes, what is almost a truism, that mankind has some effect on the climate. The only question is: how much? And the answer might be: only a trivial amount. Thus, it might also be true that global warming as a partial cause is ignorable for most questions or decisions made about values of temperature.

How can we tell? Only one way. Build causal or determinative models that has global warming as a component. Then make predictions of future values of temperature. If these predictions match (how to match is important question I here ignore), then we have good (but not complete) evidence that global warming is a cause. But if they do not match, we have good evidence that it isn’t.

Predictions of global temperature from models like CMIP, which are not shown in Mann, do not match the actual values of temperature, and haven’t for a long time. We therefore have excellent evidence that we do not understand all of the causes of global temperature and that global warming as it is represented in the models is in error.

Mann’s fourth error is to show how well the global-warming-assumption model can be made to fit past data. This fit is only of minor interest, because we could also get good fit with any number of probability models, and indeed Mann shows good fit for some of these models. But we know that probability models are silent on cause, therefore model fit is not indicative of cause either.

Conclusion

Calculations showing “There was an X% chance of this sequence” always assume what they set out to prove, and are thus of no interest whatsoever in assessing questions of cause. A casino can ask “Given the standard assumptions about dice, what is the chance of seeing N snake eyes in a row?” if, for whatever reason it has an interest in that question, but whatever the answer is, i.e. however small that probability is, it does not answer what causes the dice to land the way they do.

Consider that casinos are diligent in trying to understand cause. Dice throws are thus heavily regulated: they must hit a wall, the player may not do anything fancy with them (as pictured above), etc. When dice are old they are replaced, because wear indicates lack of symmetry and symmetry is important in cause. And so forth. It is only because casinos know that players do not know (or cannot manipulate) the causes of dice throws that they allow the game.

It is the job of physicists to understand the myriad causes of temperature sequences. Just like in the dice throw, there is not one cause, but many. And again like the dice throw, the more causes a physicist knows the better the predictions he will make. The opposite is also true: the fewer causes he knows, the worse the predictions he will make. And, given the poor performance of causal models over the last thirty years, we do not understand cause well.

The dice example differed from the temperature because with dice there was a natural (non-causal) probability model. We don’t have that with temperature, except to say we only know the possible values of anomalies (as the example above showed). Predictions can be made using this probability model, just like predictions of dice throws can be made with its natural probability model. Physical intuition argues these temperature predictions with this simple model won’t be very good. Therefore, if prediction is our goal, and it is a good goal, other probability models may be sought in the hope these will give better performance. As good as these predictions might be, no probability will tell us the cause of any sequence.

Because an assumed probability model said some sequence was rare, it does not mean the sequence was therefore caused by whatever mechanism that takes one’s fancy. You still have to do the hard work of proving the mechanism was the cause, and that it will be a cause into the future. That is shown by making good predictions. We are not there yet. And why, if you did know cause, would you employ some cheap and known-to-be-false probability model to argue an observed sequence had low probability—conditional on assuming this probability model is true?

Lastly, please don’t forget that everything that happened in Mann’s calculations, and in my examples after the First Error, are wrong because we do not know with certainty the values of the actual temperature/anomaly series. The probabilities we calculate for this series to take certain values can take the uncertainty we have in these past values into account, but it becomes complicated. That many don’t know how to do it is one reason the First Error is ubiquitous.