Posts Tagged correlation is not causation
aka back and
better than the same as ever
MK has been dormant since June 2013 — almost 2 years! It’s been a while since I wrote a bona fide blog post, but I miss doing that, so I figured I would, and on one of my favourite old topics (see HERE). This one is dedicated to my handfuls upon handfuls of readers.
I believe in climate change. I also am not all that bothered by it.
That attitude seems to raise a few eyebrows. Most people assume that if you believe in climate change, then you must see a desperate need to “take action” against it and, conversely, if you do not care much about climate change, then you are obviously one of those “climate change deniers” (a term that’s a little too close to “Holocaust denier” for my liking).
I don’t fall into either category. My thoughts can be encapsulated quite neatly in three points (and I think I may be paraphrasing John Humphreys):
- Is the climate changing? Yes.
- Are humans causing that? Probably.
- Is it as bad as we think? No.
- Does it warrant drastic government intervention? Almost definitely not.
As points 1 and 2 have been adequately canvassed elsewhere, and point 4 follows from point 3, I’ll concentrate on point 3 for the balance of this post. Before I do that, I should give this qualification: I’ll admit that I can’t claim to be an expert on the subject, but I do have a statistics major, so I am at least somewhat qualified to comment on the research findings that people like to throw around. And I have read the most authoritative material out there, like the Intergovernmental Panel on Climate Change (IPCC) reports and the major Royal Society reviews (see HERE).
So with the power of that limited knowledge and drawing on my hours of research, here is what i think:
The future hasn’t happened yet
People have been predicting the end of the world for as long as there have been people, and that includes in this “enlightened” age of “science” that we now live in. Yet the doomsayers have been proven wrong each time.
The problem with predicting the future is that it hasn’t happened yet. That may seem obvious, but it is constantly overlooked by “scientists” the world over. The standard way of predicting the future using maths, “time series analysis”, boils down to this: take what has happened in the past, figure out what the average was, and assume that the future will be the same.
This might seem intuitive — after all, the best indication of what will happen in the future that we have is what has already happened — but it is in fact an extremely flawed way of looking at the world. The best and most well known critic of the formula is probably Nassim Taleb. He makes the following criticisms:
The past is peppered with what Taleb calls “Black Swan” events and what everyone else calls “outliers”. Outliers are rare events that are different to all other events, and therefore cannot be predicted. It is impossible to predict the unpredictable, therefore any statistical projections will invariably miss the outliers, especially if it is predicting the future based on the past average.
This results in things like financial analysts missing the Global Financial Crisis (bad outlier), or Thomas Malthus predicting that all the food would run out and missing the productivity improvements of the industrial revolution (good outlier, back in Malthus’s time).
2. Knowing it all
Time series predictions involve a degree of hubris. They assume that we understand the past and why everything in the past has happened, and can confidently reduce the infinitely complex universe into a few variables that will inevitably explain anything, and so if we know how one or two of these will behave then we can comfortably predict everything else.
We give ourselves too much credit. Our actual understanding of complex systems is much weaker than we’d like to think. “Experts” modelling complex systems mathematically are constantly even getting the past wrong, so how anyone thinks they can predict the future with much accuracy I have no idea.
3. Proxies and correlations
Some things are easier to measure than others. Whenever an analysts wants to measure something complex that cannot really be measured they will use a “proxy variable” that would generally correlate with the unmeasurable variable. For example, it is not possible to measure “health”, so if you want to measure the health of a population, you might measure their average life expectancy. After all, if people tend to live longer, you would assume that they are healthier.
Makes sense right? Well maybe. One problem is that you might be missing some other variables that are affecting the situation. For example, maybe your “unhealthy” group are actually super fit and super healthy, but have an unfortunate habit of dying in car crashes. So perhaps life expectancy doesn’t correlate as well with health as you would expect.
But assume that the two variables correlate perfectly. That itself may be a problem.
Take this example: Christian Rudder from online dating website OK Cupid has found that regardless of gender, OK Cupid users who like the taste of beer tend to prefer having sex on the first date. That statistic is quite amusing, but no one would seriously suggest that this means that drinking beer changes the way someone thinks about sex, right?
Wrong. “Scientists” do that all the time, and the journalists who report their findings do it even more.
That example makes it especially obvious that the correlation between beer and sex is not causative. Liking beer does not cause someone to want to have sex on a first date, and wanting sex on a first date does not cause someone to like the taste of beer. More likely, there is a third factor at play that causes a lot of people who like beer to also want sex on a first date — probably youth culture or something. Or it could simply be a coincidence.
But that doesn’t stop people saying that hormone replacement therapy can help stop heart disease.
4. The Wayne Swan error*
Ever wondered why the government’s budget always seems to blow out? Here’s why. Say the government projects that next year’s budget will balance, with a 2% margin of error and 95% confidence. This means that there is a 95% chance that budget will be within 2% of a balanced budget (a pipe dream right now, I know).
In reality, it is almost impossible that the budget will come in below the projection — as once allocated money to spend, very few (if any) government departments will choose not spend it. On the other hand, it is quite likely that the budget will blow out, as government departments have many unforeseen expenses. So there is not so much a 95% chance that the budget will be within 2% of balanced, there is a 95% chance that there will be a deficit of 2% or less, and a 5% chance of a deficit of over 2%. I like to call that the “Wayne Swan error”, after the former Australian Treasurer who seemed to manage to blow out the budget every year that he was in office (it is also fast becoming the “Joe Hockey error”).
Getting to the point
The reason I don’t think that climate change is so bad is that the predictions that I have seen of the impact of climate change fall into all of the above traps, along with an unhealthy dose of confirmation bias. Arctic sea ice at record lows? We’re doomed! Arctic sea ice at record highs? We’re still doomed!
Remember Professor Tim Flannery? The “climate expert” who predicted unending drought when we had a drought, then unending floods when we had floods? My point exactly.
Even the most respectable science journals make outlandish predictions about mass-extinctions, rising sea levels, and economic misery based on people trying to predict the future from past averages and assuming that they understand complex systems.
Their predictions are constantly wrong. It turns out that nature is a lot more robust than we give it credit for. We forget that life on Earth has not been eliminated despite ice ages, periods of warming, super-volcanoes, floods, hurricanes, earthquakes, and everything else that nature throws at us. I seriously doubt that the atmosphere warming a couple of degrees will mean the end of the world as we know it.
Further, as a result of nature being more robust than we think, as well as humanity’s propensity for alarmism, climate scientists’ projections are subject to the “Wayne Swan error”-style second order effects that I was talking about earlier.
Scientific papers wrongly predicting the end of the world are much more likely to be published than ones predicting that everything will carry on the way it has in the past, and are much more likely to attract attention once published. Also, scientists are more likely to miss mitigating factors than exacerbating ones, and therefore overestimate both global warming and its effects. We know what causes warming — greenhouse gas levels — but not what mitigates it. Accordingly, our measurements of warming are biased towards warmer rather than cooler, and our projections are biased towards “worst case” rather than “best case” scenarios.
The biggest problem with the way we think about projections is that people are not held to account for getting it wrong. Climate forecasts made 20 years ago have proven woefully inaccurate, yet they are somehow touted as being correct. A couple of years ago, the IPCC released a report saying how accurate their 1990 projections were, and headlines around the world said “climate predictions come true”, when what had in fact happened was that the world had consistently warmed more slowly than the IPCC’s projections, but (big woop!) the warming had been within the range that the IPCC predicted. See this graph:
Now, remember that the predictions were made in 1990. Notice how the model “predicts” that temperatures before 1990 (which would have been factored into the model) would be roughly evenly distributed around the middle line, but that temperatures since 1990 (which obviously were not known when the projections were made) have been consistently below that line.
Sure enough, according to the IPCC’s projections, the world should have warmed about 0.55 degrees between 1990 and 2010. It actually warmed 0.39 degrees. That’s 30% less than projected — a pretty dismal result really. Although I’ll admit that sea levels seem to have been rising at the top end of what was projected, despite the rise in temperature being lower than projected.
Anyway, the point is that a PhD in climate science is about as useful as a crystal ball and a red and white tent when it comes to making soothsayers. Meanwhile, both humanity and nature constantly surprise with their ability to not be destroyed by whatever calamity we are predicting at the time.
All this is not to say that we shouldn’t be reducing our CO2 emissions and switching to renewable energy. But a carbon tax? No.
* Taleb makes some other criticisms which are a lot more technical and would be lost on most readers without a mathematical background. I encourage everyone to read his books, where he explains his ideas in a very accessible way.
For people who do understand this kind of thing, the Wayne Swan error is this: Most models use a 95% confidence level to compute “statistically significant” findings. If you’re lucky this will be at 99%. Not only does this a priori overlook the 5% or 1% of outliers which can have a far more significant impact on whatever the model is measuring than the 95-99% of “normal” cases, another common oversight make it likely that the confidence level is substantially underestimated: namely the assumption that the error terms are random. Often, the error terms are actually non-linear, which adds unseen biases to the model.