Tuesday, November 06, 2012

Psychohistory and the Polls

Isaac Asimov's famous Foundation Series was based on the idea of Psychohistory. This allowed mathematicians to predict human behavior using specialized mathematics. With this, the inventor, Hari Seldon, was able to predict and influence the founding of a new galactic empire. The basic principle of psychohistory is that people in large enough groups will react in predictable ways.

The theory is seductive and many pollsters began by reading this series.

One problem is identifying and quantifying all of the possible inputs. Right now there is no way to do this. The best we can do is to try to take a snapshot of opinions and make guesses about that. For example, Romney surged in the polls following each debate, even the second one in which the moderator took on the role of fact-checker. Does that mean that Romney won the debates or that winning and losing didn't matter, just looking presidential was enough? Did Obama close with Romney because of his campaign or because of his initial response to Hurricane Sandy? Did he fade at the last minute because of continuing stories about Sandy? The pollsters cannot tell us.

In the last few days multiple companies have done national polls and hundreds of companies have done state polls. As of election day the race is too close to call. All of the polls are within the margin of error. The race is going to be decided by things like voter turnout. In 2004 a record number of people voted. Democrats showed up in record numbers but were overwhelmed by record numbers of Republicans. In 2008, Democrats showed up in record numbers while Republicans stayed home or switched parties. In 2010, fewer people voted and of those who did, Republicans turned out in record numbers while the Democrats stayed home.

Pollsters try to estimate the likelihood of individuals showing up but it is difficult. Most people don't even want to talk to a pollster. So, they guess what the turnout will be and apply this to their actual sampling to get an idea of what the actual turnout will be. The result is that the polls are close to each other but none are identical. In the swing states, the percentage of undecided voters is high enough to tip the election for either candidate. These are states where neither candidate has 50% and is only ahead by two points or fewer.

Enter the poll aggregators. They reach a conclusion based on combining multiple polls. Real Clear Politics does this.

Then there is Nate Silver. He claims to be able to take unreliable data and coax reliability out of it by weighting the various polls according to secret formulas. He does not explain these except to say that he treats state polls as more accurate than national polls. He uses this to predict the winner. In this case, he says that President Obama has a 92% probability of winning.

So how does he control for such things as undecided voters and voter turnout? He doesn't. It doesn't help his case that he seems to give more weight to older, smaller, pro-Obama polls than to more recent pro-Romney polls.

The financial meltdown happened because of a reliance on models predicting human behavior. They showed that the default rate on sub-prime mortgages was low enough that they could be treated as a AAA investment. The problem was that they assumed a constant default rate. This was influenced by rising house values which allowed most people with financial problems to sell their house rather than default. When the housing market crashed, people could no longer sell their homes and started to default. The people who had written the financial models knew that this was possible but no one wanted to hear about it. There was too much money to be made by ignoring the risks.

Nate Silver's models have the same sort of flaw. They assume that the underlying polls are correct but, as I pointed out above, the race is within their margin of error. This is a much tougher election to predict than 2008 when Silver correctly called 49 out of 50 states.

At this point Obama will win or he will not. Further predictions will not change things. But I would hesitate before investing money in Silver's models.

