The US elections provide a lot of fun for data junkies and while we don’t have the talents of Hunter S. Thompson and Ralph Steadman to entertain us anymore there is still plenty of fun to be had if you are a researcher.
The 2008 election cycle was the first where the internet allowed independent analysts combining the opinion polls published by different media outlets to reach a mass audience. The best known of these, Real Clear Politics pulls together all the polling by reputable pollsters into a single average. Interestingly the RCP average was almost exactly correct for the 2008 election and predicted the 2010 mid-terms significantly better than most of the individual polls.
Former baseball number cruncher Nate Silver has taken this type of analysis to a whole new level. Silver made his name during the 2008 election at the aptly named fivethirtyeight.com when he predicted the result correctly in 49 states. He now resides at the New York Times and has powered his model up again for the 2012 elections. What sets his work apart, asides from the fact that it combines polling data with a mass of economic data which impacts voting, is that rather than predicting what would happen if the election happened now it extrapolates the data to include the impact of events between now and polling day.
The result is a model in stark contrast to the over-excited narrative of the individual polls and their often rapid changes in predicted outcome. The best example of this is the so-called convention bounce, where first one candidate gains an increase in their ratings from the media coverage around their party convention and then the second candidate gets a similar lift often cancelling the first one out. Silver’s model factors all this in, compares it to historic data, and then assumes that much this coverage will have been completely forgotten by November anyway since their will have been the Presidential debates and several month’s economics news by then.
Whether Silver’s approach becomes the norm in future remains to be seen, certainly the idea of the media adopting a model which suggests that not much is happening in response to the news and that hardly anybody is changing their minds about the candidates seems unlikely to this cynic.
Meanwhile, 3,000 miles away on the West Coast, Google has been using the Presidential debates to shakedown its own polling technology. Worryingly perhaps for Google, their poll during the first Presidential debate found Obama performed better by a small majority, in contrast to almost every other poll and the consensus of commentators. A second poll they conducted after the debate found a two to one advantage for Mr. Romney, but one characteristic both polls shared was a highly skewed sample when the demographics are checked. Based on these results, the author assumes that there is some head scratching going on in Mountain View this week.