06 August 2008

Who's doing election prediction by simulation

Here's a list of all the web sites I know of where one can find simulations of the upcoming 2008 US presidential election. (I'm going to stay out of this game, because who wants to track down polling data every day, re-run the simulations, and so on?) Generally these start by obtaining a probability that one candidate or the other will win each state, from polls and sometimes from demographic data as well. These are then in some way aggregated in "simulated" elections. Often these are accompanied with a probability that Obama will win the election, or would win it if it were held today, and in many cases a probability distribution of the number of electoral votes won by Obama.

Yes, Obama. Not McCain. People who create these sites generally have a choice to make -- since nobody other than Obama and McCain has any reasonable chance of getting electoral votes, stating everything from the Obama point of view or from the McCain point of view has the same content -- but it seems that there's a leaning towards choosing Obama for this purpose in this corner of the blogosphere. (This is an entirely unscientific sample, though.)

Note that although Sam Wang says what he does isn't simulation, that's only because his method allows him to do all the possibilities at once. This is because he uses the magic of generating functions. This trick only works if you can make the simplifying assumption that winning in each state is independent of each other state. This is reasonable if you're trying to predict what would happen if the election were held today -- there's not any big reason for sampling error in different states to be correlated. But if you're trying to predict what will happen in the actual election, this assumption is very risky. It seems that the actual movement of voter opinions in different states should be correlated.

Here's the list:

This list is by no means complete.

5 comments:

topologicalmusings said...

It is very hard to imagine people/websites wasting their time/money/effort in trying to "predict" (as if that means anything) who the winner of a presidential election is going to be! It seems some people suffer from the delusion that "election processes" can somehow be modeled just like we model processes when studying physical phenomena. Unless people derive some kind of weird fun in playing with numbers just for the heck of it - no, I am not talking about number theory - I don't see how, from the point of view of an average person, getting involved in the business of "predicting" election results is productive at all. One slip, and the frontrunner can lose it all in just one week before the election.

Unless, of course, all this "information" is being used by the gambling industry!

Allen said...

FYI, Sam Wang's blog at Princeton allows users to submit comments.

He had one blog post criticizing 538.com for having a predicted electoral vote distribution that was not Gaussian. I posted a comment pointing out that you only get a Gaussian distribution if you assume independence between the state outcomes, and Nate models correlations which breaks that assumption. I pointed another comment further explaining how not including state-to-state correlations when making certain predictions can give you a false sense of confidence in your answer. (Sam Wang's results to not include correlations.)

It turns out however that Sam Wang moderates the comments on this blog. He deleted the comments I made, and also deleted an early comment I made pointing out that the # of EV's on one of his maps did not add up to 538.

Because he is moderating thoughtful comments that he does not agree with or that point out problems in his assumptions or methodology, I believe both his site and his blog should be boycotted, i.e., simply ignored.

Thank you.

topologicalmusings said...

I sorta realized later that it may have looked as if my critical comments were directed at Isabel. I just wish to point out that that is not the case!

Sam Wang said...

topologicalmusings, Sam Wang here. I agree with you on both points. First, the models that people construct are quite intricate. I don't think they quite understand that they are simply playing with numbers. Second, in fact I do think people do it because they get off on playing with numbers just for fun.

In 2004 I did something much simpler, which seems to me to be perfectly adequate. However, the fad this year has lurched in the direction of simulation overkill. It's a consequence of having a lot of computing power on one's desktop but not quite as much ability to think about how much modeling is too much.

Allen, you can post again. But you know, not every blog in the world has to be wide open. FiveThirtyEight has chat threads that appear to be unmoderated.

Nick Barrowman said...

Prediction markets may be a better way to make predictions about who's going to win an election. They have a fairly good track record. But they can produce strange results, and I noted in this blog post.