12 June 2008

51,199,463,116,367: A fuller solution to Tuesday's electoral vote problem

A problem that recently was posted on Nate Silver's FiveThirtyEight was this one, which I've mentioned before:
How many unique ways are there to acquire at least 270 electoral votes without any excess?
Somewhat more rigorously, Michael Walsh (who might be someone I know from MIT; I'm not sure, though, because it's a fairly common name) defined a minimal winning set as a subset S of all the states such that the sum of the number of electoral votes of those states is at least 270, but that sum minus the minimum number of electoral votes of any state in the set is less than 270.

I'm going to give my solution, and also attempt to summarize the other solutions that have been given, to the extent that I can. (I am not fluent in the programming languages in which some of the solutions are written, so I can't guarantee perfection.)

The number, as I stated in my solution there, is 51,199,463,116,367. (I'm including this number in the post -- and as the title of the post -- because if you have reasonable Google skills and are looking for more information on this, you'll probably Google that number.) I gave a quick sketch of the solution there, but I want to explain it more fully.

As is usual in combinatorics, it's best to count the minimal winning sets by some parameter, and then sum over that parameter. In this case the natural parameter is the minimum number of electoral votes of an element of S. So I'll begin by counting the number of ways we can pick a set of states S, where:

• the smallest state in the set has three electoral votes;

• together all the states have 270 or more electoral votes;

• upon removing the smaller state, we're left with 269 or less electoral votes.

Thus our set must contain at least one state with three electoral votes, and a total of 270, 271, or 272 electoral votes. We can start by counting all the sets which have 270, 271, or 272 electoral votes, and then remove those which contain no 3-EV states.

So, how to count those sets? Generating functions. I'll illustrate with a smaller example. Say we live in a nation with three states, which have 3, 4, and 5 electoral votes respectively. I will call these states Wyoming, Idaho, and Utah. Now, consider the polynomial

(1 + x3)(1 + x4)(1 + x5).

Here we have a factor (1 + xk) corresponding to a state with k electoral votes. When we expand this polynomial, the result is

1 + x3 + x4 + x5 + x7 + x8 + x9 + x12.

Each term here corresponds to one of the subsets of the set {Wyoming, Idaho, Utah}; for example the x8 term corresponds to the set {Wyoming, Utah}. We can read off from the coefficients of the polynomial that there is one way each to receive 0, 3, 4, 5, 7, 8, 9, or 12 electoral votes in this nation; no other numbers are possible.

Of course, in reality there are many ways to get each number of electoral votes, because there are 251 possible sets S and only 435 electoral votes. To give a simple example of this repetition, in the above example replace Utah (5 EV) with Oregon (7 EV), and then we get

(1 + x3)(1 + x4)(1 + x7)

which expands to

1 + x3 + x4 + 2x7 + x10 + x11 + x14.

In this hypothetical mini-nation there are 2 ways to get seven electoral votes -- Wyoming and Idaho together, or Oregon by itself. That explains the 2 in front of x7.

So now how many ways are there to get 270, 271, or 272 electoral votes in the actual nation? We consider the much larger polynomial

f3(x) = (1 + x55) (1 + x34) (1 + x31) ... (1 + x4)5 (1 + x3)8)

which has a factor of 1 + x55 for California (55 EV), 1 + x34 for Texas (34 EV), and so on down to the five states with 4 EV and the eight states with 3 EV. Expanding this out gives

f3(x) = x538 + 8x535 + 5x534 + ... + 17032469851307x272+17046339123934x271+17054665123395x270 + ... + 8x3 + 1

and we see that in our actual nation, there is one way to get 538 EV (winning all the states), eight ways to get 535 (winning all the states but one of the 3-EV states), and so on. The coefficients of x272, x271, x270 in the middle are the number of ways to win 270, 271, or 272 EV.

So we just add those up, right? Not so fast. Remember that we wanted the number of ways to get 270, 271, or 272 EV including at least one 3-EV state. So we construct a similar polynomial which excludes the 3-EV states, namely

f4(x) = (1 + x55) (1 + x34) (1 + x31) ... (1 + x4)5

and the coefficient of xn here is the number of ways to get n electoral votes using only states with 4 EV or more. Expanding, the terms we care about are

f4(x) = ... + 64405297719 x272 + 64713359463 x271 + 65001219892 x270 + ...

The number of ways to get 270, 271, or 272 EV is thus 17032469851307 + 17046339123934 + 17054665123395 = 51133474098636. The number of ways to do that without using any 3-EV states is 64405297719 + 64713359463 + 65001219892 = 194119877074. The difference, 50939354221562, is the number of ways to get 270, 271, or 272 EV using at least one 3-EV state, and therefore the number of minimal winning sets which contain at least one 3-EV state.

Now we can proceed in the same way to get the number of minimal winning sets with smallest state having 4, 5, ..., 15 EV. There are 250611676072 with smallest state having 4 EV, for example, all the way down to 1 with the smallest state having 15 EV. (The states having at least 15 EV together have 271 EV, so you have to include all of them.)

Adding the numbers you get in this way gives the answer 51,199,463,116,367. The proportion of minimal winning sets which don't use at least one 3-EV state is one in 197; it's not surprising that this is quite low, because there are eight 3-EV states and so it's hard to avoid all of them!

A lot of the commenters at FiveThirtyEight used the technique of "dynamic programming" to get the same answer. This basically amounts to multiplying out the polynomials term by term and keeping track of just the coefficients, not writing down all the x's and the exponents. (Of course, the people that went that route would probably say that my approach via generating functions is just dynamic programming with a bunch of extra symbols floating around for no good reason.)

A couple of particularly good estimates also occurred. One person (Brian, 4:43 pm) simulated a million random outcomes, found that 22,790 of them gave minimal winning sets, and therefore gave a 95% confidence interval of (0.02249, 0.02309) for the probability that a set chosen uniformly at random is a minimal winning set; there are 251 minimal winning sets, so a 95% confidence interval for their number was (50.6 trillion - 52.0 trillion). The actual answer is about 51.2 trillion, and falls in that interval.

Someone signing himself as "ray" had a clever solution as well. If we assume that the states each vote independently and each have probability 1/2 of voting for our candidate, then the number of electoral votes our candidate gets is itself a random variable. Since it's a sum of a bunch of smaller independent random variables, we can assume it's close to being normal; this is a heuristic application of the Central Limit Theorem. The mean is clearly 269 (half the total number of electoral votes). The variance of the number of electoral votes received from a state with k electoral votes is k2/4, and variances add, so the variance is one-fourth of the sum of the squares of the numbers of electoral votes. This is 2466.5; the standard deviation is the square root of this, 49.66. The probability that a normally distributed random variable with mean 269 and standard deviation 49.66 falls between 269.5 and 272.5 (that is, that its value rounded to the nearest integer is 270, 271, or 272) is 0.0241. This is the approximate probability of getting 270, 271, or 272 EV; that's pretty close to the event we want. We can approximate the answer as 251 times this, or 5.42 · 1013, which is high by a few percent -- but not bad for a solution that totally ignores the combinatorics!

Anonymous said...

I'm the anonymous author of the Perl dynamic programming solution posted as a comment to your earlier post. Just wanted to add two points: I think it's surprising how quick a good dynamic solution can run. The Perl runs nearly instantaneously on any modern computer. Second, it is critical to memoize (which is, of course, sound dynamic programming practice). I don't think everyone who attempted a solution remembered to do that, even if they were otherwise on the right track in terms of how to divide and conquer the problem.

Oh, and, my first thought about your solutoin, as a programmer, was exactly what you expected: that the polynomial calculation was basically just a framework for employing a dynamic programming tool provided by your computer algebra software of choice. But seriously, it was interesting to see how you approached the problem, and this has been one of my personal favorites of your posts.

CarlBrannen said...

Congratulations to all! I would have guessed a much smaller number, until the comment on the size of 2^52.

rsc said...

Thanks for the post. I've written up the dynamic programming approach here.

Anonymous said...

Most of this looks good, but I'm a little confused by something. I get why you want to remove the solutions for 271 and 272 with no 3-EV states because you could then swap a state for a lower valued one. But why are you removing those solutions for the 270 total? If you're exactly at 270, pretty much by definition there are no excess votes no matter what the combination of states is.

Michael Lugo said...

Jeff,

the number of EVs needed to win is 270, and no state has less than 3 EV. So 270, 271, and 272 are all essentially the same, in that if a candidate gets any of those numbers they need every state they got.

Anonymous said...

Ok, now I do actually have a disagreement with the result. What you're computing is not the number of minimal winning sets.

For the 270, 271, or 272 EV totals, it doesn't matter what the smallest state is, because if you remove any of those you lose. Instead you want to add up:

1) All ways to get 270
2) All ways to get 271
3) All ways to get 272
4) All ways to get 273 without any 3 EV states
5) All ways to get 274 without any 3 or 4 EV states
etc...

The generating function approach works very nicely for this as well, but the total is going to be different.

Anonymous said...

Never mind, the number is right. I misunderstood what you meant when you said you repeated the process with 4, 5, 6, etc as the minimum. Obviously for 4 you added in the 273 coefficient as well as the 270, 271 and 272 ones.

That being said, I think you made the process a bit harder than you needed to. Instead of doing the subtraction, just add the 270, 271, 272 coefficients for the full generating polynomial, then the 273 coefficient for the 4+ EV polynomial, the 274 coefficient for the 5+ EV polynomial, and so on.

Michael Lugo said...

Jeff,

you're right -- that would be a better way to do the calculation. The solution I present here basically reproduces the way I got to the answer the first time, so it's not surprising that there are some inefficiencies.

Anonymous said...

The real issue is not how well Obama or McCain might do in the closely divided battleground states, but that we shouldn't have battleground states and spectator states in the first place. Every vote in every state should be politically relevant in a presidential election. And, every vote should be equal. We should have a national popular vote for President in which the White House goes to the candidate who gets the most popular votes in all 50 states.

The National Popular Vote bill would guarantee the Presidency to the candidate who receives the most popular votes in all 50 states (and DC). The bill would take effect only when enacted, in identical form, by states possessing a majority of the electoral vote -- that is, enough electoral votes to elect a President (270 of 538). When the bill comes into effect, all the electoral votes from those states would be awarded to the presidential candidate who receives the most popular votes in all 50 states (and DC).

The major shortcoming of the current system of electing the President is that presidential candidates have no reason to poll, visit, advertise, organize, campaign, or worry about the voter concerns in states where they are safely ahead or hopelessly behind. The reason for this is the winner-take-all rule which awards all of a state's electoral votes to the candidate who gets the most votes in each separate state. Because of this rule, candidates concentrate their attention on a handful of closely divided "battleground" states. Two-thirds of the visits and money are focused in just six states; 88% on 9 states, and 99% of the money goes to just 16 states. Two-thirds of the states and people are merely spectators to the presidential election.

Another shortcoming of the current system is that a candidate can win the Presidency without winning the most popular votes nationwide.

The National Popular Vote bill has been approved by 18 legislative chambers (one house in Colorado, Arkansas, Maine, North Carolina, Rhode Island, and Washington, and two houses in Maryland, Illinois, Hawaii, California, and Vermont). It has been enacted into law in Hawaii, Illinois, New Jersey, and Maryland. These states have 50 (19%) of the 270 electoral votes needed to bring this legislation into effect.

See http://www.NationalPopularVote.com

susan