30 June 2008

A tail bound for the normal distribution

Often one wants to know the probability that a random variable with the standard normal distribution takes value above x for some positive constant x.

(Okay, I'll be honest -- by "one" I mean "me", and the main reason I'm writing this post is to fix this idea in my head so I don't have to go looking for my copy of Durrett's text Probability: Theory and Examples every time I want this result. Durrett gives a much shorter proof -- two lines -- on page 6 of that book, but it involves an unmotivated-seeming change of variables, which is why I have trouble remembering it.)

The probability density function of the standard normal is ${1 \over \sqrt{2\pi}} \exp( -x^2/2)$, and so the probability in question is
f(x) = \int_x^\infty {1 \over \sqrt{2\pi}} \exp (-t^2/2) \, dt
It's a standard fact, but one that I can never remember, that this is bounded above by ${1 \over \sqrt{2\pi} x} \exp(-x^2/2)$ (and furthermore bounded below by 1 - 1/x2 times the upper bound, so the upper bound's not a bad estimate).

How to prove this? Well, here's an idea -- approximate the tail of the standard normal distribution's density function by an exponential. Which exponential? The exponential of the linearization of the exponent at t. The exponent has negative second derivative, so the new exponent is larger (less negative) than the old one and this is an overestimate. That is,
f(x) < \int_x^\infty {1 \over \sqrt{2\pi}} \exp(-x^2/2-x(t-x)) \, dt
where the new exponent is the linearization of -t2/2 at t=x.

Then pull out factors which don't depend on t to get
{\exp(x^2/2) \over \sqrt{2\pi}}\int_x^\infty  \exp(-xt) \, dt
and doing that last integral gives the desired bound.

Basically, the idea is that since the density to the right of x is dropping off as the exponential of a quadratic, most of it's concentrated very close to x, so we might as well approximate the density of the function by the exponential of a linear function, which is easier to work with.

By similar means one can show that the expectation of a real number selected from the standard normal distribution, given that it's greater than x, is something like x + 1/x. The tail to the right of x looks like an exponential random variable with mean 1/x. For example, the expectation of a real number selected from the standard normal distribution, conditioned on being larger than 10, is 10.09809.... But this is probably useless, because the probability of a real number selected from the standard normal distribution being larger than 10 is, by the previous bound, smaller than 1 in 10(2π)1/2e50, or about one in 1.3 x 1023.

28 June 2008

Baseball bats

Oh, apparently baseball people use "length-to-weight ratio" to describe a bat, as I learned from the people who talk too much before the Saturday afternoon game on Fox today. This is calculated by taking the weight (in ounces) minus the length (in inches), and in the major leagues can't be less than -3.5.

Of course, it's actually a difference, not a ratio.

It looks like some people call it the "differential", though, which is fine with me -- to me "differential" has other connotations, but expecting mathematical terminology not to collide with terminology used in other things is a Bad Idea. (Although why not just call it the "difference"?)

A thing I'm tired of hearing

"X does especially well/badly in interleague play."

Small sample size, people.

27 June 2008

A variant of the traveling salesman problem

Josh Robbins attempts to see a baseball game in all thirty major league baseball parks in 26 days.

Yes, you read that right.

And Major League Baseball doesn't make that easy. As you can guess, he has to see two games in one day four times -- but in markets with two teams (New York, Chicago, San Francisco/Oakland, Los Angeles) they try to schedule the two teams to be on the home at opposite times. That makes sense, because that way if you think "I want to see a baseball game today" you've got a good chance.

In fact, his four doubleheaders are Dodgers-Padres (which apparently was a bit of a tight squeeze, since the Dodgers went into extra innings), Yankees-Mets (that one should be easy; every few years the Mets and the Yankees play a game at one park in the afternoon and at the other park the same night; they're doing it today); Phillies-Nationals (which will be tight even if the games go the ordinary length; they start six hours apart, average game length is three hours or so, and the cities are two and a half hours apart with no traffic -- oh, and he's doing it on a Thursday); Cubs-Brewers.

My point is that 25 days might be possible -- but probably not. Most baseball games are scheduled for around 1 PM or around 7 PM, and games last three hours, to see two in one day requires the sites to be no more than three hours apart. The pairs that are doable in one day are probably:

Mets-Yankees, Mets-Phillies, Yankees-Phillies
Phillies-Orioles, Phillies-Nationals, Orioles-Nationals
Dodgers-Padres, Padres-Angels, Angels-Dodgers
White Sox-Cubs, White Sox-Brewers, Cubs-Brewers
Giants-A's

but of course one can do only one from each row, so it's only possible to double up on five days. Basically, this is the problem of looking for the largest matching in the graph that I defined above, where the edges are teams within about three hours' driving distance of each other.

(Oddly enough, each two-team market (and yes, I know, Baltimore and Washington may or may not be the same market) seems to have another team a couple hours away. In two cases that team is the Phillies. As you may know, this blog likes the Phillies.)

So 25 is theoretically possible, if the Scheduling Gods worked in one's favor -- but I'd be scared to even look at the schedules to try and figure it out. And what happens if there's a rainout?

As a problem in actually scheduling things, the other tricky part is that Denver really isn't near any other team. And Robbins' schedule had him at a 7:05 game in San Diego, followed by a 1:05 game in Denver the next day -- but Denver's a time zone to the east of San Diego, so that's seventeen hours between starts. Fourteen hours driving time. For 1,078 miles.

For some other variants of the traveling salesman problem which involve the road network, see Barry Stiefel's 50 states in a week's vacation (driving, with flights to Alaska and Hawaii) and 21 states in one day. The last one cheats a bit -- it's a 26-hour day, since he started in the Eastern time zone during daylight savings time (GMT-4), and did the trip on a day when we went back to standard time (GMT-5) and then crossed into the Central time zone (GMT-6). The difference here is that you only have to enter each state instead of reaching a point.

Oh, and I feel obliged to point out that I find the meme of going on a long road trip this summer because "this is the last summer it'll ever be possible" kind of stupid. (Not that anybody here brought it up.)

Edited (Saturday morning): Google Maps says Cleveland to Detroit can be driven in 2:46. I didn't realize they were that close together. They'd be even closer if someone built a bridge across Lake Erie.
(Saturday afternoon): Cleveland to Pittsburgh in 2:18. I'll admit the reason I forgot this one is that mentally I think of Pittsburgh as being in the same state as me and Cleveland as not being in it, so they must be far apart. This is despite the fact that I live about five miles from New Jersey.
Anyway, you could shave off yet another day by combining the Indians with either the Pirates or the Tigers.

Mathifying the oil bubble

House passes bill to reverse oil price increases.

Why am I writing about this? Because of the following sentence, which opens the article (emphasis added):
After the close of the markets Thursday, as the fear of a continued parabolic rise in the price of oil was still fresh on the minds of investors, the U.S. House of Representatives approved a bill that that could help to reverse the direction of oil prices.
I don't know, the runup looks linear to me, at least in the short term -- although everything looks linear in the short term. That doesn't mean it hasn't been fast. This is mathification at work.

By the way, The 2006-2008 Oil Bubble and Beyond (D. Sornette, R. Woodard, W.-X. Zhou; arXiv:0806.1170) claim that the growth is actually super-exponential, which they claim is diagnostic of a bubble. (Via the physics arXiv blog.) So "parabolic" may be a massive understatement.

25 June 2008

White People hate math, but like statistics

Did you know White People hate math, but like statistics?

This is from Stuff White People Like. There are three things you should know about SWPL, if you don't already. First, it's SATIRICAL. Second, it is not actually about "white people" (i. e. people whose ancestors originally hail from Europe) but "White People". These are best defined as people who like things on this list, like irony, Netflix, Wes Anderson movies, indie music, having two last names, Oscar parties, having black friends, indie music, The Wire, and the idea of soccer. (This is actually a randomly chosen sample from the list, which is conveniently numbered; random.org gave me 41 twice, and indie music is #41, hence the duplication. I was going to just pick a few things at "random", but I realized that I was kind of biased towards the things that I like.)

By "statistics" is meant not the mathematical field but various interesting-sounding numbers. For example, if each White Person has a favorite thing from the list of Stuff White People Like, and you pick ten white people at random, there's a 36% chance that two of them will have the same favorite White Person Thing. (This is the White Person version of the birthday paradox.)

Also of interest there: the entry on graduate school, which I think pretty clearly refers to grad school in the humanities.

22 June 2008

Nonconvergent sums and evolution

It's often said that evolution works in such a way as to maximize the number of descendants that an individual has.

More formally, X's children share one-half of their genes with X, X's grandchildren share one-quarter of their genes with X, and more generally X's nth-generation descendants share 1/2n of their genes with X. So if you buy the whole "selfish gene" theory that genes act in such a way as to maximize the number of copies of them which are made, the quantity individuals should be attempting to maximize might be half the number of children, plus one fourth the number of grandchildren, plus one eighth the number of great-grandchildren, and so on.

It's an infinite sum.

What's more, if you assume a "total fertility rate" of k -- that is, the average female bears two children -- then this sum is k/2 + k2/4 + k3/8 + .... And if k = 2, which corresponds to population not growing or shrinking, this sum is just 1 + 1 + 1 + 1 + ... which doesn't converge. (Similarly if k > 2, but populations which grow indefinitely don't seem sustainable.)

Of course, biologically populations don't live for infinitely long. And in reality people don't at least consciously think more than a couple generations into the future. So practically speaking this is all a bit meaningless.

edited, Monday, 8:42 am: I did mean maximizing the number of copies of genes, in the sense of Dawkins' idea of the selfish gene, and this post is not meant to be taken seriously.

20 June 2008

Gallons per mile?

Experts find key to saving fuel: say gallons per mile.

I'll summarize: d/dx 1/x = -1/x2.

Research has been done that shows that people believe that an improvement of 1 mpg will always save them the same amount of gas. But this is obviously false, as some simple arithmetic shows. Let's say I drive 12,000 miles a year. (Why 12,000? Because everything that follows will work out to be integers.) If I improve from 15 mpg to 16 mpg, I go from using 800 gallons of gas a year to 750, a 50-gallon reduction. But if I improve from 24 mpg to 25 mpg, I go from using 500 gallons of gas a year to 480, a 20-gallon reduction.

A driver driving m miles per year in a car getting x miles per gallon will of course use m/x gallons of gas; the derivative of this is -m/x2. So if you get x miles per gallon already, improving by one mile per gallon saves m/x2 gallons. (I'm assuming here that 1 is small compared to x.)

The article claims that this means people wanting to get better gas mileage, if they have multiple vehicles, should always target the least efficient vehicle -- but that's going too far. It might be cheaper to get a 1-mpg improvement for less fuel-efficient cars. There's no reason that the cost of a car should be linear in the number of miles per gallon it gets, all else being held constant.

Note that I'm also not saying the cost of a car should be linear in the number of gallons per mile it gets! In fact this would be impossible, because it would predict that cars that get zero gallons per mile could be made for a finite amount of money.

"Gallons per mile" is kind of an annoying unit, though, because all cars get less than 1. Perhaps "gallons per 100 miles" would be a good way to go, with most cars measuring between perhaps 3 and 6 on this scale. And people can picture driving 100 miles. (For example, if they have a ten-mile commute each way, it's five round-trips to work.) But on the other hand, there's a temptation to not have to deal with decimals, and there's a big difference between 4 gallons per 100 miles and 5 gallons per 100 miles. Perhaps "gallons per 1000 miles" works nicely; typical values are now 2-digit integers, and rounding to the nearest integer gives roughly the same precision as the current system.

(Readers from other countries: please spare me the "in my country we measure fuel economy in liters per 100 km" comments. I know this.)

18 June 2008

The probability of making a putt

I don't know much about golf. Basically I know the following things:
  • you try to get the ball in the hole;
  • scoring works by counting the number of times you hit the ball with the club; lower scores are better;
  • "par" is really good, not average.

Anyway, Tiger Woods won the U. S. Open, and Ian Ayres, guest-blogging at Freakonomics asks why golf commentators don't give the probability that a putt from distance x will go in. Commentators in just about every other sport do; Ayres' example is free-throw percentage in basketball, mine would have been batting average in baseball.

Ayres then gives a plot showing the success rate of golf putts as a function of difference from the hole, taken from this paper. Not all that surprisingly, the probability of making a putt from distance x scales like 1/x; essentially you can assume that the angular error in putting doesn't change as the distance of the putt increases. But the apparent size of the target is smaller at longer distances. Basically, from twice as far away the hole looks half as big.

It turns out that's what Andrew Gelman and Deborah Nolan thought, too. (Andrew Gelman and Deborah Nolan, "A Probability Model For Golf Putting", Teaching Statistics, Volume 24, Number 3, Autumn 2002. (This is the source for Ayres' figure.)) Their actual model is a bit more complicated, because they actually do the trigonometry correctly, and they assume that errors in putting are normally distributed while I'm assuming they're uniform. Read it, it's two and a half pages.) This fixes the problem that my crude model would have. At five feet, pros make 59 percent of their putts; thus it would predict that at two and a half feet, they make 118 percent!

The result of Gelman and Nolan is that the probability of the success of a putt from distance x is
2 \Phi \left( {1 \over \sigma} \arcsin {R-r \over x} \right) - 1

where R, r are the radii of the ball and the hole; 2R = 4.25 inches and 2r = 1.68 inches. σ is the standard deviation of the angular error of a shot (in radians), which can be found from empirical data to be 0.026 (about 1.5 degrees). Φ is the standard normal distribution.

If you assume that x is large enough that (R-r)/x is small enough that we can make the small angle approximation arcsin x ≈ x, then this becomes 2Φ((R-r)/(σ x)) - 1. But Φ(z) is linear near z = 0, with Φ(0) = 1/2 and Φ'(0) = 1/(2π)1/2. So the probability of succeeding from distance x, for large x, is approximately
P(x) = 2 \left( {1 \over 2} + {1 \over \sqrt{2\pi}} {R-r \over \sigma x} \right) - 1 = \sqrt{2 \over \pi} {R-r \over \sigma x}

R-r is 1.285 inches, or .10708 feet.

So we get that the probability of making a putt from distance x, in the limit of large x, is about (3.29 feet)/x, although this is really only a good approximation above x = 6 feet or so. This has the advantage of being easy to remember -- well, somewhat easy, because you still have to remember the constant. But if you measure in meters, there are 3.28 feet in a meter, so the constant is basically 1; clearly golf should be done in metric.

Incidentally, I think it's a good idea to put citations that at least include the author and title in blog posts, even though strictly speaking they're not necessary as pointers if a link is provided. Why? Because that makes it more likely that people who Google the paper's title or authors find the people talking about it. (Occasionally I've recieved comments from self-Googling authors.)

Crop circles and π

Baffling crop circles equal pi.

The main thing here is a picture of a crop circle which encodes the first ten decimal digits of π. The article doesn't explain how, but I will. "Read" from the inside out; notice that the main part of the figure consists of ten concentric arcs joined by short segments. The arcs have angular lengths 3, 1, 4, 1, 5, 9, 2, 6, 5, and 4 tenths of a circle.

The article says that "This may cause more controversy in the debate whether crop circles are a result of extraterrestrial activity." It seems to me evidence against extraterrestriality; there's no reason why extraterrestrials would use decimal. The usual hypothesis seems to be that if aliens want to send us numbers, they'll do so in binary; this seems reasonable because 2 is really the only "special" numerical base, being the smallest practically usable one. (You can't send arbitrary real numbers in "base 1", and the time to send the integer N scales like N, as opposed to like log N in any base greater than 1.

And I thought that crop circles had pretty thoroughly been shown to be the work of human pranksters.

Of course, a much simpler way to encode π in a circle is to just have the circle itself.

17 June 2008

Tegmark's mathematical universe

In his paper The Mathematical Universe, Max Tegmark explores the physics implications of what he calls the External Reality Hypothesis, namely that there exists observer-independent reality. From this he argues for what he calls a Mathematical Universe Hypothesis, namely that the universe is mathematics. This is in the same sense that, say, Newtonian gravitation is just the study of curves in R4 (that is, three space dimensions and one time dimension) with minimum action.

It's interesting, but at thirty pages kind of long; Tegmark has also written a three-page article on the same topics called Shut Up and Calculate!.

I am led to wonder if Tegmark is deliberately referring to the Flying Spaghetti Monster here, in his discussion of Newtonian gravitation. (The "frog" sees only the universe as it exists right now; the "bird" sees the universe "as a whole" and in particular sees all time at once.)
If the frog sees a particle moving with constant velocity, the bird sees a straight strand of uncooked spaghetti. If the frog sees a pair of orbiting particles, the bird sees two spaghetti strands intertwined like a double helix. To the frog, the world is described by Newton’s laws of motion and gravitation. To the bird, it is described by the geometry of the pasta, obeying the mathematical relations corresponding to minimizing the Newtonian action.
Also of interest is the claim that the Mathematical Universe Hypothesis "banishes... the classical notion of randomness", essentially because all probability can be recast as measure theory. The randomness of quantum mechanics, according to Tegmark, is essentially "epistemological" -- we perceive what looks like randomness because we do not have perfect knowledge. This, of course, is not believed by all physicists; the Copenhagen interpretation of quantum physics does fundamentally include randomness.

And why aren't there other interpretations of quantum mechanics named after cities?

16 June 2008

Novelists who are mathematicians

Professor Finds the Art in Both Numbers and Letters -- a New York Times interview with Manil Suri, mathematician and novelist. I believe a copy of his first novel, The Death of Vishnu, is somewhere in my parents' basement; I remember enjoying it.

Another mathematician-novelist is Jordan Ellenberg, who has a blog.

Perhaps relevant here is Alex Kasman's mathematical fiction webpage, although this is a list of fiction about mathematics or mathematicians, not by mathematicians. Did you know Tolstoy wrote in War and Peace about history as the integral of the actions of individuals. (I didn't, because I haven't read War and Peace, because it's Really Long.)

On a related note, Ellenberg's blog is, as of right now, a year plus two days old. Mine is a year minus two days.

14 June 2008

A consequence of seven not dividing thirty

I use Google Reader to read a large number of blogs and other RSS feeds. It displays charts of the number of posts I've read in each day of the last thirty, and also the number of posts I've read in each hour of the day and day of the week over a 30-day period.

There's pronounced periodicity in the number of posts read on each day, which comes from periodicity in posting frequency (I probably read blogs twice a day or so). In particular certain days of the week see more posts than others -- in most weeks I read the most posts on Wednesday; then Tuesday and Thursday; then Monday and Friday; then Saturday and Sunday. I've known this for a while.

But in the last thirty days, I have read more posts on Friday than any other day. The numbers are as follows: Sunday, 369; Monday, 636; Tuesday, 719; Wednesday, 784; Thursday, 724; Friday, 883; Saturday; 448.

Where are all these extra Friday posts coming from?

Well, it's Saturday. The last 30 days are a period going from a Friday (May 16) to a Saturday (June 14), including five Fridays and Saturdays and four of each other day.

If I consider the last twenty-eight days instead, I get 696 for Fridays and 378 for Saturdays, and the profile for the month looks like the profile for any given week.

But the fact that thirty isn't divisible by seven has an interestinng effect there.

13 June 2008

When it comes to oil, addition is hard

Have we underestimated total oil reserves?, from New Scientist (and, it appears, every other source in the British-speaking world).

Richard Pike, of the Royal Society of Chemistry and previously of the oil industry, points out that it's generally the convention in the oil industry to give, as a one-number estimate for the output of a particular oil, the 10th percentile of the distribution that they expect. This is an inherently conservative estimate. That's fine -- but then when they combine the estimates from every oil source they do it by just adding those together. If I'm understanding this correctly, the estimates that are out there of how much oil there is correspond to what happens if every oil well, etc. performs only at the 10th percentile of its expected distribution -- which just isn't going to happen. The 10th percentile is called the "proven reserves", the 50th "proven plus probable reserves".

I don't know what the underlying distributions are, but it seems like they generally report the 10th, 50th, and 90th percentiles of the underlying distribution -- so says Pike at this friction.tv video. They add these distributions together correctly internally but Pike claims that governments don't want to think about the probabilistic logic. Pike then claims that security analysts use the resulting very pessimistic estimates; perhaps the current run-up in prices is a result of this, although I'm not sure if he'd say this. (It sounds like it's not a secret that this is the way things is done, and perhaps the analysts know this.)

Mathematically, the idea is simple. Let's say that a certain oil field is expected to produce 4 megabarrels, and the production is normally distributed with standard deviation 1 megabarrels. The tenth percentile of a normal distribution is 1.28 standard deviations below its average, so the "proven reserves" of this field would be 2.72 megabarrels, and the "proven plus probable" 4 megabarrels.

But now say we have four such fields. The oil industry's techniques would say that those fields together have "proven reserves" of 2.72 million times four, or 10.88 megabarrels. But for uncorrelated distributions -- and I'm going to assume that the distributions here are uncorrelated -- the variances add. The variance of the distribution for one field is 1 megabarrel2, so for four fields it's 4 megabarrel2; the standard deviation is the square root of this, 2 megabarrels. The mean is still 16 megabarrels, but the 10th percentile is now 13.44 megabarrels. The chances of getting as low as 10.88 megabarrels are about one half of one percent; this is what Pike means when he calls this a pessimistic estimate. And of course with more fields the pessimism becomes more extreme.

The first reference I can find to this is a May 2006 press release of the Royal Society of Chemistry, referring to a June 2006 article by Pike in Petroleum Review but it seems to have swept through the British blog world in the last few days after the Times of London made a quick reference to it in an article on North Sea oil, as this press release of the RSC mentions. (For some reason the Times article refers to "this month"'s issue of Petroleum Review, which I can't seem to see online even though Penn's library claims to have an electronic version.)

If this is true (I hesitate to think it is, just because it seems surprising that people wouldn't know this!) then it's good news and bad news. Good news because it means we're not going to run out of oil as soon as we think, which means less economic shock. But it also means more carbon for us to spew into the air before we finally run out.

I encourage you to not read the comments in most places where this has been posted, because it's basically people just ranting about global warming and saying either "we've reached peak oil, anybody who says we haven't is a poopyhead" or "we haven't reached peak oil, anybody who says we have is a poopyhead".

The moral here: probabilistic forecasting is tricky. See also Nate Silver's appearance on cnn.com regarding polling for the presidential election, which is totally irrelevant here except that it also goes to show how a lot of people don't know how to aggregate probabilistic data.

First prime numero

From Merv Griffin's Crosswords, a crossword clue: "First prime numero", three letters.

Three of the contestants answered UNO.

Of course, the answer is DOS.

12 June 2008

51,199,463,116,367: A fuller solution to Tuesday's electoral vote problem

A problem that recently was posted on Nate Silver's FiveThirtyEight was this one, which I've mentioned before:
How many unique ways are there to acquire at least 270 electoral votes without any excess?
Somewhat more rigorously, Michael Walsh (who might be someone I know from MIT; I'm not sure, though, because it's a fairly common name) defined a minimal winning set as a subset S of all the states such that the sum of the number of electoral votes of those states is at least 270, but that sum minus the minimum number of electoral votes of any state in the set is less than 270.

I'm going to give my solution, and also attempt to summarize the other solutions that have been given, to the extent that I can. (I am not fluent in the programming languages in which some of the solutions are written, so I can't guarantee perfection.)

The number, as I stated in my solution there, is 51,199,463,116,367. (I'm including this number in the post -- and as the title of the post -- because if you have reasonable Google skills and are looking for more information on this, you'll probably Google that number.) I gave a quick sketch of the solution there, but I want to explain it more fully.

As is usual in combinatorics, it's best to count the minimal winning sets by some parameter, and then sum over that parameter. In this case the natural parameter is the minimum number of electoral votes of an element of S. So I'll begin by counting the number of ways we can pick a set of states S, where:

  • the smallest state in the set has three electoral votes;

  • together all the states have 270 or more electoral votes;

  • upon removing the smaller state, we're left with 269 or less electoral votes.


Thus our set must contain at least one state with three electoral votes, and a total of 270, 271, or 272 electoral votes. We can start by counting all the sets which have 270, 271, or 272 electoral votes, and then remove those which contain no 3-EV states.

So, how to count those sets? Generating functions. I'll illustrate with a smaller example. Say we live in a nation with three states, which have 3, 4, and 5 electoral votes respectively. I will call these states Wyoming, Idaho, and Utah. Now, consider the polynomial

(1 + x3)(1 + x4)(1 + x5).

Here we have a factor (1 + xk) corresponding to a state with k electoral votes. When we expand this polynomial, the result is

1 + x3 + x4 + x5 + x7 + x8 + x9 + x12.

Each term here corresponds to one of the subsets of the set {Wyoming, Idaho, Utah}; for example the x8 term corresponds to the set {Wyoming, Utah}. We can read off from the coefficients of the polynomial that there is one way each to receive 0, 3, 4, 5, 7, 8, 9, or 12 electoral votes in this nation; no other numbers are possible.

Of course, in reality there are many ways to get each number of electoral votes, because there are 251 possible sets S and only 435 electoral votes. To give a simple example of this repetition, in the above example replace Utah (5 EV) with Oregon (7 EV), and then we get

(1 + x3)(1 + x4)(1 + x7)

which expands to

1 + x3 + x4 + 2x7 + x10 + x11 + x14.

In this hypothetical mini-nation there are 2 ways to get seven electoral votes -- Wyoming and Idaho together, or Oregon by itself. That explains the 2 in front of x7.

So now how many ways are there to get 270, 271, or 272 electoral votes in the actual nation? We consider the much larger polynomial

f3(x) = (1 + x55) (1 + x34) (1 + x31) ... (1 + x4)5 (1 + x3)8)

which has a factor of 1 + x55 for California (55 EV), 1 + x34 for Texas (34 EV), and so on down to the five states with 4 EV and the eight states with 3 EV. Expanding this out gives

f3(x) = x538 + 8x535 + 5x534 + ... + 17032469851307x272+17046339123934x271+17054665123395x270 + ... + 8x3 + 1

and we see that in our actual nation, there is one way to get 538 EV (winning all the states), eight ways to get 535 (winning all the states but one of the 3-EV states), and so on. The coefficients of x272, x271, x270 in the middle are the number of ways to win 270, 271, or 272 EV.

So we just add those up, right? Not so fast. Remember that we wanted the number of ways to get 270, 271, or 272 EV including at least one 3-EV state. So we construct a similar polynomial which excludes the 3-EV states, namely

f4(x) = (1 + x55) (1 + x34) (1 + x31) ... (1 + x4)5

and the coefficient of xn here is the number of ways to get n electoral votes using only states with 4 EV or more. Expanding, the terms we care about are

f4(x) = ... + 64405297719 x272 + 64713359463 x271 + 65001219892 x270 + ...

The number of ways to get 270, 271, or 272 EV is thus 17032469851307 + 17046339123934 + 17054665123395 = 51133474098636. The number of ways to do that without using any 3-EV states is 64405297719 + 64713359463 + 65001219892 = 194119877074. The difference, 50939354221562, is the number of ways to get 270, 271, or 272 EV using at least one 3-EV state, and therefore the number of minimal winning sets which contain at least one 3-EV state.

Now we can proceed in the same way to get the number of minimal winning sets with smallest state having 4, 5, ..., 15 EV. There are 250611676072 with smallest state having 4 EV, for example, all the way down to 1 with the smallest state having 15 EV. (The states having at least 15 EV together have 271 EV, so you have to include all of them.)

Adding the numbers you get in this way gives the answer 51,199,463,116,367. The proportion of minimal winning sets which don't use at least one 3-EV state is one in 197; it's not surprising that this is quite low, because there are eight 3-EV states and so it's hard to avoid all of them!

A lot of the commenters at FiveThirtyEight used the technique of "dynamic programming" to get the same answer. This basically amounts to multiplying out the polynomials term by term and keeping track of just the coefficients, not writing down all the x's and the exponents. (Of course, the people that went that route would probably say that my approach via generating functions is just dynamic programming with a bunch of extra symbols floating around for no good reason.)

A couple of particularly good estimates also occurred. One person (Brian, 4:43 pm) simulated a million random outcomes, found that 22,790 of them gave minimal winning sets, and therefore gave a 95% confidence interval of (0.02249, 0.02309) for the probability that a set chosen uniformly at random is a minimal winning set; there are 251 minimal winning sets, so a 95% confidence interval for their number was (50.6 trillion - 52.0 trillion). The actual answer is about 51.2 trillion, and falls in that interval.

Someone signing himself as "ray" had a clever solution as well. If we assume that the states each vote independently and each have probability 1/2 of voting for our candidate, then the number of electoral votes our candidate gets is itself a random variable. Since it's a sum of a bunch of smaller independent random variables, we can assume it's close to being normal; this is a heuristic application of the Central Limit Theorem. The mean is clearly 269 (half the total number of electoral votes). The variance of the number of electoral votes received from a state with k electoral votes is k2/4, and variances add, so the variance is one-fourth of the sum of the squares of the numbers of electoral votes. This is 2466.5; the standard deviation is the square root of this, 49.66. The probability that a normally distributed random variable with mean 269 and standard deviation 49.66 falls between 269.5 and 272.5 (that is, that its value rounded to the nearest integer is 270, 271, or 272) is 0.0241. This is the approximate probability of getting 270, 271, or 272 EV; that's pretty close to the event we want. We can approximate the answer as 251 times this, or 5.42 · 1013, which is high by a few percent -- but not bad for a solution that totally ignores the combinatorics!

11 June 2008

"Torsion" as an adjective

I had an algebra professor my first year of grad school who would describe groups in which all elements had finite order as "torsion". For example, he'd say something like "since S4 is torsion"...

Today I was browsing through Tao and Vu's additive combinatorics. They give the following definition:
Definition 3.1 (Torsion) If Z is an additive group and xZ, we let ord(x) be the least integer n ≥ 1 such that n · x = 0, or ord(x) = +∞ if no such integer exists. We say that Z is a torsion group if ord(x) is finite for all xZ, and we say that it is an r-torsion group for some r ≥ 1 if ord(x) divides r for all xZ. We say that Z is torsion-free if ord(x) = +∞ for all xZ
In the ensuing sections they often refer to "a torsion group". But they never use "torsion" alone, or if they do I didn't see it. This seems to me to be good evidence that the use of bare "torsion" isn't universal. (It also sounds weird to my ears, but a lot of things this particular professor said sounded weird to my ears and turned out to be standard usage among mathematicians. Since I was a first-year I had some learning to do.)

And the way that the word "torsion" is used here seems different than, say, an "open set". You can start a proof by saying "Let U be open." But "Let G be torsion." seems lazy. Perhaps mathematical English has two sorts of adjectives -- adjectives of the second kind like "open", which can be used without the implied noun they modify, and adjectives of the first kind like "torsion", which can only be used with the implied noun.

(The swapping of "second" and "first" is deliberate; it's like the Stirling numbers. I can never remember which is which.)

xkcd on science

Scientific fields arranged by purity, from xkcd.

What about applied mathematicians?

10 June 2008

Some electoral math from fivethirtyeight.com

A "homework assignment" from FiveThirtyEight.com: Electoral Projections Done Right: in a US presidential election, "[h]ow many unique ways are there to acquire at least 270 electoral votes without any excess?" That is, how many sets of states with at least 270 electoral votes, such that when any state is removed less than 270 electoral votes remain?

For the most part the comments so far are:
  • people misunderstanding the problem (which is not surprising; it's the sort of problem mathematicians are better at understanding than the general population, and the audience there is not necessarily mathematicians);
  • programmer types whining "waah, this problem is in NP, I can't do it!". Is it in NP? I don't feel like thinking about that -- but I did it in a few minutes, and more importantly with a couple seconds of computation time. NP doesn't stand for "not possible";
  • people saying "who cares, most of those combinations are politically unrealistic!" (which I have to admit is true);
  • a few estimates of the answer based on some quick and dirty statistical assumptions, which look basically right;
  • and a couple people (me included) who have gotten a numerical answer.
I'd explain my solution here, but I think some of you might want to try out this problem. If you poke around in the comments at FiveThirtyEight.com you can find my solution.

What's interesting to see is that a lot of people seem to believe you have to actually generate the sets in order to count them. That would of course be hopeless. The explanation of combinatorics as "counting without counting" comes to mind.

09 June 2008

Rubik's deck of cards?

By Igor Kriz and Paul Siegel, at Scientific American: Rubik's Cube Inspired Puzzles Demonstrate Math's "Simple Groups".

The Rubik's cube can be said to be a physical embodiment of the Rubik group, which is a certain subgroup of S48. (Why 48? There are 54 "facets" of the Rubik's cube, nine on each side; six of these don't move, leaving 48.) This subgroup has an easy presentation with six generators, namely rotations of the six faces. As the authors point out, other "Rubik's-type" puzzles embody groups of the same general sort.

The authors have invented puzzles based on certain sporadic groups, namely theMathieu groups M12 and M24 and the Conway group Co0. These right now only exist as computer programs, although the authors claim that a physical version of the M24 puzzle could be built.

A possibly interesting, although not-at-all well-defined question -- which groups are "buildable", in the sense that one can build a physical object that represents them?

The programs (which run on Windows machines) can be downloaded from Igor Kriz's home page.

Thanks to John Armstrong for pointing me to the article.

07 June 2008

Link to review of The Drunkard's Walk

Here's a review of Leonard Mlodinow's book The Drunkard's Walk. I will resist the temptation to review the review. (Although I've been meaning to write down what I think of the book. I might do so.)

For some difficult-to-explain reason it is illustrated with a couple images from Jessica Hagy's indexed.

Virtual laboratories in probability and statistics

Virtual laboratories in probability and statistics offers several dozen applets for the visualization of probabilistic systems.

Most of them are pretty simple -- they graph some theoretical distribution and then show the results of repeating some corresponding experiment that gives that over and over again.

For example, the random walk experiment -- although I link to that one because it shows the arcsine distribution for a random walk quite well! The "arcsine distribution" as applied to random walks is as follows: let X1, X2, ... be independent random variables which are each +1 with probability 1/2 and -1 with probability 1/2. Let Sn = X1 + X2 + ... + Xn. Consider the sequence (S1, ..., S2n) -- now, what's the probability that the last zero in this sequence comes between time 2an and 2bn? It turns out that as n approaches infinity, this probability approaches
\int_a^b {1 \over \pi} {1 \over \sqrt{x(1-x)}} \, dx = {2 \over \pi} \left( \sin^{-1} \sqrt{b} - \sin^{-1} \sqrt{a} \right)

and in particular, the last zero is surprisingly likely to be near 0 or 2n, and surprisingly unlikely to be somewhere in the middle.

If you like pretty pictures (and you should -- a picture is worth a thousand words and all that), check out the section on interacting particle systems, which includes the fire process (which models the spread of a fire in a forest, where the trees form a grid and fires only spread between adjacent trees in discrete time -- so a bit unrealistic as a model for real fires, but who cares?) and the voter process. Here h particles in a grid change state in order to match their neighbors; "segregation" of the different particle types occurs, not because anybody's enforcing it from on high but just because of these local constraints, and some particle types just die out completely.

The web page not only includes the applets, but explanations of what's going on, since the intended audience here seems to be students taking a first course in probability.

06 June 2008

Trust the polls

According to an op-ed in today's New York Times, a couple of physicists (J. Richard Gott and Wes Colley) have discovered that in "swing states" in US presidential elections, the median of the results of polls conducted in the weeks prior to the election is a good predictor of how a state will vote.

That's true -- but it's also incredibly obvious. Asking a sample of people how they're going to vote is apparently a good way of predicting how they're going to vote. As for why they take the median, it's because the median is more robust than the mean; there are some highly partisan or incompetent pollsters out there. The fact that it's newsworthy says something, although I'm not entirely sure what.

This was apparently something they knew in 2004.

As for the article's claim that according to the polls, Clinton would beat McCain but McCain would beat Obama, the conventional wisdom seems to be that that's precisely because Obama and McCain have been campaigning against each other and ignoring her for the last month or more. If Clinton were to somehow get back into the race (let's not get into how that would happen), then McCain would campaign against her, and she'd go down in the polls.

Of course, this is all analysis of the "if the election were held today" type, which it's not. I only support this type of analysis when it predicts that the Phillies will make the playoffs, which it does as of right now.

05 June 2008

Fermat's last theorem is "liberal" mathematics?

From change notation and integrate by parts, I found Conservapedia's examples of bias in Wikipedia. Most of them are the typical things that get trotted out in conservative complaints about liberals. (I mean these words in the US sense.) Buried in the list, though, is:
22. Mathematicians on Wikipedia distort and exaggerate Wiles' proof of Fermat's Last Theorem by (i) concealing how it relied on the controversial Axiom of Choice and by (ii) omitting the widespread initial criticism of it.

I am not making this up.

Now, I don't know Wiles' proof anywhere near well enough to comment on its use of the axiom of choice. But the placement seems to imply that the people behind Conservapedia think the proof of FLT is somehow "liberal" mathematics. Why they single out FLT is unclear. Apparently conservative mathematics mentions God a lot. (The previous link goes to descriptions of math courses at a Christian high school, which I previously mentioned back in August.)

By the way, the "God" of my blog's title is not the God of those course descriptions. My blog's title is an answer to Einstein's famous aphorism "I shall never believe that God plays dice with the universe" -- and the Einsteinian God has basically nothing in common with the Judeo-Christian God except name.

04 June 2008

The British are bad at maths

According to Tom Geoghegan of BBC News, "The British are uniquely happy to admit being bad at maths", citing examples such as television personalities and politicians.

Well, yeah, the British are unique in this way -- if "maths" and "math" are different things. Americans don't like admitting they're bad at maths, because they don't want to sound British. But they do seem to like admitting they're bad at math.

Quantum mechanics as noncommutative probability theory?

From The Quantum Pontiff: Quantum mechanics as noncommutative probability theory?

As I pointed out in a comment there, there are quantum random walks, which are a generalization of classical random walks, but display quantum interference and "spread out" much faster than classical random walks, namely at a rate proportional to time, instead of proportional to the square root of time as in the case of classical random walks. The comments at the Quantum Pontiff, which address other aspects of quantum mechanics as noncommutative probability theory, could be worth a quick read if you're interested in this sort of thing. (I actually don't know enough quantum mechanics to comment intelligently.)

But ordinary ("commutative") probability theory is counterintuitive enough...

02 June 2008

Richard Stallman: not a mathematician

Something I didn't know: Richard Stallman ("rms") of free software fame took Harvard's Math 55, often called "the hardest math class in the country". (I think I've linked to the Math 55 article before, but I can't find it.) He then took more math courses, and turned out to be quite good at it, but for some reason -- his biographer speculates it was because he shied away from the competitive aspects of mathematics -- he gravitated to computer science.

David Harbater (professor of mathematics at Penn) said of that class:
"only 10 really knew what they were doing." Of that 10, 8 would go on to become future mathematics professors, 1 would go on to teach physics.
"The other one," emphasizes Harbater, "was Richard Stallman."
Daniel Chess also makes an interesting point, about a "brilliant" proof that Stallman came up with his sophomore year:
"That's the thing about mathematics," says Chess. "You don't have to be a first-rank mathematician to recognize first-rate mathematical talent. I could tell I was up there, but I could also tell I wasn't at the first rank. If Richard had chosen to be a mathematician, he would have been a first-rank mathematician."
It's interesting that someone would even say this. Nobody would say that you don't have to be a great chef to recognize great cooking, because it goes without saying.

I learned this from Free as in Freedom: Richard Stallman's Crusade for Free Software by Sam Williams, which is available online for free.

01 June 2008

Betting on his life

Man set to win on 'lifetime' bet, from the BBC.

In April 2007, a British man, Jon Matthews, was diagnosed with lung cancer and given nine months to live. He placed a bet with a bookmaker, William Hill, of 100 pounds at 50 to 1 odds. He's still alive, and he's collecting his 5000 pounds today.

Now, I have to think that the bookmaker offered odds which were much too long. Assuming that "nine months to live" actually means anything, I'd guess it's the median survival time. So this man had a probability of something like 1/2 of living the nine months. The bookmaker should have been offering 1 to 1 odds. (Actually, not even that, because they've got to make a profit!)

And even if it's the mean survival time that's meant, I doubt the distribution is sufficiently skewed to make a real difference. I suspect that "you've got X months to live" is just something that doctors say to patients informally, though, and it may not be sufficiently well-defined to quibble about this.

Of course, as the article points out, bets like this don't get made often. If they did, the bookies would pretty quickly figure out that they shouldn't be offering such large payoffs.

Game theory and toilet seats

A game-theoretic approach to the toilet seat problem -- when someone who pees standing up and someone who pees sitting down share a bathroom, what should they do?

(My solution to this problem is living alone, but this may not work for everyone.)