## 31 December 2007

### Some football probabilities

Back in September I answered a question about probabilities in Major League Baseball's wild-card race..

John Armstrong (of The Unapologetic Mathematician fame) asked me about NFL football: assuming all teams are evenly matched, what is the probability that a division sends three teams to the playoffs? For those of you who aren't familiar with the NFL's playoff structure, there are two conferences of sixteen teams each (the NFC and the AFC); each conference is divided into four divisions of four. The winner of each division gets into the playoffs, and the two best teams among non-division winners in each conference get in as "wild cards". (For an example, see this year's final standings at nfl.com.)

This is equivalent to the following question: what is the probability that both wild cards in a conference come from the same division? I suspect that this question is inspired by the fact that this actually happened in both conferences this year, in the AFC South and NFC East respectively. (This must be especially frustrating for fans of the fourth-place team in each of those divisions. The fourth-place team in the NFC East was the Philadelphia Eagles. I think I'll avoid Eagles fans today. But that might be hard.)

Anyway, the probability that this event happens in any given conference in any given year is 2/11. Let the teams play out their season. Consider the twelve non-division winners in a given conference, and rank them in order from 1 to 12. (You might think: in the NFL each team only plays sixteen games, so won't there be ties? Perhaps, but the NFL's wild card tiebreakers don't care what division a team is in.) The top two teams get in. The divisions of the twelve teams (from top to bottom) form a word with three N's, three E's, three S's, and three W's, picked uniformly at random. Say without loss of generality that the first letter that the word starts with an N; then the remainder of the word is picked uniformly at random from permutations of NNEEESSSWWW. The chances that starts with an N are 2/11.

Alternatively, there are words which contain three each of N, E, S, and W. The number of these that begin with the same letter twice is -- pick the letter that appears twice at the beginning in one of four ways, then arrange the ten remaining letters however you like. So the probability of the two division winners coming from the same division is 67200/369600, or 2/11. I don't like this approach quite as much, though, because it involves big numbers that cancel out; it seems cleaner to me to not invoke the big numbers in the first place.

By the way, the NFL has had its current structure since 2002, and the years and conferences in which this event has actually occurred are the 2007 AFC South, 2007 NFC East, and 2006 NFC East, making three out of twelve.

The question I asked in September was not the analogue of this one (MLB only has one wild card, so there is no analogue); there I found the probability that the four MLB teams making the playoffs from any league were actually the four teams with the best records. But I don't have a general method to find the solution for some league with arbitrary structure, and the NFL would require slogging through more cases than MLB (two wild cards instead of one) so I won't bother for now. (Or probably ever -- I don't particularly care about that number.)

## 30 December 2007

### Why are knots so common?

Why are knots so common? This article in Science News, which I found from Slashdot, gives a bit of an explanation. (There are links to further, more scholarly, articles in the Science News piece; it's Sunday morning, though, so I'm not bothering to read them.)

My instinct is that any measure of knottedness is bounded below, but not above. You can't get more unknotted than, well, an unknotted string, but conventional measures of knottedness (say, the crossing number) can get arbitrarily large. So we can imagine a random walk on the positive half-line, which takes values which are the crossing numbers of some random process which generates knots; the crossing number will tend to be some medium-sized positive number, just because whenever it's zero it gets forced away.

## 29 December 2007

I just discovered Scott Aaronson's blog, Shtetl-optimized. You should discover it as well.

I am currently being amused by the comments to his post soliciting unparadoxes, which are what they sound like -- results that superficially resemble well-known paradoxes, but are actually not paradoxes at all. For example, "The Banach-Tarski Unparadox: If you cut an orange into five pieces using a standard knife, then put them back together, the result will have exactly the same volume as the original orange." This seems to me to be a new class of mathematical jokes; only someone who had heard of the Banach-Tarski paradox would appreciate why one would ever expect it to be possible to cut an orange up in any other way!

Take a look through the comments to Aaronson's post, which give about a hundred more examples; I won't try to come up with any of my own.

### Gravity pods

The game Gravity Pods is a very fun game, and it vividly illustrates the principle of sensitive dependence on initial conditions. You're asked to fire a projectile in an area which includes some gravitational attractors, in an attempt to hit a given target. (At later levels you can move some of the attractors around. At even later levels there are repulsive objects, a nice twist. There might be even more tricks later, but I haven't gotten that far.)

It was an even more vivid illustration of sensitivity before, because in earlier incarnations of the game there were levels that could not be beaten -- it would be possible to fire at an angle of 10 degrees from the horizontal, or 10.1 degrees (0.1 degrees is the finest control you get over the angle at which the projectile is fired), but 10.05 degrees (say) is what you really needed. Similarly, the movable attractors and repellers can only be placed to the nearest pixel. I don't think this problem exists any more, but there are still annoying (although perhaps fascinating as well) levels in which sub-pixel positioning abilities would be nice.

## 28 December 2007

### On the Mian-Chowla sequence

An interesting problem from Additive Combinatorics by Terence Tao and Van Vu. (This is almost a direct quote; I've modified it slightly in order to be able to render it entirely in HTML.)

Exercise 6.2.5. Let S = {1, 2, 4, 8, 13, 21, 31, ...} be the Sidon set of positive integers constructed by the greedy algorithm (this set is sometimes known as the Mian-Chowla sequence). Show that the kth element of S does not exceed (k-1)3 + 1, and hence the number of elements of S which are less than n is Ω(n1/3) as n goes to infinity.

A Sidon set, for those of you who don't know, is a set for which all the pairwise sums are distinct. Tao and Vu credit this result to Stöhr; see below. A rough translation of the title of that paper is "Solved and unsolved questions on bases of sequences of natural numbers", although my German isn't so good. (I've been looking for mathematics written in German, though, because my department requires me to be able to read two of mathematical French, German, and Russian -- I've already passed the French and I know a little German and no Russian -- so I might try to read this, or at least parts of it.)

First, an example of where these numbers are coming from. Let s1 = 1, s2 = 2, s3 = 4, and so on form the sequence S. Say we know the first four terms, s1 through s4, are 1, 2, 4, 8. Then 9 can't be in the sequence, because 9 + 1 = 8 + 2. 10 can't be in it, because 10 + 2 = 8 + 4. 11 can't be in it, because 11 + 1 = 8 + 4. And 12 can't be in it, because 12 + 4 = 8 + 8. But none of 13 + 1, 13 + 2, 13 + 4, and 13 + 8 are pairwise sums of {1, 2, 4, 8}, so the fifth term is 13. It turns out to be easier, I think, to think in terms of differences instead of sums; we say that 9 can't be in the sequence because 9 - 2 = 8 - 1, and so on. So if s1, ..., sk are known, we choose sk+1 to be the smallest integer N such that the intersection is empty.

The solution is straightforward. Now, by the "greedy algorithm" it's meant that if we know s1 through sk, then sk+1 is the smallest integer greater than sk which doesn't violate the definition of our sequence. Then it's enough to minimize the difference tk = sk+1 - sk; this is the smallest positive integer which does not occur among pairwise differences of elements of {s1, ..., sk}. There are k(k-1)/2 such differences, and they're not necessarily distinct, so the smallest positive integer which doesn't occur is at most (k(k-1)/2) + 1, or . Then the sk are just the partial sums of the tk, so we get which is asymptotically equal to k3/6.

The same method gives a lower bound -- all the differences sk - si for i < k are distinct, so tk is at least k+1. This gives a lower bound for sk of order k2/2.

The continuation of the sequence can be found in the Encyclopedia of Integer Sequences; it shares its first eight terms with this sequence, with which I confused it at first. The condition for computing this "impostor sequence" is that be empty. The impostor sequence begins {1, 2, 4, 8, 13, 21, 31, 45, 60, ...} and it's at the term "60" that it first differs from the Mian-Chowla sequence. We have 60-45 = 15, and 15 never occurs as a difference between any of the first eight terms. But 60 - 31 = 31 - 2 = 29, so 60 isn't the ninth term of the Mian-Chowla sequence; that turns out to be 66. The divergence only gets wider after that. But the upper and lower bounds that I gave above still apply!

It appears (from computations of Frank Chu) that the nth term of the Mian-Chowla sequence grows like n2.8. I don't know why this should be, though. However, the sequence I confused it with is the partial sums of the segmented numbers; the nth segmented number is about n lg lg n (this is just a conjecture, but a reasonable one) and so my impostor sequence should grow like (n2 lg lg n)/2, which fits the known values reasonably well. Since my impostor sequence has fairly weak conditions allowing us to extend it, I would conjecture that its growth rate is close to the lower bound, of order n2, and so n2 times a subpolynomial function seems nice. I'd actually conjecture that the Mian-Chowla sequence grows like n3 divided by some subpolynomial function -- but which one? Finally, something about the fact that we're looking at collisions between the difference sets makes me think that there might be a probabilistic model for this problem.

References
Frank Chu. Notes on the Mian-Chowla Sequence. http://www.cumc.math.ca/2004/mc.pdf
A. Stöhr, Gelöste und ungelöste Fragen über Basen der natürlichen Zahlenreihe, I, II. J. Reine Angew. Math. 194 (1955), 40-65; 111-140.
Terence Tao and Van Vu, Additive Combinatorics. Cambridge: Cambridge University Press, 2006.
Encyclopedia of Integer Sequences: A004978, A005282.

## 27 December 2007

### A couple articles from the Princeton Companion to Mathematics

Differential forms and integration (10 pp.), from the forthcoming Princeton Companion to Mathematics. (From Terence Tao, who has also written other articles for the PCM. I think it would have been rather nice to receive such an article at the outset of the class in which I learned about this stuff, as it would have helped to give some context; although one can often get context from a textbook, it's difficult to do so because all the little explanatory bits are often mingled with the details. The PCM articles I've seen do not suffer from this flaw; indeed, one might describe a lot of them as what the preface of a textbook covering the same material should be.

In a less theoretical, and less general, vein, here's Frank Kelly's The Mathematics of Traffic in Networks, also from the PCM, which I learned about from a post from Tao last week. One thing this mentions is the paradox that under certain conditions, adding roads to a road network can make trips slower, even though each driver tries to minimize their own time spent on the road. (Fortunately, my life is arranged so that I walk almost everywhere; thus at the moment I only appreciate the weirdness of traffic on a theoretical level. But I know it won't always be like that.) I'd be interested to see if this paradox occurs in actual road networks or only in the contrived example that Kelly gives.

The PCM will sell for about $100, so I probably won't buy it when it comes out. But I suspect I will spend some time browsing it when our library gets it. ### A postal nomogram Nomograms are collections of curves drawn on a sheet of paper that are used to do a specific calculation. See, for example, this Google Image Search. A typical example calculates some real-valued function of two variables by having three scales. Two of these scales, usually, are straight lines (although they don't have to be); these lines are marked with values of the two inputs to the function. To compute a value of the function f(a,b), you find the point marked "a" on the first scale, and the point marked "b" on the second scale; then you draw a straight line between those two points, and the marking of the point where that intersects the third scale is the value of f(a,b). The simplest possible example would probably be the nomogram that computes the function f(a,b) = ab/(a+b); to obtain this, the "input scales" are just the x and y axes marked off as usual, and the "output scale" is the line y=x, with the marking z at the point (z,z). The line from (a, 0) to (0, b) intersects the line y = x at (ab/(a+b), ab/(a+b)). This is known as the "parallel resistance nomogram", since the resistance of an a-ohm resistor and a b-ohm resistor in parallel is ab/(a+b). More complicated computations can be done with more complicated curved scales. You might think that these don't exist any more, since isn't it possible to compute things to any desired accuracy with modern computers? But today I saw an example of something similar at a United States Post Office. Apparently the Post Office has regulations saying that packages with volume under one cubic foot are to be treated differently than those with volumes over one cubic foot. You could ask the people behind the counter to do the multiplication, but that might be asking for trouble. So the postal service has made a mat which has the curves xy = 1728/h plotted on it, for various values of h; the curves are labelled with h. (So, for example, the curve passing through (16, 12) is marked 9, since a 16-by-12-by-9 package is one cubic foot, or 1728 cubic inches.) The postal employee can place a package on the mat, with one corner at (0, 0); the opposite corner will be at (l, w) where l and w are the length and width of the package. Then the curve which passes through the opposite corner is the maximum height allowed for the package. I wish I could find a picture of it online, but I can't. This isn't exactly a nomogram (at least under the definition I've seen), but it's very much the same principle. And it's quite interesting to see one "in the wild". ## 25 December 2007 ### Merry Christmas Merry Christmas! Brain doping from the LA Times. (A few days old, but I just found it.) And why do real programmers confuse Halloween with Christmas? Because 31 OCT = 25 DEC. ## 24 December 2007 ### Why W? Here's an interesting article I found while poking around the Web for something else: Why W?, by Brian Hayes, from the March-April 2005 issue of American Scientist. W is not the president here, but the inverse function of x → xex. Apparently the name didn't become popular until the Maple software was written, because Maple needed a name for the function. Empirically speaking, when I type random things I'm curious about into Maple, the W function is one of the functions that most frequently appears in its analytic solutions; Bessel functions are also quite common, which shouldn't be surprising, as are various instances of the hypergeometric series. Since so many familiar functions can be written in terms of the hypergeometric series, this is hardly surprising. The coefficient of xn in the Taylor series for -W(-x) is nn-1/n!; the number of labelled trees on n vertices is nn-2 (Cayley), so it's not that surprising that W is involved in the enumeration of trees. In fact, W(x) is the generating function for labeled rooted trees. (n rooted trees correspond to each unrooted one, since we can choose the root from among all the verticecs. Unfortunately most of these trees are very oddly shaped and would not look nice as a Christmas tree. ## 23 December 2007 ### Reciprocal fuel economy Eric de Place notes: You save more fuel switching from a 15 to 18 mpg car than switching from a 50 to 100 mpg car. This sounds counterintuitive at first. But the "natural" units for fuel consumption, at least in this case, are not distance per unit of fuel but units of fuel per distance. In some parts of the word fuel usage is given in liters per 100 km; let's say we were to give fuel usage in gallons per 100 miles. (The constant "100" is just there to make the numbers reasonably sized.) Then switching from a 15 mpg car to an 18 mpg car is switching from a car that gets 6.67 gal/100 mi to 5.56 (lower is better); switching from a 50 mpg car to a 100 mpg car is switching from a car that gets 2.00 gal/100 mi to 1.00. (Another interesting consequence -- switching from 50 mpg to 100 mpg has the same effect as switching from 100 mpg to ∞ mpg, i. e. a car that uses no fuel at all.) The idea is that chopping off the low-fuel-economy tail of the distribution (by legal means) would be a much easier way to reduce oil consumption than trying to make very-high-fuel-economy cars. But not all incremental achievements are created equal. It was probably a lot harder to get from 1 mpg to 2 mpg than it will be to get from 100 mpg to 101 mpg. Also, note that a pair of cars that get, say, 20 mpg and 50 mpg will average "35 mpg" in the way that the new regulations for average mileage of a automaker's fleet are written; but for each car to go 100 miles, it'll take a total of seven gallons of fuel, for a fuel economy of 200/7 = 28.6 mpg. (This is the harmonic mean of 20 and 50.) The regulations aren't necessarily flawed -- they probably should be stated in terms of the measures of fuel economy that are most commonly used -- but there's room for possible misunderstanding. Another place I can think of where the "natural" units are the reciprocal of the ones that are habitually used is in statistical mechanics; there are tons of formulas there that have temperature in the denominator, and for the purposes of statistical mechanics it makes more sense to use inverse temperature. (I've written about this before, I think; it basically comes out of the fact that the partition function involves inverse temperature.) Are there others? (I found this from Marginal Revolution.) ### Tracking an intruder From Yuliy Baryshnikov: an animation described as follows: "In the 40-by-40 array of sensors, a running intruder is represented as a black dot. Can you spot her? The catch: the sensors are prone to spontaneous signals ("false positives"). These signals are rare (just 2.5% probability), but what a mess they make!" But of course, if you stare long enough you can see the intruder. The trick, is that the intruder moves slowly, say one grid square per second. If you overlay the grids at time n and n+1 on top of each other, there will be about eight pairs of (kingwise) adjacent squares such that the first square is a false positive at time n and the second square is a false positive at time n+1. (There are roughly 402 × 8 ordered pairs of kingwise adjacent squares.) If we consider it at three consecutive times our expected number of false positives is the 402 × 82 triples of kingwise adjacent squares, times the 40-3 probability that the three squares of such a triple are occupied at times n, n+1, n+2, for an expected 1.6 false positives. And so on -- there are one-fifth as many false positives for each extra consecutive step we look at. If false positives occurred one-eighth of the time, though, then we'd have no hope; the number of "false positives" when looking at some window of length k would be constant, instead of decreasing exponentially with k. In general, you'd expect that if the intruder has r choices for her position of each step, then as the probability of false positives approaches 1/r the intruder becomes impossible to track. A quick skim of titles of Baryshnikov's papers doesn't tell me if this is an illustration of something from his research. I'm curious, so I've sent him an e-mail to ask. ## 22 December 2007 ### What is power? From Walter Lewin's 8.01 Lecture 14, about 14 minutes in: "What is power? Power is work that is done in a certain amount of time. dw/dt, if w is the work, is the instantaneous power at time t. We also know power in terms of political power. That's very different. Political power, you can do no work at all in a lot of time and you have a lot of power. Here in physics, life is not that easy." This reminds me of the classic claim that, say, bricklayers do more work than lawyers, because work is force times distance, and bricklayers are exerting much larger forces when they lift things, since bricks are heavier than pieces of paper. This is a sort of translingual pun, the two languages being English and scientific jargon. ## 21 December 2007 ### When asteroids attack Asteroid May Hit Mars in Next Month (AP). Scientists tracking the asteroid, currently halfway between Earth and Mars, initially put the odds of impact at 1 in 350 but increased the chances this week. Scientists expect the odds to diminish again early next month after getting new observations of the asteroid's orbit, Chesley said. If they expect the odds to diminish, why haven't they set them lower in the first place? I think this may be another issue of mean-median confusion -- say, there's a 1 in 3 chance that the odds will go up to 1 in 25 next month and a 2 in 3 chance it'll go away completely. But the statement seems kind of silly. Apparently efficient market theory doesn't apply to asteroids. ### Links for 2007-12-21 Mathematicians solve the mystery of traffic jams, which has been making the rounds lately. Basically, sometimes traffic jams come out of nowhere because someone taps their brakes and the traffic propagates backwards. I remember reading this ten or fifteen years ago. It must be a slow news day. Thanks to Wing Mui, The trouble with five by Craig Kaplan. Why is it that we can tile the plane with triangles, quadrilaterals, or hexagons -- but not pentagons? And what sort of structure do tilings with some sort of fivefold symmetry have? Also, a few days ago I wrote about Galileo's argument that the thickness of bones should scale like the 3/2 power of an animal's height, which I learned about from Walter Lewin's lectures, but that this turns out not to be true. Having gotten a bit further into Lewin's lectures, I found out that the real thing we're protecting against is buckling, not crushing. Greg Mankiw a couple months ago provided a link to a paper on optimal mortgage refinancing by Sumit Agarwaland, John Driscoll, and David Laibson. The Lambert W function (the inverse of f(x) = xex) turns out to be good for something! If you use this, I want a cut of your savings for pointing you to it. ## 20 December 2007 ### A sighting of mathematics on Jeopardy! Today's Final Jeopardy question (alas, I didn't write down the category, but it was something like "brands"): "Each unit in this brand, introduced in 1968, is a hyperbolic paraboloid, & they fir together for perfect storage." The answer: What are Pringles? Two of the three contestants got it right; the third answered "What is Orville popcorn?" I wonder if the inclusion of the "hyperbolic paraboloid" makes it harder or easier. I think it made it harder for me, because I got confused and was picturing a hyperboloid of one sheet instead. Fortunately I realized that those wouldn't fit together in any reasonable way, at least if they were all the same size. I don't know the background of the contestants tonight; I suspect most people, even most people who know enough random things to appear on Jeopardy!, would probably just filter out those words, or at best replace them with "funny shape" (because they've heard of hyperbolas and parabolas). "Funny shape" is probably the right way to think of it here; the fact that Pringles are oddly shaped is a much more salient fact about them than the precise shape, unless of course you work in a Pringle factory. (Before you ask: No, I've never tried out for Jeopardy. Yes, I probably should.) ### Weather forecasting One way I procrastinate and/or pass time is by reading the archives of long-running blogs. Today it's glenn mcdonald's furialog, because I was thinking of buying the My So-Called Life box set, but decided against it because it's$70; mcdonald used to write a music review column called The War Against Silence which I liked, the first installment of which was written five days after the last episode of that tragically short-lived television show aired. mcdonald was a fan of the show, and I was bored this morning, so I thought of his blog. (For the record, he's a programmer by trade but not by training, although this may be irrelevant.)

Anyway, he wrote on 13 January 2005, commenting on weather forecasting:
But what I would hold someone responsible for, if I thought it weren't a pervasive cultural flaw, is the destructive precision with which uncertain predictions are communicated. Weather is merely the most obvious daily public manifestation of a fundamental reluctance, or perhaps an inability, to say what we really know, rather than what we wish we knew.

Similarly, I've watched a bunch of Walter Lewin's introductory physics lectures in the last few days (I've mentioned this before) and he has drummed into my head that a measurement without uncertainty is useless. What is true for measurements is doubly true for predictions.

It would be interesting to see error bars on weather predictions, but we never will. One can get something like this, though, for the tracks of tropical storms at Weather Underground; they show the outputs of various computer models. If these models all show a storm going in the same direction, one can be fairly sure it will actually go in that direction; if the projected paths are spread out, one knows not to be so sure. I wonder if it is possible to determine the accuracy of more mundane weather predictions by, say, tuning into a large number of different TV channels, web sites, and so on which make independent forecasts. If it were possible to get archived weather forecasts I'd take a look; as it is I don't think I can find out what people on July 14, 2006 (say) thought the weather would be like on July 18, 2006, so the only way I could collect data would be to wait, or perhaps to call up some weather-forecasting entity and ask, and I don't care quite that much.

The fundamental reason we don't see error bars on weather forecasts, though, is because they are brought to us for the most part by profit-seeking entities, and such entities really don't want to advertise "we don't know what's going to happen!" -- even though everybody knows this. This is the "pervasive cultural flaw" that mcdonald refers to above.

Also, although I cannot produce data for this, I seem to perceive that at least certain weather forecasting outlets use only even numbers of (Fahrenheit) degrees in their forecasts, not odd numbers. If this is true, I suspect it means that they believe they can't be accurate to within one degree.

## 19 December 2007

### What if gravity reversed itself?

A friend of mine, not long ago, noticed that it was snowing up outside her office window. (When I've seen this happening, it's usually due to something like an internal courtyard in a building, so the air currents get sufficiently weird to push the snow upwards. Don't worry, physics isn't broken.)

This raises the (somewhat silly, I must admit) question: what if gravity had actually reversed itself outside my friends' office? Three scenarios are possible:

• gravity instantaneously reverses itself;

• the downward-pointing vector of gravitational acceleration decreases in magnitude, goes through zero, and then becomes an upward-pointing vector;

• the gravitational acceleration vector always has the same magnitude, but swings around through different angles to point up instead of down.

I claim that the third of these is not reasonable. Basically, this is for reasons of symmetry. Imagine gravity that neither points straight down nor straight up. How would gravity know in which of the many possible "sideways" directions to point? There are only two points that could possibly have any effect on the direction of gravitational acceleration, namely the point that you are at and the center of the Earth. The direction of gravity must be invariant under any isometry which fixes those two points -- thus it must point straight towards the center of the Earth or straight away from it.

Apparently I am more attached to gravity being a central force than to its particular strength or even to the fact that it is attractive, not repulsive.

(I'm not sure whether the first or second of the models suggested above is reasonable. This post is silly enough as it is.)

## 18 December 2007

### Scaling arguments

There's a classic argument, supposedly due to Galileo, for why mice and elephants are shaped differently, and there are no organisms that have the general mammalian body plan but are a hundred feet tall.

The argument goes as follows: assume that all animals have roughly similar skeletons. Consider the femur (thigh bone), which supports most of the weight of the body. The length of the femur, l, is proportional to the animal's linear size (height or length), S. The animal's mass, m, is proportional to the cube of the size, S3. Now, consider a cross-section of the bone; say the bone has cross-sectional area d. The pressure in the bone is proportional to the mass divided by the cross-sectional area, or l3/d2. Bones are probably not much thicker than they have to be (that's how evolution works), and have roughly the same composition across organisms. So l3/d2 is a constant, and so d is proportional to l3/2 or to s3/2. SSimilar arguments apply to other bones. So the total volume of the skeleton, if bones are asusmed to be cylindrical, scales like S7/2 -- and so the proportion of the animal taken up by the skeleton scales like S1/2. What matters here is that if S is large enough then the entire animal must be made up of bone! And that just can't happen.

Unfortunately, the 3/2 scaling exponent isn't true, as I learned from Walter Lewin's 8.01 (Physics I) lecture available at MIT's OpenCourseWare. It's one of the canonical examples of dimensional analysis... but it turns out that it just doesn't hold up. I suspect, although I can't confirm, that this is because elephant bones are actually substantially different from mouse bones. It looks like for actual animals, d scales with l (or perhaps like lα for some α slightly greater than 1), not with l3/2. Lewin uses this as an example of dimensional analysis; he also predicts that the time that it takes an apple to fall from a height is proportional to the square root of that height, which is true, but that's such a familiar result that it seems boring.

(Watching the 8.01 lecture is interesting. The videotaped lectures available on OCW are from the fall of 1999, which is two years before I entered MIT; occasionally the camera operator pans to the crowd of students, and a few of them look vaguely familiar.)

P. S. By the way, this blog is six months old. Thanks for reading!

Edited, Wednesday, 9:14 AM: Mark Dominus points out that the argument is in fact due to Galileo, and can be found in his Discourses on Two new Sciences. This is available online in English translation.

### Particularly numerically egregious typos

Language Log's Mark Liberman wrote this morning about The biggest typo in history, from this New York Times article, which referred to 10500 instead of 10500. (They've fixed the mistake now; I wonder if they found it independently or if someone pointed out the Language Log post to them?) The number here refers to the size of the string theory landscape, which I don't pretend to understand.

The Times made a similar mistake in The Death of Checkers, December 9, writing:
The brute-force method is slow, which is its big limit. Schaeffer says he suspects you couldn’t use it to solve chess, because that game — with between 1040 and 1050 possible arrangements of pieces — is far more complicated than checkers, which has 5 × 1020 positions. “Chess won’t be solved in my lifetime,” he predicts. “We need some new breakthrough in technology to do that.

(James Kushner, a reader of this blog, pointed this out to me; I meant to mention it at the time but didn't.)

This particular mistake is actually so common that I barely notice it any more, especially because I usually have a decent idea of what the number should be. But to someone who doesn't know, it could make a big difference in how they perceive the article. For one thing, if you interpret the checkers quote literally, you get the idea that checkers is more complicated than chess: after all, 5 × 1020 = 5010 5100.

Another paper that's been known to make these mistakes is The Tech, MIT's student newspaper. The difference here is that The Tech ought to know better; their entire staff is MIT students, who ought to be familiar with exponents. The Times at least has an excuse. (Disclaimer: I went to MIT as an undergraduate.)

Now if only I could get the Times to write about tilings of the plane, where they could say that an 80 by 80 rectangle has "10800" domino tilings. (That's "ten thousand eight hundred", as opposed to the correct number, which is around "ten to the eight hundredth".) Or just in general to write about statistical mechanics, which is where numbers like this come up.

(The Times article which provoked all this -- on the question of whether the laws of nature are inherently mathematical or whether we've just gotten lucky -- is interesting as well. The way I see it, even if a nonmathematical physics is possible, the mathematical physics is so far ahead that we might as well just act as if physics is ultimately governed by mathematics. But I don't think a nonmathematical physics that has the same explanatory power as the physics we have is possible. That might just be a matter of faith.)

### Monkeys can do math. Sort of.

Monkeys and college students equal at mental math? (Reuters)

Monkeys and college students were asked to do addition problems mentally, by being shown two sets of dots and then being asked to pick out a third set of dots that had the same number of dots as the first two sets combined. Apparently (according to the Reuters article) the college students were told not to count or verbalize as they did the math; this doesn't seem to make a huge difference, though, since average about one second so it really wouldn't have been practical to do so. This seems like an unfair handicap; we're so used to doing math verbally that doing it any other way is difficult. I also wonder if calculator use among humans has contributed to this. What would have happened if the same study were done fifty years ago?

The headline is a bit inaccurate, though; the monkeys responded equally quickly to the problems, but they were less accurate (94% accuracy for humans, 76% for monkeys).

All snarking aside, though, it's interesting that there are parts of mathematics that appear to not depend on linguistic abilities. And it seems a bit surprising that monkeys would be nearly as good at these tasks as college students, because the college students have had quite a bit more mathematical experience! But the task in question had to be done quickly enough that it really couldn't be verbalized, and most of the students' mathematical experience has been mediated through language.

The actual article is available online: Jessica F. Cantlon*, Elizabeth M. Brannon, "Basic Math in Monkeys and College Students", PLoS Biology 5(12): e328. (It's nice to be able to actually read the article that the journalists are hastily generalizing about! Often a subscription is required to do so, which is incredibly frustrating.)

## 17 December 2007

### What is infinity, anyway?

At this post from A Dialogue on Infinity, Alexandre Borovik writes about an experiment that one of his teachers tried once, in which the calculus teacher at his boarding school tried to build all of calculus in terms of finite elements. The logic here is basically the following: the main use of calculus is to solve differential equations. The differential equations basically come from approximating some discrete calculation as a continuous one. (The example that comes to mind for me is the way in which the equation of the catenary, the shape that a chain hangs in if it's only under the influence of gravity, is usually derived; see, for example, this page, which asks us to "Consider the equilibrium of the short length of chain subtending a distance dx on the x-axis." And then if you're reduced to solving a differential equation numerically (a common state of affairs), this is basically done by this same sort of finite element analysis -- Euler's method and the like.

I'd be interested to see how that worked.

But on the other hand, I'm not sure how valuable it is. Sometimes finite numbers are large enough that one doesn't want to deal with them as finite numbers, and one wants to make infinitesimal approximations. Basically, sums are hard, and integrals are easy. This is the insight behind the Euler-Maclaurin formula, which approximates sums by integrals and then lets you know how good the approximation is. It's hard to come up with the formula for the sum of the first n 8th powers; but do you really need the exact number, or do you just need to know it's about n9/9, which is what integration would tell you?

A commenter at an earlier post from the same blog wrote:
Your example from Computer Science reminds me of something often forgotten or overlooked in present-day discussions of infinite structures: that some of the motivation for the study of the infinite in mathematics in the 19th and early 20th centuries came from physics where (strange as it may seem to a modern mathematician or a computer scientist) the infinite was used as an approximation to the very-large-finite.

I didn't realize that historical point. But somehow that doesn't seem strange to me. This may mean I spent too much time hanging around physicists in my formative years.

## 16 December 2007

Saw Friday's episode of Numb3rs, which I usually skip because the mathematical content is usually laughable from my point of view. But I learned about the elevator paradox, which says that when you're at the bottom of a building, usually the first elevator which comes will be going down, but when you're at the top of a building, it'll usually be going up. The reason, according to the Wikipedia article, is:
Essentially, the explanation seems to be this: a single elevator spends most of its time in the larger section of the building, and thus is more likely to approach from that direction when the prospective elevator user arrives. An observer who remains by the elevator doors for hours or days, observing every elevator arrival, rather than only observing the first elevator to arrive, would note an equal number of elevators traveling in each direction.

But this raises an interesting question: how do we know that you'd have an equal number of elevators traveling in each direction? This is, it appears, basically by conservation of elevators. If in general more elevators go upwards than downwards past a certain point, then the number of elevators below that point will steadily decrease. Surprisingly, none of the online expositions seem to mention this. They seem to think it's obvious that the first elevator passing a point has an equal probability of going up and down. But I've spent enough time with probability to know it's ridiculously counterintuitive.

Also, should Wikipedia have mathematical proofs?, from Slashdot. I haven't thought about this; if I think of anything worth saying I'll post it here, but the link alone didn't seem worth its own post.

## 15 December 2007

### Playing around with sums of squares

An interesting random fact, learned from Arcadian Functor:

12 + 22 + 32 + ... + 242 = 702

This doesn't seem quite so weird if you recall that the sum of the first n squares is n(n+1)(2n+1)/6; then the sum of the first 24 squares is (24)(25)(49)/6.

But are there any other n for which the sum of the first n squares is also a square? Not under a million. I tested. It seems kind of tricky to prove this directly, because knowing the factorization of any of n, n+1, or 2n+1 doesn't tell you much about the factorizations of the others.

Let f(n) = n(n+1)(2n+1)/6; for what proportion of values of n is f(n) squarefree? For random integers it's π2/6. Let g(n) be the number of integers [1,n] such that f(n) is squarefree; then g(1000) = 504, g(10000) = 5029, g(100000) = 50187.

Hmm, it looks an awful lot like g(n) ~ n/2, doesn't it? I have no idea if this is true. It seems quite obvious that g(n) ~ cn for some constant c, but I'm not sure if c is exactly 1/2. (And I'd be surprised if it did come out to be 1/2; these sorts of probabilities are rarely rational.)

### Tilings of the plane

For various reasons, I've been thinking about tilings of the plane a lot lately. This reminded me of a paper I haven't thought about in a while, by Cris Moore, Some Polyomino Tilings of the Plane, which "calculates generating functions for the number of tilings of rectangles of various widths by" various polyominoes. Let Tm,n be the number of tilings of an m-by-n board by a given set of tiles; then the bivariate generating function is in general a quite complicated object. (I don't know of any explicit examples of this function, which I think just goes to show that it's complicated!) Even the "diagonal" generating function, which has as the coefficient of xn the number of tilings of an n-by-n square, seems to be on the ugly side. On the contrary, is usually a rational function. In a sense tilings of strips of fixed width are a "one-dimensional" problem -- for example, one could encode tilings of fixed-width strips with a certain set of tiles as words over some finite alphabet, I suspect. The fact that these generating functions are rational seems to be a "folk theorem" -- anybody who's thought about tilings knows it, but I've never seen a proof.

For example, the number of tilings of the 2-by-n rectangle with dominoes are just the Fibonacci numbers! (See Mathworld's pictures.) This is easy to see. Let Tn be the number of ways to tile a 2-by-n rectangle with domnioes. The left edge of a 2-by-n rectangle (that's 2 rows, n columns) tiled with dominoes is either a single vertical domino, in which case there are Tn-1 ways to finish the tiling, or two horizontal ones, in which case there are Tn-2 ways to finish the tiling. So Tn = Tn-1 + Tn-2; noting that T1 = 1, T2 = 2 finishes the proof.

But the number of tilings of an m-by-n rectangle by dominoes is given by the quite difficult formula which is one of those curious formulas that occur in combinatorics in which some expression that has no business even being an integer is not only an integer, but manages to count something! (Notice that if both m and n are odd, then the number of domino tilings of Tm,n should be zero; the formula takes care of this nicely. Namely, if m and n are both odd then the j = (m+1)/2, k = (n+1)/2 term of the product is zero. (I'm quoting Graham, Knuth, and Patashnik for this formula; it's originally due to Kasteleyn. Anybody who mentions domino tilings is required by law to cite Kasteleyn's paper.) There's a rather strange interpretation of this formula I read somewhere once: given an m-by-n board, you can write the number in the (j, k) square, and the product of those numbers is the number of domino tilings of the board.

Furthermore, if you consider log Tm,n the product becomes a sum: and we can approximate that sum by an integral, since it's a Riemann sum; thus The integral isn't elementary, but it can be evaluated numerically; thus we get log Tm,n ≈ .29156 mn.

Perhaps a bit surprisingly, this only depends on the size of the rectangle being tiled, not its shape; that's not quite true, though, because the approximation by an integral is a bit crude. Still, for example, an 80 by 80 rectangle has 1.18 × 10800 domino tilings, while a 40 by 160 rectangle has 4.26 × 10797; in terms of the logarithm (which is how these things should be viewed) those are quite close. There are deep connections to statistical physics here that I don't quite understand, although I could if I put my mind to it; the limit of log Tm,n/mn as m and n get large is called the entropy of a set of tiles (Moore), and I believe something like this is true for any set of tiles, although entropies aren't exactly easy to calculate.

References
1. Ronald L. Graham, Donald E. Knuth, Oren Patashnik. Concrete Mathematics, 2nd edition. Addison-Wesley, 1994.
2. P. W. Kasteleyn, "The statistics of dimers on a lattice, I: The number of dimer arrangements on a quadratic lattice." Physica 27 (1961) 1209-1225.
3. Cris Moore, "Some Polyomino Tilings of the Plane". Availalable from author's web site, http://www.santafe.edu/~moore/.

## 14 December 2007

### Credit card "points"

I got a credit card today.

## 01 December 2007

### Boiling eggs and dimensional analysis

A formula for boiling an egg. We learn that the time it takes to cook an egg is where ρ is density, c specific heat capacity, and K thermal conductivity of egg. Tegg is the initial temperature of the egg, Twater is the temperature of the cooking water, and Tyolk is the temperature of the yolk-white boundary when the egg is done. (The egg is modeled as a homogeneous, spherical object; the yolk-white boundary is just a proxy for a certain distance from the center of the egg.)

What I keep noticing about formulas that purport to model real-world situations is that no matter how complicated they are, when numerical exponents appear they are always rational numbers. This is also true for purely mathematical phenomena. If someone told me that some statistic of some combinatorial object, for large n, had mean an and standard deviation bn7/22, for some bizarre, probably-transcendental constants a and b, I would think "ah, that must be a complicated proof, to generate such a large denominator." But if they said that this statistic had mean n and standard deviation n1/π, I would know they were pulling my leg. Of course, these would be fairly difficult to distinguish "experimentally". I suspect the reason that such exponents are rational in physics problems is for reasons having to do with dimensional analysis. For example, in the case of this problem, someone with sufficient physical intuition would probably guess that, fixing the temperatures, the important variables are the mass, density, thermal conductivity, and specific heat capacity of the egg. How can we multiply these together in some way to get time? Well:

• mass (m) has units of kg

• density (ρ) has units of kg m-3

• thermal conductivity (K) has units W/(m × K), or kg m K-1 s-3

• specific heat capacity (c) has units J/(kg × K), or m^2 s-2 K-1

Thus mα ρβ Kγ cδ has units kgα+β+γ m-3β+γ+2δ K-γ-δ s-3γ-2δ. But we know this is a time, so it must have units of seconds; thus we have the system of equations

α + β + γ = 0, -3β + γ + 2δ = 0, -γ - δ = 0, -3γ-2δ=1

which has the solution α = 2/3, β = 1/3, γ = -1, δ = 1, exactly the exponents we see. Similar analyses are possible for many physics problems, and in the end one gets a system in which integer combinations of the exponents are integers; thus the exponents themselves must be rational. And that's the really important part of the problem. In fact, all that really matters is that α=2/3. In practice, if you're cooking eggs, the thermal properties will be the same from egg to egg, but some will be larger than others; the exponent of mass tells you how to adjust for that, and everything else can be calibrated experimentally.

Mathematicians don't worry about this so much... but lots of these sorts of problems have some sort of physical model, which probably explains why rational exponents are common in combinatorial problems but I can't ever remember seeing an irrational exponent. (That's not to say they don't exist -- just that the solution to any such problem would be quite unconventional.)