God Plays Dice: August 2008

29 August 2008

The Olympic poset

During the Olympics (yeah, I know, you've all forgotten about the Olympics), there was much argument about whether the right thing to do, if we want to determine which country "won the Olympics", is to count the country that got the most gold medals (China), the most medals (the US), or something else -- say a points system that allocates three points for a gold, two for a silver, and one for a bronze.

Simon Tatham has prepared a Hasse diagram of the medal table. The main idea is:

So we want to say that one country has done strictly better than another if the medal score of the latter can be transformed into the former by a sequence of medal additions and medal upgrades.

This gives a partial order on the countries.

Alternatively, we could say country A does strictly better than country B if and only if A gets more points than B under all weighting schemes in which we assign x points for a gold, y points for a silver, and z for a bronze, with x ≥ y ≥ z ≥ 0. This seems like it's equivalent to Tatham's order, but I haven't thought that hard about it.

One could extend this to include the population of a country; the natural order there would be that A did strictly better than B if the medal score of B can be transformed into that of A by a sequence of medal additions, medal upgrades, and population reductions. The idea here, of course, is that if two countries win the same assortment of medals, the one with lower population did better. But going there is dangerous; do we then take into account GDP? Popularity of sports in general in the country? The fact that the particular set of sports in the Olympics is more popular in some countries than others?

28 August 2008

Gowers on the PCM

Here is an MP3 interview with Tim Gowers about The Princeton Companion to Mathematics, of which he is one of the editors.

It's $100. Who wants to buy it for me?

26 August 2008

Predicting the next Mersenne prime

Robert at Casting Out Nines reports the new that the Great Internet Mersenne Prime Search claims to have found the 45th Mersenne prime, but they're waiting for verification before they report its identity. He invites guesses on the number of digits.

Here's my guess. A remark in Sloane's encyclopedia says the number of Mersenne primes up to exponent N is, conjecturally, about K log N for some constant K. (Here's a plot illustrating that.)

Let N_r denote the exponent of the rth Mersenne prime. The 44th Mersenne prime has exponent N₄₄ = 32582657, so we have approximately

44 ~ K log 32582657

from which we get K ~ 2.5434. Thus the 45th Mersenne prime has exponent N₄₅ satisfying approximately 45 ~ K log N₄₅; thus N₄₅ ~ 48276546. (Alternatively, we could have raised 32582657 to the 45/44 power, but it's not obvious why!) So the number of digits is approximately N₄₅ log₁₀ 2 ~ 14532688; I'll take 14.5 million digits.

This method, it turns out, might show some sort of systematic bias. If I'd predicted N_r = (N_r-1)^(r/(r-1)) with r = 2, 3, ..., 44, I would have had 16 underpredictions (the predicted exponent less than the true exponent) and 27 overpredictions. That's not quite statistically significant but it's not something you'd just neglect either. But this is just for fun -- if I were seriously trying to make a prediction I would have defined things more precisely, for one. (Oddly enough, since Mersenne primes get sparser as exponents get larger, the "most likely" single exponent is probably the single prime after N₄₄ -- unless there's something that keeps them from clustering!) But if I were going to bet on this I'd also take into account the history of GIMPS; how long has it been since they've reported a prime, and how quickly does their search appear to move?

If you want to take a shot at that, be my guest. (I may regret what I've just unleashed.)

Best. Title. Ever.

Birth Control for Giants, by Joel Spencer and Nick Wormald.

It's about avoiding the birth of the "giant component" in a random graph.

Fibonacci powers of two

Are there Fibonacci numbers which are also powers of 2, other than the obvious ones?

The sequence of Fibonacci numbers begins

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ...

where each number is the sum of the two previous ones. By some tortuous chain of logic I found myself wondering if any of them are powers of two, other than the obvious ones (1, 1, 2, 8).

It seems like there shouldn't be, because both sequences are exponentially sparse. The number of Fibonacci numbers in [2ⁿ, 2ⁿ⁺¹-1] is at most two. So the "probability" that 2ⁿ is Fibonacci is at most 2/2ⁿ, since there are 2ⁿ integers in this interval; if we sum this over n it converges, so we expect there to be finitely many, and the exponential decay seems to indicate that those Fibonacci numbers which are powers of two should appear early.

Now, it turns out that a positive integer z is a Fibonacci number if and only if one of 5z² + 4 and 5z² - 4 is a perfect square. I learned this from Wikipedia, which gets it from a 2007 book by Alfred Posamentier and Ingmar Lehmann entitled The (Fabulous) FIBONACCI Numbers. So the question is equivalent to asking which numbers that are five times a power of four, plus or minus four, are perfect squares. That is, if one of 5(4ⁿ) + 4 and 5(4ⁿ) - 4 is a perfect square, then 2ⁿ is a Fibonacci number, and conversely. We have 5(4⁰) + 4 is a square, as is 5(4¹) + 4 and 5(4³) + 4; these are 9, 25, and 324 respectively.

By looking at the Fibonacci numbers modulo 8, we see that multiples of 8 occur exactly in positions which are divisible by 6 -- the sixth, twelfth, ... Fibonacci numbers are the only multiples of 8. The "plus or minus" in the previous paragraph seems to depend on the parity of the position -- a Fibonacci number z occuring in an even position has 5z² + 4 a perfect square, and a Fibonacci number occuring in an odd position has 5z² - 4 a perfect square. Powers of two have to occur in even positions (they're multiples of eight) and so we're looking for positive integers n such that 5(4ⁿ) + 4 is a perfect square; then 2ⁿ will be a Fibonacci number.

It might also be interesting to look at when the first Fibonacci number which is divisible by 2ⁿ appears. (This is related to, but not the same as, the Pisano period, which is the period of the Fibonacci numbers modulo some integer.) If 2ⁿ is a Fibonacci number, it is the first Fibonacci number divisible by 2ⁿ.

Does someone who's better at number theory than I am want to finish this?

Edit (3:43 pm): Stregatto points out that my implicit conjecture that 1, 2, 8 are the only Fibonacci numbers which are powers of two is true, and is a consequence of Carmichael's theorem for Fibonacci numbers.

25 August 2008

Sign errors and upside-down things

When was the last time you made a sign error?
When was the last time you visualized something upside-down by mistake?
I thought so.

Marcello Herreshoff writes at Overcoming Bias on using the native architecture, i. e. taking advantage of the things that the human brain is naturally optimized for when trying to do more recently invented things, like mathematics.

(Although I did visualize something upside-down today. In particular, I drew the Hasse diagram of a certain poset and then realized that the way I was thinking of things, I really wanted the reverse order. But that doesn't really count, because I'd never seen this particular poset before.)

23 August 2008

Mathematical trivium

Trivium Mathématique, V. I. Arnold (in French) is a list of 100 problems that Arnold set as a list of mathematical problems that (university) students in physics ought to be able to solve. This is preceded by Arnold stating that the system of oral exams in Russia should be replaced with written exams, and that a system based around the ability to solve certain problems is superior to a system based around the ability to regurgitate certain theorems. He then goes on to give a list of such problems.

As someone who tends to prefer problem-solving to theory-building, I find the following particularly interesting (my translation):

A student who takes more than five minutes to calculate the average of sin ¹⁰⁰ x with a precision of 10% has no mastery of mathematics, even if he has studied nonstandard analysis, universal algebra, supervarieties, or plongement [I don't know this word] theorems.

Go ahead, try that problem! (Curiously, it's not one on the list.) More generally, it's an interesting bunch of questions in various branches of mathematics, biased towards but by no means exclusively in calculus and differential geometry; this is no surprise as it's meant for physicists.

(Why the French version, you ask? The article was originally in Russian; there's an English translation but it's not free. Links to free versions in Russian or English, if they exist, would be appreciated.)

Edit, 12:07 am Sunday: An anonymous commentator provides the English version, and Dmitri Pavlov the Russian.

20 August 2008

Telescopic text

Telescopic Text, by Joe Davis.

The web page starts out with the words "I made tea. by Joe" and various words can be clicked on; when you click on them they expand, so "Joe" becomes "Joe Davis", for example, when you click on it. Clicking on "I" reveals the word "Yawning" preceding it; "tea" becomes "a cup of tea", and so on. As you expand the text, some biscuits that weren't there before, comments on how to make tea, and so on materialize.

This reminds me of something that's been thrown around the mathematical blogosphere as a possible way to write papers that might be well-adapted to our present computer technology; start with a very high-level sketch of a proof, and make each step clickable. Upon clicking on a word, the proof is expanded to remind you what that word means, how exactly one uses that particular technique here, etc.

This would require more work than writing a paper in the usual way, though; it's not clear whether it's worth the trouble. And there's always the issue that some people like to read papers away from the computer, they eventually end up in journals which are printed on paper, and so on; what level of detail should be published there?

19 August 2008

Trying to explain the Olympic gymnastics tiebreaker

Nastia Liukin of the USA wins silver on the uneven bars; He Kexin of China wins gold. This is news because the two of them had the same score. I've seen a lot of bad explanations of how the tiebreaker works, and implications that it involves some Big Scary Mathematics.

The way gymnastics scoring currently works is that each contestant receives a score for the difficulty of their routine (I think this is open-ended), called the "A score", which is essentially the sum of the difficulties of the various things they attempted to do. Then six judges give a score out of 10, in multiples of 0.1, for how well they did it; the lowest and highest scores are thrown out and the other four are averaged, and this is the "B score". The two scores are added to give the score for that routine.

Both Liukin and He received 16.725 points -- so they're tied, right? Wrong. The first tiebreaker, in this case, is that the contestant who had the higher A score wins -- which rewards the contestant that attempts a more difficult routine. But both had A score 7.700, B score 9.025.

The impression I got (watching NBC's broadcast last night) is that if there's still a tie, then the B scores given by the four middle judges are looked at individually. In this case, for He the six judges gave 9.3, 9.1, 9.1, 9.0, 8.9, 8.9; for Liukin they were 9.3, 9.1, 9.0, 9.0, 9.0, 8.8. In both cases the middle four scores add up to 36.1. The lowest of these scores (so the second-lowest of the original scores) is thrown out. This leaves 27.2 for He, 27.1 for Liukin, so He wins. See the tiebreaker page at the official Beijing Olympics site; there's no explanation here, but he various numbers shown there seem to bear it out. Note that instead of reporting a score of x, they sometimes use 10-x, which is the number of points deducted from the highest possible B score, which is 10.0. This explains the phrasing in some sources that refers to an "average of deductions".

I'm not sure what the logic behind this is. At first I thought that it rewarded inconsistency -- the competitor who has their scores more tightly clustered will probably have a higher second-lowest score. But this isn't the right interpretation, because the scores weren't received on different routines, but on different people's measurements of the same routine -- so does the tiebreaker reward having a routine which is hard to score? Also, it was stated many times that there are no ties in the current scoring system, but what would have happened had He and Liukin received identical scores from each judge?

The math here isn't that hard; I think the big flaw was that nobody seemed to know what the rules were.

14 August 2008

iTip

At Freakonomics: Is Tipping Really So Hard? Apparently the iPhone App Store features a rather large number of "tip calculators".

Is it really so hard to multiply (approximately) by .15? (Or some other number; I don't wish to debate the "correct" tipping percentage here.) I've always figured that the hard part of tipping is not the multiplication, but the knowing who to tip. For example, do you tip food delivery people? I do on those rare occasions that I get food delivered; they bring the food to your house, which you could argue is more work than bringing it to your table in the restaurant! Plus a lot of them pay for their own gas. But in reality, I don't get food delivered, because I know I should tip, but I don't want to. Do you see how this gets complicated?

However, I am better than average at math and worse than average at "social skills", so I hesitate to generalize from my experience.

But as some people have pointed out there, this gives you an excuse to pull out your iPhone. And it's also the case that it's basically the simplest application one could write that actually does something useful.

A bill-splitting calculator I can see being useful, in cases where the amount of food people ordered varies enough that people aren't willing to split it equally; although I personally would never be in a party that would need it, because I can do the arithmetic. (And yes, my friends know this, so on occasion they'll ask me how much they should pay instead of figuring it out themselves.) But I can't eat with everybody! But even a bill-splitting calculator suffers from the fact that the hard part of splitting the bill is not the math, it's remembering who ordered what.

12 August 2008

Variance in Olympic events

It's often claimed that the reason that there are many more men than women in certain academic disciplines (mathematics is one, but that's not the point of this post) is not that men and women have different mean abilities, but rather that the standard deviation of male ability is larger than the standard deviation of female ability. (Of course, it is unwise to espouse these views publicly, for political reasons; that's what got Larry Summers in a lot of trouble.)

It occurs to me, having watched lots of the Olympics in the last few days, that something similar might be true in athletic events. I'm not claiming that men and women are physically identical (I'm not blind), or that their average performance in physical feats is the same. But it may be the case that the difference between the very best men and the very best women in physical feats (say, times in some sort of race, because these are the most easily quantified) is larger than the difference between the average man and the average woman, because there could be more variance among men than women.

Is there any evidence for this? I'm obviously not a student of this sort of thing (in fact, I don't even know what "this sort of thing" is called, although it's clearly some subfield of biology or medicine).

Oh, and Jordan Ellenberg wrote an explanation of why the new gymnastics scoring system is good. I'm glad he did, because I'd had a feeling it was better than the old system but was having trouble articulating why.

11 August 2008

Lucky babies redux

Two babies born at 8:08 am on 8/8/08, weighing eight pounds, eight ounces, both in the United States.

How many would you expect?

The 2007 crude birth rate for the US is 14.2 per 1000, per year; the estimated US population is 304,843,316. The product of these is 4,328,775 births per year, or 8.25 births per minute.

From here I can find the distribution of birth weights (in Norway, 1992-1998 -- better figures would be appreciated). About six percent of babies weigh between 3850 and 3950 grams, which is a 3.5-ounce-wide interval; thus about 6%/3.5 = 1.7% of babies weigh 8 pounds, 8 ounces (to the nearest ounce) at birth.

So the expected number of babies born in the US at that particular minute, at that weight, is about 1.7% of 8.25, or 0.14.

There were two. The probability of this happening, assuming births are a Poisson process, is about one in 112. I wouldn't trust this number too much, because birth weights are supposedly growing with time and the Norwegian distribution is probably different from the US distribution.

So if I had to guess, people at the hospitals are fudging the numbers; if we were being totally honest, those babies might turn out to have been born at 8:09 am and weighed eight pounds, seven ounces, or something like that. Not that there's anything wrong with that.

(This post borrows a lot from a post I just remembered I made, lucky babies, about babies born on July 7, 2007 at 7:07 and weighing seven pounds, seven ounces -- but those were fictional babies.)

Google ads disappoint again...

Google ad, via gmail: "Quantum Corporate Funding - Quantumfunding.com - Factoring For Contractors Direct Contractor Funding Source".

I was hoping it would be about quantum computers to do integer factoring, say using Shor's algorithm, but something seemed a bit off about the text; it had "quantum" and "factoring" in it, but also words like "funding" and "contractor".

"Factoring" in this context is apparently a banking service that involves selling your accounts receivable (i. e. money people owe your business) in exchange for a (slightly smaller) amount of cash, which you get now. Wikipedia has more. I'm not quite sure why it has that name.

But it wouldn't have surprised me to see somebody saying that they have invented a quantum computer large enough to do non-trivial factorization via a Google advertisement. I'm not saying they'd be right, but I've seen other mathematics/physics crackpottery delivered through that channel.

10 August 2008

Big numbers confuse the New York Times

From Chinese basketball builds towards podium (August 9):

"With 300,000 million people playing basketball across the country — roughly the same number as the population of the United States..."

Really? I mean, it's almost a cliche by this point that There Are Lots Of People In China -- but I didn't know there were quite that many.

09 August 2008

Things the Olympics broadcast won't tell you

Things we're not hearing about during coverage of Olympic swimming: apparently the construction of the "Water Cube" (the Beijing National Aquatics Center) is based on the Weaire-Phelan sturucture. More specifically, the Weaire-Phelan structure is apparently the best known solution to the problem of partitioning space into cells of equal volume with minimal suface area. The edges of the cells in this structure make up the steel frame of the building; in order to make a more "organic"-looking pattern the pattern was sliced at an oblique angle.

Somehow I missed this in the New York Times on Tuesday. See also the Guardian from 2004 and Science News a few weeks ago.

08 August 2008

A story of Google, Wikipedia, and languages I only sort of read

The Viquipèdia article "Mathemàtiques" has a bunch of amusing pictures that are meant to be icons of different types of mathematics: a Rubik's cube for abstract algebra, a Koch snowflake for fractal geometry, the Lorenz attractor for chaos theory, dice for probability, an elliptic curve for number theory, and so on.

Some areas don't translate into pictures as well: for category theory they have a commutative diagram, for combinatorics the six permutations of [3], etc.

Also, here are Representacions matemàtiques de diversos camps. (The English version of the article does not have this picture currently; they have a picture of Euclid.)

Much of the article seems to be a straight translation of the English version, but I find myself focusing more on the pictures when reading the Catalan version, because I don't actually read Catalan. But I read French and, to a lesser extent, Spanish, so I can figure things out.

(As to why I'm looking at the Catalan wikipedia -- well, I ended up there because Kowalski said he googled 1.70521 when it appeared in some of his work, and I wanted to see the results.))

Such a search finds Wikipedia articles, usually tables of mathematical constants, in Serbian, English, Esperanto, Catalan, Japanese, Thai, Czech, Turkish, Japanese, Serbo-Croatian, and Bosnian. This is both an illustration of the universality of mathematics and the extent to which Wikipedia is an international enterprise. (My apologies if I misidentified any of these languages!)

But the first hit upon Googling 1.70521 (upon this writing) was Kowalski's post, though it's twenty minutes old. That shows you how fast Google is at indexing. (By the time you read this, who knows? This post might be the first hit.)

07 August 2008

Weather and political polls

From a Philadelphia TV weather man, Glenn "Hurricane" Schwartz, upon observing that the low and high temperature for Philadelphia today (70 and 85, respectively) were the "normal" temperatures for the day:

"Exactly normal! It's not normal to be exactly normal!"

which, of course, is true. That's how distributions work.

The weather people on television try a lot less to explain why the weather did what it did than the political people; John and Zeno have talked about this "roller-coaster polling". I suspect this is because once the weather has happened we don't care why it happened that way, while the whole point of polls is to use them to forecast the upcoming election.

06 August 2008

Who's doing election prediction by simulation

Here's a list of all the web sites I know of where one can find simulations of the upcoming 2008 US presidential election. (I'm going to stay out of this game, because who wants to track down polling data every day, re-run the simulations, and so on?) Generally these start by obtaining a probability that one candidate or the other will win each state, from polls and sometimes from demographic data as well. These are then in some way aggregated in "simulated" elections. Often these are accompanied with a probability that Obama will win the election, or would win it if it were held today, and in many cases a probability distribution of the number of electoral votes won by Obama.

Yes, Obama. Not McCain. People who create these sites generally have a choice to make -- since nobody other than Obama and McCain has any reasonable chance of getting electoral votes, stating everything from the Obama point of view or from the McCain point of view has the same content -- but it seems that there's a leaning towards choosing Obama for this purpose in this corner of the blogosphere. (This is an entirely unscientific sample, though.)

Note that although Sam Wang says what he does isn't simulation, that's only because his method allows him to do all the possibilities at once. This is because he uses the magic of generating functions. This trick only works if you can make the simplifying assumption that winning in each state is independent of each other state. This is reasonable if you're trying to predict what would happen if the election were held today -- there's not any big reason for sampling error in different states to be correlated. But if you're trying to predict what will happen in the actual election, this assumption is very risky. It seems that the actual movement of voter opinions in different states should be correlated.

Here's the list:

Stochastic Democracy (David Shor)
Race to 270 (Ben Schak)
Election-Projection.Net (If you want to do your own simulations, they have an interactive calculator.)
FiveThirtyEight.com (Nate Silver)
Princeton Election Consortium (Sam Wang)
270towin.com 2008 Election Simulator

This list is by no means complete.

Worms doing calculus?

Worms do calculus to find food. (Um, not really.)

But apparently worms use salt concentration to find food, and tend to head in the direction of the gradient of salt concentration. That is, they go where there's more salt. This is due to neuroscientist Shawn Lockery and his students at the University of Oregon. I think the paper is the following:

Suzuki H, Thiele TR, Faumont S, Ezcurra M, Lockery SR, Schafer WR (2008). "Circuit motifs for spatial orientation behaviors identified by neural network optimization." Nature 454:114-117.

but I can't be 100 percent sure -- Penn's libraries don't allow access to the electronic version of papers from Nature until twelve months have passed, and I'm not on campus right now. (This is, however, the only paper on Lockery's list with a title fitting the description.)

Saying "worms do calculus to find food" seems a bit disingenuous to me, though. It seems like saying that baseball players do calculus to catch fly balls. The larger point, though, is that neural processes -- of worms or of humans -- can be modeled using mathematical techniques, which may be of use to people trying to develop artificial systems that do these things.

(From John Scalzi, via 360. Apparently this first appeared in blogs a couple weeks ago, but I'm posting it here anyway, because it's new to me, which means it's probably also new to a lot of you.)

04 August 2008

Continued fractions and baseball hitting streaks

The title of this post is misleading, because you might think there's a substantial connection between its two halves. The connection is only that I happened to come across both of these things this morning and it seemed silly to make separate posts about them.

Todd Trimble gives two proofs of the continued fraction expansion for e, namely e = [2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, ...]. This is in the usual notation for continued fractions, so it actually means

e = 2 + 1/(1 + 1/(2 + 1/(1 + 1/(1 + 1/(4 + ...)))))

but all the extra 1s obscure the pattern. This is one of those things that it's much easier to state than it is to prove. And I must admit it's always seemed a bit strange to me that e has such a nice continued fraction expansion, while π doesn't -- e has a special place in continued-fraction land.

From the arXiv, via the physics arXiv blog: A Monte Carlo Approach to Joe DiMaggio and Streaks in Baseball, by Samuel Arbesman and Steven Strogatz, which is what it sounds like. This expands upon this piece in the New York Times on the same subject, which I wrote about back in March. And no, I'm not sure why it's in a physics category at the arXiv.

02 August 2008

Fox Sports' World Series odds

At this article: Fox Sports expects at least 1.5 teams to win the World Series this year.

No, I'm not making this up.

They give odds for fifteen Major League Baseball teams to win the World Series, ranging from the Marlins at 40 to 1 to the Angels at 5 to 2. I'll just give the top five teams here: Angels 5:2, Yankees 3:1, Cubs 7:2, Red Sox 4:1, Mets 8:1. So they figure that the Angels has probability 2/7 of winning the World Series, the Yankees 1/4, and so on.

Add up their probabilities for the top five teams: 2/7 + 1/4 + 2/9 + 1/5 + 1/9 = 449/420, which is greater than one. If you add up the probabilities for the fifteen teams they give probabilities for, you get about 1.506.

I'm not arguing with their ranking of the teams. (There might be things to argue with, but contrary to what you might come to believe in the next two months, or to what longtime readers might remember from last August and September, this isn't a baseball blog.) But a bookie that offered these odds would be broke pretty quickly.

Also, about the Phillies (to whom they give 16 to 1 odds to win at all), we're told that "Odds say they can't mash their way in two years in a row." Okay, it may be true that they can't rely on just their offense. But the fact that they did it last year really has no bearing on whether they can do it this year. If anything, the fact that they did make the playoffs last year is evidence that their offense is enough.

01 August 2008

Bourbaki and the AMS

An interesting fact from the August 2008 Notices of the American Mathematical Society (in the article "From the AMS Secretary", by John Ewing, which is mostly a history of the AMS): Nicolas Bourbaki is the only mathematician known to be denied membership in the AMS. Bourbaki was a member of the Société Mathematique de France and applied for reciprocal membership. But Bourbaki was denied membership in the AMS, basically on the grounds that Bourbaki was neither an individual nor an institution.

(Incidentally, tortured grammar in this post can be blamed on the fact that it's not clear whether I should use singular or plural pronouns when referring to Bourbaki.)