31 December 2008

Best-seller encore impatiemment attendu

From Robert Sedgewick's web site: slides for a talk entitled Impatiemment Attendu, given at a conference in honor of Phillipe Flajolet's 60th birthday. The gist of this talk appears to have been something like "The book will be out soon, after about thirty years of gestation." (That's Analytic Combinatorics, in case you're wondering.)

I mention it because apparently, at the beginning of this month in Paris, this post I made in September was projected on a big screen in front of a bunch of important people. I am of course amused. (The title "impatiemment attendu" is not mine, though; I took it from a paper by Nicolas Pouyanne. I suppose the English translation "impatiently awaited" is mine, but this was not a translation that required some huge inspiration.)

At the time, Amazon said that the book would be out on December 31. It's December 31. It's not out yet, as far as I know. I'll be at ANALCO '09 on Saturday. A slide in the presentation says that the book will be available at SODA (ANALCO takes place the day before); maybe the book will be there?

E-mail address change

I have a new e-mail address.

To figure it out, concatenate the first three letters of my first name, my entire last name, and "@gmail.com".

The intrepid reader can figure out my "academic" e-mail address. Once I get things set up those should redirect to the same place anyway. (I'm tired of checking multiple addresses.)

And I apologize for obfuscating the address like this, but it's a new address, and I'd like to keep the spammers at bay for at least a little while.

Happy New Year! (Do I have any readers in Japan, Korea, Australia, or anywhere else where it's 2009 already? And Kate, if you're reading this, I urge you to remember that you're on vacation and you should get off the Internet.)

30 December 2008

Tetrahedra with arbitrary numbers of faces

While reading a paper (citation omitted to protect the "guilty"), I came across a reference to an "n-dimensional tetrahedron", meaning the subset of Rn given by

x1, ..., xn ≥ 0 and x1 w1 + ... xn wn ≤ τ

for positive constants w1,..., wn and τ.

Of course this is an n-simplex. But calling it a "tetrahedron" is etymologically incorrect -- that means "four faces", while an n-simplex has n+1 faces. This probably occurs because most of us tend to visualize in three dimensions, not in arbitrary high-dimensional spaces.

I'm not saying that "tetrahedron" shouldn't be used here -- I'm just pointing out an interesting linguistic phenomenon.

29 December 2008

A combinatorial problem from Crick

I recently read What Mad Pursuit: A Personal View of Scientific Discovery
, which is Francis Crick's account of the "classical period" of molecular biology, from the discovery of the double helix structure of DNA to the eventual figuring out of the genetic code. It differs from the more well-known book by James Watson, The Double Helix: A Personal Account of the Discovery of the Structure of DNA, which focuses more on the characters involved and less on the science.

Crick was trained as a physicist, and learned some mathematics as well, and every so often this pokes through. For example, back when the nature of the genetic code wasn't known, combinatorial problems arose to prove that a genetic code of a certain type was or was not possible. One idea, due to Gamow and Ycas was that since there are twenty combinations of four bases taken three at a time where order doesn't matter, perhaps each one of those corresponded to a different amino acid. This turned out to be false. Another, more interesting problem comes from asking how the cell knows where to begin reading the code. What is the largest size of a collection of triplets of four bases such that if UVW and XYZ are both in the collection, then neither VWX nor WXY is? The reason for this constraint is so that the "phase" of a genetic sequence is unambiguous; if we see the sequence UVWXYZ, we know to start reading at the U, not the V or the W. Thus the collection can't contain any triplet in which all three elements are the same, and it can contain at most one of {XYZ, YZX, ZXY} for any bases X, Y, Z, not necessarily distinct. There are sixty triplets where not all three elements are the same, thus at most twenty amino acids can be encoded in such a code. There are solutions that acheive twenty; see the paper of Crick, Griffith, and Orgel.

Note that the "20" in the two types of code here arises in different ways. If we assume a triplet code with n bases, then the first type of code can encode as many as n(n+1)(n+2)/6 amino acids, the second (n3-n)/3.

Crick says that the more general problem of enumerating the number of codes which imply their own "reading frame" was considered by Golomb and Welch, and separately Freudenthal. Based on the title and the date, I think the first of these is the paper I point to below -- but our library doesn't have that journal in electronic form, and the physical library is closed this week!

F. H. C. Crick, J. S. Griffith, L. E. Orgel. Codes Without Commas. Proceedings of the National Academy of Sciences of the United States of America, Vol. 43, No. 5 (May 15, 1957), pp. 416-421.

George Gamow, Martynas Ycas. Statistical Correlation of Protein and Ribonucleic Acid Composition Statistical Correlation of Protein and Ribonucleic Acid Composition. Vol. 41, No. 12 (Dec. 15, 1955), pp. 1011-1019.

Golomb, S.W., Gordon, B., and Welch, L.R., "Comma-Free Codes", The Canadian Journal of Mathematics, Vol. 10, 1958. (Citation from this list of Golomb's publications; I haven't read it.)

28 December 2008

Tao on calibration of exponents

Terence Tao: Use basic examples to calibrate exponents. This article, for the eventual Tricki, gives many examples of the following basic procedure. In many problems there is a "size" parameter N, and the problem has an "answer" that we believe for some reason behaves like Nk for some constant k. A quick way to find N is to look at "basic examples" (say, random graphs in a graph-theoretic problem).

The interesting thing about this article -- and about the Tricki as a whole, once it finally launches -- is that its organizational principles are not the same as most mathematical exposition. A typical lecture or section of a textbook gives problems with similar statements but not necessarily with similar proofs; the Tricki will group together problems with similar proofs but not necessarily with similar statements.

27 December 2008

Comfort with meaninglessness?

Comfort with meaninglessness the key to good programmers, from Boingboing.

Is this true for mathematics as well as computer programming?

23 December 2008

Housing prices drop 13 percent -- what does this mean?

The New York Times reports on bad housing news:
The median price of a home plunged 13 percent from October to November, to $181,300 from $208,000 a year ago. That was the lowest price since February 2004.
They mean that house prices have gone down 13 percent in a year, i. e. from November 2007 to November 2008. That's what the National Association of Realtors press release says.

But one sees this pretty often -- the confusion between monthly declines and annual declines. And sometimes a 1% decline in a month might be reported as a "12% per year" decline -- but then the "per year" gets dropped, the statement "prices of X dropped 12% this month" is made, and those who aren't familiar with how people who care about the price of X report their numbers get confused.

Don't get me wrong -- a drop of 13% in a year is still a big deal. But a drop of 13% in a month would be a much bigger deal.

22 December 2008

Is wind chill misleading?

From Daniel Engber at Slate: Wind Chill Blows. Back when nobody read this blog, I wrote about how the heat index doesn't make sense to me, because I know what 95 degrees with "typical" humidity for my location feels like, and telling me it "feels like" 102 is misleading. (That's Fahrenheit, not Celsius; we're not literally boiling here in the summer.)

Something similar is true for wind chill. Both of these measures only take into effect two of the many variables that effect comfort -- temperature and either humidity or wind speed. They assume that these are the only two variables which actually vary -- clothing, amount of sunlight, weight, etc. are held constant. In reality, comfort is a function of many variables, and it's misleading to create an index that assumes it's just a function of two variables. People know that they should take more than the temperature into account, but I've seen quantitiatively unsophisicated people think of the wind chill as some perfect index of the weather.

But let's face it, a wind chill of zero sounds scarier than a temperature of sixteen. (Those are approximately the numbers I heard reported this morning in Philadelphia.) That means more people watch the news.

Logarithmic jihadism

From the Wall Street Journal:
U.S. counterterrorism officials have been on guard for homegrown recruitment by radical groups. Intelligence analysts from the New York Police Department, in a study of radicalization in Western Muslim communities, warned that "jihadist ideology" is "proliferating in Western democracies at a logarithmic rate."
Maybe it's just me, but logarithmic proliferation doesn't seem all that scary.

19 December 2008

Uncyclopedia's list of proof methods

Uncyclopedia has a list of methods by proof. My favorite:


Of course, you have to be careful which letters you use as variable names in stating your result -- you can only use one of a, α, and A, for example.
And some of them are methods of proof that are actually used:
Proof by Diagram

Reducing problems to diagrams with lots of arrows. Particularly common in category theory.

And here's an interesting point about mathematical writing:
Proof by TeX

The proof is typeset using TeX or LaTeX, preferably using one of the AMS or ACM stylesheets. When laid out so professionally, it can't possibly have any flaws.

Erdos papers available online, and the Erdos-Turan law for the order of elements in the symmetric group

The collected papers of Paul Erdös are available online.

I'm glad I found this. There's a theorem of Erdös and Turan that I've been curious about for a while. Namely, let Xn be the order of a permutation in Sn selected uniformly at random. Then

\lim_{n \to \infty} {\rm Prob} \left( {\log X_n - {1 \over 2} \log^2  n \over \sqrt{{1 \over 3} \log^3 n} } < y \right) = \Phi(y)

where Φ is the cumulative distribution function of a standard normal random variable. Informally, log Xn is normally distributed with mean (log2 n)/2 and variance (log3 n)/3. Unfortunately the proof, in On some problems of a statistical group-theory, III, doesn't seem to explain this fact in any "probabilistic" way, so I'm not quite as excited to read the paper as I once was. But I had believed the proof was in the first paper in the (seven-paper) series, which is in storage at our library, and it's nearly the holidays, so I probably would have had to wait quite a while to get a copy just to see that it wasn't the one I wanted.

In fact, it was worrying that I had the wrong paper that led me to find this resource in the first place -- seeing the "I" in the citation I had got me curious, so I went to Google. What I expected to see in the Erdos-Turan paper, and what I actually wanted to see, was a "probabilistic" proof somehow based on the central limit theorem. This exists; it's in the paper of Bovey cited below. Also, Erdös seems to have not been good at titling papers; titles "On some problems in X", "Problems and results in Y", "Remarks on Z", "A note on W", etc. are typical. I guess he was too busy proving things to come up with good titles.

Bovey, J. D. (1980) An approximate probability distribution for the order of elements of the symmetric group. Bull. London Math. Soc. 12 41-46.
Erdos, P. and Turan, P. (1967) On some problems of a statistical group theory. 111. Acta Math. Acad. Sci. Hungar. 18 309-320.

Guessing the answer without knowing the question

Ian Ayres asks a question at Freakonomics. Which of the following is the correct answer? 4π, 8π, 16, 16π or 32π square inches?

No, I didn't forget the question. But it's possible to make a reasonable guess by trying to reverse-engineer the question.

(Don't read the comments. They're full of people who didn't get it.)

17 December 2008

Jean Chretien is secretly a mathematician?

"I don't know. A proof is a proof. What kind of a proof? It's a proof. A proof is a proof, and when you have a good proof, it's because it's proven."
-- Jean Chretien, former prime minister of Canada. The context appears to be something having to do with Canada's involvement in the Iraq war, but I'm having trouble finding details. It seems that this was a Big Thing in Canada when it happened, so perhaps I have Canadian readers who can explain?

NYT profiles of Jessica Fridrich

Specializing in Problems That Only Seem Impossible to Solve , by Bina Venkataraman, in yesterday's New York Times.

This is an article about Jessica Fridrich, a professor at Binghamton University, who at one point held the world record for the fastest solving of the Rubik's Cube. She currently specializes in the research of information hiding in digital imagery.

15 December 2008

Pairing up the states

A Ballot Buddy System, by Randall Lane, an op-ed in today's New York Times.

As you may remember, there was a presidential election six weeks ago in the United States. But Barack Obama isn't officially elected president until today; today is the day that the electors cast their votes. This is the first time since 1892 that a state will have electors voting for more than one candidate. Maine and Nebraska both have laws in which two electors go to the winner of the popular vote in the state and one goes to the winner of each congressional district. Nebraska went for McCain, but the 2nd congressional district (Omaha and some of its inner suburbs) went for Obama.

It's been suggested that all states should apportion their electoral votes in this way, on the assumption that less people live in "safe districts" than "safe states". (I'm not sure if this is the case, especially with the way some districts are gerrymandered these days.) But the problem with this is that the majority of people (and legislators) in any state would see their party hurt by the passage of such a law in their state.

Lane's suggestion is that Republican-leaning states and Democratic-leaning states with approximately the same number of electoral votes (say, Texas and New York) could agree to pass these laws together. The problem is that in each pairing, it seems that you'd want two states that are roughly of equal size and are equally far from the political center; it seems that it might not be possible to construct such a pairing. The obvious problem is what to do with California? It's easy to state a few plausible pairs, as Lane does, but I'm not sure that all the states could be paired off in this way. Furthermore, things probably get weird, in terms of how much "power" each state holds in presidential elections, if some substantial number of states have enacted such laws.

13 December 2008

π really does equal 3

Okay, not really. But here's a fake proof that π = 3, which I hadn't seen before.

12 December 2008

The Xie brothers' advice collection

Tao Xie and Yuan Xie (brothers, in case you're wondering) maintain an advice collection consisting of links to things other people have written about how to succeed in scientific careers -- on getting a PhD, writing papers, and so on. Those links that seem to be aimed towards people in certain subjects are mostly aimed at computer scientists, but at least some of what I'm finding in their links seems reasonable.

Of course, one piece of advice they probably should give is "don't spend lots of time reading this sort of advice".

Since I'm talking about brothers, I feel obliged to mention the following paper:
Michalis Faloutsos, Petros Faloutsos and Christos Faloutsos, On Power-Law Relationships of the Internet Topology, SIGCOMM 1999. It has nothing to do with the Xie brothers' advice collection, but I wanted to mention it anyway because I saw a citation to it and I was amused.

Where are the mathematicians?

Why are people in Iowa interested in combinatorics? Combinatorics is more popular in Iowa than in any state but Massachusetts.

Google now has a feature called "Google Insights"; you can type in a search term and see where people are searching for it, how frequency of searches varies with time, etc. In states where there's a lot of volume it's possible to zoom in; in Massachusetts it's possible, for example, and most of the interest is in Cambridge. Given that there is a Big Important University and a liberal arts school that has a well-known mathematics department in Cambridge, that's not surprising. But I can't zoom in on Iowa.

(It's possible to get results by country, too, but these results seem ridiculously skewed; I suspect that Google may be normalizing by the number of Internet users in a given area, and the user pool is different in different places.)

Another interesting one: "probability" is popular in Maryland, and among cities in that state it's most popular in College Park and Laurel. College Park is where the University of Maryland is. Laurel is where the NSA is. You can see similar things in other states; for example, in New York, "probability" is most common in Stony Brook, Troy (RPI), and Ithaca (Cornell). In Pennsylvania, it's University Park (Penn State), Bethlehem (Lehigh), and State College (Penn State again). The general pattern seems to be first a few college towns, then the big cities -- the places with the fourth and fifth highest numbers for "probability" in Pennsylvania are Pittsburgh and Philadelphia.

Most mathematical search terms I could think of are highly seasonal -- they're less common in the (Northern Hemisphere) summer, when schools aren't in session. That seems to imply that lots of the people doing the searching are students. I couldn't find a mathematics-related search term that didn't show this seasonality; I don't know if it can be done, because only search terms that receive a reasonably large amount of traffic are reported on the site at all, and things which are important enough to get lots of traffic are probably studied in schools.

11 December 2008

How do you pronounce ≤ and ≥?

I'm taking a break from proofreading a paper. I'm reading it out loud, because I find this is the best way to catch mistakes; it forces me to look at every word.

There are inequalities in this paper, so the signs ≤ and ≥ come up a lot. How do you pronounce these? When I was in college I pronounced them "less than or equal to" and "greater than or equal to". But sometime around the first year of graduate school I seem to have shifted to "at most" and "at least", which have the obvious advantage of being shorter.

Edit (11:15 pm): It appears I've mentioned this before.

McMullen on the geometry of 3-manifolds

The Geometry of 3-Manifolds, a lecture by Curt McMullen. This is one of the Science Center Research Lectures; in which Harvard professors talk about their research to the interested public; the series doesn't appear to have a web page, but here's a list of videos available online in that series; these include Benedict Gross and William Stein on solving cubic equations. There are non-mathematical things too, but this is at least nominally a math blog, so I won't go there.

McCullen apparently also gave this lecture at the 2008 AAAS meeting in Boston, and has a couple other video lectures available online.

And now I want a do(ugh)hnut-shaped globe with just North and South America on it. This is a fanciful example of what an "armchair Magellan" might suspect the world looked like if humans had reached the North and South poles starting from somewhere in the Americas but had never crossed the Atlantic or Pacific; they might suspect that the cold area to the north and the cold area to the south are actually the same. McMullen uses to illustrate that tori and spheres are not the same, since loops on the sphere are contractible but loops on the torus are not. The lecture, which leads up to telling the story of the Poincaré conjecture, begins by using this as an example of how topology can distinguish between surfaces.

Finally, here's an interesting story, which may be well-known to some people but wasn't to me: Wolfgang Haken, one of the provers of the Four-Color Theorem, may have intended that (famous, computer-assisted) proof to be a "trial balloon" for a brute-force proof of the Poincare conjecture.

09 December 2008

Danica McKellar, my mother, and Cuban food

In honor of my age now being a square, my parents took me out to dinner. (Okay, so it had nothing to do with my age being a square.)

My mother: "I read in the US Air in-flight magazine that... um... that girl from Boy Meets World is writing books about math now."

Me: "You mean the girl from The Wonder Years??"

Yes, this actually happened. The two shows are not all that different, and the male leads in them are brothers in real life and look similar, so it's a natural mistake to make.

Danica McKellar majored in math at UCLA (where Terence Tao was one of her teachers, and has written two books aimed at middle school girls to tell them that math doesn't suck, namely Math Doesn't Suck: How to Survive Middle School Math Without Losing Your Mind or Breaking a Nail and Kiss My Math: Showing Pre-Algebra Who's Boss

By the way, if you're in Philadelphia and want "modern Cuban cuisine", Alma de Cuba has it. I especially recommend that you go to this restaurant if someone else is paying.

On birthdays

Yesterday, my age was a factorial.

Today, my age is a square.

This raises an interesting problem.

On translation of games

At Language Log recent discussion has gone on about how you can translate from one language to another, but you can't translate from one game to another. For example, you can't take a game of chess and translate it into poker.

I'm reminded of the Subjunc-TV in Douglas Hofstadter's Godel, Escher, Bach: An Eternal Golden Braid, which has characters tuning into a baseball game that has been made to look like a football game. Of course this doesn't work perfectly, which is intended to illustrate Hofstadter's points about the imperfection of analogies.

At Language Log, I learned that there are certain "logical games" for which a notion of translation is possible. These are apparently of interest to logicians; you can read more at the Stanford Encyclopedia of Philosophy.

But in combinatorial game theory, we can associate each position in certain games with a "number"; is it meaningful to say that positions in different games which have the same number are the "same position"? In this case, translations between games would become possible, except that those numbers are apparently difficult to calculate.

08 December 2008

Ken Jennings is not a mathematician

Ken Jennings, the 74-time Jeopardy! champion, has a blog.

Today he wrote about polyominoes, inspired by the question of how many pieces there would be in Tetris if it were played with pieces made of five or six squares instead of the canonical four. He's not a mathematician, and he finds it surprising that there's no nice formula for the number of polyominoes with n cells. I suppose it is kind of surprising to someone with no mathematical training; by this point in my life I've gotten used to the fact that things just don't work that way.


Via X=Why?, I found an article on how a criminal investigation lab in California is inviting students to come in and showing them that math is useful for solving crimes. (In the CSI way -- figuring out how blood splatters, say -- not in the Numb3rs way.) This is certainly a good thing to do.

But can you make sense of this?
Craig Ogino, the department's crime lab director, started the event by offering a prize of $10 to the student who could use trigonometry to determine the number in gallons of a mixture used to make methamphetamine, based on his sketch.
I'm assuming that trigonometry was actually used for something else -- like, say, the aforementioned blood splattering analysis, seen later in the article -- and that the reporter made a mistake. But I'm not totally sure. Any thoughts?

07 December 2008

Kids these days...

Shitload of math due Monday, from The Onion:
Making matters worse, students said, was their math textbook, which reportedly doesn't even have any of the freaking answers in the back.
How would these kids feel if they learned that eventually the questions aren't even in the book, but you have to come up with them yourself?

2008 Putnam problems

This year's Putnam exam problems, via 360.

I haven't thought about these, but I might as a break from writing things up over the next few days.

04 December 2008

How could you guess the formula for the sum of the first n fifth powers?

The following three formulae are reasonably well known:

1 + 2 + 3 + ... + n = n(n+1)/2

12 + 22 + 32 + ... + n 2 = n(n+1)(2n+1)/6

13 + 23 + 33 + ... + n 3 = (n(n+1)/2)2

(The sums of first and second powers arise pretty often naturally; the sum of cubes is rare, but it's easy to remember because the sum of the first n cubes is the square of the sum of the first n natural numbers.)

The first member of this series I can't remember is the following:

14 + 24 + 34 + ... + n 4 = n(n+1)(2n+1)(3n2+3n-1)/30

and generally, the sum of the first n kth powers is a polynomial of degree k+1.

I ran into these formulas, which I'd seen plenty of times before while perusing the book Gamma: Exploring Euler's Constant by Julian Havil, which I had heard of a few years ago, forgotten about, and then found while browsing the library shelves. (Also in German, under the title GAMMA: Eulers Konstante, Primzahlstrände und die Riemannsche Vermutung.) This is a most interesting book, at least if you're someone like me who pretends to know number theory but really doesn't.

Anyway, back to the main story. Say I wanted to know the sum of the first n fifth powers. Well, there's a general method for finding the formula of the first k powers; it involves the Bernoulli numbers. But let's say I didn't know that. Let's say somebody hands me the sequence

1, 33, 276, 1300, 4425, 12201, 29008, 61776, 120825, 220825

in which the nth term, sn, is the sum 15 + 25 + ... + n5 -- but doesn't tell me that's where the sequence comes from -- and challenges me to guess a formula for it in "closed form". (Smart-asses who will say that there are infinitely many formulas are hereby asked to leave.) How would I guess it?

Well, it can't hurt to find factorizations for these numbers. And if you do that you get

1, 31 111, 22 31 231, 22 52 131, 31 52 591, 31 72 831, 24 72 371, 24 33 111 131, 33 52 1791, 52 112 731

and this seems interesting; these numbers seem to have lots of small factors. Furthermore, a fair number of them seem to have one largeish prime factor, which I've bolded. (Yes, I realize, 11 times 13 isn't prime, but I actually did think of it as a large factor.) What are the large factors that I observe in these numbers? They are

?, 11, 23, ?, 59, 83, ?, 143, 179, ?

and the nth of these is easily seen to be 2n2 + 2n - 1. (Some terms don't show up from inspection of the factorizations because they get "lost in the noise", as it were.)

From there the rest is pretty easy. We see that often (but not always), sn is divisible by 2n2 + 2n - 1. You can check that for n = 1, 2, ..., 10, the term sn is always divisible by (2n2 + 2n - 1)/3. So now consider the sequence tn = sn / ((2n2 + 2n - 1)/3). The numbers t1 through t10 are

1, 9, 36, 100, 225, 441, 784, 1296, 2025, 3025

and I recognized that all of these are squares; in particular tn = (n(n+1)/2)2.

Putting everything together, I get the conjecture that the sum of the first n fifth powers

sn = n2(n+1)2(2n2+2n-1)/12

which could be proven by induction, but actually writing out the proof is best left to undergrads.

The method here is reminiscent of Enumeration of Matchings: Problems and Progress by James Propp. In that article, Propp lists various unsolved problems in the enumeration of tilings, and conjectures that some of them might have answers which are given by simple product formulas, because actually counting the tilings in question gave numbers with nice prime factorizations.

Edit, 9:13 pm: of course this is not the only method, or even the best method; it's just the method I played around with this morning. See the comments for other methods.

How to break into a keyless-entry car

Weak security in our daily lives (in English): basically, you can use a de Bruijn sequence to break into a car with keyless entry in what might be a non-ridiculous amount of time. I'm referring to the sort which have five buttons marked 1/2, 3/4, 5/6, 7/8, and 9/0, and a five-digit PIN that has to be entered. This trick takes advantage of the fact that the circuitry only remembers the last five buttons pressed, so if you press, say, 157393, then the car will open if the correct code is either 15739 or 57393. It is in fact possible to arrange things so that each key you press, starting with the fifth, completes a five-digit sequence that hasn't been seen before.

Of course, you shouldn't do this.

Via microsiervos (in Spanish).

03 December 2008

Logic as machine language

Gil Kalai mentions a metaphor I hadn't heard of before about the foundations of mathematics:
To borrow notions from computers, mathematical logic can be regarded as the “machine language” for mathematicians who usually use much higher languages and who do not worry about “compilation.”
Of course there would be analogues to the fact that certain computer languages are higher-level than others as well. To take an example dear to me, the theory of generating functions might be at a higher level than the various ad hoc combinatorial arguments it's often introduced to students as a replacement of. I don't want to press this metaphor too hard because it'll break -- I don't think there are analogues to particular computer languages. But feel free to disagree!

02 December 2008

You can't say you're a liar

From Family Guy:
"Chris, everything I say is a lie. Except that. And that. And that. And that. And that. And that. And that. And that."

30 November 2008

A Russian teacher in America

A Russian teacher in America, by Andrei Toom.

It's what it sounds like. Toom repeats the familiar litany that in America, students learn for grades, and only incidentally for learning; it's an interesting read, mostly because of the perspective that he's able to bring to it as somebody who didn't grow up within the American system.

I'm thankful for the Borel-Cantelli lemma

Cosmic Variance is thankful for the spin-statistics theorem, because it enables the division between matter and force, which is kind of important.

In this spirit, I am thankful for the second Borel-Cantelli lemma, which states that if countably many events E1, E2, E3, ... are independent and the sum of the probabilities of the En diverges to infinity, then the probability that infinitely many of them occur is 1. Let these events be "something interesting happens at time n", for a suitable quantization of time; then given infinite time, infinitely many interesting things will happen. (Of course I'm making an independence assumption here.) I like interesting things.

I am also thankful for whoever it was that took a picture of a monkey at a typewriter.

29 November 2008

Neal Stephenson's Anathem

I haven't really digested it yet (in fact, I'm only a third of the way through it), but Anathem by Neal Stephenson is making a strong run at the title of My Favorite Book. Mostly because there's math hidden everywhere inside it. And how could I resist this, which almost feels like something out of The Glass Bead Game:
Three fraas and two suurs sang a five-part motet while twelve others milled around in front of them. Actually they weren't milling; it just looked that way from where we sat. Each one of them represented an upper or lower index in a theorical equation inolving certain tensors and a metric. As they moved to and fro, crossing over one another's paths and exchanging places wile traversing in front of the high table, they were acting out a calculation on the curvature of a four-dimensional manifold, involving various steps of symmetrization, antisymmetrization, and raising and lowering of indices. Seen from above by someone who didn't know any theorics, it would have looked like a country dance. The music was lovely even if it was interrupted every few seconds by the warbling of jeejahs.
Please, Internet, deliver me video of this mathematical dancing. Somewhat more seriously, though, moving pictures often float in my mind (and I suppose the minds of others) as I attempt to understand various mathematical structures.

Stephenson gave an interesting
talk/Q-and-A at Google about the book
, if you've got an hour to kill. I think if you liked Cryptonomicon
you'll like this one; on the other hand I was disappointed by the Baroque Cycle, which lots of people seem to have liked. I suspect this has to do with the times in which they're set; the Baroque Cycle takes place a few centuries ago, Cryptonomicon during World War Two, and Anathem on another planet entirely, but one in which the secular world is roughly comparable to present-day Earth. (Perhaps a bit too comparable; Earth intellectual history and the intellectual history inside Anathem are essentially the same thing with different names.) Except in Anathem, the mathematicians live in what are essentially monasteries cut off from the outside world. I don't think I could handle that. I suppose some would argue that universities aren't the Real World, though...

26 November 2008

The asymptotics of "27 lines on a cubic"

Charlie Siegel, whose office is just down the hall from mine, had for a while the following statement on the blackboard in his office (paraphrased and renotated a bit): Let g(n) be the number of lines on the generic hypersurface of degree 2n-1 in complex projective (n+1)-space. Then
g(n) = [z^n] (1-z) \prod_{j=0}^{2n-1} (2n-1-j+jz)
where the notation $[z^n] f(z)$ means "the coefficient of zn in f(z)". For example, g(2) is the coefficient of z2 in (1-z)(3)(2+z)(1+2z)(3z), which turns out to be 27; this means that the number of lines on the generic hypersurface of degree 3 in complex projective 3-space is 27. Informally, "there are 27 lines on the cubic". Want to know why? For the cubic case, see this post by Charlie or go to his talk last week.

The first few values of g(n) are 1, 27, 2875, 698005, 305093061, 210480374951, 210776836330775, 289139638632755625... -- so this sequence grows pretty quickly. How quickly? I couldn't tell at first, but it kept nagging at me every time I was in Charlie's office. The sequence is A027363 in the Encyclopedia of Integer Sequences. It turns out that it's in an appendix of the paper that the Encyclopedia links to (arXiv:math/0610286, specifically its appendix by Zagier) but the derivation there is long and involved and it was more fun with the formula myself. There are a few gaps, but here it is.

We can deal easily with the 1-z factor out front, and we want
g(n) = [z^n] f_n(z) - [z^{n-1}] f_n(z)
f_n(z) = \prod_{j=0}^{2n-1} (2n-1 - j + jz)

We can already see it's going to be kind of tricky; the coefficients of fn(z) are probably going to be pretty big and not that far apart.

Now, we can pull out a 2n-1 from each factor in the expression for fn(z), and we get
 f_n(z) = (2n-1)^{2n} \prod_{j=0}^{2n-1} {(2n-1 - j + jz) \over 2n-1 }

Call the product pn(z). Now this is where, out of nowhere, I start having Fun With Probability. (It's kind of amusing because there is nothing probabilistic about the original question.) The term corresponding to j in that product is the probability generating function of a random variable which is 1 with probability (j/(2n-1)) and 0 otherwise. The product is thus the p.g.f. of the sum of such random variables for j = 0, 1, ..., 2n-1.

Now, this sum has mean which is the sum of the means of those random variables, that is,
\sum_{j=0}^{2n-1} {j \over 2n-1} = n
and variance which is the sum of their variances,
\sum_{j=0}^{n-1} {j \over 2n-1} \left( 1 - {j \over 2n-1} \right) = {2n(n-1) \over 3(2n-1)}
which is approximately n2/3. Furthermore, since the sum is a sum of a bunch of small contributions, it's approximately normal. So pn(z) is the probability generating function of a random variable which is roughly normal with mean n and variance n2/3, but integer-valued, and its kth coefficient is the probability that such a random variable is equal to k.

Therefore [z^k] p_n(z) is approximately
q_n(k) = {3 \over \sqrt{2\pi n}} \exp \left( {-(k-n)^2 \over 2n/3} \right)

which is the density of such a normal random variable. We want to know [zn] pn(z) - [zn] pn-1}(z), and this is approximately qn(n) - qn(n-1). You'd think this would be qn'(n), but qn(n) is actually zero -- the fact that the coefficients were "close to each other" is even more troublesome than you'd expect. Still, we can make a Taylor series for qn at n, and the linear term is zero, so we get
q_n(n) - q_n(n-1) \sim {q_n^{\prime\prime}(n) \over 2} = \sqrt{27 \over 8\pi n^3}
And g(n) = [zn] fn(z) - [zn] fn-1(z) = (2n-1)2n ([zn] pn(z) - [zn] pn-1(z)); using the approximation we get
g(n) \sim \sqrt{27 \over 8\pi n^3} (2n-1)^{2n}

and now note that
(2n-1)^{2n} = (2n)^{2n} \left(1-{1 \over 2n}\right)^{2n} \sim {(2n)^{2n} \over e}.

g(n) \sim \sqrt{27 \over 8\pi e^2 n^3} (2n)^{2n}
which matches (with a bit of computation) the result given by Zagier in the arXiv, and the terms reported in the OEIS. I wouldn't have trusted it otherwise; that "take the second derivative" step is particularly sketchy, although it can probably be justified as there are results on the rate of convergence of the central limit theorem. But asymptotic analysis is nice in this way; if it's easy to compute the terms of some sequence, then we can often be confident in results like this even without

Besides, I'd never seen the trick of using the second derivative to approximate a difference before. At least not that I can remember.

Eulerian beer

At mathlog: Jever Bierdeckel-Mathematik und die Brücken von Kaliningrad. That is, "Jever coaster mathematics and the bridges of Konigsberg." (Why this funny translation? Because you haven't heard of the bridges of Kaliningrad, even though that's what the city is called now.)

It includes a solution to the general problem of identifying graphs with Eulerian tours. In English, in the form of a poem.

And what do coasters have to do with this? Jever is a beer, and the problem of finding an Eulerian tour is on a coaster they distribute (in German).

Edit, 1:28 pm: In the comments, Wing says that he thought the post would be about Euler beer. Apparently there is a beer named Euler. Someone's auctioning off a coaster on ebay.

25 November 2008

On foods of genus one

It seems that some people describe the torus as the shape of a bagel, and others as the shape of a doughnut.

I wonder if this is somehow correlated with geography; bagels are more common in some places, doughnuts in others.

Heuristic derivation of the Prime Number Theorem

One way of stating the Prime Number Theorem is that the "probability" that a large number near x is prime is 1/log(x). (Here, as always, all logs are natural.)

A heuristic derivation of the prime number theorem, Frank Morgan, via Andrew Gelman, with some embellishment by me: let P(x) be the "probability" that a large integer x is prime. Then how do P(x+1) and P(x) compare? Say x is prime, which it is with "probability" P(x); it "divides" x+1 (and all larger numbers) with probability 1/x. (Of course, this doesn't actually make sense; the idea is that we're modeling the numbers divisible by x as a random set with density 1/x, which should have the same large-scale properties.) Other than this, x and x+1 are equally "likely" to be prime. So the function P satisfies

(*) P(x+1) = P(x) [P(x) (1 - 1/x)] + (1 - P(x)) P(x)

since x+1 is "prime" with probability P(x) (1-1/x) if x is "prime", and P(x) otherwise. Divide through by P(x) to get

P(x+1)/P(x) = P(x) (1-1/x) + (1-P(x))
P(x+1)/P(x) = 1 - P(x)/x.

Now, P(x+1) can be approximated by P(x) + P'(x), so we have

1 + P'(x)/P(x) = 1 - P(x)/x
P'(x)/P(x) = -P(x)/x

which has the general solution P(x) = 1/(C + log x).

This is a bit of a lie. For one thing, the difference equation (*) actually seems to have solutions that differ from those of the differential equation by a constant factor, which seems to depend on the initial conditions. (This amounts to changing the base.) For another thing, the assumption that x+1 might be divisible by x is, um, stupid if we're actually talking about prime numbers. (It's probably possible to rephrase this heuristic derivation as a rigorous result about random sets, though.) Still, it gets the prime number theorem to within a constant factor, which isn't bad for such a simple argument.

24 November 2008

The Know-It-All

I'm reading The Know-It-All: One Man's Humble Quest to Become the Smartest Person in the World, by A. J. Jacobs -- I remembered hearing good reviews, and they're for the most part right. It's an amusing book to read, because it's part trivia and part A. J. Jacobs, who's an editor at Esquire, read the entire Encyclopedia Britannica in a year or so, which is long enough that while he was reading it amusing stories unfolded in his life -- most of which have some trivia worked in. Is it great literature? No. But it's fun. And there's the inevitable moment when you know something that Jacobs doesn't mention he knows; that makes you feel a little smarter when you're reading any book, but especially one with this book's subtitle.

It's amusing and interesting to be reading this during my less ambitious attempt to read the The Princeton Companion to Mathematics
. But I'm mostly making this post because I couldn't resist sharing this, from when Jacobs joins Mensa:
Of course, I'm terrified that I'll be rejected. In fact, I'm pretty sure that they'll send me a letter thanking me for my interest, then have a nice hearty laugh and go back to their algebraic topology and Heidegger texts and Battlestar Galactica reruns.

From britannica.com, it looks like the EB does not have an article titled "algebraic topology". But there are articles titled "topology" and "mathematics" with sections named "algebraic topology", and they do seem to have serious treatments of mathematics in there. Jacobs admits (p. 338) that "the math sections... are my bête noire.

23 November 2008

Kiyoshi Ito dead

Kiyoshi Ito (of Ito calculus fame) is dead.

(When? I'm not sure. The New York Times says Monday, the 17th, but the Japan Times published an obituary on Saturday the 15th which said he died "Monday" -- so I'm guessing the 10th.)

See the obituaries, the MacTutor biography and Wikipedia article, or this Notices article upon his receipt of the Gauss Prize for an idea of his contributions.

A couple amusements from the Princeton Companion

From The Princeton Companion to Mathematics, specifically the article "Algebraic Geometry", by János Kollár:
Finally, if we marry a scheme to an orbifold, the outcome is a stack. The study of stacks is strongly recommended to people who would have been flagellants in earlier times.
I feel this way about most of algebraic geometry, but that's only because Penn has a high enough concentration of algebraic geometers that I get tired of hearing people walking around and talking about it all the time.

Also, I find myself screaming at the book quite often. But I do it in a good way; it's often some crucial insight that makes me think "why didn't anybody tell me that before?" (Of course, it's possible they did and I wasn't listening.) And sometimes it's "oh, damn, I thought I came up with that myself". For example, I just read the article on computational number theory, by Carl Pomerance, in which he explains a heuristic reason why Fermat's last theorem is true. I won't give it in full here, but it's basically the following. First, Euler showed it for n = 3. Second, consider all the positive integers which are nth powers for some n ≥ 4. The "probability" that a number m is in this set is about m-3/4. So replace the set of fourth-or-higher powers with a random set S, which contains m with probability m-3/4. Then the probability that a givennumber n can be written as a sum of two such elements of this set S is proportional to n-1/2; independently, it has probability n-3/4 of being in S. So the probability that n can be written as a sum of two elements of S and is also in S itself is proportional to n-5/4, and so we expect finitely many examples. This isn't quite true, because the set of fourth-or-higher powers has some nontrivial structure. But it also took a couple hundred words, instead of a couple hundred pages like the real proof.

But I figured out something like this when I was in college, and I was so proud of myself! So it saddens me to learn I'm not the only one who thought of it. (It also makes me happy, though, because the idea is due to Erdos and Ulam, and there are worse people to be imitating.)

Etymology of "theorem"

Is there some connection between the etymology of "theorem" and words like "theology" or "theist"?

For "theorem" the OED says: theorem, from the late Latin theorema, from the Greek θεωρημα, spectacle, speculation, theory, (in Euclid) a proposition to be proved, from θεωρειν to be a spectator (θεωροσ), to look at, inspect. (This isn't an exact quote; I've expanded some of the abbreviations, and suppressed some of the accent marks. But if you're the sort of person who could actually answer my question you probably already knew that.)

But for the words where "the-" or "theo-" is god-related, of which there are a lot, it just says things like "from Greek θεοσ, God" and doesn't go any further. And maybe you could imagine that people "look at" or "inspect" God. Of course I recognize that the OED is not the best possible source for these things -- but I'm suspecting that someone in my audience has also noticed this apparent coincidence of words and knows the answer.

(I just want to reiterate that the title "God Plays Dice" is not a religious thing; it's alluding to the quote of Einstein, as I've written before.)

22 November 2008

NYT on the Netflix Prize

In this week's New York Times Magazine, Clive Thompson writes about people trying to win the Netflix Prize.

Early in 2007, Netflix, the video-rental-by-mail service, made (anonymized) data on its users preferences available, and is offering $1,000,000 to the first person or team that creates an algorithm for recommending movies that makes a certain substantial improvement upon the existing algorithms. More precisely, Netflix attempts to predict the star rating, on a one-to-five scale, that users will give to a movie; the root mean square error in their old algorithm was about 0.95 stars, and they'll pay the $1,000,000 for improvement of this by 10%. (They also pay $50,000 for improvements of 1%.)

Why are they doing this? Well, Netflix recommends movies to lots of people, and they want their recommendations to be good. And it sounds like they're getting a lot more than a million dollars worth of time from the various competitors working on this; if they actually had these people on their payroll it would cost more.

It's interesting that there are certain movies that are very difficult to predict; Napoleon Dynamite is one of the examples the article gives, which won't surprise anybody who's seen it; poking around the forums on the Netflixprize.com site I found a list of movies like that, although it's a year and a half old so there's been progress since then. (Hmm, I'm not sure if I like Napoleon Dynamite -- sometimes I do and sometimes I don't.)

Apparently one of the major mathematical tools that has emerged as being useful here is singular value decomposition. I'm not surprised that some Big Fancy Linear Algebra tool was necessary, as a lot of the algorithms essentially seem to work by identifying various dimensions along which movies can vary. (Look, it's a hidden metric space!) Hmm, I wonder if the space of movies has some nontrivial topology... no, I will not get sucked into thinking about this!

21 November 2008

Classification is hard, and libraries are not perfect

When I go to the library, even if it's just to get a specific book, I tend to browse a bit. I've found some interesting books this way.

So I was just there, and Anatomy of Integers, the proceedings of this conference, was shelved near the calculus textbooks. In particular, it was near various books that collect integrals. "Anatomy of integers" is a name that seems to be used for a certain branch of number theory that looks at the distribution of prime factors; for a nice introduction to the concept, see Andrew Granville's anatomy of integers and permutations. (This paper, still in development, talks about the analogy between the prime factorizations of integers and cycle structure of permutations. For example, the distribution of the number of cycles of a random permutation of {1, 2, ..., n} is approximately normal with mean and variance near log n; the Erdos-Kac theorem says that the distribution of the number of prime factors of a random integer less than or equal to en is asymptotically normal with mean and variance near log n. Lots of results about prime factorizations or about permutations seem to have counterparts in the other realm. Anyway, this isn't calculus! But it was with the calculus books.

Similarly, I found a book entitled Mathematical Essays: In Honor of Su Buchin, which was a volume of mathematical research articles in honor of the Chinese differential geometer Su Buchin, shelved with various books that consist of reflections by mathematicians on mathematics -- the sort of nontechnical writing that "essay" usually means.

Is there something wrong with Penn's library? Not at all. They were in the right place according to the Library of Congress Classification, but whoever does that classification seems to make the occasional error. Of course the reasons for these particular errors were obvious from the titles, which is why I spotted them. I am not sure if the classification is done by people who aren't familiar with mathematics; that would explain the integer/integral mistake, at least.

20 November 2008

Tao's blog is soon to be a book

In the issue of the Notices of the American Mathematical Society, which arrived in the mail today, I learned that Structure and Randomness: Pages from Year One of a Mathematical Blog will be coming out soon.
I saw a draft copy of it when Tao posted it on his blog back in April; he announced the book was going to exist here and posted the draft here. The book is a collection of Tao's posts on various open problems, expository posts, summaries of lectures, and so on, and is in large part intended to be, so far as I can tell, what one might think of as an "archival" version of the blog. A lot of Tao's blog posts tend to be longer than typical blog posts, so his blog seems particularly suited for the book medium.

The title of the draft version is "What’s new - 2007: Open questions, expository articles, and lecture series from a mathematical blog", which describes the content well -- but I like "Structure and Randomness" better. Here's a long post I made back in July of 2007 on "psuedorandomness" in the primes, in large part inspired by watching a talk Tao gave entitled "Structure and Randomness in the Prime Numbers".

19 November 2008

Worst math joke I've heard today

A joke that I've seen in several places today (first from my friend Karen): "An infinite number of mathematicians walk into a bar. The first one orders a beer. The second orders half a beer. The third, a quarter of a beer. The bartender says "You're all idiots", and pours two beers."

I can "improve" this joke. An infinite number of mathematicians walk into a bar. The first one orders a beer. The second orders two beers. The third orders three beers, and so on. The bartender takes a twelfth of a beer from the first one and they all walk away happy.

(Why, you ask? Because ζ(-1)=-1/12.)

People on Slashdot don't know probability

This is a few days old, but I didn't mention it when I first saw it: Microsoft Exploit Predictions Right 40% of Time, from Slashdot. (The link is to Slashdot; here's the original article.)

Apparently Microsoft has a mechanism for predicting which parts of its code are most likely to be exploited by malevolent hackers; they made nine predictions in October, and four actually got exploited. Of course, the Slashdot commentary is filled with reflexive Microsoft-bashing, and people pointing out that that's worse than flipping a coin. But the correct comparison isn't to flipping a coin. The correct comparison is to the number of hits that would have been obtained if Microsoft had picked nine pieces of their code at random, which presumably is much less than four. There are a few people in the comments trying to put numbers on this, but nothing that really sounds informed.

18 November 2008

Folklore for divergent series

From Emanuel Kowalski, quoting what is apparently folklore: "The only 'truly' divergent series is the harmonic series."

The idea is that one can assign a value somehow to basically any other divergent series; see the link for a more thorough explanation.

(But don't tell the calculus students! They already can't remember the harmonic series is divergent.)

And I usually think of the harmonic series as being "equal" to log n, although of course log n isn't a number. So I amend the folklore, somewhat facetiously: "there are no divergent series."

Navigability of complex networks

Here's an interesting idea, which falls in the very large set of "things I thought about but have no idea how to implement": M. Boguna, D. Krioukov, kc claffy. Navigability of Complex Networks, arXiv:0709.0303v2. (Published in the November 16, 2008 issue of Nature Physics; I learned about it from physorg.com.) The basic idea is that it is possible to navigate in complex networks by navigating in some underlying hidden metric space. Nodes are close together in the hidden metric space if they are similar to each other in some abstract sense, and they are more likely to be connected to each other in the network if they are closer together.

To take an example where the metric space isn't hidden, one navigates between airports by thinking about distances on the surface of the Earth; if you route greedily, by at each airport taking the flight which gets you closest to where you want to go, you usually will end up at your destination. Of course, this isn't a priori true -- you could get stuck in an infinite loop -- but the point is that real networks seem to obey this.

This is something that I think a lot of people have an intuitive sense of. For example, there are people I've met that when I meet them, I want to ask "why haven't we met before?" -- some combination of shared geography and shared interests makes it seem like we should know each other. (In my own life, there is at least one case of somebody who I never met when I was an undergrad, though there were plenty of amusing stories about mutual friends of ours that we'd both heard; we met shortly after I moved back to Philadelphia for grad school.)

A couple of questions on the mechanics of writing mathematical papers

I'm writing a paper, and of course this requires the use of LaTeX. As many of you know, the way one creates cross-references inside a LaTeX document is a two-step process. First, you insert the command \label{big-important-theorem} where your Big Important Theorem is in the paper. Then when you want to refer to your theorem, you write something like "And as a consequence of Theorem \ref{big-important-theorem} we can prove Corollary \ref{million-dollar-problem}, and so I claim to be the winner of a million dollar prize."

No, I have not written that sentence. Nor do I plan to. In fact, I would add "claims to be the winner of one of the Clay prizes in the body of the paper" to John Baez's crackpot index. Baez couldn't have put that in his list, because it was written in 1998. He does mention the Nobel, though, which carries a similar monetary value.

But this leads to a question -- how do you label your theorems, equations, etc. in your own LaTeX code? I try to come up with names that reflect what the labeled object is about, but this isn't so easy, because sometimes there are lots of objects that are "about" the same sort of thing. I'm tempted to just find some source of extra-mathematical names. So I could name my theorems \label{market}, \label{chestnut}, \label{walnut}, \label{locust}, ... (Philadelphia streets), say. Of course, this sequence has the disadvantage that the streets come in order; a set of names with no natural order is probably better, because then I won't feel like I'm moving things out of order when I move them around.

On a related note, often one sees bibliographical references of the form
[Be74] Edward A. Bender. Asymptotic methods in enumeration. SIAM Review,
Vol. 16, No. 4, October 1974.
I prefer this form to the form where [Be74] is replaced by a number, because if something is cited more than once in the same paper, I only need to look at the reference once; the second time I see [Be74] in the paper I know it's the paper by somebody whose name starts with Be, written in 1974. (And if it's a paper I've heard of already, sometimes I don't have to look at the citation at all.)

But what's the convention for picking the letters to be used? First two letters of the name seems common for a single-authored paper, but by no means universal; I've definitely seen one-letter citations, the problem being that if you have a reasonably extensive bibliography you'll want to cite two different authors with the same initial. I'm currently using just the first letter of each name for multiple-author works -- [FS08] for Flajolet and Sedgewick's Analytic Combinatorics (coming out sometime in December, preorderable now! and readable online!), for example. I've tried to reverse-engineer whatever convention there is from other people's reference lists and I can't. Is there actually no convention?

Of course, there is no need for a convention, as long as each work cited has a unique identifier.

17 November 2008

Monopoly: electronic banking edition

There is, I just learned from a television commercial, a Monopoly Electronic Banking Edition.

The children of the future, it appears, will perhaps not be able to claim that Monopoly is educational because it requires them to do math. (Although I'll admit that I always found Monopoly frustrating because other people's mental math is slower than mine.)

Somewhat more seriously, are the children of the future (or, perhaps, the present) going to feel that they have less reason to learn mathematics because they don't deal with cash? That seems like it would be something that would motivate at least some students to learn basic arithmetic.

15 November 2008

How does the orbit of the moon look?

The orbit of the moon around the sun doesn't look like what you'd expect.

(Although now that I've told you this, you might think a little harder about what the orbit looks like.)

13 November 2008

The geometry of a game

From 360: Shinju, a geometrical game. You're given an arrangement of shells in some of the squares of a square grid. One of the shells hides a pearl. Your goal is to find it, opening at most four shells. When you open a shell, it either contains the pearl, or a number. That number is the distance to the pearl, in the (T)chebyshev distance.

There's a nice little result hiding here:
Problem 1: Given any arrangement of shells, prove that it is possible to find the shell in four clicks.
Since you always get four clicks (at least as long as I played), the game becomes trivial if you can find a constructive proof of that fact. (If you're the sort of person that likes to rack up points in video games, though, I think you get more points if you don't use all your clicks -- so how do you set things up so that you're likely to solve the puzzle in less than four clicks?)

Problem 2: Generalize (to higher dimensions, different metrics, etc.)

In which I buy the Princeton Companion to Mathematics

I cashed in 5.793 kilograms of assorted coins today at the bank. That's $115.19, for a density of $19.88 per kilogram; I know the weight because the bank's coin counting machine gave a receipt that had the numbers of each type of coin I received. I then immediately proceeded to the bookstore and purchased the The Princeton Companion to Mathematics (That was $106 with sales tax; perhaps I should have bought it from Amazon.com, where it's cheaper. Oh well, it's too late now. But at ten cents a page it's still a good deal.) I told myself I wouldn't buy it, but I'm doing some private tutoring on the side, and so there's a bit of extra money floating around, and I couldn't help myself...

I want to compliment the design of it; the side of the book which faces out on the shelf has "The Princeton Companion to" in small letters and then "Mathematics" in large letters, perhaps giving the impression that the book contains all of mathematics. This is not true, but from the portions of it I've seen online and the part I've read so far, it seems to admirably solve the optimization problem of squishing down mathematics to an object that only weighs five pounds. That's less than the coins I hauled to the bank to get the cash I paid for it! For links to various reviews, see this entry from Tim Gowers' blog. Gowers is the editor, and also wrote substantial parts of the book, although it has many other contributors; you can play the party game "how many of these names do I recognize?" I was interested to see that I recognize many more than I would have when I started grad school.

Yes, I'm actually trying to read it cover-to-cover. Like many mathematicians, there are a bunch of things that people seem to assume I'm familiar with, but I only heard about very briefly in some course in my first year of grad school in which I was holding on by the skin of my teeth. Indeed, this is one of the uses that's recommended in the introduction! I will resist the urge to turn this blog into a Companion to the Princeton Companion to Mathematics.

Etch-a-Sketch graphing

Mr. K points out that the Etch-a-Sketch toy helps students understand graphing. You turn one knob and the x-coordinate changes; you turn the other knob and the y-coordinate changes.

That's a good point -- but I was surprised to learn that eighth graders (that's who Mr. K teaches) are familiar with the Etch-a-Sketch.

11 November 2008

Combinatorics of universal remotes

From the wrapping a univeral remote I recently bought: "Controls VCR/DVD combos, TV/VCR combos, and TV/DVD combos!"

This is because it controls two devices, and those are the three types of device that people have. But I guess "controls two devices" doesn't work from a marketing standpoint, so they have to list all 3-choose-2 subsets. (And I also find elsewhere on the packaging "manage up to 8 separate devices with one remote control". Something doesn't add up here.)

They can take our labs, but they can't take our brains!

Chemistry hobbyists face a labyrinth of local and state regulations -- an article about how increasing regulation makes amateur chemistry more and more difficult. Via slashdot.

Fortunately mathematics is not so easily regulated. Although if mathematics comes to rely increasingly on computers to do the "dirty work", then one can imagine a future where amateur mathematics is concentrated in certain fields -- the fields which don't require computation -- if high-powered computers become regulated. But in the end mathematics is pure thought, and they can't regulate what goes on inside our brains.

(I studied chemistry as well as mathematics in college. I would never have laboratory equipment in my home. But this is because whenever I touched laboratory equipment it broke, which is why I got out of chemistry. In the hands of a competent chemist, there's nothing to fear.)

10 November 2008

A quirk of electoral apportionment

I was curious: how will the electoral vote apportionment change between now and 2012? (Reapportionment is done after each census, and censuses take place in years divisible by 10; the apportionment takes effect the year after the census. Thus the 2004 and 2008 presidential elections were done under one apportionment, and the 2012, 2016, and 2020 elections will be done under another one.)

I don't know (my first attempt at programming the apportionment gave some really strange-looking results) but I wanted to share an amusing fact.

Each of the 50 states receives a number of electoral votes equal to its number of Representatives, plus two. So the question is really one of determining the number of Representatives that each state gets. The way this works is as follows. First, each state receives one seat. Then, let the populations of the states be P1, P2, ..., P50; let

Qi,j = Pi / (j(j-1))1/2

for 1 ≤ i ≤ 50 and all positive integers j. Sort these numbers, and take the 385 largest of these numbers. Now state i (the state with population Pi) gets r = ri representatives, where r is the unique integer such that Qi,r is one of the 385 largest Q's, and Qi,r+1 is not. (385 is 435-50; 435 is the number of seats in the House of Representatives, and 50 seats were already assigned in the previous step, one for each state.) Essentially, this assigns the seats in the House "in sequence", so we can speak of the 51st seat, 52nd seat, ..., 435th seat.

So what if there's a tie for 385th place in that ordering? This can occur, of course, if two states have the same population, and I bet some tiebreaker is written into the law. But what if two states have different populations, but after dividing by the square root factor, two of the Qi,j are the same? Surprisingly, this can happen. Let P1 = 6P2. Then it's not hard to see Q1,9 = Q2,2; that is, state 1 gets its ninth seat "simultaneously with" state 2 getting its second seat. More generally, if

P1 / (m(m-1))1/2 = P2 / (n(n-1))1/2

then state 1 gets its mth seat simultaneously with state 2 getting its nth seat. Note that P1/P2 is rational. So a tie can only occur when (m(m-1)/n(n-1))1/2 is rational; when does this happen?

When n = 2, this amounts to asking when (m(m-1)/2) is a square; this happens for m = 2, 9, 50, ... (the indices of the square-triangular numbers in the sequence of triangular numbers) So one state can receive its second seat at the same time another one gets its 9th seat, its 50th seat, ... if the larger state has 6, 35, ... times the population of the smaller one.

Somehow I doubt the law covering apportionment has a provision for this. I suspect the provision taken would be similar to what happens if there's a tie in an election; I know there are some jurisdictions that just flip a coin in that case.

Edit, 10:53 pm: Boris points out in the comments that somebody's done the projection. Texas gains 4, Florida and Arizona each gain 2; the Carolinas, Georgia, Utah, Nevada, and Oregon each gain 1. New York and Ohio each lose 2; Massachusetts, New Jersey, Pennsylvania, Michigan, Illinois, Minnesota, Iowa, Missouri, Louisiana, and California each lose 1. At first glance this shift seems like it would favor the Republicans in the presidential race; nine of the seats created are in states that voted for McCain in '08, and only two of the seats destroyed are. But I'm not sure about this analysis; states are made of people, so as a state's population grows or shrinks its political makeup changes as well. Maybe Nate Silver will have something to say about this?

05 November 2008

The notion of "typical" doesn't behave nicely

Matt Yglesias makes an interesting point. The "typical" American is white, in that more than half of all Americans are white. The "typical" American is Christian, in that more than half of all Americans are Christian. But does this mean that the "typical" American is a white Christian, in that more than half of all Americans are white Christians? Not necessarily; I don't have the numbers.

Moreover, the "typical" white Christian votes Republican. Thus typical people vote Republican, so the Republicans should have won last night. But they didn't.

The point is that most people are "typical" in some ways, but few people are "typical" in all ways. And a party that is based around just people who are "typical" in all ways (note that I'm not saying this describes the Republican party) is doomed to fail, because most people are unusual along some dimension. I don't think this deserves the name of "paradox", but it's just something worth keeping in mind about How Statistics Work.

04 November 2008

We win!

I heard that the featured articles at Wikipedia were the articles on McCain and Obama, and for the first time ever they had two featured articles at once.

So I went over to check.

But Wikipedia runs on GMT... and so it's Wednesday there. The featured article is Group (mathematics).

The "We" in the subject line is "mathematicians", of course.


Scott Aaronson points out that the probability of your vote changing the results of the election scales like N-1/2, where N is the number of people. But the amount of change your vote creates, if it tips the election, scales like N. So the expected amount of change you will cause, by voting, scales like N1/2. That's a big number, so you should vote. If you live in a big country, you should vote more, although that's irrelevant; if any other country is voting today, the US media has ensured that I don't know that.

Of course, N1/2 seems a bit high, and it comes from modeling people as flips of a fair coin; Aaronson points out that under a more realistic prior (due to Andrew Gelman), the expected probability that your vote flips the election is N-1, so the expected amount of change your vote causes doesn't depend on the size of the country.

I won't "officially endorse" anybody. But one of the candidates in the present election trained as a lawyer, taught in a law school for a while, and likes to compare himself to a certain Senator from Illinois. That senator, in order to equip himself to understand the law better, studied Euclid. Being a mathematician, I think this is pretty cool. Who I'm voting for is left as an exercise for the reader.

03 November 2008

AMS: Math in the Media

The AMS puts together a roundup of math in the (mainstream) media.

I think I may be the last person to know this, but in case I'm not, here it is.

JSTOR question

I just googled JSTOR in order to find it. (Of course, it's at jstor.org.)

Anyway, the Google result I get reads (with links omitted):
Scans of print journals, with 10 major math journals (requires subscription).
www.jstor.org/ - Similar pages - Note this

Does Google know I'm interested in math, or does it say this to everybody?

As it turns out, I was in fact looking for an article from the Bulletin of the American Mathematical Society, so telling me there were math journals in there worked out well for them.

A thought on expectation

Political campaigns should not campaign in such a way as to maximize their expected number of votes. Rather, they should campaign in such a way as to maximize their probability of winning.

Early in a campaign, these are pretty much the same thing, because one doesn't know how things are going to play out; late in a campaign they diverge. The candidate that's behind in the polls should, perhaps, start doing things that are, in expectation, Bad Ideas. If there were something that McCain could do that had a 1% chance of swinging 10% of the vote to him in the next 24 hours, and a 99% chance of scaring all the voters away, he should do it. (For example, let's say McCain turns out to secretly be an extraterrestrial; we probably don't want to be ruled by extraterrestrials, but who knows, we might change our mind? Of course, I'm being deliberately silly here.)

This is somewhat analogous to what happens in sports; strategies change late in a game. In the early innings of a baseball game you play, basically, in such a way as to maximize the difference between the expectation of your number of runs and your opponents' number of runs. In the late innings, when you have a better idea of how many runs you need, you change your strategies. See, for example, the bottom of the ninth inning of Game 3 of the World Series; tie game, bases loaded, nobody out. The Rays bring in one of their outfielders to create a fifth infielder; the idea is essentially that they want to maximize the probability of the Phillies scoring zero runs, and if the ball gets into the outfield a run is scoring anyway. (As it turns out, the Phillies won that game -- on a hit in the infield.) Of course, nobody saw that because it happened at two in the morning. Things are different when you're playing for one run.

Basically, what's happening as I write this is that Obama is running out the clock. (Yes, baseball doesn't have a clock. But Obama's a basketball player, so I think he'd like this metaphor.)

What politics and sports have in common, of course, is that there's a huge difference between second place and first place. If you're a company of some sort, does it matter if you sell 1% more or 1% less than your competitor? Not really, although it might have meaning for your pride. But if you're a politician, 1% of the vote makes a big difference.

01 November 2008

Conditional probability is subtle, part 2

Nate Silver returns to the idea of conditional probability: he's saying Pennsylvania is "in play" in this election, not because the polling in Pennsylvania is close, but because conditioned on the election being close, Pennsylvania is close. In general, I've heard quite a few people argue that the candidates should focus on the states that would be close if the election were roughly tied, not the ones that actually are close, because the details of which state a candidate deliberately tries to sway things in only matter in a close election anyway.

Unfortunately, subtleties like this seem to be lost on some of Silver's commenters.

In case you're wondering, my last post titled "Conditional probability is subtle" had nothing to do with politics.

31 October 2008

The Theorem

Via Keith Devlin: The Theorem. (To give more of a description would spoil the punchline; I'd recommend not reading Devlin's article until after you've seen the video.) This should be familiar to anybody who's seriously thought about mathematics, although my life doesn't come with background music like the video does.

30 October 2008

Mandelbrot, Knuth, and open-ended citations

A citation reproduced verbatim from Benoit Mandelbrot's The Fractal Geometry of Nature (1983):
Knuth, D. 1968-. The art of computer programming. Reading, MA: Addison Wesley.

The "1968-" is not Knuth's birth date, of course, but the date at which he started writing the work in question, which was fifteen years in the making when Mandelbrot wrote. It's still not done.

Incidentally, Mandelbrot's book is a good one, full of pretty pictures (although of course not as intricate as one might see now, because the book is a quarter-century old); it's also fairly casually written. Mandelbrot describes it as an "essay", in what I take to be the original Francophone sense of the word that means roughly an "attempt" at something, and so it's rather discursive, personal, and nonrigorous; these idiosyncrasies are good for a book one wants to read casually, as I do right now, but someone who wanted a rigorous understanding of the concepts might look elsewhere. I'm tempted to say it's a good series of lectures crystallized into paper form, although as far as I know it was never intended as such.

I think I'd heard of the book before, but it was Nova's Hunting the Hidden Dimension (aired on Tuesday night) that got me to actually head to the library and find it. I suspect I'm not alone, because apparently it's selling quite well on Amazon.

(The usual disclaimer on books I'm reading that I borrowed from the library applies: ignore my recommendations if you're at Penn, because I have the library's copy.)

No more Knuth checks

Donald Knuth is no longer giving reward checks for those who find errors in his books. Apparently people have been stealing his banking information and he's had to close some checking accounts. He's maintaining a web page with the amount of rewards people would receive, which is basically as good, because nobody was cashing the checks anyway. And perhaps it's better, because now everybody in the world can see that you found a bug in one of Knuth's books! If you just had a framed check on your wall, only the people you invited in could see it.

Congratulations to the Phillies

Really, one World Series win in twenty-eight years is pretty close to average; over the last twenty-eight years there have been between twenty-six and thirty teams in Major League Baseball, so the average team should have won the World Series about once in that time span.

Still, this blog congratulates the Philadelphia Phillies on their winning the World Series in five and a third games. Not that any of them are reading it.

(If I cared about sports other than baseball, things would be different; Philadelphia has four major professional sports teams, and until last night none of them had won the championship of their sport since the 1983 Sixers; that's about 100 sports seasons without a championship, which is noteworthy. At that time I was a small ball of undifferentiated cells. As was Cole Hamels.)

28 October 2008

People are not balls

The margin of error is only the beginning of political polling: "If one or more of the above statements [about certain red and blue balls] are true, then the formula for margin of error simplifies to Margin of Error = Who the hell knows?"

27 October 2008

Electoral sensitivity

By combing through my logs I found The electoral college and second terms, which is related to my post on translating popular votes to electoral votes. (Roughly speaking, in a close election, the electoral-vote margin is about five times the popular-vote margin.)

26 October 2008

Bert and Ernie teach probability

From Alan Schwarz in the New York Times:
You can hear it in the same broadcast booth. One announcer will say that Joe Hitter is in a slump, suggesting that he is somehow plagued and that his chances of success are lower than normal. Then, as if on cue, the color man (in jovial agreement) will say that, yes, Mr. Hitter is due to break out— implying that his chances of success are higher than usual.

These exchanges and more are collected on the recently released three-disc set, “Bert and Ernie Teach Probability.”

Things have been slow around here; between the World Series, obsessing over the election, and Real Work, the blog's kind of getting left behind. But I thought you might appreciate this. According to the model implied by What Announcers Say, the hitters who are most likely to do poorly in a given game are the ones who have been performing near their long-term average.

(Oh, and by the way, "Bert and Ernie Teach Probability" doesn't exist. I was up late last night -- the game didn't end until quarter of two in the morning -- so I'm not thinking quite straight. So I checked. It should exist. Dear Internet, get working on that.)