30 November 2007

Degree Clinical Protection

A commercial for "Degree Clinical Protection" deodorant said: "Do you know one in four people think they sweat more than normal?"

So another one in four people are in denial, since the number of people who sweat more than normal is probably about half. (I'm assuming that "sweating more than normal" is the sort of thing one would deny, which seems reasonable, at least in the perspective of a deodorant commercial.)

I'm reminded of the often-quoted fact that three-quarters of all incoming students at [insert prestigious university here] think they're going to be in the top quarter of their incoming fact.

(The amazon.com page has a description beginning "Did you know one in four Americans worry about excessive sweating?", which I don't have a problem with, because the word "excessive" doesn't have the same statistical connotations.)

Pattern avoidance redux: explaining part of Alon's conjecture

Not long ago I wrote a post about pattern avoidance, in which I mentioned the following conjecture of Noga Alon:
Conjecture (Alon): The threshold length t(k), for a random permutation to contain all k-permutations with substantial probability, has t(k) ~ k^2/4.
The paper by Richard Arratia from which I learned this conjecture doesn't really give any idea where it comes from; some rationale for the conjecture may be in one of Alon's papers (most of his post-1990 papers are available online) but I'm not particularly interested in tracking down the source.
Anyway, there are two things to understand about that conjecture -- the "2" and the "4". I have no idea where the 4 comes from, but I think I can heuristically explain the 2. The problem can be looked at as an instance of the classical "coupon collecting" problem. In order to see all k! of the k-permutations, we need there to be about (k! log k!) different places in which we could see a k-permutation. That is, we need
{t(k) \choose k} \approx k! \log (k!)

but by Stirling's approximation, the right-hand side is
{t(k) \choose k} \approx k! (k \log k).

Now, note that if n is much larger than k, then we have
{n \choose k} \approx {n^k \over k!}

since we can represent ${n \choose k}$ as a quotient of products; the numerator contains k factors which are each approximately n, and the denominator is k!. So we have
{t(k)^k \over k!} \approx k! (k \log k).

Solving for t(k), we get
t \approx \left( (k!)^2 k \log k \right)^{1/k}

and applying Stirling's approximation, we get
t \approx {k^2 \over e^2} (k \log k)^{1/k}.

But for large k, (k \log k)1/k is very close to 1. However, we get the wrong constant: e-2, not 1/4. This is because the dependence problem is quite serious -- if we pick two subsets of length k from a set of size (k/e)2, the probability that they intersect doesn't go to zero as $k$ goes to infinity is nonneglibible, and is something we can compute. Consider the probability that two randomly chosen k-subsets of a set of size ak2 do not intersect. This will be given by the quotient
{ak^2 - k \choose k} \over {ak^2 \choose k}

which is just the number of ways we can choose the second subset not to intersect the first, divide by the total number of ways to choose the second. But recalling our approximation ${n \choose k} \approx n^k/k!$, we see that this is approximately
\left( {ak^2 - k \over ak^2} \right)^k = \left( 1 - {1 \over ak} \right)^k

and as k gets large, this approaches e-1/a. In our case, a = e-2, so this probability is e-e2, or about 0.0006. So the probability that two randomly chosen subsets of size k do intersect is about 0.9994. So the independence assumption is so wrong it's not even funny; the dependence makes it more likely that we see the same patterns over and over again, explaining why we need to go further than (k/e)2 to expect to see all the patterns.

29 November 2007

Punitive damages

From Overcoming Bias: Unbounded Scales, Huge Jury Awards, & Futurism.

Apparently jurors are able to consistently rank-order the amount of money that they think should be given as damages in various cases -- they can say "I think person A deserves more money than person B", and other people will agree with them. And it seems implied in the post (although it's never actually stated) that they're fairly consistent in things like "I think person A deserves twice as much money as person B". But their assignments of dollar amounts to people are all over the map.

(If you're thinking "jurors always give ridiculous amounts in punitive damages!", that might be because the media seems to like the really out-there cases, like the guy who sued his dry cleaners for $54 million because they ruined his pants. He didn't get the money. I am not a lawyer, but I suspect that the amounts given in damages in most cases are more reasonable-sounding, if only because the people on juries are people like you and me.)

Similar things might hold for predictions of the future -- people are better at guessing the order in which the future will unfold than the speed at which it will unfold.

A puzzle, and some brief thoughts on education

A puzzle, from arpatubes.net.

There's no trick here. Still, only twenty percent of the people answering it have gotten it right, because it's the sort of thing you have to read very carefully.

The comments at fark.com (where I found this puzzle) are interesting. One common theme seems to be "this isn't really a math problem, this is a reading comprehension problem" or "this is a logic problem". But reading comprehension and logic are, of course, large parts of mathematics. I have a sense I'm preaching to the choir here.

On a similar but less frivolous note: from Pencils Down, A Letter to a Young Mathematician is a (fictional?) answer to what seems to be a student asking "what good are proofs?" (From vlorbik.) I'm reminded of the recent book Letters to a Young Mathematician, by Ian Stewart (who also wrote Does God Play Dice?; the resemblance of titles is almost certainly intentional. I suspect that students would, for the most part, feel differently about proofs and other things that aren't just rote calculation if they were introduced from the very beginning of their mathematical training, before they get the idea that all mathematics is arithmetic. But this might not be possible, because abstract reasoning is something that young children just aren't capable of, yet we need to start getting arithmetic into them at a young age. Their brains haven't matured enough yet for abstract reasoning.

As it is, students seem to resent being asked to prove something. A large part of this seems to be that proofs often require writing words, and mathematics, they feel, does not properly involve words. I actually spend a fair bit of time trying to counteract this in my teaching, but I can only do so much insisting that calculus students should know how to write complete sentences and be able to clearly explain their logic before they start to rebel.

And there is of course the problem that many students probably learn their first mathematics from an elementary school teacher that never much liked math when ey was in school.

28 November 2007

The binomial pinball machine

The Binomial Pinball Machine, from Kim Moser's repository of videos of Commodore 64 programs, via the Statistics Consulting Blog.

It's too bad the video stops before you actually start seeing a decent approximation to the normal distribution.

By the way, I almost named this blog quincunx, which is a name for this machine, but I decided that people might have trouble spelling it.

27 November 2007

Collections of images of mathematicians

The Sarong Theorem Archive, which claims to be "the only public repository of sarong-wearing mathematics images". If I had a sarong, and a digital camera with batteries in it, I'd be here. The people at the Secret Blogging Seminar have lamented the fact that this collection is not growing quickly. Incidentally, it seems to me that the best theorems for picture-taking are ones that have nice pictures associated with them, so I nominate Eric Paniagua proving the Pythagorean Theorem as the best of the lot.

Female mathematicians with teal hair. I'm not here because, well, my hair isn't teal. I don't know any of these people, although I went to college with a female mathematician named Teal.

On a more serious note, there are various pages which list a large number of mathematicians and have links to pictures of them; this is useful for putting a face to a name, although it would be more useful in the reverse direction of putting a name to a face. This is true in general; wouldn't it be nice to have something like identifont for faces? (Identifont asks a series of questions to help you identify an unknown font. After nineteen questions, it identified the font of this blog as Trebuchet, which is correct. I'm curious how exactly identifont works -- in particular, if I answer some questions wrong, can I still get the correct font? This is a question about error-correcting codes in disguise.) This isn't just for mathematicians, but for any population. There's a significant difference, though; usually when you're trying to identify a font, you have a sample of it in front of you, whereas if you have a picture of someone in front of you that would be good enough for such a method to be fruitful, you probably also know someone who knows their name. So a tree-like facial identifier patterned on Identifont would be more of a curiosity.

Counting LEGO towers

A Lego counting problem, from Soren Eilers, via Vlorbik on Math Ed.

The question: what is the number of ways to make a tower of height six out of two-by-four LEGO® blocks? There are 102,981,504 ways; this is pretty straightforward, once you take the relevant symmetries into account. But apparently this has been confused with the problem of how many ways there are to make any sort of configuration out of six blocks; Eilers and his collaborator Mikkel Abrahamsen claim this number is 915,103,765, by what sounds like a brute-force program. The most common height for such a tower is 4; almost half the towers have this height.

How are the heights of towers containing n blocks distributed? I don't have much faith that this problem could be answered; it reminds me of the problem of counting self-avoiding walks. Let's say you live on the integer lattice, and you go for a walk, going north, south, east, or west at each step. There are clearly 4n walks of length n. But in how many of these walks do you never come back to a point you've previously visited? Nobody knows. There are various classes of self-avoiding walks that are restricted in certain ways, which can be counted; for example, it's not hard to count walks that only go to the north or east, or even walks that only go to the north, east, or west. (Such a walk has the following structure: go north. Then go east or west a bunch of times. Then go north. Then go east or west a bunch of times. And so on.) It wouldn't even be that hard to determine the average number of times that a NEW-walk chosen uniformly at random from all those of length n takes a northbound step; if I remember correctly it's on the order of n1/2. But counting all self-avoiding walks? Nobody knows. It's known that if f(n) is the number of self-avoiding walks of length n, then f(n)1/n has a limit as n goes to infinity; this limit is the self-avoiding walk connective constant in two dimensions, which is somewhere between 2.62 and 2.68.

It's not hard to show it must be between 2.41 and 3, if it exists:

  • Let g(n) be the number of self-avoiding walks of length n which only make steps to the north, east, or west. Then g(n)1/n approaches 1+√2 as n goes to infinity. This can be shown by encoding the walks as words in the three-letter alphabet {N, E, W} and counting the number of words which don't include the subwords EW or WE;

  • Let h(n) be the number of walks of length n which don't include the sequences NS, SN, EW, or WE. The number of these is 4 × 3n-1; the first step can be anything you like, and each following step can be anything except the opposite of the previous step. So h(n)1/n approaches 3.

But the walks counted by g(n) are all self-avoiding, and self-avoiding walks are a subset of those counted by h(n). So g(n) ≤ f(n) ≤ h(n), which is enough. I believe that at least some of the results which tighten up the connective constant were derived in the same way, by finding classes of walks which contain or are contained by the self-avoiding class.

(The thoughts on self-avoiding random walks are basically coming from a talk on prudent self-avoiding walks by Mireille Bousquet-Melou that I heard a couple weeks ago; I was hoping her slides would be online, since there were some nice pictures, but they aren't.)

The LEGO®-tower-counting problem is similar. In the case where the tower goes straight up, we don't have to worry about the different blocks colliding with each other, so it's something like the "easier" problems above. But if we allow more than one block on each level, we have to take into account the "physics" of the problem, namely that only one block can be in any given piece of space. The LEGO® problem is actually more closely related to counting polyominoes; if you look through the list of Bousquet-Melou's papers, you'll see she's done a lot of work on that sort of problem as well.

26 November 2007

Graph paper

The history of graph paper, from Alexandre Borovik's Mathematics under the Microscope.

I often find myself thinking that graph paper is an innovation whose time has passed. Its main purpose, in my life, is to make my students' homework harder to read; there are some of them who write on graph paper despite the fact that we are very rarely asking them to graph something by hand, and the vertical lines end up distracting my eyes. And on those occasions when I do want to graph something, I use a computer.

Similarly, I can see how graph paper would be useful for numerical computation, because it makes it easier to line up the digits of various numbers one wants to add or subtract; but I use a computer for that, too.

There was a time when I was playing around with tilings of the square lattice a lot; during that period I liked graph paper. In the same vein, a few days ago Borovik illustrated multiplication of the Gaussian integers on (a simulation of) graph paper.

My normal approach is useless here

"My normal approach is useless here", from xkcd.

I'd seen this posted on a bulletin board in the hallway here, but I didn't know this was the source.

By the way, I assume that whatever ♥ represents, it has units. Therefore you can't take its cosine, since you can only take the cosine of a unitless quantity; similarly for the Fourier transform. And yes, I am deliberately missing the point here.

Rejecta Mathematica

Rejecta Mathematica "is a new, open access, online journal that publishes only papers that have been rejected from peer-reviewed journals (or conferences with comparable review standards) in the mathematical sciences. We are currently seeking submissions for our inaugural issue."

Not surprisingly, the second FAQ is "Is this some kind of joke?" -- the answer is in the negative.

Basically, the point is that papers in which people reprove known results, try seemingly promising techniques and show they don't work, and so on are of value. Papers will be accompanied by a letter from the author saying why the journal to which the paper was originally submitted rejected it.

It's an open-access journal, and the articles will be available online. I can't tell if there's a print version; the site seems to neither confirm nor deny this. It'll be interesting to see what comes out of this.

(Also, the journal's URL is math.rejecta.org; this seems to imply a possible branching out into other fields if this experiment goes well.)

25 November 2007

compressed calendars and the Doomsday algorithm

Infodesign challenge -- how to fit a calendar for a year onto a business card.

There are many solutions; basically it seems that in order to do this well one has to exploit the fact that our calendar has some sort of structure. In short, all months look the same if you just rename the days of the week.

#2, for example, takes this into account: in a non-leap year January and October "look the same", as do February, March and November; September and December; April and July.

Incidentally, this is not a problem I particularly worry about, because I know the Doomsday algorithm. In short:

  • It is relatively easy to determine what day of the week the last day of February falls on in a given year. (Because of leap years, a lot of calendar algorithms focus on this day. There are some cases where for the purposes of formulas it is convenient to think of January and February as the 13th and 14th months of the previous year.) This day of the week is called "Doomsday". Doomsdays run in a 400-year cycle, as does the entire Gregorian calendar. Doomsday in 1900 was a Wednesday, and in 2000 it was a Tuesday. For every twelve years after a multiple of 100 Doomsday moves forward one day; this is because twelve years include three leap years, so Doomsday actually moves forward 15 days. (So Doomsday in 2012 is a Wednesday, in 2024 a Thursday, and so on.) So take the last two digits of the year and divide by 12; call the quotient q and the remainder r. Move Doomsday forward from its day in the last century year by q+r days (each dozen years moves Doomsday forward fifteen (= one) days, then each year after that moves it forward one more day) plus r/4 days rounded down (for leap years). So, for example, Doomsday in 2000 was Tuesday; 7 divided by 12 is 0, with remainder 7. So Doomsday in 2007 is Tuesday, plus zero days, plus seven days, plus one day (7/4 rounded down is 1); that's Wednesday. Indeed, February 28 was a Wednesday this year.

  • certain days will always fall on the same day of the week as that day; the easiest set to remember is 4/4, 6/6, 8/8, 10/10, 12/12, 5/9, 9/5, 7/11, 11/7 [note that for the set named so far! it doesn't matter whether you put the month first or the day first!], 3/7, 2/(28 or 29), 1/(3 or 4). (A few others that stick in my head are 2/14 (Valentine's Day) in ordinary years, 7/4 (Independence Day), 10/31 (Halloween) and 12/26 (the day after Christmas).) All of these days were Wednesdays this year as well.

  • once you have a set that includes at least one day in each month, it's not hard to work within each month, probably because we have a decent amount of practice doing so. November 7 was a Wednesday; thus November 21 (fourteen days later) was also a Wednesday; thus November 25 is a Sunday. And my television agrees with me; it is showing that execrable show that only survives because of its cushy post-Simpsons time slot, King of the Hill.

A real challenge would be to figure out rules that let someone calculate the day of the week of a given date in the Hebrew calendar in their head. I suspect it's not possible.

24 November 2007

A simple probability puzzle,

A simple probability puzzle, from Doubting Toshuo via Anarchaia.

Paraphrasing a bit: imagine we flip a fair coin until either THH or THT appear on three consecutive flips. Which is more likely to come up first? Alternatively, which has the smaller expected time to its first appearance? (These are different questions; the first is "which wins the race" and the second is "which has the faster average time". They don't necessarily have the same answer in general.)

The answer to both is THH. This is analyzed in Flajolet and Sedgewick's forthcoming book Analytic Combinatorics, section I.4.2. If you want to figure out either the probability that THH appears before THT or the distribution of the first time at which they appear, it's a basically routine problem in the combinatorics of words.

Most people say that they're equally likely to come up first, and that they have the same expected time to the first appearance. This is because they're confusing this problem with a closely related problem: which pattern appears more often in the long run? And both patterns have probability 1/8 of appearing in the positions n through n+2 for any n; the expected value of the number of appearances of THH, or of THT, which start by position n is in fact n/8. But patterns like THT, which can overlap themselves (the technical term is "autocorrelation"), tend to occur in clusters; for example, we can get three occurrences of THT in just seven coin flips: THTHTHT -- but to get three occurrences of THH takes at least nine coin flips. As Flajolet and Sedgewick write:
On the other hand, patterns of the same length have the same expected number of occurrences, which is puzzling. The catch is that, due to overlaps of [pattern 1] with itself, occurrences of [pattern 1] tend to occur in clusters, but, then, clusters tend to be separated by wider gaps than for [pattern 2]; eventually, there is no contradiction.

They use different patterns, but the same effect occurs for their patterns as for Toshuo's. There might be a halfway decent bar bet here; unfortunately most of the people I would make a bar bet with are my friends, and my friends know I'm a probabilist, so they'd be suspicious. Especially since I've written about this here.

23 November 2007

Numbers that "look prime"

From MathNotations, which is a math education blog. A suggested activity for high schoolers:

Using positive integers, think of primes ending in 1 or 3 (you may want to use the technically precise phrase 'whose units digit is 1 or 3'). For example, 21 'ends in' 1, but it is not prime. You must go in order and you are not allowed to ask the number the previous person gave!

I tried doing this myself; it's surprisingly hard! I would never be so foolish as to say 51 is prime... but I said to myself "41... 43... um, what comes next?" It's something about looking at that list of numbers out of context; a significant part of my knowledge about which numbers are prime is apparently just being able to recite the first thirty or forty (up to 150 or so) pretty quickly. ("Recite" is not quite the right word; I tried to actually do this and once I got past 37 I realized that I was not quite working from memory so much as doing trial division.)

Also, I have to share the old joke that 91 is the first number that "looks prime" but isn't. The argument is that multiples of 2 and 5 "look composite" (by looking at their last digit); multiples of 3 "look composite" since people know the fact that a number is divisible by 3 if and only if the sum of its digits is; and (two-digit) multiples of 11 "look composite" because of their repeated digits. Furthermore, squares don't "look prime". So the smallest number that looks prime, but isn't, is 7 times 13, or 91. After that, numbers that look prime but aren't are:
119, 133, 143?, 161, 187, 203, 209?, 217, 221, 247, 253?, 259, 287, 299.
(I put a question mark after multiples of 11 because I'm not sure how to count multiples of 11 which are greater than 100. And of course, this criterion of "looks prime" is subjective; for example, I don't think 217 or 287 look prime, since they can easily be written as 210 + 7 or 280 + 7 and thus their factorizations as 7 × 31 and 7 × 41 are obvious)

so there are 14 "psuedoprimes" in the range from 100 to 300, compared with 32 actual primes in that range. So this test isn't too bad in that range. Of course, eventually it becomes quite bad; it identifies (1/2)(2/3)(4/5)(10/11) = 8/33 of all integers as primes (ignoring the fact that it identifies squares as nonprimes), when the actual density of the primes is zero. In the limit, almost all of the numbers identified by this test as primes are in fact composite.

strategy in Tammet's solitaire

This was written a few months ago; I'm just getting around to posting it now for some reason.

Recently I read Daniel Tammet's book Born on a Blue Day: Inside the Extraordinary Mind of an Autistic Savant. Tammet is a savant with Asperger's syndrome; he's very good as a mental calculator; he also is synaesthetic. In his particular case, he experiences numbers and words with certain colors, which explains the title of the book. He also sometimes sees numbers as shapes, to which he attributes his calculational ability; there's an interesting point where he explains that he sees multiplication of numbers by seeing the "shapes" that he associates them come together, and the space in between is the shape of the product. This seems to imply that each prime number would have to have some basic sort of shape.

At one point he mentions a solitaire game to be played with cards that he invented. The game works as follows:

Some of the time I played a simple form of solitaire of my own creationo, with a deck of cards in which each card was given a numerical value: ace as 1, jack 11, queen 12, king 13, while the numbers on the other cards determined their values. the object of the game is to keep as many cards as possible from the deck. At the start, the deck is shuffled and then four cards ar played onto a pile. If, after the first card, the total value of the cards in the pile is at any point a prime number, then those cars are lost....
The player now decides whether to risk putting another card onto the pile or to start a new pile from scratch. If the player decides not to risk a new card on the pile, then the cards from the pile are safe and are retained. If the player plays a new card and the total reaches a prime number at any point, then the whole pile of cards is lost and a new pile is started. The game ends when all fifty-two cards in the deck have been played into piles, some lost and some successfully held. The player counts up the total number of safely retained cards to work out his final score.

Presumably the reason for not declaring the pile dead if the first card is prime is because the chances of the number on any given card being prime are just too high -- six out of the first thirteen natural numbers are prime! The first natural question to ask is the probability that that initial four-card pile is allowed to live. It wouldn't be hard to compute this exactly, but I figured I'd get a better feel for the game by playing with actual cards. A fifty-two card deck naturally breaks up into thirteen such piles; I went through a deck of cards ten times, seeing 130 piles, and got fifty-three piles which were allowed to live -- about 41% of them.

I then ran the following MAPLE code:

S := 0;
for i from 1 to 13 do
for j from 1 to 13 do
for k from 1 to 13 do
for l from 1 to 13 do
if not(isprime(i+j) or isprime(i+j+k) or isprime(i+j+k+l)) then S := S+1; fi;
od; od; od; od;

This loops through all the quadruplets (i, j, k, l) with each member between 1 and 13; 1 is added to the accumulator S for each quadruplet which corresponds to a good pile. There are 10026 possible "good" piles, it turns out, out of a total of 28,561; that's about 35%. Going through the deck ten times as I did, you'd expect 46 good piles; I got 53, which isn't that far off given the small sample size.

(Note that implicit here is the assumption that each quadruplet is equally likely; in fact, those which contain repetitions are less likely when you're dealing with an actual deck of cards.)

Incidentally, if 13 is replaced by some other number n, then the probability of a "good" quadruplet increases as n is increased. If n=2 it's 1/16 -- the only good quadruplet is (2, 2, 2, 2). If n = 10 we get 0.3074. If n = 20 there are 64575 good quadruplets out of 160000, and the probability of a good quadruplet is 0.4035. If n=52 (this corresponds to each card in the deck having its own number) the probability of a "good" quadruplet is up around 48%. I suspect the probability approaches one as n gets large, but this code runs quite slowly as n gets larger. There are faster ways to do the computation, but they take more coding.

Now, let's say we're dealt four cards, and their sum is k. Do we want a fifth? As Tammet points out, you want a fifth card if there aren't that many primes in the next thirteen numbers. My instinct is that you only want to "hit" (somehow the blackjack terminology seems natural here) if the sum has probability less than one-fifth of being prime -- that is, if there are two or less primes in the numbers from k+1 to k+13. k can be at most 52; it turns out that you should only hit on 44 or 45, in this strategy.

But this is actually the correct strategy for maximizing the number of cards you get on a single trick. (We're trading in having four cards with certainty to having five cards with at least a 4/5 probability.) The object of the game, as stated, is to maximize the number of cards you get going through the whole deck. with the naive strategy of always staying after the fourth card, we get to keep on average 35% of the cards.

So let's say I have a four-card pile going, with sum k. Considering those four cards and the next one, I expect to keep 4.35 of them if I stay; I expect to keep 5m/13 of them if I hit, where m is the number of composites between k+1 and k+13. So hitting is the right move exactly when 4.35 > 5m/13, or m > 11.31; I only want to hit if the numbers from k+1 to k+13 contain zero or one primes. But k is at most 52. The sequence of primes begins

2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67...

and so I should never hit. I wonder if Tammet is aware of this?

Obviously, though, not hitting makes the game less interesting...

Daniel Tammet has a website and a blog

21 November 2007

Acres are strange.

Canada to Announce Vast New Park, from tomorrow's NYT.

The size of the park is referred to as "25.5 million acres", which seems kind of silly to me. The whole point of an acre is that it measures areas which are too large for, say, square feet and too small for square miles. I have no idea how large 25.5 million acres is, until I convert it to square miles -- 40,000 square miles. That's approximately the area of Pennsylvania, or five times the area of New Jersey. If the park were circular, it would have a radius of 112 miles -- that's kind of a useful way to picture it, since that says that if you were in the center of the park the borders would be over a hundred miles away. I have some idea what a hundred miles looks like.

But that relies on the fact that an area is the square of a distance. An acre is 43,560 square feet, or 1/640 of a square mile. What's the square root of an acre? 208.7 feet, which isn't any conventional length unit. The history is that an acre is a rectangle one furlong by one chain, or so Wikipedia says, so 208.7 feet is the geometric mean of a furlong (660 feet, or 1/8 mile) and a chain (66 feet, or 1/80 mile), or √10 chains. The weird definition is agricultural -- "[t]he acre was selected as approximately the amount of land tillable by one man behind an ox in one day", and it's easier to plow a long, narrow rectangle than a square. But haven't we moved past this?

Of course, in the end trying to rationalize the customary measurement system is silly. I suspect many of you are just saying "why don't you Americans use metric already?" In fact, one of my students, who is not American, actually wrote this once on a homework assignment, in his solution to a problem which applied calculus to physics where measurements were given in customary units.

20 November 2007

Pattern avoidance

Here's a paper I found while Googling for something else a while ago: On the Stanley-Wilf conjecture for the number of permutations avoiding a given pattern, by Richard Arratia, from the Electronic Journal of Combinatorics, Volume 6(1) (1999), note N1.

Pattern avoidance in permutations is a beautiful little subject, about which I don't know that much -- but not that much is known. For example, see Bridget Tenner's remarkably short database of permutation pattern avoidance; okay, so the state of knowledge isn't quite as bad as this makes it look, but it's not that good either. (A good introduction to the area seems to be Miklos Bona's Combinatorics of Permutations, which has a couple chapters on the area. However, if you're at Penn I don't recommend Bona's book, because I have the library copy and I don't want to give it up.

Let p1 p2 ... pn be a permutation of the integers from 1 to n. We say a permutation is, for example, 231-avoiding if there is no i < j < k such that pk < pi < pj. That is, a 231-pattern in a permutation is a set of three letters in the word representing it which, when read from left to right, fall in the same order as the numbers 2, 3, 1; a permutation is 231-avoiding if it has no 231-patterns. So, for example, 26415837 is not a 231-avoiding permutation of [8], because 2, 6, 1 form a 231-pattern. A similar definition can be made of a σ-avoiding permutation for any permutation (called a "pattern") σ. Rather surprisingly, the number of patterns avoiding any permutation of length 3 is the same, and this can be proven bijectively. But some permutations of length 4, for example, are "easier" to avoid than others; what makes a pattern easy or difficult to avoid isn't totally clear. The Stanley-Wilf conjecture (now proven, by Gabor Tardos and Adam Marcus) states that if we let F(n, σ) be the number of permutations of [n] which avoid the pattern σ, then
\lim_{n \to \infty} F(n, \sigma)^{1/n}

exists and is finite. For example, 231-avoiding permutations are in bijection with Catalan trees of size n, so F(n, 231) is the nth Catalan number. Thus
F(n, 231) \sim {4^n \over \sqrt{\pi n^3}}

and so we get the constant 4 for the limit above. The Stanley-Wilf conjecture thus associates a constant with each permutation, but not too many of these constants are known. (I won't attempt to tabulate the known constants; I suspect someone out there has already done it. At the very least, there are a fair number of results scattered throughout the applicable chapters of Bona's textbook.)

Anyway, a natural question to ask is "how many σ-patterns does a permutation of [n] have, on average?" This is fairly obviously
{1 \over k!} {n \choose k}

since there are ${n \choose k}$possible sites for such patterns, and each site actually contains such a pattern with probability 1/k!; from playing around with a few small cases it looks like the variance of the number of σ-patterns is asymptotically cσn2k-1, where the constants cσ don't appear to be anything nice in general. (The only way I know how to compute these variances basically comes down to considering all the ways in which two instances of the same pattern in the same permutation can intersect and enumerating a large number of cases; it's not surprising the results are ugly. I've only done it in a couple simple cases, and I don't want to quote the results here because I haven't had the patience to check my computations.) In the case of inversions, which are just 21-patterns, these are the classical results that the average number of inversions of an n-permutation is asymptotically n2/4, with variance n3/36. I have a faint hope that there might be some relation between the constant "1/36" there and the fact that the number of 21-avoiding permutations of n is just 1n (that is, 1 -- this is the simplest case of the Stanley-Wilf conjecture, that there is only one inversion-free permutation) but I don't actually know enough of these constants to see what's going on.

Anyway, what I want to bring attention to is the conjecture of Alon at the end of this note:
Conjecture (Alon): The threshold length t(k), for a random permutation to contain all k-permutations with substantial probability, has t(k) ~ k^2/4.
Why should this be true? The note by Arratia gives some idea -- it relates this problem to the longest common subsequence problem -- but I want an argument that stays purely within the pattern avoidance realm. I'm working on it.

the Tao of compactness

Terence Tao on compactness and compactification (from the forthcoming Princeton Companion to Mathematics). Particularly interesting is the fact that using ideas of compactification, one can rigorously view a continuous object as a limit of discrete objects: the circle group as a limit of cyclic groups, straight lines as the limit of circles (they're just circles with the center at infinity!), the Dirac delta function as the limit of taller and narrower "spike functions", and so on.

Since I seem to be on a nomenclature kick, "compact" is kind of misleading nomenclature at first; if you didn't know what "compact" meant, wouldn't you think that if the closed interval [0, 1] is compact then the open interval (0, 1) surely must be? After all, "compact" means something like "small" in everyday discourse, and (0, 1) is contained in [0, 1]. It seems even sillier when we say that the real line is compactified by adding a "point at infinity"... until you realize that this means you can just join the edges together and turn the line into a circle, and circles are "smaller" than lines. Intuitively "compact" should mean what "bounded" means, and of course the Heine-Borel theorem says that in Euclidean space "compact" means "closed and bounded". Why should something called "compact" and something called "closed" be connected? And why aren't "closed" and "open" logically opposite? When I first learned the definitions of those -- the definitions on the real line, in a real analysis class -- I must have flipped back to the page with those definitions on them in baby Rudin at least once a day. I eventually made my peace with those definitions... but it took a while.

A Mobius strip video

A video about the Mobius strip, with an interesting choice of background music.

This video illustrates certain classical facts about the Mobius strip:
- if you draw a line down the middle of the Mobius strip, it runs down "both sides" of it (by which I mean both sides of the original sheet of paper; the Mobius strip isn't orientable)
- if you cut along that line, you end up with a long twisted strip;
- if you cut along a line one-third of the way from one of the edges of the original strip of paper to the other, you get two linked strips, of equal thickness but different lengths;
and another fact I didn't know:
- if you cut along a line one-quarter of the way from one of the original edges to the other, you get three linked strips (I assume at least one of them is Mobius), but now one is twice as wide as the other two. (But I think there's something trickier going on with the cutting here; it's hard to get a good look.)

Also, breaking the underwater one-handed Rubik's-cube-solving record. I can't make this stuff up.

(These are both from is from sciencehack, which is a site for science videos that claims that "every science video on ScienceHack is screened by a scientist to verify its accuracy and quality". Apparently they don't host the videos; they just index other people's science-related videos.) Here is their index of mathematics videos.0

19 November 2007

The complexity zoo

From bit-player (Brian Hayes): Until NEXPTIME, on the proliferation of complexity classes.
Have you ever tried to explain to your grandmother why NP is named NP? Does she get it when you say that problems labeled NP-complete are the hardest problems in NP, but NP-hard problems might be harder, and not in NP?
Yes, complexity classes are confusing. As Hayes points out, "There are hints of structure in the naming scheme". I'm not sure if that makes it better or worse. My question is: do there exist complexity classes A, B, C, and D such that the pairs of names (A, B) and (C, D) are related in the same way but the actual classes are related in different ways? In other words, does this nomenclatures have the potential for false generalizations? Hayes points out that the chemists have come up with a good systematic nomenclature for chemical compounds, of which there are many more than there are complexity classes. It's even useful if you're not a chemist; I've caught myself on occasion using the chemist's nomenclature for alkanes to describe trees. (I studied chemistry and math in college.) In particular, if I remember correctly the systematic chemical nomenclature doesn't allow false generalizations.

The trivial chemical nomenclature does, though. ("Trivial" in chemistry doesn't have the mathematician's meaning; in chemistry it means the way in which names are assigned in a one-to-one, ad hoc manner to chemical structures.) Perhaps the Right Thing to do is not to try to regularize the current nomenclature for complexity classes, but to come up with a new systematic naming system that could overlay the current one. But that might require knowing more about complexity theory than we currently do.

There's an inclusion diagram for complexity classes at the Complexity Zoo (which also exists in a wikified version here which appears to be more current.)

Abusive dimensions and cohomology of Grassmannians

Yesterday I complained about the abuse of the word "dimension" by a lot of popular authors, namely the fact that they tried to order the dimensions.

Being a discrete mathematician, I don't think about dimensions all that often. But I'm taking a course in algebraic topology (it's required for my program); I have a habit of trying to twist the problems there into combinatorics problems whenever possible. Recently we were asked to compute the cohomology of the Grassmannian manifold GnRk. This Grassmannian is just the manifold of n-dimensional subspaces of k-dimensional space passing through the origin. It is of dimension k(n-k), as we can see by "dimension-counting". To obtain such a subspace we pick n orthogonal unit vectors in k-space. The first one is chosen from Sk-1 and thus we have k-1 "degrees of freedom", contributing k-1 to the dimension. The second one is chosen from a (k-2)-sphere in the (k-1)-dimensional orthogonal complement of this first vector; this contributes k-2 dimensions. And so on, down to k-n dimensions for the nth vector. But we've overcounted by 1 + 2 + ... + n-1, since we've actually chosen an n-dimensional flag in k dimensions; thus we need to quotient out by the orthogonal group SO(n), which has this dimension. Thus the dimension of the Grassmanian is

[(k-1) + (k-2) + ... + (k-n)] - [0 + 1 + ... + (n-1)] = n(k-n).

As you may have noticed, I used "dimension" in two senses there; the rigorous sense of the topological invariant which is always an integer, and the non-rigorous sense of "degree of freedom". At this point my complaint from yesterday is still valid, because I never ordered the dimensions, and that was the cardinal sin I attributed to people who don't know what they're talking about.

But then I caught myself saying things like "the cohomology of G2R5 in the second dimension is Z2." That's no good.

As it turns out, I rediscovered the fact that the dimension of Hm(GnRk) is the number of integer partitions of m into at most k-n parts, none of which exceed n; in the case I just mentiond we have m = 2, n = 2, k = 5, so we're looking for integer partitions of 2 into 2 parts which don't exceed 3. There are two of them, namely 2 and 1 + 1. This isn't too interesting, but say we want to know, for example, the dimension of H9(G3R10)? Then we can just write down all the partitions of 9 into three parts which are at most 7;

7+2+0, 7+1+1, 6+3+0, 6+2+1, 5+4+0, 5+3+1, 5+2+2, 4+4+1, 4+3+2, 3+3+3

and there are ten of them; thus H9(G3R10) = Z10. You can write down generating functions for these sequences of integers; I suppose one could even ask about the asymptotic behavior of these dimensions, although I don't think a topologist would care. (I can't find a source for this fact, but I'm actually pretty sure I've heard it before; the two obvious sources to me were Hatcher's Algebraic Topology and Stanley's Enumerative Combinatorics, which are both books I know reasonably well, and I found the place in each where this should be mentioned. Stanley just says that there is a connection to the cohomology of the Grassmannian and leaves it at that; Hatcher talks about symmetric polynomials but never sinks to finding actual numbers.)

But as I said, I caught myself saying things like "what's the second-dimensional cohomology of G2R5"? Second-dimensional. And I've heard this usage from Real Algebraic Topologists, too. So apparently what was bothering me yesterday was something more subtle than this abuse of language.

18 November 2007

The "tenth dimension"

Imagining the tenth dimension, by Rob Braynton. The place where I originally read about this link thought it was kind of cringeworthy, but I didn't agree at first; it's not a bad popular explanation at the beginning...

But at one point the voice claims that quantum wavefunctions exist in five dimensions, invoking some. This isn't reasonable; the space of wavefunctions is a hell of a lot larger than one-dimensional. Sometime around here I stopped paying too much attention. And all dimensions above four, in this explanation, seem to arise in the same way. Why stop at ten? (In the video, somehow seven-dimensional space gets collapsed to a single point. I won't pretend to explain it. Then ten is seven plus three.)

While I'm on the subject, it's kind of annoying that the narrator says "the tenth dimension" as if you can put "the dimensions" in order. Space may be ten dimensional, but there's no canonical ordering of the elements. "Because we live from moment to moment in the third dimension" -- no! We live in three dimensions. Insofar as the third dimension is anything, it's a single dimension, so this nomenclature seems to insist we live in one-dimensional space. Dimension is a number (usually an integer) which is an invariant of a space; there's not some canonical set of three things associated with our three-dimensional space. (Yes, a basis for our space has three dimensions, but there's no canonical choice of basis.)

And don't even get me started about science fiction that talks about "creatures from the fourth dimension". "Four-dimensional creatures" are at least mathematically acceptable.

In the end he's trying to sell a book, and there's a blog; in one post there's a list of where the ten dimensions arise from but it feels like the methodology (if one could call it that) could justify any particular number of dimensions.

17 November 2007

The thirtyfold way

Robert A. Proctor has written an article Let's Expand Rota's Twelvefold Way for Counting Partitions! I first learned about Rota's "Twelvefold Way" from Stanley's book Enumerative Combinatorics, where it concludes the first chapter; it's a set of twelve basic problems in counting partitions, compositions, etc., organized into a four-by-three table. Proctor's article expands the table to six by five. These are all problems of the sort "how many ways can we put n objects into k boxes", where the problems vary by how many objects are allowed in each box, whether the boxes or the objects are distinguishable or not, whether the order in which the boxes sit on our table or the order in which objects enter each box, and so on, matter. The objects in the table include certain classical objects (and classical sequences of numbers) like set partitions (Bell numbers), set partitions into a given number of parts (Stirling subset numbers), integer partitions (partition numbers), compositions (powers of two), and so on. Unfortunately there is quite a proliferation of idiosyncratic nomenclature for all of these objects; I'm not sure to what extent this can be avoided. (This isn't a complaint about Proctor's paper, but about the subject in general.)

The article has been submitted to the American Mathematical Monthly; as far as I can tell it hasn't been published there yet, and the tables of contents of future issues show that it wont be published by April 2008.

"Partition" is a bit problematic because it refers to both integer partitions and set partitions; furthermore it can refer to multiset partitions, which interpolate between the two, although not too many people seem to have thought hard about multiset partitions because there doesn't seem to be a nice way to compute those numbers. One can view an integer partition as a way of putting n black objects in unlabeled bins; one can view a set partition as a way of putting n all-different-colored objects in unlabeled bins. A mutliset partition is then a way of putting n objects, some of which have the same color, into unlabeled bins; we can associate the coloring of the balls with an integer partition. So, for example, with the integer partition 2 + 1 + 1 we associate the number of ways of putting four balls, two of which are red, one yellow, and one blue, into some bins. The ways of doing this can be written


and there are eleven of these; this is intermediate between the number of integer partitions of 4, which is 5, and the number of set partitions of a set of four elements, which is the fourth Bell number, 15. I'll say that these multiset partitions are associated with the integer partition 2 + 1 + 1. I took to calling the numbers of partitions of a multiset Knuth numbers -- the name seems to be taken but I like it anyway -- and writing them as K(integer partition), so for example we'd have K(2,1,1) = 11, which is between K(4) = 5 (the partition number) and K(1,1,1,1) = 15 (the Bell number). At one point a few weeks ago I calculated some of the Knuth numbers and stared at them for a while, but couldn't really get anywhere; part of the problem is just that it's incredibly frustrating working with objects which are indexed by integer partitions! There are some nice cases, though; for example, multiset partitions associated with the integer partition (n-1) + 1 are, in the colored-ball interpretations, ways of putting n balls into indistinguishable boxes, where one of them is marked. This is the number of distinct parts of the partition λ, summed over all integer partitions λ of n. For example, partitions of the multiset RRRB, which contains one marked object, are


of which there are seven. The integer partitions of 4 are 4, 3 + 1, 2 + 2, 2 + 1 + 1, 1 + 1 + 1 + 1, which contain one, two, one, two, and one distinct parts respectively, for a total of seven. So K(3,1) = 7. If you stare at the numbers there's some sense of order -- as partitions of n get more parts the Knuth numbers get larger -- but it's hard to make out much order beyond that.

Another question comes to mind -- are there examples of tables like this where not all the entries are filled in? These would seem to be the more intriguing ones, since the challenge is then to fill in the missing entries; it reminds me of the Periodic Table from chemistry, which was invented when a large number of elements still hadn't been discovered, and properties of the elements that hadn't yet been found could be guessed quite reliable from properties of the surrounding elements.

16 November 2007

Moebius transformations revealed

Moebius transformations revealed, from YouTube, via MathTrek -- a visual demonstration of the folk theorem that Möbius transformations correspond to moving around the Riemann sphere in various ways. By Douglas Arnold and Jonathan Rogness.

Robin Whitty's theorem of the day

The Theorem of the Day, by Robin Whitty. The content of this page is a hundred single-page things which state a theorem, along with giving a pretty picture related to the theorem and some historical motivation or examples. Whitty writes that "Each day offers a different theorem (or lemma, law, formula or identity), each one worthy of adorning the walls of a mathematical Alte Pinakothek, Guggenheim, Louvre, Tate, Uffizi or Zach Feuer." In idle moments I used to think about making a "mathematical coffee table book" of this sort, but I'm not so good at making pictures, so I blog instead. Now I don't have to.

My favorite out of the ones I've looked at is Ramsey's theorem, because the picture includes aliens; these are the aliens of the famous story of Erdös in which he says that if aliens want us to tell them what the Ramsey number R(5,5) is or else they will destroy the Earth, we should get all the mathematicians in the world working on it; but if they want to know R(6,6) our best chance is to try to kill the aliens.

Incidentally, it's not actually a "theorem of the day", in that there's not a new theorem every day; but don't let that stop you from looking at it. A new theorem is displayed every day, but they cycle back about every three months. How this is implemented (it's nontrivial since new theorems are inserted into the rotation every so often) is explained here.

I found this while Googling for more stuff on Robin's theorem, which I wrote about this morning; Robin's theorem is not one of the theorems listed here. Rather, the Theorem of the Day is maintained by Robin Whitty. The name "Robin's theorem" confuses me when I see it in print, because there are a few things I refer to as "Robin's theorem" since they're due to a mathematician of my acquaintance whose first name is Robin. But I found out that Guy Robin is French, so I just pronounce his name in my head in a very exaggerated French way.

Robin's theorem

One of my favorite theorems is Robin's theorem. Consider the function σ(n)/n, where σ(n) is the sum of the divisors of n; I've talked about this function before, when I wrote about the density of abundant numbers. In that post I asked about the distribution of σ(n)/n, so for example we can say that in the limit, σ(n)/n is greater than 2 slightly under a quarter of the time. But another question is just how large can it get?

Gronwall's theorem says that the lim sup of σ(n)/(n log log n) is exp(γ), where γ is Euler's constant. So this function doesn't get very large for "reasonably sized" n, since log log n grows quite slowly, and exp(γ) = 1.781... is not a very large constant. For example, log log 1024 = 4.012... and so σ(1024)/1024 can't be much larger than seven. (One has to be careful with these sorts of limiting results in number theory, though; the Erdos-Kac theorem is another one of my favorite theorems, and it states that the number of distinct prime factors of numbers up to n approaches a normal distribution with mean and variance log log n. But in order to get anything that really "looks like" a normal distribution you probably need log log n to be, say, 10 or so; that limiting distribution is approached very slowly.

Gronwall's theorem is not quite so poor at approaching the limit, assuming the Riemann hypothesis. Robin's theorem is as follows: if the Riemann hypothesis is true, there is a finite list of exceptions, to the inequality

σ(n)/n ≥ exp(γ) log log n

and the largest one is 5040; the full set of exceptions is 2, 3, 4, 5, 6, 8, 9, 10, 12, 16, 18, 20, 24, 30, 36, 48, 60, 72, 84, 120, 180, 240, 360, 720, 840, 2520, and 5040. Furthermore, Robin proved in the same paper that if the Riemann hypothesis is false, there are infinitely many counterexamples.

A very similar problem is An elementary problem equivalent to the Riemann hypothesis , which has the virtue of being written about in English and recently enough that it's online. (Robin's theorem is twenty-three years old and in French.) Lagarias shows that
{\sigma(n) \over n} \le {H_n + \exp(H_n) \log(H_n) \over n}

for all n if and only if the Riemann hypothesis is true, where Hn = 1 + 1/2 + ... + 1/n.

You might suspect these are equivalent, since H_n is about (log n) + γ thus exp(Hn) is about n exp(γ) and log(Hn) is about log log n. It's not obvious, though, but it's also not that hard to prove. Lagarias does it in his paper.

Unfortunately, this doesn't make proving the Riemann hypothesis all that much easier. It makes disproving it easier "in theory"; a number which violates Robin's theorem or Lagarias's very similar result constitutes a counterexample to the Riemann hypothesis, and furthermore, if Riemann was wrong then there are infinitely many of them! But pretty much everyone believes the Riemann hypothesis to be true.

And another question, which I've barely thought about; pick a constant C < exp(γ). Is the number of positive integers such that σ(n)/n ≥ C log log n finite? (In other words, are there a lot of "near-counterexamples" to Robin's theorem?)

J. C. Lagarias, An elementary problem equivalent to the Riemann hypothesis. Amer. Math. Monthly 109 (2002), 534--543. (arxiv.org/math.NT/0008177)
G. Robin, "Grandes Valeurs de la fonction somme des diviseurs et hypothèse de Riemann." J. Math. Pures Appl. 63, 187-213, 1984.

15 November 2007

Asymptotics of partition polynomials

I've previously mentioned Richard Stanley's page of interesting zeros of polynomials. For an example of this in action, you can see a couple papers which were justed posted to the arXiv, by Robert P. Boyer and William M. Y. Goh:

Partition Polynomials: Asymptotics and Zeros

Polynomials associated with Partitions: Their Asymptotics and Zeros

In the first paper, we take the polynomial which counts the partitions of n by number of parts; when n = 200 the zeroes of this polynomial are plotted at Richard Stanley's web site. What's surprising is that the zeroes are actually a good bit more subtle than they'd look from that picture; they come in three families, the third of which doesn't appear until n = 13,000.

The second paper makes a similar conjecture for the zeroes of the "rank polynomial" and "crank polynomial" of high degree, where partitions are counted by their rank (the difference between the largest part and the number of parts) or the crank (see p. 11 for definition). (In the rank case, at least, Boyer and Goh actually just count partitions with positive rank; by conjugation one can get those with negative rank.) The zeroes of the rank polynomial appear to lie approximately on the unit circle, and are spread out uniformly.

By the way, I've previously credited this idea of looking at the zeros of combinatorially defined polynomials to Stanley. Apparently it's actually due to Rota, who once said:
The one contribution of mine that I hope will be remembered has consisted in just pointing out that all sorts of problems of combinatorics can be viewed as problems of location of the zeros of certain polynomials and in giving these zeros a combinatorial interpretation. This is now called the critical problem. Over the years people have added examples of critical problems, though we still haven't gotten any idea of how to solve it in general. I'd like to see someone get such an idea before I die. The four-color conjecture-that with only four colors you can color every planar map so that no two adjacent regions have the same color-is one of these critical problems.

Rota was Stanley's thesis advisor.

14 November 2007

time-traveling mathematicians

Take a look at today's Brown Sharpie.

So what if someone with a time machine went back in time and gave Fermat a book about elliptic curves, therefore enabling him to prove his famous "theorem"? Then he probably wouldn't have written his famous marginal note. But large parts of number theory came into being precisely because people were looking to prove Fermat's last theorem -- it's such an innocuous-looking assertion that it feels like it "should" have a simple proof. But that means that the book that our time traveler brought back in time never would have been written. And so on.

If you remember one thing today, it's that time travelers shouldn't give people proofs.

There's a similar example in Brian Greene's book The Fabric of the Cosmos. Greene imagines that he travels into the future, checks online to see what the latest advances in string theory are, and finds out that his mother proposed some grand unified theory of everything. He reads her paper, and in the acknowledgements he finds that she thanked him for teaching her physics. But his mother, as far as he knows, doesn't know any physics! So he travels back in time and goes to teach her physics. But she just doesn't get it. Knowing that she writes the paper, he eventually just tells her what to write, which he can do because he read it in the paper. Who gets the credit for the paper? Brian Greene shouldn't, since he just learned what was in the paper from reading it; but his mother shouldn't get the credit, either, because her son told her what to write! So if backward time travel is possible, then knowledge can appear out of thin air. The resolution is that in backwards time travel one must also travel between parallel universes.

13 November 2007

Two examples of convolution

1. This morning it was raining in Philadelphia. I walked to school, and as I was walking I considered the question -- which route should I walk so that I get the least wet? (I have an umbrella, but it is a rather small one because I don't like carrying a full-sized umbrella, so I still get a bit wet.) There is one particular street I could have taken that has a lot of trees. I find that when I walk on this street while it's raining I get rained on less, because the trees keep some of the water from falling to the ground immediately. But all the rain eventually gets down to the ground (I'm assuming; perhaps some of it gets absorbed by the leaves?) and roughly speaking there is some sort of convolution here, where the "rainfall density function" gets convolved with some sort of operator that shifts the rain forward in time. The net effect is that sometimes you can be walking down a tree-lined street after the rain stops and yet it still falls on you.

2. Last night, grading exams. Conventionally at Penn in the calculus courses we set the curve on the exams so that 30 percent of the grades are A, 30 percent are B, 30 percent are C, and 10 percent are D or F. This seems to a lot of people like too many A's and C's and not enough B's. But it makes sense; since it will not be the same students that get the A's on every exam, my advisor (who also happens to be the coordinator for the course) said that the distributions for the various exams end up effectively getting convolved to give the final distribution of grades, which tends to end up more like 25-40-25 than 30-30-30. (This isn't entirely true, though, as the exam grades are correlated with each other; students who do well on one exam tend to do well on the others.)

11 November 2007

Genealogical nomenclature

The nomenclature of "kth cousin l times removed" clearly was not invented by a mathematician.

A friend of mine (hi, Kate!), in an effort to explain the nomenclature, said that "Nth cousins share an (N-1)th-great grandparent." This makes sense with respect to the traditional nomenclature, but that "-1" is kind of awkward. But you can't make it go away; the rephrasing is "Nth cousins share a common ancestor (N+1) generations back".

Therefore siblings are zeroth cousins, since they share an ancestor one generation back (their parents). (For the sake of simplicity I'm ignoring half-siblings, who share one parent, and all other half-relations.)

And in the degenerate case, you are your own negative-first cousin; you share an ancestor with yourself zero generations back (yourself!)

The relevant number is the number of generations between you and the first common ancestor; thus what we call "first cousins", for example, should really be indexed by the number two. This would make the indexing start at zero, not minus one.

The "removed" nomenclature gets even weirder. If two people share a common ancestor, who is k generations above person A and k+l generations above person B, where l is positive, then they are (k-1)th cousins l times removed. So if my grandparent is your great-grandparent, then we have k = 2, l = 3, so we're first cousins once removed. (Incidentally, if my grandparent is your great-grandparent, then you must live in the future; as of 2007 no such person exists.) The nomenclature is symmetrical, which is surprising because the only other English-language kinship terms that are symmetrical involve people of the same generation. (And this is only true if you equate "brother" and "sister", or use the gender-neutral word "sibling".) If person U is your uncle, you're not his uncle; you're his niece or his nephew.

Now, I share an ancestor with my uncle (my mother's brother); his mother is my grandmother. (His father is my grandfather, too, but my grandfather's dead, and I didn't much like him when he was alive.) So we have k = l = 1, and so he's my 0th cousin once removed.

And my grandmother? She's my negative first cousin twice removed. (In general, your direct ancestors, n generations backed, are your negative first cousins n times removed.)

The simplest fix would seem to be shift all the cousin numbers up by one, so your siblings are now your first cousins, the people currently called "cousins" (people with the same grandparent) are second cousins, and so on. That way the indexing starts at zero, not negative one. The "removed" nomenclature seems a bit funny, but I suppose that in a lot of cases the important quantity is the difference in generation number between two people, so I'd actually keep it the way it is if I were reforming the system -- although perhaps de-symmetrizing it, so that one can immediately tell which of the two people involved is of the older generation.

10 November 2007

The Menger sponge

Zooming in and out on the Menger sponge, from Microsiervos.

You always hear that fractal objects are "self-similar" (that's what makes them fractal) but it's hard to get your head around that if you're looking at an ordinary drawing, since intuitively you think that the "small" parts are somehow qualitatively different than the "big" parts. But that's not really true. And this video illustrates that quite vividly -- as you zoom into the Menger sponge it very clearly looks the same on smaller and smaller scales.


Edited, September 14, 2008: This site is now at pronunciationguide.info.

pronunciationguide.org: A Guide for Classical Radio Announcers (and whoever else is interested), by Chris Wendl, whose day job is as a mathematician. The purpose of the guide was originally to inform classical radio announcers how to pronounce the names they run across, but the information therein is general enough that it should be useful to anyone who needs to pronounce foreign names. That includes a lot of mathematicians, though -- mathematics, like music, is practiced by people all over the world. (In particular, welcome to my many Spanish-speaking readers! I'd welcome you in your own language, but I can only read Spanish, not write it.)

Unfortunately it's focused only on European languages, since classical music is for the most part a European art form; the language I'd really like to have a decent guide for pronouncing is Mandarin, because there are so many Chinese mathematicians. I suspect it's out there if I take the time to look.

09 November 2007

On the infinitude of primes

From Microsiervos: two videos: "Cómo Arquímedes calculó el volumen de una esfera" (5:52) and "Sobre la suma de los recíprocos de los números primos" 11:46 (That is, "how Archimedes calculated the volume of a sphere" and "On the sum of the reciprocals of prime numbers")

There's a bit of an apparent error. Translating the slide that's on the screen at 3:00 in the second video:
If P [the set of primes] is finite, P := {p_1, p_2, ..., p_n} in increasing order.
Let us form the number p_1 p_2 ... p_n + 1.
This number is greater than p_n, therefore it is not prime.
It is not divisible by p_1, nor by p_2, ..., nor by p_n, therefore it is prime.

(There's nothing in what the speaker says that contradicts this; it's just easier for me to translate from the written text than the speaker.)

Indeed, the original post at Microsiervos points this out:
La demostración de Euclides de que hay infinitos primos que se presenta es incompleta: el 30031 sería un contraejemplo (se construye como 2 × 3 × 5 × 7 × 11 × 13 + 1 pero aun no siendo divisible entre 2, 3, 5, 7, 11, 13 no es primo: 59 × 509). Así que faltaría añadir «… o bien [el número creado] está compuesto por la multiplicación de otros números primos mayores que los de la lista original».

That is,
Euclid's demonstration that there are infinitely many primes, as presented here, is incomplete; 30031 would be a counterexample (it can be constructed as 2 × 3 × 5 × 7 × 11 × 13 + 1, which despite not being divisible by 2, 3, 5, 7, 11, and 13 is not prime: 59 × 509. One must add: "or else [the created number] is composite and made of oter prime numbers larger than those from the original list.

I'd claim that this isn't really an error; we've assumed there are no other primes, so the number p1...pn+1 can't be divisible by some other primes. It's hard to keep things straight because we all know there are infinitely many primes.

This also includes the classic demonstration that the sum 1 + 1/2 + 1/3 + 1/4 + ... is infinite, which leads to Euler's analytic proof of the infinitude of primes (roughly, the number of primes "is ζ(1)") before eventually getting to the proof that the sum of the reciprocals of primes diverges. (Wikipedia says that means that the primes are a large set, which seems like a good name, but I don't know if people actually use this term.) Also see this English Wikipedia article for a lot of the content of the second video.

Oh, and ζ in (Castilian) Spanish apparently sounds the same as θ in (American) English. How do Castilians pronounce θ?

Monopoly strategy

How to Win at Monopoly, by Tim Darling.

This is basically a user-friendlization of some of the rather unwieldy tables Probabilities in the Game of Monopoly®, which uses a Markov-chain-based approach. If you can compute the probability that a given space gets landed on, then you know how often you'll collect rent for that space if you have it; this in turn lets you know which properties are good to buy, put houses on, and so on. (Incidentally, the best return on investment in the game seems to come from buying the third house on any given property.)

However, the assumption that underlies all this is that when playing Monopoly, you're trying to maximize your expected winnings. Unless you play for actual money (and maybe not even then), what you're actually trying to do is stay in the game longer than everybody else, which is different. In an actual game you might want to take bets that are more likely to pay out rather than bets that are risky but pay extraordinarily well when they do. This doesn't really apply to Monopoly the way I've stated it (all the squares have roughly equal probabilities of being landed on), but it seems to imply that if you want to play conservatively (say, because you're winning and you'd like to preserve that), you can do so by building up two sets of cheap properties than one set of expensive properties. Conversely, if you're behind you probably want to build on expensive properties and hope desperately that someone lands on them. Diversification is usually good, but not in those cases where the expected value is kind of crappy and only the best possible outcomes will keep you alive.

08 November 2007

The birthday paradox

Finding the Odds of Synchronized Couples, from The Numbers Guy.

The problem is as follows: how many opposite-sex couples need to be in a room for the probability that there are couples Mr. and Mrs. A, and Mr. and Mrs. B, such that Mr. A and Mr. B have the same birthday and Mrs. A and Mrs. B have the same birthday, to be more than one half?

This is a variation on the classic "birthday problem", which asks this for individuals. The answer in the case of individuals is 23; for couples it's 431. This number 431 is given in the article without justification, so I wanted to explain it. For same-sex couples, the answer is about 304. (There are less possible pairs of birthdays for same-sex couples because you can no longer distinguish between the two members of the couple.)

Let's say there are n possible types of objects (say, birthdays that a person can have) and objects of these types (people with different birthdays) come along one by one. Their types (birthdays) are chosen uniformly at random. The probability that the first person is not the same type as any of the people before em is, of course, 1. The probability that the second person is not the same type as any of the people before em is 1-1/n. The probability that the third person is not the same type as either of the first two people -- given that the first two people are of different types -- is 1-2/n. And in general the probability that the kth person is not the same type as any of the people preceding em is
\left( 1 - {1 \over n} \right) \left( 1 - {2 \over n} \right) \cdots \left( 1 - {k-1 \over n} \right)

But recalling that (1-x) ≈ e-x, we can rewrite this approximately as
\exp \left( -\sum_{i=1}^{k-1} {i \over n} \right)

which is, again approximately, exp(-k2/2n).

To have this probability be less than one half, then, we need
k > \sqrt{n(2 \log 2)} \approx 1.18\sqrt{n}

For birthdays of ordinary Earth people, n = 365, and we get k > 22.49, that is, we need at least 23 people (since people come in integers) to have probability 1/2 that two of them have the same birthday. For birthdays of opposite-sex couples, n = 3652, and this gives k > 430; for birthdays of same-sex couples, n = 365(365+1)/2, and k > 305.

Incidentally, the birthday problem is often called the "birthday paradox" because a lot of people are surprised by the result -- if you have 23 people in a room then there's a fifty-fifty chance that two of them have the same birthday. I don't find it surprising any more, so I won't call it a paradox.

(Somewhat more surprisingly, I don't seem to have blogged about the birthday paradox before!)

It would be interesting to note whether the birthdays of couples are actually evenly distributed; having them not be evenly distributed might be evidence for something. The obvious thing is astrology -- but it wouldn't surprise me if there are earthly reasons why people born around the same time of year are more likely to be attracted to each other. (I have no idea what the mechanism for this would be, though, and I suspect this isn't the case.)

RSS feeds for the arXiv

I learned from An American Physics Student in England that the arXiv has RSS feeds for its updates. There are individual feeds in each category: see for example combinatorics (math.CO) and probability (math.PR) This may be common knowledge, but it's something I didn't know. And I welcome it, because I check my RSS reader as a way of procrastination, and now perhaps procrastination will bring me interesting papers. (Of course, then it won't be procrastination.)

07 November 2007

The lunacy of our language

Raving Lunatic Obviously Took Some Advanced Physics, from The Onion, March 17, 2004. About the "raving lunatic" named Cosmic Stan, some physicist who doesn't actually exist says:
"He's definitely had some advanced training, though I'm not surprised that it went unnoticed for so long," Lundergaard said. "It's hard for the layperson to differentiate schizophrenic ramblings like 'Modernity chunk where the sink goes flying on the ping-pang' from legitimate terminology like 'Unstable equilibria lie on the nodal points of a separatrix in phase space.'"

Some parts of mathematics are the same way.

Partitions with maximal products

Consider all the ways that we can write an integer n as a sum of nonnegative integers. Then for each of these sums take the product of the summands. What is the largest such product?

For example, we can write
6 = 5+1 = 4+2 = 4+1+1 = 3+3 = 3+2+1 = 3+1+1+1 = 2+2+2 = 2+2+1+1 = 2+1+1+1+1 = 1+1+1+1+1+1
and the corresponding products are
6, 5, 8, 4, 9, 6, 3, 8, 4, 2, 1
of which 9 = 3 × 3 is the largest.

First, any partition of n (that's what these sums are called) which includes a 1 does not give rise to the maximal product. If we have n = a1 + ... + ak-1 + ak + 1, then this corresponds to the product a1...ak; clearly we can rewrite n as
n = a1 + ... + ak-1 + (ak + 1)
(where we've combined ak and 1 into a single part), and
is larger.
Similarly, any partition of n which contains a part which is 5 or larger can't give rise to the maximal product. Simply replace a part k ≥ 5 by (k-3)+3.

So any partition of n having maximal product has all its parts equal to 2, 3, or 4. Furthermore, we can break a part equal to 4 into two parts equal to 2 without changing the product; for example 4+3 and 3+2+2 both give the product 12. So if we just care about the value of the maximal product, it's enough to consider parts equal to 2 or 3.

Finally, any partition containing three or more parts equal to 2 doesn't have maximal product; we can replace 2+2+2 with 3+3 and increase the product. For example, 3+2+2+2+2 is not the partition of 11 having maximal product (it has product 48) since we can replace three of the parts equal to 2 with two parts equal to 3, giving 3+3+3+2, which has product 54.

So the maximal product of a partition of n must be given by a partition which has all parts equal to 3, except for zero, one, or two parts equal to 2. Furthermore, considering n modulo 3 tells us that only one of these three situations is possible. Thus, we see that:
- the maximal product of a partition of 3n has n parts equal to 3, and has product 3 × 3n-1;
- the maximal product of a partition of 3n+1 has n-1 parts equal to 3 and two parts equal to 2, and has product 4 × 3n-1;
- the maximal product of a partition of 3n+2 has n parts equal to 3 and one part equal to 2, and has product 6 × 3n-1.
For example, the partition of 12 having maximal product is 3+3+3+3, with product 81; the partition of 13 having maximal product is 3+3+3+2+2, with product 108; the partition of 14 having maximal product is 3+3+3+3+2, with product 162. These are unique, except that in the 3n+1 case we can replace the two 2s with one 4.

The sequence generated in this way is in the Encyclopedia of Integer Sequences; although I invented the problem myself, it's (unsurprisingly) well-known in problem-solving circles. For example, it was 1979 Putnam exam problem A1 (in the special case n = 1979) and 1976 IMO problem B1 (in the special case n = 1976).

Link: an analysis of Excel's "65535" bug

From Chris Lomont: An Analysis of the Excel 2007 "65535" bug. (This is the bug that made certain floating-point calculations with results near 65535 or 65536 appear as 100000 or 100001.)

Towards the end, Lomont writes:
[A]n amazing number of people guessed the bug had something to do with the 65536 row limit, showing the flaws in belief in numerology.

Um, it's a power of 2? Somehow it's not surprising that powers of 2 occur in a lot of distinct places with binary computers. (Hence the "correlation is not causation" tag on this entry -- the number of rows didn't cause the bug, nor vice versa, but both were caused by the fact that the internal architecture of the computer is binary.)

06 November 2007

Gosper's modification of Stirling's approximation

From Math Forum: an improvement of Stirling's approximation, credited to Gosper:
n! \approx \sqrt{\left(2n + {1 \over 3} \right) \pi} \left( {n \over e} \right)^n

The usual approximation is
n! \approx \sqrt{2\pi n} \left( {n \over  e} \right)^n

which is just the beginning of an asymptotic series
n! \sim \sqrt{2\pi n} \left( {n \over  e} \right)^n \left( 1 + {1 \over 12n} + O(n^{-2}) \right)

and we can rewrite the square root to get
\sqrt{2 \pi n} \left( 1  + {1 \over 12n} \right) = \sqrt{2 \pi n} \sqrt{1 + {1 \over 6n} + O(n^{-2})}

and combining the two square roots gives us
n! = \sqrt{\left( 2n+{1 \over 3} \right)\pi} \left( { n \over e} \right)^n (1 + O(n^{-2}))

which is what we wanted. This is basically a combination of the first two terms of the usual Stirling approximation into one term. For example, when n = 8 this gives 40316, compared to 39902 for the first term of Stirling's series and 40318 for the first two. (8! = 40320.) In general the error of Gosper's approximation is about twice that in the first two terms of Stirling's, so this isn't incredibly useful but it's an interesting curiosity.

West's glossary and open problems in graph theory and combinatorics

Douglas West has a list of open problems in graph theory and combinatorics and a glossary of terms in combinatorics which may be of interest. The open problems are keyed to the four volumes of his The Art of Combinatorics, which is a series of four advanced graduate textbooks/reference books, on "Extremal Graph Theory", "Structure of Graphs", "Order and Optimization", and "Arrangements and Methods"; each is intended to be (I think) a semester or so worth of material. The series is in preparation; the title is, of course, a nod to Knuth. I can't vouch for the books being any good, because I haven't seen them.

There seem to be a lot of lists of open problems scattered about the Internet. Has anyone compiled a list of lists of open problems?

(I found the glossary while Googling for information about hook length of partitions; it didn't help for that, because I already know what hook length is.)

Links for 2007-11-06

Visualization of Numeric Data, from desktop.de (in English), via Anarchaia. A silly related example is the equinational projection from Strange Maps, in which we see what the world would look like if every country had the same area.

Stat of the Day talks about the length of extra-inning games, extending upon my post from a few days ago (which in turn depended upon their data).

A napkin filled with research-related work at WebDiarios de Motocicleta, following up on this idea. I make sure to have paper with me most of the time, so this doesn't happen to me. But last Friday I did some math on a pizza-stained paper plate.k

05 November 2007

A cold lottery

British lottery games cause confusion over negative numbers. A winter-themed scratch-off lottery game required people to be able to compare two numbers to determine if they won; the numbers were "temperatures", and some of them were negative. The game was pulled from the shelves.

I'm not crazy enough to say that we should just use Kelvins so that we wouldn't have this problem. But temperature is special this way. There are two examples of negative numbers that "ordinary" people have to deal with -- cold temperatures and debts. But the difference between owing money and not owing money is much larger than the difference between "positive" and "negative" temperature.

(In statistical physics, inverse temperature comes up a lot. This only makes sense if one is using absolute temperature.)

The density of abundant numbers

Let σ1(n) be the sum of the divisors of n. So, for example, the divisors of 18 are 1, 2, 3, 6, 9, and 18; so σ1(18) = 1 + 2 + 3 + 6 + 9 + 18 = 39.

If σ1(n) > 2n, we call n abundant, so 18 is abundant. If σ1(n) = 2n, we call n perfect; if σ1(n) < 2n, we call it "deficient".

(If you're wondering about the subscript 1: σk(n), more generally, is the sum of kth powers of divisors of n. It turns out that σ1(n)/n = σ-1(n), so everything I'm about to say could be rephrased in that terminology.)

A basic result of analytic number theory (for example, see Apostol, Introduction to Analytic Number Theory, Theorem 3.4) is
\sum_{n \ge x} \sigma_1(n) = {1 \over 2} \zeta(2) x^2 + O(x \log x)
. If we write
\Sigma_1(n) = \sum_{k \le n} \sigma_1(k)

then the previous result says that Σ1(n) is about n22/12). But Σ1 is in a sense an "integral" of σ1; the "average value" of σ1, for numbers near n, should just be Σ1'(n), or about n(π2/6). That is, the sum of the divisors of a number near n is, on average, about 1.644n.

So, the logical question to ask is: what can we say about the distribution of σ(n)/n? It seems possible that almost all numbers could have σ(n)/n near π2/6; this would seem to mesh with the fact that large perfect numbers are very rare. But in the 1930's, Davenport and Erdos independently showed that this is in fact not the case, and σ(n)/n does seem to have a well-defined distribution; in particular, that
A(x) = \lim_{n \to \infty}  {\#\{ k \le n : \sigma(k) \ge xk \} \over n}

exists and is continuous. A(x) is just the proportion of numbers for which σ(n)/n is at least x, so A(2) is the proportion of numbers which are abundant. The best result I know of (although I just started thinking about this) is that of Deleglise, which is that A(2) is between 0.2474 and 0.2480. From numerical work one can get approximate A(x) for other x; for example, A(1.5) seems to be about 57%, and A(2.5) seems to be about 8.8%. I don't know the full distribution (although I haven't thoroughly read the last two papers I cite below, and I haven't read the first one at all) but I suspect that it is not one of the familiar ones. The Deleglise paper goes through a lot of work just to compute A(2); it seems like it would require equally much work to compute A(x) for each individual x, when what one would like is of course the entire distribution in one fell swoop.

Here are references for the papers I mentioned:
Davenport, H. "Über numeri abundantes." Sitzungsber. Preuss. Akad. Wiss., Phys.-Math. Kl., No. 6, 830-837, 1933.
Deléglise, M. "Bounds for the Density of Abundant Integers." Exp. Math. 7, 137-143, 1998.
Erdos, P. "On the Density of the Abundant Numbers." J. London Math. Soc. 9, 278-282, 1934.