29 February 2008

Leap days

Mark Dominus has written an interesting post about leap days, of which today is one. I welcome this turn of events, because it means I don't have to! But I have a few things to add.

The reason that today is February 29 -- and not March 1 like it would be in a normal year -- is because the year is almost a quarter of a day longer than 365 days. So we add an extra day every four years. But not quite -- it's actually more like 365.24 days -- so we leave out one of these extra days every century. But no, it's really more like 365.2425 days -- so we add back in one of the days we got rid of every four centuries. (We did that in 2000, as you may remember. 2000 was a leap year.)

The real number of days per year is something like 365.24219; thus what one wants to do is to find a rational approximation p/q to .24219 and add p leap years every q years. The approximation we use, 97/400, actually isn't that good for a number with such a large denominator; 8/33, a convergent of the continued fraction of .24219, is better. Dominus proposes having leap years in those years congruent to 4, 8, 16, ... , 32 mod 33, a scheme that was also suggested in 1996 by Simon Cassidy on Usenet, complete with the same choice 4, ..., 32 of leap years in each 33-year cycle. The reason for this coincidence is that 1984, 1988, ..., 2012 are leap years in both systems.

And although dividing by 33 is hard, as Dominus points out, it's not that hard to reduce a number modulo 33 in your head. The year abcd (where a, b, c, and d are digits) reduces modulo 99 to ab + cd; for example, 2008 reduces to 28 modulo 99. Then just subtract 33 or 66 as necessary. I'd rather have to do lots of reduction modulo 33 in my head than, say, modulo 37. (And with 33 there's probably some slick Chinese-remainder-theorem method that allows you to reduce a number mod 33 by reducing it mod 3 and 11, both of which are easy.) In any case, there's no reason we couldn't have a calendar that requires more complicated arithmetic than the current one; most people are not in the business of calculating calendars. (And calendars are calculated by computers now; Dominus points out his new rule actually requires less computation than the old one. The 97/400 rule is an artifact of the fact that we work in decimal.)

As is pointed out both by Cassidy and by this page, Jesus is traditionally said to have lived for 33 years; for some people that might lend additional appeal to a 33-year cycle.

Edit (8:23 am): There's also something to be said for a system that includes 31 leap years every 128 years -- have leap years in years divisible by 4, except those divisible by 128 -- this was apparently suggested by Johann Heinrich von Mädler, and would only be off by something like a quarter-second a year. But striving for such accuracy is silly, because the length of the year isn't actually constant.

27 February 2008

Number sense?

Numbers Guy, from the March 3, 2008 issue of The New Yorker, on human beings' intuitive "number sense".

Interestingly, we're quicker at comparing numbers that are further apart, which seems to imply some intuitive sense of "number line" -- and our number line perhaps has some sort of strange metric (not the usual one) in which the distance between 2 and 3 is longer than the distance between 7 and 8, since we're faster at saying 3 is larger than 2 than we are at saying 8 is larger than 7. (See, for example, yesterday's post on hyperbolic discounting which exploits a similar phenomenon with respect to timelines.)

And here's an interesting fact:
Because Chinese number words are so brief—they take less than a quarter of a second to say, on average, compared with a third of a second for English—the average Chinese speaker has a memory span of nine digits, versus seven digits for English speakers. (Speakers of the marvellously efficient Cantonese dialect, common in Hong Kong, can juggle ten digits in active memory.)

I wonder if this sort of thing is true in general -- does this correlation extend beyond just a pair of languages? Wikipedia's article on the "seven plus or minus two" phenomenon backs this up -- the actual limit is something like two seconds of speech -- although unfortunately there's no citation to follow.

(Via Seed's Daily Zeitgeist.)

26 February 2008

Combinatorial quotients

Consider a combinatorial class (in the sense of Flajolet and Sedgewick) or combinatorial species (in the sense of Bergeron, Labelle, and Leroux). There are certain generating functions associated with combinatorial classes/species (I'm conflating the two for the moment because the difference between the two theories aren't important here), and the operations of sum and product have nice "combinatorial" interpretations -- addition of generating functions corresponds to disjoint unions of classes/species and multiplication of generating functions what I'll call here "combinatorial product", i. e. for two species F and G, an FG-structure on the set [n] is a pair which consists of an F-structure on some subset S of [n] and a G-structure on [n]-S. Then the generating function of the FG-structures is the product of the generating function of the F-structures and that of the G-structures.

Other natural operations on functions have combinatorial interpretations: composition of functions, taking exponentials or logarithms, differentiation, integration, etc. Even subtraction has an interpretation, although one has to be careful: this is the "virtual species". (A virtual species is a pair of ordinary species (F, G), written as F-G, where (F, G) = (H, K) if F + K = G + H. Replacing species with integers and addition with multiplication, this looks like the construction of the rational numbers from the integers.)

But is there a natural quotient operation? If there is, I can't find a mention of it in Flajolet and Sedgewick, and Marko Petkovsek (who I was just talking about this with) tells me it's not in Bergeron et al. either; this is one of the sources he's using for our current topics course in combinatorics. Let F be a species with 1 structure on the set of size 0, and let Fn be its restriction to size n. Let F+ be its restriction to all positive sizes. Then
F^{-1} = (1 + F_+)^{-1}

where 1 is the multiplicative identity species (hence the name); it has one structure on a set of size 0. We can of course write this as
1 - F_+ + F_+^2 - F_+^3 + F_+^4 - \cdots

and we collect the "positive" and "negative" terms to get
F^{-1} = \sum_{k=0}^\infty F_+^{2k} - \sum_{k=0}^\infty F_+^{2k+1}

So the inverse of the species F exists, and it's a virtual species in which the positive part is even-length tuples of nonempty F-structures, and the negative part is odd-length tuples of nonempty F-structures. This doesn't quite seem like a "combinatorially natural" operation -- but on the other hand I was able to describe it in a single sentence, so it seems plausible that it could be a natural thing to occur somewhere.

Notice that this also proves that if
F(z) = \sum_{n=0}^\infty f_n {z^n \over n!}, G(z) = \sum_{n=0}^\infty g_n {z^n \over n!}

and F(z)G(z) = 1, f0 = 1, then all the gn are integers, since the generating function G(z) actually counts something! (Well, a virtual something -- but that's close enough. It won't surprise you to learn that for two species H and K, the generating function corresponding to the virtual species H-K is H(z) - K(z). G here is a virtual species, so the gn are (possibly negative) integers. Of course, there are other ways to prove this, but I think this is a cute "combinatorial proof".

"One Giant Leap For Babykind"

One Giant Leap for Babykind, from the (Feb. 25?) New York Post.

A mother born February 29, 1980 is due to have a baby on February 29, 2008. (Her doctors have said they'll induce labor that day if it doesn't happen naturally.)

Now, one out of every 1461 days is a leap day; assuming that the events of a mother and a baby being born on that day are independent (which seems reasonable), this should be true for about one in every (1461)2 = 2,134,521 mother-baby pairs. So about 140 people in the U. S. should be born on February 29 and also have a mother born on that day.

I was actually reading the article expecting them to make some sort of mathematical mistake -- this feels like the sort of place where an "expert" is consulted and gives some ridiculous figure -- but they managed to restrain themselves.

25 February 2008

Hyperbolic discounting

Greg at The Everything Seminar posts about "hyperbolic discounting". Roughly speaking, theoretically one should value a payoff of an amount P of money at time t in the future with the same value as Pe-rt today, where r is the inflation rate. But people actually seem to value P at that time in the future like they value P/(1+ct) today for some constant c. People probably have a decent sense of what the inflation rate is; thus Pe-rt and P/(1+ct) should agree locally, so I'd guess c = r. (Set the two expressions equal to each other, so ert = 1 + ct, and take the first two terms of the Taylor series.)

Greg writes:
I think something like a logarithmic measure on actual time might give the hyperbolic discounting model.

That's true. Let's say we live at time 0; the correct (exponential discounting) value of a payoff of 1 at time t is e-rt. The value of a payoff of 1 at time T under hyperbolic discounting is 1/(1+rT). Setting these equal, we get
e^{-rt} = {1 \over 1 + rT}
.
Solving for each variable in terms of the other,
t = {\log(1+rT) \over r}, T = {e^{rt}-1 \over r}


So roughly speaking, from looking at the first equation, the discounting that people actually use instinctively is obtained by taking the logarithm of the time T they're discounting over (up to some scaling, which really just sets the units of time), and then applying the correct (exponential) model. This reminds me of a logarithmic timeline, but in reverse. People see the period from, say, 16 to 32 years ago as being as long as the period from 32 to 64 years ago. This is also why I don't believe in a technological singularity even though I'd like to; the arguments often seem to be based on "look! lots has changed in the past hundred years, more than changed in the hundred years before that!" but our memories of "change" are somewhat selective.

24 February 2008

Wikipedia on 0.999...

Wikipedia has an article entitled 0.999...

The article is quite good, and includes at least sketches of the various proofs that 0.999... = 1 and some interesting historical tidbits; I won't recapitulate those here.

The talk page and the arguments page are interesting, in a different way; one gets to watch people who don't really understand why this should be true, but want to understand it. (Of course there is the occasional crackpot. It might give you a headache, though. It's like sausage being made.

23 February 2008

links for 2008-02-23

From A neighborhood of infinity: What is topology?, attempting to provide some intuition for the standard axioms that define a topology in terms of "observable sets" (which are just open sets, with different intuition attached to them!)

Also, on a totally unrelated note: John Baez has online notes from the course he gave in Fall '03, Winter '04, Spring '04. The winter notes are particularly interesting to me, because they combine combinatorics (which I know well) and a lot of category-theoretic notions that I'm not too familiar with.

22 February 2008

Hats, colors, and algebraic structures

Here's a game: there are a hundred people arranged in a circle. Some of them are wearing blue hats, some of them are wearing yellow hats. None of them can see the color of their hat, but everybody can see the color of everybody else's hats. Each of them in turn has to say "yellow" or "blue". What strategy can they use to ensure that the largest number of them say the color of their own hat? (And how many is that?)

Obviously the first person to speak can't be ensured of saying the color of their own hat. But everyone else can! The strategy is as follows: the first person says "yellow" if they see an even number of blue hats, and "blue" if they see an odd number of blue hats. Say they say "yellow". Then each person after that counts the number of blue hats they see. If they see an even number of blue hats, then their hat must be the same color as the first person's; if they see an odd number of blue hats, their hat must be the opposite color from the first person's. (This is reversed if the first person said "blue".) We can do no better.

The problem pretty easily generalizes to any finite number of colors of hats and finite number of people; the trick is to think of the n colors as the group Z/nZ, that is, the integers under addition modulo n. (It starts seeming very unnatural to express the strategy here in terms of colors, though.)

And it turns out -- this is a bit more surprising -- that we can always do this even when there are an infinite number of people or an infinite number of colors, or both. With an infinite set of colors and a finite number of people the key fact is that any set of colors can be given the structure of an abelian group, even if it's infinite. With an infinite number of colors and an infinite number of people things get kind of weird, and you have to endow the set of colors with a field and invoke Zorn's lemma.

This is basically a summary of Des chapeaux, des couleurs et des structures algébriques (in French, of course) by Florent Benaych-Georges, which I have translated into rather rough English for no good reason.

21 February 2008

Politics and winding numbers

Confusion about the changing positions of political parties in the U.S., from Andrew at Statistical Modeling, Causal Inference, and Social Science.

There are fairly standard models that allow one to plot political opinions in a two-dimensional "issue space", with the two dimensions being roughly "social" and "economic"; the two-party system dictates that these essentially get projected down to a single dimension. The post alludes to them, and refers to an argument that the Democrats and Republicans may have switched positions in the last century or so via a rotation of 180 degrees in this space. (As counterintuitive as it seems given the current ideological stances of the major parties, the Republicans started out as the anti-slavery party.) Andrew is not convinced -- the argument seems to rely on an assumption that states remain constant in their political leanings, which isn't true -- but it at least seems like something that could happen.

So in another century, could the parties rotate all the way around and get back where they started? And if the alignment of the two major parties in issue space in 2100 ends up being the same as that in 1900, is it more likely to happen by the rotation continuing in the direction it's going, or by reversal? This is basically a question about winding numbers, in a sense I really don't want to make precise because it's kind of silly.

20 February 2008

Wow, X suck[s] at math.

See John Armstrong deconstruct Monday's XKCD, which he also talked about when it came out. (For those who haven't seen it: one student writes ∫ x2 = π on the board and a teacher says "Wow, you suck at math"; another student, with longer hair, writes the same thing and a teacher says "Wow, girls suck at math".)

Other people have responded to this as well: Ben Webster of Secret Blogging Seminar, Reasonable Deviations, ZeroDivides, and probably some others that I'm missing. It's not my intention to react here, just to point to other mathoblogospheric commentary on the matter. But you really should read John Armstrong's post, because it's funny. (John, I think you should try deconstructing some actual mathematics. I, for one, would be amused.)

Incidentally, is the correct way to cite a blog informally like this by author, or by blog-title? In other words, if you were writing a blog post and linking to me, would you refer to "God Plays Dice" or "Isabel Lugo"? (Secret Blogging Seminar is a group blog, so I had to put Ben Webster's name in.)

18 February 2008

A thing I didn't know about Pythagorean triples

For my class, I needed some examples of Gaussian integers (i. e. numbers of the form a + bi where a, b are both integers) which have Gaussian integers as square roots. Of course, the easiest way to generate such examples is to start squaring Gaussian integers, which gives, for example,

(2+i)2 = 3 + 4i
(3+2i)2 = 5 + 12i
(7+4i)2 = 33 + 56i

etc. -- and you may notice that (3, 4), (5, 12), and (33, 56) are the legs of right triangles with hypotenuses 5, 13, and 65 respectively. In other words, (3, 4, 5), (5, 12, 13), and (33, 56, 65) are Pythagorean triples. (Okay, so you probably don't recognize (33, 56) as such a pair. But it is.)

Now, I've long known that Pythagorean triples -- that is, triples of integers (a, b, c) such that a2 + b2 = c2 -- can be written in the form (r2 - s2, 2rs, r2 + s2), and all primitive Pythaogrean triples can be written in this form for suitable integer choices of r and s. (r and s have to be relatively prime and not both odd.) When I first saw this -- I don't remember when, it may have been in a number theory course -- it seemed kind of strange. Sure, it's easy to check that any r and s give rise to a Pythagorean triple in this way, but where does it come from? But notice that

(r+si)2 = (r2 - s2) + 2rsi

and the absolute value of the right-hand side is

((r2-s2)2 + (2rs)2)1/2 = r2 + s2.

So Pythagorean triples fall naturally out of the arithmetic of the Gaussian integers... which I didn't know.

The back of the universe

Brent Yorgey is working on a book intended to introduce high-school-level students to mathematics through hands-on problem solving. (His working title is The Math Less Traveled.) In his preface, he writes:
If you're having trouble with a problem, don't look at the answer -- put it down and come back later. This is how 'real' mathematics works: there's no 'Back of the Universe' where mathematicians can go look up the answer when they can't solve a problem!
If there were a "Back of the Universe", of course, that still wouldn't be practical, because it would be really far away.

Some might argue that the "Back of the Universe" is in fact the universe itself, at least in problems that have some interpretation in terms of real-world things. But just like the back of the textbook, physical experiment only gives answers, not solutions. If you throw a ball into the air and see that its trajectory is parabolic, that's something -- but experiment can't generally tell you why something is true.

The other wonderful thing about "real mathematics" is that, unlike for homework, you don't have to do all the problems; I believe I recently came across something stating that if a research mathematician solves one out of every one hundred problems they look at, they're doing well. This counterbalances the fact that real problems are much harder than textbook problems. (I wish I could cite this claim; it's on one of the many "advice for graduate students in math" web pages out there.)

17 February 2008

Things I found from browsing David Aldous' web page

David Aldous has a list of non-technical books related to probability, complete with short reviews, and a much shorter list of technical books. This was constructed for his undergraduate seminar From Undergraduate Probability Theory to the Real World.

Also found from Aldous' page: Current and Emerging Research Opportunities in Probability, a report dating from 2002. This document summarizes views on that subject by the probabilists at a workshop with a similar name. Very short summary: probability is important -- here are a bunch of examples -- and we should make sure more people learn it. Since I talked about pretty pictures yesterday, I should point out that this report contains many pretty pictures.

A random fact about Aldous: very often when reading papers related to things he's done (which therefore cite him), the first reference in the bibliography is to one of his papers. This is, of course, because his name falls very early in the alphabet. This is true often enough that if I see [1] in the right sort of paper, and I just want to know the author of the work in question, I don't bother flipping back to the bibliography. It's very important here that he's at the beginning of the alphabet; the same phenomenon doesn't happen with a researcher at the end of the alphabet, because bibliographies are not all the same length. Also, papers with multiple authors tend to end up early in an alphabetized bibliography because there is something of a convention for collaborators to put their names in alphabetical order. (See, for example, the most recent submissions to the arXiv in mathematics. As that list stands right now, you'll have to go back a few pages to find a collaborative paper that's not alphabetized -- I was actually starting to think the arXiv might automatically alphabetize author names.)

16 February 2008

A theorem a day brings promotion and pay...

Within spitting distance of my apartment (okay, maybe not literally), tonight, will be the Artclash Collective's Fun-a-Day show. Basically, for the entire month of January, a bunch of people each do some creative thing each day, and then these are exhibited. (The exhibit takes place tonight, at Studio 34 Yoga, 4522 Baltimore Avenue, Philadelphia.)

Some friends of mine, this year, did Breakfast Pastry a Day (I helped -- by eating some pastry), SEPTA Haiku a Day (which is still ongoing -- the person in question commutes by SEPTA and I guess once you start making haiku it's a tough habit to break), Scrabble a Day (no link -- although I actually took part in this one by being an opponent), Internet Troll a Day (or so I've heard), and so on. (More conventional "art" like drawings, comic strips, songs, etc. also is pretty well represented.)

Unfortunately most of my creativity lies in mathematics... and theorem-a-day would just be too hard. I googled "theorem a day" and most of what comes up are variations on Erdös' quote "A theorem a day means promotion and pay; a theorem a year and you're out on your ear." (What does a theorem a week bring? Or a theorem a month? And does anything rhyme with month? The best I can do is "n-plus-oneth", which isn't a word in ordinary English -- but it is in mathematical English!) Although something like "proof-without-words-a-day" seems feasible. And a project involving pretty pictures, of course, has the advantage that the non-mathematical people in the audience (which, in a general audience, is most of them) could enjoy it too. But I tend to be the sort of mathematician who has much better pictures in my head than I can put on paper. Making a good mathematical picture is hard.

Then again, maybe it's precisely for this reason that I should be making more pictures that really do reflect what's going on in some mathematical problem. So come to Fun-a-Day in 2009... my work might be there! (We'll see how I feel next January.)

15 February 2008

Searching for mathification

Over at Language Log (which a friend of mine claims is very widely read among grad students at Penn -- and it does come from Penn -- but I read it before I came here), one often finds references to "linguification". This is the rhetorical device of expressing true things about the world in terms of false statements about language. "It snows a lot where Eskimos live" is true. "The Eskimos have eight zillion words for snow" is false, but if you say it, or in general "The language spoken by X has lots of words for Y", everyone knows that you mean "Y is important to X-speakers" -- even though it doesn't matter that such a fact might not actually be true. (The snow example is both linguification and a snowclone).

A frequent example is claims that certain words are often followed by certain other words, like "It's difficult to find a piece of writing in the mainstream press which mentions the word 'bisexual' without finding that it is immediately followed by the word 'chic'." The folks at Language Log don't like this much, in part because the write word there is not "difficult" but "trivial", especially in the age of Google.

Anyway, around the same time I came across that Language Log post, I came across "The knights who say "nerd": 20 pop-cultural obsessions even geekier than Monty Python", from The Onion's AV Club. It begins:
It's the elephant in the nerdy-obsessions room, and in the Venn diagram of nerd-dom, it may be the meeting point for everything else on this list, with good reason.
. Venn diagrams surface again later on in the article, when they're talking about Joss Whedon: "We need a Venn diagram for this one, too. (Maybe diagram-making deserves its own entry?)" Why do Venn diagrams always come up as what seems like a bad example of something that the author seems to think is mathematical? And who was Venn, anyway? So I'm starting to keep an eye out for what one might call "mathification" -- mathematical statements which occur in the popular media which clearly intend to get across a true point about the real world by making a false point about mathematics. I could swear I see this a lot, but I don't think I've seen it since seeing the Onion article ten days ago.

14 February 2008

Pretty pictures!

From MathTrek: Math on Display. Apparently there was an exhibition of mathematical art at the Joint Mathematics Meetings in San Diego last month. Some of it actually arises naturally in considering problems which actually arise. Since I mostly deal in discrete mathematics I don't see the really nice pictures too often in my own work -- they seem to arise more naturally on the continuous side of things -- although certainly there are a lot of results in discrete mathematics that are essentially limit theorems, which often give rise to nice "smooth" pictures.

Although it doesn't arise from a mathematical problem, my favorite of the pictures shown in that post is the picture of Sierpinski made out of tiles which are different iterations of the Sierpinski carpet. There's something deliciously self-referential about it. Now if only there were some set of pictures commonly associated with, say, Gödel...

13 February 2008

Scott Aaronson's "Great Theoretical Ideas in Computer Science"

Scott Aaronson writes on his blog that he's giving a course Great Theoretical Ideas in Computer Science. (Note that it seems impossible to see some of the course materials if you're not at MIT; I suspect this is intentional.)

He's posting his lecture notes online; see the first lecture, which addresses questions such as:
- what is a computer, anyway?
- how do you run an online gambling site, in such a way that your customers trust they're not getting cheated by your random number generators?
- how are the ancient Greek compass and straightedge a model of computation? (And why didn't the ancient Greeks know this?)

I look forward to following this.

Also, it appears that the lecture notes will be for the most part scribed by members of the class. This system is apparently widespread in computer science; why is this not done in mathematics?

Is Liouville's number interesting

Mark Dominus writes about uninteresting numbers. In particular he claims that Liouville's number,
\sum_{i=1}^\infty 10^{-i!} = .110001000000000000000001\ldots
is uninteresting; the only thing that's interesting about it is its transcendentality, and that's not a big deal, because almost all real numbers are transcendental. (In fact, "almost all" seems too weak here, to me, although it is technically correct in the measure-theoretic sense.)

But I think this number is interesting. Why? Because an expression that gives it can be written with a very small number of characters (bytes of TeX code, strokes of a pen, etc.) and most numbers can't be.

Of course, this means every number anyone's ever written down is interesting, by this definition -- even if it took them, say, ten thousand characters to define it. Say we work over a 100-symbol alphabet; then there are at most 10010000 or so numbers which can be defined in less than that many characters! (There are multiple definitions for the same numbers; most things one could write are just monkeys at a typewriter, and so on -- but this is an upper bound.) But this number is finite! The complement of a finite set is still "almost all" of the real numbers.

That last paragraph, I don't actually believe. But any number that can quickly be written down is "interesting" in some (weak) sense.

(By the way, "Liouville" is not pronounced "Loo-ee-vill". And I'm told that the city of Louisville in Kentucky is not pronounced this way either.)

edit, 10:22 pm: Take a look at Cam McLeman's The Ten Coolest Numbers. (The list, in reverse order, is: the golden ratio φ = 1.618..., 691, 78557, π2/6, Feigenbaum's constant δ = 4.669201..., 2, 808017424794512875886459904961710757005754368000000000, the Euler-Mascheroni constant γ = 0.577215..., the Khinchin constant K = 2.685252..., and 163.)

12 February 2008

Presidential science debate?

the Presidential Science Debate is scheduled for April 18, at Philadelphia's Franklin Institute. (The Pennsylvania primary is April 22.)

The New York Times asks: will the candidates come? A lot of people commenting there seem to think that it would be a bad move for a candidate to go, basically because they either have to claim that global warming and evolution are real (and thus anger the right) or that they're not (and thus anger the left). Yes, I'm caricaturing. But science shouldn't be a political football.

I would clearly support candidates going to this thing, if only because we may actually get an idea to what extent they're members of the reality-based community. (I'm still not endorsing a candidate, but you can probably guess I'm not endorsing Mike Huckabee.) And as a lot of people point out, such a debate will almost certainly include questions about scientific education; as an educator I'd like to see how those get handled. More funding for the schools. And stop sending us at the college level students who can't do algebra properly. This could be tied into funding -- from what I've heard a lot of the school teachers don't know how to do it, because teaching doesn't pay well enough to hire competent people.

Also, if anyone denies evolution on the grounds that there's no way a process based on "random chance" would create an organism, they will lose the all-important probabilist vote. (Okay, that might just be me.)

11 February 2008

The IRS quadratic formula

If the IRS had discovered the quadratic formula -- the algorithm for solving a quadratic equation, in the style of an IRS tax form. (This has been floating around for a while; I'm just reviving it.)

It seems silly here because everyone knows the quadratic formula. But the cubic formula is quite ugly, and takes up a couple lines; it's better expressed as an algorithm, which basically says to make a few changes of variables that give a quadratic and then solve the resulting quadratic. (Certainly if one were in the business of solving cubics, one would probably memorize the algorithm, not the formula.) And I don't even want to think about writing down an explicit formula for the general solution of the quartic in terms of radicals.

But tax forms are really just walking ordinary people through dealing with calculations involving certain equations. Sometimes I think taxation would make more sense to me if I actually saw the equations.

10 February 2008

Splitting permutations of [n] into two classes

The number of permutations of [n] with an even number of cycles and with an odd numbers of cycles are equal, for all n ≥ 2. (I assume this is a standard result, just because anyone who's stared at the Stirling numbers for long enough would think of it, but I don't recall hearing it before.)

Here's a proof which relies on bivariate generating functions. Note that a permutation can be viewed a set of directed cycles. In the notation of combinatorial classes, we write this
\cal{P} = \hbox{Set}(\mu \hbox{ Cyc}(\cal{Z}))

Now, there are (n-1)! cycles on an n-element set (order the n elements in n! ways, but then divide by n for the rotation) giving the generating function
z + 1! {z^2 \over 2!} + 2! {z^3 \over 3!} + 3! {z^4 \over 4} + \cdots

which simplifiies to
z + {z^2 \over 2} + {z^3 \over 3} + {z^4 \over 4} + \cdots

which is just log 1/(1-z).
The mark μ then becomes the variable u, and taking sets corresponds to exponentiation, so we have
P(z,u) = \exp( u \log (1-z)^{-1}) = (1-z)^{-u}

as an exponential generating function for permutations, where z marks size and u marks number of cycles. That is,
(1-z)^{-u} = \sum_{n,k} {1 \over n!} \left[ {n \atop k} \right] z^n u^k

where $\left[ {n \atop k} \right]$ is the number of permutations of [n] with k cycles.

Finally, let u = -1. Then we get
1-z = \sum_{n,k} {1 \over n!} \left[ {n \atop k} \right] z^n (-1)^k

and taking the coefficient of zn on both sides for any n ≥ 2 and multiplying through by n!, we get
0 = \sum_{k}  \left[ {n \atop k} \right] (-1)^k

But on the right-hand side, we get a contribution of +1 for each permutation of [n] with an even number of cycles, and -1 for each permutation with an odd number of cycles. Thus those two sets are equal in cardinality.

I feel like this is a standard result, but I hadn't seen this proof before coming up with it myself. There's a similar proof that doesn't require the bivariate generating function machinery; the generating function of the numbers $\left[ {n \atop k} \right]$ for fixed n is
\sum_k \left[ {n \atop k} \right] z^k = z(z+1)(z+2) \ldots (z+n-1)

and let z = -1 here. And I may have even seen a bijective proof of this fact, which gives a bijection between permutations of [n] with an odd number of cycles and an even number of cycles.

By the way, the analogous result that, say, one-third of permutations of [n] have a number of cycles divisible by 3, one-third have 3k+1 cycles for some k, and one-third have 3k+2 cycles for some k isn't true. (It should be approximately true, though, but that doesn't seem particularly interesting.)

Edited, 3:24 pm: See the comments for a bijective proof offered by Jeremy Henty; it really is quite simple but eluded my grasp this morning.

09 February 2008

A mathematician goes to the post office

There's a pretty standard theorem that one proves at the beginning of, say, a differential geometry course -- that any (n-1)-dimensional manifold embedded in n-dimensional Euclidean space is locally the graph of some function. (To any geometers are reading this, I apologize if I haven't stated this correctly.)

This morning I had to go to the post office to pick up a piece of certified mail. I live on one side of a large set of railroad tracks; the post office is about a mile away on the other side of the tracks. I generally don't go past the tracks, because very little that interests me is there, and I hadn't looked at a map before setting out. It's important to know where the tracks are when walking around in that neighborhood, because there are a lot of streets that exist on both sides of the tracks but are interrupted by them. The street on which the post office is, Florence Avenue (links goes to Google satellite imagery) may be such a street; it exists on both sides of the tracks, and doesn't cross over the tracks. It might cross under; Google Maps won't tell me, and I'm not curious enough to head out there again and look. So the scheme of "follow the street with the same name" doesn't always work.

I knew the tracks run basically from northwest to southeast. (For those who know where I live, or who are looking at the map I linked to: all directions are "nominal", i. e. by "west" I mean "the direction in which the street numbers get higher".) At one point on the way back I crossed a bridge and I was wondering "could that have been the tracks I just walked over?" Then I thought "hmm... I'm further north than I was when I crossed the tracks heading to the post office, so the tracks must be further west here."

Why must they? Because the tracks are basically a one-dimensional manifold embedded in two-dimensional Euclidean space. So locally there's a function mapping east-west streets to north-south streets that gives the position of the tracks. Everyone knows this intuitively. If you asked someone who knows that neighborhood well "Where does [insert street here] cross the tracks?" you'd get an answer, even if they're totally ignorant of geometry.

08 February 2008

One hundred thousand! And the Collatz conjecture

As of 8:30 PM last night, this blog has received one hundred thousand hits. Thanks for reading! It really means a lot to me.

The lucky winner, who receives e+1 dollars, is someone from Atlanta, Georgia, who looked at this post from December about a Putnam problem; they came there from a google search on "3K+1 probability problem". The post in question probably didn't help them; it was just a post that involved the notation "3k+1" and talked about probability at one point or another. (A lot of my Google search strings seem to be like that -- people looking for pointers on homework with somewhat misguided search strings. It's not surprising they're misguided, since they don't know the subject that well! I'm sure I've committed the same offense at one point or another.)

I'm assuming that this person was looking for something about the Collatz conjecture. This conjecture is as follows -- let a0 be a positive integer. Let an+1 = an/2 if an is even, and an+1 = 3an+1 if an is odd. Then it appears that the sequence eventually reaches the number 1, at which point it repeats 1, 4, 2, 1, 4, 2, 1, ... forever. For example, if a0 = 22 we get

22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1.

This statement has been checked up to some ridiculously high number. The Wikipedia article is interesting. It seems like this theorem should be true for the following reason: let {bn} be the subsequence of odd numbers obtained when running the Collatz recursion; so in the above example we have

11, 17, 13, 5, 1.

Note that we get from 11 to 17 by multiplying by 3, adding 1, and dividing by 2. We get from 17 to 13 by multiply by 3, adding 1, and dividing by 2 twice. In general, bn+1 i s the largest odd factor of 3bn + 1. Half the time, 3bn + 1 only has one factor of 2 in its prime factorization; then bn+1 is about 3bn/2. One fourth of the time, there are two factors of , and bn+1 is about 3bn/4. In general, the "geometric expectation" of bn+1/bn (that is, the expectation of the mean of its logarithm, which is what's relevant here since the effects we care about are multiplicative) is
\left( {3 \over 2} \right)^{1/2} \left( {3 \over 4} \right)^{1/4} \left( {3 \over 8} \right)^{1/8} \cdots = {3 \over 4}

(Yes, it's exactly 3/4. Don't believe it? Well, the product can clearly be rewritten as
{3^{1/2} 3^{1/4} 3^{1/8} \cdots \over 2^{1/2} 2^{2/4} 2^{3/8} \cdots}

and the exponents in the numerator add to 1; the exponents in the denominator add to 2.

So "on average" bn+1 = 3bn/4. Thus the sequences shouldn't diverge. If I could make that "on average" made sufficiently rigorous my name would be trumpeted from the rooftops, or so I'd like to think. Perhaps the eventual proof will not be of this particular conjecture but of some more general conjecture which would consider other recursions where an+1 = kan + l, where k and l are rational constants depending on the residue class of an modulo some integer and are chosen so that an+1 will always be an integer. For example, we could take an+1 = an/5 when that's an integer, and an+1 the nearest integer to 7an/5 otherwise; the analogue to 3/4 is around 0.6537... here. (I think. I may have miscalculated it.) I have no idea how that behaves -- I just made it up. But on the other hand the eventual proof could hinge on peculiar properties of, say, binary expansions, in which case it would be specific to the original problem of Collatz.

This may or may not help the person who gave me the hundred thousandth hit. I'm surprised I'd never actually talked about the Collatz conjecture here before; I thought I had, but in retrospect it seems that I just thought about it on the way to school one morning and never wrote down what I was thinking.

07 February 2008

How to load airplanes faster

Optimal boarding method for airline passengers (arXiv:0802.0733), by Jason Steffen. Via Cosmic Variance. Apparently Steffen is a physicist who was sufficiently annoyed while boarding airplanes to start thinking there has to be a better way.

The conventional method (boarding from back to front), although it looks efficient from the point of view of someone in the airport, is quite bad; Steffen describes it as the second worst method, after the obviously bad front-to-back boarding If you stop to think about it you realize that the real sticking point is not getting onto the plane, but loading luggage into the overhead compartment. So the trick is something to parallelize by, say, boarding all window seats first, then all middle seats, then all aisle seats. Heuristically, you'd expect a speedup factor on the order of the number of people sharing each overhead compartment. This isn't exactly Steffen's method -- his simulations show that it's best to divide the people on the plane into four groups, which correspond to people in even- or odd-numbered rows on the left and right sides, and board each group in turn -- but his simulations as far as I can tell treat the entire plane as one long row. Still, either of these makes you realize that there's work to be done.)

Plus, if you board people so that the people loading their luggage at the same time aren't on top of each other, then you get less people hitting each other with their bags. Everybody wins! Except perhaps the people in the aisle seat, in my scheme -- since they get to load their bags last, and people try to get some pretty big carry-ons on planes these days, there might not be room in the overhead compartment for them. But that's got to happen to somebody.

Journal of the Empty Set, or how to fail your defense

Mark Dominus writes about a notional "Journal of the Empty Set", which would publish papers on results about mathematical objects that don't actually exist. He asks: "But on the other hand, suppose you had been granted a doctorate on the strength of your thesis on the properties of objects from some class which was subsequently shown to be empty. Wouldn't you feel at least a bit like a fraud?"

There's a (perhaps apocryphal) story that's made the rounds in my department about one possible way to fail your thesis defense. Namely, it's said that a student went to their defense and began by defining the sort of group they were studying. There was a rather long list of conditions, and one of the examiners quickly proved there was no such group. The student therefore failed.

The good thing about combinatorics? That sort of thing doesn't happen. Usually we can actually construct explicit examples of the things we're talking about.

05 February 2008

Concentration of measure, or, God plays dice.

Today (well, technically, yesterday) I came across Concentration of Measure for the Analysis of Randomised Algorithms, a draft of a monograph by Alessandro Panconesi and Devdatt Dubhashi. I'm giving an expository seminar talk on, basically, the concept of negative dependence next week. There are certain sets of random variables that are inversely correlated, in the sense that if one of these variables gets larger (for example, if the variables are indicator variables, the indicated event gets more likely) then the other variables get smaller, and this can be made precise. The prototypical example occurs when balls are thrown into bins, and there's one random variable for each bin, which is 0 when the bin is empty and 1 when the bin is occupied. (Bins can be multiply occupied in the example of thinking of; the case where only single occupation is allowed should also show negative dependence, but it's not as interesting.)

Not surprisingly, if you have a large collection of such random variables, and you add them all up, the distributions you get tend to be pretty tightly concentrated -- intuitively you think they should be more concentrated than they would be if you add up a bunch of independent random variables. I don't know of formal results in this direction, but at the very least they're not less concentrated than a sum of independent random variables. (In the extreme cases, the sum is constant, so we get concentration on a point.) This particular result is a Chernoff bound for negatively dependent random variables.

I don't want to go into the details here. (I'll probably try to make a blog post summarizing the talk, but the usual procedure for these things seems to be that the post should follow the talk, and the talk isn't quite ready yet.) But I just want to give a quote (edited for typos) from the draft monograph I linked to above:
In more sophisticated forms, the phenomenon of the concentration of measure underlies much of our physical world. As we know now, the world is made up of microscopic particles that are governed by probabilistic laws – those of quantum and statistical physics. The reason that the macroscopic properties determined by these large ensembles of particles nevertheless appear deterministic when viewed on our larger scales is precisely the concentration of measure: the observed possibilities are concentrated into a very narrow range.

In three words? God plays dice.

Perhaps the last post about delegate allotment

The Democratic primaries all allocate delegates proportionally; a fair proportion of the Republican delegates are winner-take-all.

The conventional wisdom seems to be that that means that "all other things being equal", the Republicans will choose a nominee sooner than the Democrats, because a Republican candidate can build up a large lead in delegates more easily than a Democratic candidate.

But the arithmetic works both ways. A Republican candidate can build up a larger lead in delegates... but a lead in delegates of, say, 10% of the total number of delegates is a lot less safe for a Republican. I suspect if one analyzed it properly -- asking questions about how the probability of a given candidate winning changes during the primary season -- the two systems wouldn't seem all that different.

A system like the Republican one in which some states are winner-take-all and others are proportional, though, just seems too asymmetrical to be stable. Quite a bit probably depends on whether the election is close... so in the end it probably turns into a question where candidates and their people try to argue for one method or another on ideological grounds when they're really just trying to calculate what makes them most likely to win. I suppose you can't blame them for trying.

Maximum entropy distributions

Maximum entropy probability distributions, from Wikipedia.

For example, the normal distribution has maximal entropy among all distributions with a given mean and variance; the exponential distribution has maximal entropy among all distributions with positive support and a given mean; the uniform distribution has maximal entropy among all distributions supported on an interval.

These are all special cases of a theorem due to Boltzmann of which I was surprisingly unaware.

Win probabilities in elections

What's the probabiltity that Romney is leading in California?, from Ian Ayres at Freakonomics. Ayres makes the good point that even when a race is called a "statistical tie" or a "statistical dead heat", which seems to be a common term these days for polls where the two candidates are polling within the margin of error of each other, there's still information to be had. Just because a 2-point lead is within the margin of error doesn't mean that a candidate would freely trade being 2 points down for being 2 points up.

04 February 2008

Political parity

This blog will not endorse a candidate in the US presidential election. (That will change if somebody can convince me that one or more of the candidates is particularly good or bad for mathematics.)

There appears to be some confusing mathematics behind the way that delegates in the presidential primaries are allotted, at least when there are more than two candidates getting delegates, which isn't the case in either party any more. (Basically, partial delegates can't be awarded, so how do you handle the rounding if there are, say, 24 delegates to allot and three candidates appear to be entitled to, say, 7.7, 9.7, and 6.6 delegates? You have to round up for two of the three -- but which two? This problem also occurs in congressional reapportionment, and I suspect similar solutions are used.)

Anyway, in a lot of states delegates are apportioned by congressional district; according to Slate, California's Democratic primary is an example, where each congressional district gets between three and six delegates. (Since congressional districts have equal population, I can only assume that the districts with more delegates are the ones with a greater proportion of Democrats. That article says the maximum is seven, but they link to a chart that only goes up to 6.)

There are no rounding problems with two candidates. A district with six delegates, for example, would have two delegates going to Obama if he gets between 1.5/6 and 2.5/6 of the vote; three delegates if he gets between 2.5/6 and 3.5/6 of the vote; four if he gets between 3.5/6 and 4.5/6 of the vote; and so on. (Similarly for Clinton.) So if the district is close, the delegates get split 3-3. In a district with five delegates, by contrast, there can be no tie.

The Slate article that I linked to claims this disenfranchises people in districts with even number of delegates -- which may be true, although I don't have district-level polling data. The state of California as a whole may be close, but that doesn't mean that every district is close. Still, it's interesting to think that the odd-versus-even distinction could have actual ramifications. (In some sense, though, one expects them to all cancel each other out.)

I would argue that perhaps we should do away with this system of having integer numbers of delegates; why can't there be fractional delegates? The mathematics would not be that hard. (Or use larger integers; say each delegate gets an integer number of votes at the convention which is the actual number of people which led to them being seated at the convention. In the end these would be equivalent, because any fractional delegations would almost certainly have sizes which were rational numbers.) I'd make the same statement for electoral votes. I understand that we have to have whole numbers of representatives and senators because no state's going to accept that it gets, say, 2.7 representatives -- but why not give the state that has population 2.7 times that of the average Congressional district 4.7 electoral votes?

03 February 2008

OEIS is down

The Encyclopedia for Integer Sequences appears to be down.

This is very unfortunate, because there are some things I had to figure out for myself instead of just looking them up. (I know, that's not such a horrible thing.)

How to extract bits from coin flips

For the first lecture of Michael Mitzenmacher's Algorithms and Data Structures class this semester, he gave a lecture on how to extract "fair" randomness from a biased coin. That is, given a coin which is possibly unfair, how can we use it to simulate a fair coin?

The canonical first solution is attributed to Von Neumann -- flip the coin twice. If the output is HT, call the result 0; if it's TH, call the result 1; if it's HH or TT, repeat. It turns out that this scheme can be extended, which isn't surprising since we just threw out quite a bit of information. One starts by saying that if the first four flips turn up HHTT or TTHH we call the result 0 or 1, respectively; this means that there's only a one-in-eight chance that we don't get a bit out of the first four flips. Not surprisingly, this can be iterated.

Furthermore, if we consider the problem of extracting a stream of bits 0 or 1 which are independent and identically distributed, with probability one-half of being 0 or 1, a more complex iteration scheme can actually do maximally well. That is, if the heads probability of the original coin is p, it's possible to get H(p) = -p log2 p - (1-p) log2 (1-p) bits per coin flip, which is the binary entropy of a single flip of the coin and thus the best one could hope to do.

02 February 2008

How many fixed points do involutions have?

It's a fairly well-known fact that if we pick a permutation of the set [n] at random, then the expected number of cycles of length k in that permutation is 1/k, for 1 ≤ k ≤ n. (Somewhat more memorably, the average number of elements in a permutation of [n] which are members of a k-cycle is 1.)

So what would you expect to happen if you just considered permutations have cycles of length 1 and 2 -- that is, involutions -- and sampled uniformly at random from them? If you're like me, your first thought is that the average involution will have twice as many 1-cycles as 2-cycles, and the same number of elements in 1-cycles as 2-cycles -- that is, the average involution on [n] will have n/2 1-cycles (i. e. fixed points) and n/4 2-cycles, for a total of n/2 fixed points and n/2 elements in 2-cycles.

But then you look at "n/2 fixed points" and think that that seems awfully large...

it turns out the average number of fixed points of an involution chosen uniformly at random from all involutions is about n1/2. This follows from standard generating function arguments. The exponential generating function of the number of involutions marked for their number of fixed points is exp(uz+z2/2); that is, the coefficient of znuk in that function is n! times the number of involutions on [n] with k fixed points. Standard methods from, say, Flajolet and Sedgewick (which I will probably buy when it comes out in print later this year, because I seem to cite it constantly) gives that the expected number of fixed points is
{[z^n] z \exp \left( z + {z^2 \over 2} \right) \over [z^n]  \exp \left( z + {z^2 \over 2} \right)}

and this can actually be rewritten as nan-1/an, where an is the number of involutions on [n], that is, $a_n = n! [z^n] \exp \left( z + {z^2 \over 2} \right)$. (There's a nice interpretation for this -- an-1/an is the probability that any given element of an involution is actually a fixed point -- although it's hard to say exactly why this should be true.)

Then, if you're still like me, you think "alas, I have forgotten how to figure out the asymptotics of coefficients of entire functions". But the asymptotic number of involutions is the last example of Herbert Wilf's generatingfunctionology. After some work, the asymptotic formula he gives for an gives that the expected number of fixed points in an involution is n1/2 - 1/2 + o(1)

Once you know that fixed points are rare, then it's not hard to guess that their distribution should be approximately Poisson, and thus variance should be of the same order of magnitude as the mean -- and the variance result turns out to be true. (I don't know about the Poisson result.) The variance is, I believe, n1/2 - 1 + o(1), although this is only from numerical evidence. (The generating-function way to calculate the variance relies on the definition of the variance as the mean of the square minus the square of the mean; this means I need better asymptotics in order to verify this. The better asymptotics are certainly achievable, but they're not at my fingertips.)

The result is a bit surprising, though -- why does cutting out cycles of length 3 and greater so drastically change the relative numbers of 1-cycles and 2-cycles? But involutions make up a vanishingly small proportion of all permutations, and weird things can happen in these asymptotically negligible sets without the bulk of the population caring at all.

Mathematical infallibility

In proving the fundamental theorem of arithmetic to my students, I was establishing the fact that any number has a factorization into primes. The proof goes as follows:
By way of contradiction, say there are positive integers without prime factorizations. Then there is a smallest such integer; call it N. N is not prime, because then it would have a prime factorization. So N has some divisor a such that 1 < a < N, and we can write N = ab for some integers a, b greater than 1. By assumption, N was the smallest integer without a prime factorization, so a and b have prime factorizations and we can concatenate these to get a prime factorization of N.
I want to bring your attention to the bolded "we can write", which I definitely said while presenting the proof. (The other language might not be exactly what I used.)

Sure, we can write that. But we can also write "2 + 2 = 5". Or we could have just written "N = ab" at the beginning When a mathematician says "we can write X" for some statement X, they mean something like "X is true, for suitable values of some variables which might be contained in X that we haven't mentioned yet, and which we'll talk about now."

In short, mathematicians are only capable of writing true things, or so we'd want people to think from our writing. If only it were so easy!

01 February 2008

Mathematica est omnis divisa in partes tres

That is, "all mathematics is divided into three parts". I don't actually know Latin. I fear I may have said that Mathematica, the software, is divided into three parts.

This occurred to me this morning, when I saw the following three-fold characterization of the conic sections, which may amuse. (I'm basically copying this from Leila Schneps' lecture notes.)

Conic sections can be defined geometrically, analytically, or algebraically.

  • Geometrically: a conic section is the curve obtained when a plane intersects with a cone. The conic section is an ellipse, parabola, or hyperbola according to whether the plane is less steeply slanted, exactly as steeply slanted, or more steeply slanted than the cone's generating line.

  • Analytically: given a point F the focus and a line D the directrix, a conic section is the locus of points P such that d(P,F)/d(P,D) is a constant e, the eccentricity. The conic section is an ellipse, parabola, or hyperbola according to whether the eccentricity is less than, equal to, or greater than 1.

  • Algebraically: a conic section is the solution set of Ax2 + Bxy + Cy2 + Dx + Ey + F = 0. The conic section is an ellipse, parabola, or hyperbola according to whether B2 - 4AC, the discriminant, is negative, zero, or positive.


The three definitions are equivalent.

It's fitting that the quote is in Latin, because I don't know Latin, and I also don't believe that mathematics is divided into only these three parts. In particular, you all know that I like probability and combinatorics, which don't naturally fit into this tripartite scheme. But it's a nice division of what one might call "classical" mathematics (it's roughly the content of the semi-standard first-year graduate mathematics curriculum, for example).

Calling the second definition "analytic" is a bit of a stretch, though.

Sometimes crowds are stupid.

Chad Orzel of uncertain principles reminds us that "The 'point spread' for a football game is set at the level required to get equal numbers of bets on the two teams." Furthermore this is not the consensus of "experts". And since gambling serves as a form of entertainment for a lot of people, there are probably a lot of people that bet on the team they like, not the team they think will beat the spread. So you can't even say that the "wisdom of crowds" applies in situations like this. Sometimes crowds are stupid, especially when they have an emotional stake in the matter. (And not just in sports betting! Consider the current housing bubble, or the dot-com bubble.)

And it's not entirely clear that bookies even always act in the way Orzel says they do; they act to maximize their expected profits, which turns out to not be the same. (See Steve Levitt's paper for more detail.) Not surprisingly, this means there's a winning strategy for betting on football, or at least it looks like there is -- bet on home underdogs, or so Levitt says. (They tend to be undervalued.) I'd do it, but I'm pretty risk-averse when it comes to my own actual money, as opposed to other people's expected-value money.