30 September 2008

Organization of papers

I'm currently attempting to organize a paper out of a bunch of notes I've built up recently; a possibly useful suggestion I received is to write each theorem, definition, etc. on an index card, so that I can physically move them around to figure out how the paper should be organized.

Of course, definitions have to come before the theorems that use them, some theorems use other theorems in their proofs, and so on -- so to the extent that I'm remembering to do so, I'm indicating these sorts of dependencies on the index cards as well.

It occurs to me that what I am doing here is trying to extend a partial order (the ordering that comes from the dependency) to a total order. There are of course constraints on this order; certain results, although not logically related, are related in some philosophical sense and should perhaps be kept near each other. It's actually an interesting optimization problem.

Now if only I were writing a paper about extending partial orders to total orders...

(But my paper does talk quite a bit about permutations. And a total order will end up being a permutation of my index cards.)

Aldous on probability and the real world

In April, David Aldous gave a talk on the top ten things that math probability says about the real world.

My favorite (i. e. the one I hadn't really thought of before) was that news headlines are only interesting in real time. If you woke up twenty-five years from now and decided to read the last 25 years worth of news headlines, you probably wouldn't learn much. It's a good bet that at some point in that period the stock market had a really bad day, and will it matter, in 2033, that on September 29, 2008 the Dow lost 777 points? The trends are more important, but you can't see them from looking at individual headlines; you can only see them by looking at how many of different types of headlines there were.

Compare, for example, the fact that it's difficult to determine who's good at hitting in baseball by watching games; the difference between a good hitter and a poor one might be, say, one hit every five games, which is too small to measure without actually counting how many times they do various things.

28 September 2008

Erdos and grapefruits?

From Paul Graham, How to Start a Startup (and no, I'm not looking to):

People who don't want to get dragged into some kind of work often develop a protective incompetence at it. Paul Erdos was particularly good at this. By seeming unable even to cut a grapefruit in half (let alone go to the store and buy one), he forced other people to do such things for him, leaving all his time free for math.


I don't use this strategy. But I do use the strategy of cooking things in ridiculously large batches, which isn't much more work than small batches; then when I want something to eat I just fire up the microwave.

(I'm not sure if I can cut a grapefruit in half; to do that without struggling might require some special knife I don't have. I haven't tried, because I don't like grapefruit.)

25 September 2008

Best-seller impatiemment attendu

From "Quelques contributions au carrefour de la géométrie, de la combinatoire et des probabilités", by Nicolas Pouyanne:

"Dans leur best-seller impatiemment attendu [28], P. Flajolet et R. Sedgewick offrent un développement approfondi de la méthode symbolique en analyse
combinatoire, auquel on pourra se référer."

That is, "In their impatiently awaited best-seller [28], P. Flajolet and R. Sedgewick offer a detailed development of the symbolic method in combinatorial analysis, to which one will be able to refer."

I don't laugh often while reading mathematics, but this made me laugh, because I am among those impatiently awaiting this book. I've seen citations to Flajolet and Sedgewick's book Analytic Combinatorics in things written as long ago as 1998 or so; Amazon.com says the book will be out on December 31, 2008, and the publisher, Cambridge University Press says December 2008. It can be downloaded from Flajolet's web site.

Edited, January 22, 2009: my copy of this book arrived yesterday.

24 September 2008

Ranking graduate departments

On Some Random Webpage Full Of Spam, I came across what purports to be US News' 2009 ranking of graduate programs in mathematics. (I feel bad about linking to this, because it just helps the spammers, but I'm doing it anyway.)

If I remember correctly, this ranking is produced by surveying people in mathematics departments at various schools and asking them to rank other institutions. That's it.

It seems to me that a more sensible way of ranking mathematics departments would be to start with the assumption that a better department is, by definition, one which has its students get hired by better departments. This could work, at least to do the high end of the ranking, because the departments doing the hiring are often the same departments that have graduate students; eventually I expect the process I'm alluding to would converge on a ranking. I'm not sure what you'd do to deal with ranking "lesser" programs where many students are not hired by departments which themselves have doctoral programs. This just aggregates what people think about reputation, but in a way that's more principled than just asking some shadowy panel thinks.

Of course, in the end these sorts of rankings are not particularly valuable, so I don't want to pursue this any further.

(Disclaimer: my memory of how these rankings work comes from standing in a bookstore four years ago and copying down that year's rankings on a scrap of paper. I think the statue of limitations has passed on that, so I'll admit to it.)

22 September 2008

In which I tie baseball to math, sort of

One of our first years: "Baseball is made by God. Other sports are made by man."
Me: "So baseball is like the integers?"

19 September 2008

A surprise from differential equations

A problem which circulated among the grad students here at Penn today, which came up while somebody was teaching differential equations:

Consider the differential equation dy/dx = (y-1)2. Separate variables and integrate both sides as usual; you get y(x) = 1 - 1/(x+C), where C is determined by the initial condition. Take the limit as x goes to infinity, and you get limx -> ∞ y(x) = 1, regardless of C.

But now notice that dy/dx is positive for all y. (We're working over the reals here.) So if the initial condition is of the form y(x0) = y0 for some y0 > 1, then we start at 1 and keep going upwards; how can the limit be 1?

Of cousre there's a mistake somewhere in here. But where is it? (I know the answer, but it took annoyingly long to figure out.)

edit: the differential equation was wrong.

17 September 2008

Predicting the next Mersenne prime: update

A few weeks ago I attempted to predict the size of the newly reported 45th Mersenne prime by extrapolating from the trends on previous Mersenne primes; I liked 14.5 million digits.

Brent Yorgey reports that it's actually about 11.2 million digits; the 46th prime (which hadn't been discovered when I made that post, but was reported to exist a few days later) has just under 13 million digits.

Oh well, I was wrong. Good thing I didn't put any money on it.

The Journal Of Stuff I Like

Chad Orzel writes about The Journal of Stuff I Like. Basically, the idea is that you're reading papers anyway; just make a list of the ones you found worthwhile and put it where people can read.

This certainly seems like it could be a useful thing. If you liked one paper I liked, you might like other papers I like.

The dual, of course, would be to compile a list of "people who liked this paper". And from there it's a short step to "people who liked this paper also liked...", and if there's some centralized system keeping track of things, amazon.com-style recommendations, which have on occasion found me books that I liked but wouldn't have known about. (It's a short step conceptually; I'll admit that I don't know too much about how big of a step it is computationally.) But even without these extra steps in the previous paragraph, it could still be worthwhile.

I might start a list. Like I said, it wouldn't be hard. I already have a list of papers I've read, which is reasonably complete over the time period I've been keeping it.

16 September 2008

Something interesting about libraries

At least two volumes of archival versions of Donald Knuth's papers, Selected Papers on Discrete Mathematics and Selected Papers on Analysis of Algorithms, are kept in Penn's Van Pelt library, which for the most part houses the humanities and social science collections. (This occurred to me while I was going to the library today to pick up the second of these.)

Of course, the journals in which these papers were originally published, and his other books, are for the most part in the Math, Physics, and Astronomy library, which isn't important enough for any rich benefactor to have the naming rights to. (Are you wondering why those three subjects share a library? I think it's because they share a building.) I know for sure that Mathematics for the Analysis of Algorithms, Concrete Mathematics, and The Art of Computer Programming live in the math library. (Okay, to be totally honest, the library's copy of Concrete Mathematics lives in my apartment.)

I suspect this is because books of "selected papers" are seen as historical documents; the books on mathematics, broadly defined, that are shelved in Van Pelt are books on the history of mathematics, mathematics education, the philosophy of mathematics, etc.

15 September 2008

Doogie Howser on the Copernican principle

"You never make plans with a girl for further in the future than you've been going out." -- Barney Stinson, the Neil Patrick Harris character on How I Met Your Mother. (Neil Patrick Harris played Doogie Howser on the show by that name.)

This is an example of the "Copernican principle", which I've written about before -- if you assume that there's nothing special about now, you're equally likely to be in the second half of the relationship as the first. And if you're in the second half of the relationship, and you violate this rule, you'll have broken up before the plans happen.

14 September 2008

Nautilus shells aren't golden spirals

You know how everybody says that the nautilus shell grows in a golden spiral, i. e. a logarithmic spiral that grows at a rate of φ = 1.618... per quarter turn? (Formally, it's the graph of r = φ in polar coordinates.)

Turns out it's not true. (This is also why I'm not worrying about saying who "everybody" is.)

I'm not surprised; a lot of the places where φ shows up in nature seem to be related to the fact that it's the "most irrational number" (i. e. the one that's hardest to approximate accurately by rational numbers, which follows from the fact that its continued fraction expansion consists entirely of 1s). This is useful for, say, growing leaves on a tree; if the angle between successive leaves is φ-1 revolutions then leaves end up not on top of each other, and therefore not blocking the sun. I can't see any reason why a shell should grow like that, though I could be suffering from a failure of imagination, and of course I am not a biologist. A logarithmic spiral with any growth rate is self-similar, though, and self-similarity is everywhere in biology.

Of course, technically I should go out and find my own shells and do my own measurements, and not just trust Some Guy On The Internet. But it's Sunday morning, and I really shouldn't even be awake yet, so cut me some slack.

Ivars Peterson at Science News wrote about this a few years ago, it seems. A more reputable source seems to be The Golden Ratio: A Contrary Viewpoint, Clement Falbo, The College Mathematics Journal, March 2005, pp. 123-134; among other things Falbo measured some actual shells, and the growth rate for a typical shell appears to be about 1.33 per quarter-turn, although there's a pretty wide range, but the growth rate never seems to get as large as φ. The mathematical study of mollusk shells from the AMS web site might also be of interest

11 September 2008

Oded Schramm obituary at NYT

Yesterday's New York Times included an obituary of Oded Schramm, who died on September 1, at the age of 46. I don't know Schramm's work, but I do know that the obituary did not have me screaming things like "That's not how mathematics actually works!". Mainstream media coverage of mathematics often has such an effect on me, so the obituary is at least decently written.

There's a memorial page; if you knew him (I didn't) you probably already knew that, though.

10 September 2008

The blackboard of anonymous questions

There are blackboards in the common room of my department. They don't see much use.

We were joking at tea today that people ought to write problems they want solved or questions they have on the blackboard, and other people could write the answers. Surely this has been tried somewhere. How did it work?

09 September 2008

Will McCain raise taxes?

Greg Mankiw calculates the probability that McCain will raise taxes, using data from the Intrade prediction markets, which have contracts for who will be elected president and for what tax rates will be in the future. It seems pretty likely.

Of course, the error on these markets is pretty high, and the calculation requires subtraction, which just amplifies these errors. But it's an interesting thought. (And I have to admit I've played around with trying to extract conditional probabilities from prediction markets myself.)

08 September 2008

The oil painting metaphor

From Steele this morning, a metaphor for mathematics I hadn't heard before: mathematics is like an oil painting. Basically, people doing oil paintings start by making a very rough sketch of the painting and then progressively build up the details of the figures. (I've never painted in oil, so correct me if I'm wrong.)

Mathematics is similar. In research one only starts out with a vague idea of the result and then progressively refines it; in teaching one first gives a sketch of an argument and then comes back and fills in the details. Teaching was the context here; often in classes which depend on measure-theoretic probability, which this is, we first give a semi-formal proof of a result and only later come back and fill in the σ-fields, justify the magic words like "dominated convergence", and so on.

Compare perhaps Hackers and Painters by Paul Graham, which compares the two title groups.

Doubly mutilated chessboard problem: not the solution

On Friday I wrote about the mutilated chessboard problem.

I'll give the solution now. The reason this problem is usually phrased in terms of a "chessboard" is that a chessboard has its squares alternately colored white and black. (I'm going to call the colors white and black even if that's not actually what they are.) A chessboard has the same number of squares of each color, and any domino covers one white square and one black square. So if we remove two squares of the same color, such as the opposite corners, then tiling is impossible.

It turns out that if you remove any two squares of different color, leaving 31 squares of each color, the board can be tiled with dominoes. Drini gave the solution; draw any tour that goes through all the squares exactly once and finishes where it began. Removing two squares of opposite colors breaks this into two subpaths of even length, which are easily tiled.

As for the other problem I asked, when can you remove four squared and have there be a tiling: clearly we need to begin by removing two squares of each color. There is another necessary condition for there to be a tiling -- the squares removed can't block off a single corner square from the rest of the board. (You might worry that they could disconnect the board in some other way, but the only other way possible under the tiling constraints is that they disconnect a domino-shaped region.) But then the tour argument above doesn't work; in particular, a tour is broken up into four subpaths of even lengths if and only if the removed squares occur on it in the order black-white-black-white. Furthermore, it appears that there are ways to remove four squares from the four-by-four board such that the remaining twelve are tileable, but there is no tour that they break up into even-length subpaths. There aren't that many tours of the four-by-four board, so it's easy to check this by brute force with a simple program; in fact there are six according to Sloane's encyclopedia. (It says twelve there, but Sloane refers to directed tours.)

However, that cuts down the number of cases one has to deal with, and in the end it looks like a four-by-four chessboard, with four squares removed, is tileable if and only if the removed squares are of opposite color and do not include both of the squares adjacent to any corner.

My method (which I realize I haven't fully explained here; if anybody wants to seriously tackle this problem for some reason I'll be happy to explain in more detail) requires a bunch of nasty case analysis, though. And in the case of the 8-by-8 board there are millions of tours, so it doesn't generalize well.

07 September 2008

First number not in Google?

Walking Randomly asks: what's the smallest positive integer that gives no Google hits?

I don't know exactly, but it's in the high eight digits. Why, you ask?

Well, I searched a series of numbers, starting with 1 and roughly doubling. This gave the following data:




Search stringNumber of hits
13033319 158
26066638 37
52133277 17
104266555 12
208533110 4

(Each number in the left column is either twice the previous one, or twice the previous one plus one.)

So here's what I'm thinking; numbers around 26 million, have, on average, 37 google hits. Since containing a given number is a Rare Event, I'm guessing that we can treat the number of hits of numbers around that magnitude as Poisson with parameter 37. The probability that a Poisson(37) random variable is equal to zero is exp(-37), or about one in 1016. So the first integer with no Google hits is probably larger than 26 million.

But by the same argument, numbers around 52 million have probability exp(-17), or about one in 25 million, of having no Google hits. So we expect one between, say, 52 million and 77 million.

And by the time we get up near 100 million, we should be seeing these numbers at a frequency of one in exp(12), or six in a million; they should become commonplace.

To get a better model, I'd take more samples. The number of Google hits for a number seems to follow a power law; the number of hits for n is a constant times n for some exponent α, somewhere between 1 and 1.5. (There are issues around saying that things follow power laws, though; it's easy to see them even when they're not there.) And there are various complications -- for example, powers of ten and powers of two are more common than the numbers around them. And how do we know that the occurrence of a number on various web pages is actually independent? To be honest, we don't; if a number exists on a web page, it's there for a reason, and if one person has something to say about it, why not someone else?

06 September 2008

A lexicographical copout

Definitions of "set" (as a noun) in the OXford English Dictionary include:

Math. Used variously, as defined by the individual author. Obs.


(This is set, n.2, definition 10b. Definition 10c is "An assemblage of distinct entities, either individually specified or which satisfy certain specified conditions", which is what mathematicians usually mean by "set". set, n.2 basically is the definitions of "set" that mean "bunch of things", while set, n.1 has to do with things hardening, hanging, etc.

I was led to this definition by a talk about the modern dictionary from the TED conference, in which Erin McKean, an editor for the OED, talks about how in the future dictionaries won't be like they are now. (A transcript of the talk is here.) She mentions that set has many definitions, one of which is "miscellaneous technical senses". I was actually hoping that this would include the mathematical sense, and I could write a post about how the compilers of the OED hate mathematicians. The numbered definition she was talking about, set, n.1 definition #30, has various lettered sub-definitions; usually the sub-definitions under a number are related, but this one isn't. McKean's theory is that it was Friday afternoon and the people writing it wanted to go down to the pub, and she called it a lexicographical copout.

But McKean also makes a more substantial point in the talk; it's not her job to enforce rules about the language, but rather to describe it. And nowhere is this more true in mathematics, where we basically just define words, in our papers, to mean whatever we want them to mean. The definition I gave above may be obsolete -- we've pretty much converged on a meaning for set -- but the spirit there lives on.

05 September 2008

LaTeX "errors"

Output from PDFLaTeX: "0 errors, 5 warnings, 33 badboxes".

If only that meant that there are zero mathematical errors in my paper.

Best Wikipedia article title ever

Mutilated chessboard problem.

The problem is the folkloric one: given a chessboard with two diagonally opposite corners removed, can you tile it with dominoes? A domino is a rectangle which is the size of two adjacent squares. (If you haven't seen it, think about it; the solution is in the article.)

I've known this problem for as long as I can remember, but I didn't realize it had a name. I didn't realize it needed a name. But apparently it's a standard example of a theorem which is hard for automated theorem-proving programs, so that community needed something to call it, because they can't refer to "that one with the dominoes on the chessboard where you get rid of two squares" in the title of their papers.

Harder question, again if you haven't seen it before: when can you remove two squares and have there be a tiling? (I know the answer. If you want to know it, leave a comment.)

Even harder question: when can you remove four squares and have there be a tiling? This might not be that difficult, but I don't know the answer. (It'll give me something to think about on the walk into school this morning, though.)

McCain's life span?

Here and here, people attempt to answer the question: what are the chances that John McCain will die in the next four or eight years?

A quick look at mortality tables says roughly 15% in four years, 30% in eight years, which are roughly ten times the corresponding figures for Obama --- although it gets more complicated than that pretty quickly. Obama smoked for a while, McCain had cancer. Obama's parents died relatively young, which seems bad for him-- but his father from an automobile accident and his mother from ovarian cancer, which Obama himself is obviously not at risk for. McCain's mother, on the other hand, is still alive at 96. The presidency is a very stressful job -- but it comes with great health care! (I don't actually know what sort of health care the president has, but somehow I don't see doctors turning away the president for inability to pay.) And so on.

And if we're talking about the probability that a president will survive his term, we also have to think about assassination. Four out of 44 US presidents have been assassinated; what are the probabilities that either of the nominees would be assassinated? I don't even know how one would begin to assess that.

Finally, as meep points out in the first post linked above, "The central limit theorem doesn't kick in at one person". We don't get to elect a president, branch off a large number parallel worlds, and see in what proportion of those worlds he survives four or eight years. (Unless you subscribe to the many-worlds interpretation, that is.) We get one shot.

(Well, we Americans get one shot. My statistics have historically shown that about half my readers are reading from outside the US.)

03 September 2008

Change of name

After long consideration, I have recently decided to change my name. I will henceforth be Michael Lugo. (My e-mail address remains the same for now.)

I hope to still bring you the same, well, whatever it is I've been bringing you for some time to come.

02 September 2008

Randomness in football

California high school's offensive scheme adds randomness to football. I know very little about football (I think we've established pretty well here that I'm a baseball fan) but it's an interesting idea. Basically, this team runs a wider variety of plays than many teams, making them less predictable.

I'm not sure if they mean "randomness" in the sense that I would hope, though. First a team would determine a mixed strategy for what plays it should run, based on how they expect the opponents would be able to defend against those plays. Then they should just let a random number generator programmed with that strategy call the plays. If it works for diplomats and terrorists, why not football?

(And why not baseball, for that matter. In fact, there are less plays possible at any given juncture of a baseball game than at any given juncture of a football game, I think, so it would probably be easier to do this in baseball.)

01 September 2008

Who's Tremellius?

There's a web site called the Mathematics Genealogy Project, which traces the "family tree" of mathematicians. The relationship of "parent" in traditional family trees is replaced with "doctoral advisor".

You'd think, then, that everybody would have a unique 1st-level, 2nd-level, etc. ancestor, but they don't, since some people, especially before the advisor-advisee relationship formally came into existence, had two advisors. Leibniz is an example, as is Dirichlet.

Anyway, the natural thing to do, I think, is to keep following the links upwards; this process usually seemed to terminate at Erhard Weigel, who was a mathematics professor in the mid-1600s, and who taught Leibniz. But Leibniz had two advisers, Weigel and Huygens; if you follow the Huygens path back up into the 1500s you end up, eventually, at Immanuel Tremellius, who appears to have been known principally as a translator of Bibles. He advised Rudolph Snellius, who in turn advised his son Willebrord Snellius of Snell's law fame.

Tremellius has 56,128 descendants, which is the most anybody has, tied with Valentine Naibod (they both were advisors of the elder Snellius and nobody else, and so have the same set of descendants). I'm pretty sure some of those links weren't there before; the people at the MGP seem to have found some new data. But the database contains 125,583 people; it would not be reasonable to say that Tremellius is the ancestor of all, or even almost all, modern mathematicians.