31 August 2007

Patent for a five-sided dice [sic]

A patent for a "five sided dice" [sic].

The patent was filed by Louis Zocchi, inventor of the Zocchihedron, which is a 100-sided die; the Wikipedia article indicates that the Zocchihedron isn't a fair die.

Will this one be?

Previously I wrote about how one might design asymmetric dice; Mark Dominus claims that "the probability that the hexahedron will land on face F is not proportional to the area of F, but rather to the solid angle subtended by F from the hexahedron's center of gravity." I'm not sure if I believe this. It seems reasonable, because it captures how the die is likely to rotate in the air, but dice bounce when they hit the table, and I'm not convinced that the "bouncing" behavior isn't chaotic.

Anyway, the patent indicates that the die is basically a triangular prism (although with beveled edges), with 1 and 5 on the triangular faces and the pairs (2,3), (2,4), (3,4) on the rectangular faces (thus 2, 3, or 4 will appear "upwards" when the die comes to rest); by symmetry, 1 and 5 should occur with the same frequency, as should 2, 3, and 4. So there is such a die.

Part of the patent reads as follows:
The present invention has been tested for fairness wherein different sizes of dice were included in the test ranging from 13-18 milimeters in thickness.... During initial testing, it was felt that the 14 millimeter thickness was the closest size to providing equally random outcomes for each of the five faces so that each face would occur one-fifth of the time. Specifically, 0,63 rolls were made of the 14 millimeter thickness test dice which yielded 6,152 rolls in which a rectangular silhouette was seen and 4,011 rolls which yielded a triangular silhouette. This means that the two triangular faces came up 4,011/10,163=0.3947 of the time. If the dice was perfectly fair, those faces should come up exactly 04000 of the time. Given the number of rolls, the uncertainty (one standard deviation) was estimated to be 0.0070 which indicates that the experiment detected no significant deviation from fairness.
The actual standard deviation is more like √((10163)(.4)(.6)/10163 = 0.0049, meaning the results were a bit over one standard deviation from fairness; by the usual standards of statistics, though, it's still in a 95% confidence interval (i. e. within 1.96 standard deviations).

Eventually, it seems these will be manufactured at a thickness of 13.6 millimeters (which would prefer the triangular faces slightly more than the 14-millimeter thickness) but it is then stated that
It is believed that the dice may be ultimately manufactured in a range of size from 13 to 15 millimeters depending on the type of material they are to be used on.


It seems like a lot of trouble to have to have different dice for different purposes, which the inventor seems to think would be needed for fairness. (Perhaps this has something to do with the "bounciness".) There's a standard shape for a ten-sided die which could easily be used for this purpose (just label opposite sides with the same number), and from purely symmetrical grounds it's fair. I've been informed that rolling a ten-sided or twenty-sided (icosahedral) die and reducing mod 5 is standard among people who play role-playing games.

how maps are like mathematics, and maps of mathematics

Yesterday I wrote about different ways of drawing the Interstate highway system, and introduced this map by Nat Case that clearly shows it as a grid. Case had written about the map at Map Head, and says there that people accused him of copying this map by Chris Yates.


Apparently some people aren't happy about Case's map, because they're claiming that Case just ripped off Yates' idea. I don't know anything about intellectual property law, but it seems to me that there are a lot of maps of the same area that look very similar and convey almost the same information -- maps that are much more similar than those of Case and Yates. For example, two ordinary road maps of the same area should look very similar, because they'll depict the same roads! The impression I have is that the information (where the roads are) is freely available but the particular way in which it's drawn (if this isn't just a simple transcription of the information) is not. To quote one of the commenters here:
Copyright law expressly does NOT apply to "ideas", only to the execution of ideas. If this dude had copied the colors, font, or any other specific element from Chris' map, then it would be a different story (and "straight lines" isn't quite a specific element). If he had made his own map look like a subway map, even then it might have been a fuzzy area.

But, what seems to have happened was that he looked at Chris' map, then started from scratch creating his own expression of a similar idea. While it may have been courteous for him to credit Chris for the inspiration more obviously than he did, he certainly isn't legally required to do so. In fact, this very process of re-interpreting others' work is basically how art progresses in society in general. To use your analogy, I think this guy looked at a Picasso, said, "Hmm, cubism is cool," and painted his own cubist painting of the same model that Picasso used.


To this I would add that mathematics works the same way. Most proofs of the same mathematical fact, most expositions of the same concept, etc. will look the same in outline, simply because they have to respect how the ideas stand in logical relation to each other. But the choice of notation, of words to go between the equations explaining what's going on, and so on is unique to each author, and probably has a lot to do with what else is going on in a particular book or article. One might choose to emphasize certain parts of a proof -- doing some simple algebra in full instead of just saying "the reader can verify this" -- because the same calculation will come in handy later. Notation seems analogous to graphic design elements like color; a good map uses colors which are clearly different for things which are different, and good notation does the same. (I've had professors who used a and α, u and μ, or v and ν in the same problem, and had bad handwriting. No! I'd even go so far as to say that using letters which look similar in typed work is bad practice, because the conscientious reader will be copying your notation by hand in order to check things.) I doubt there are two proofs out there of, say, the Fundamental Theorem of Calculus (to take a result that's been reproduced in a zillion textbooks) that are word-for-word and symbol-for-symbol the same, just like there probably aren't two maps of Manhattan that are pixel-for-pixel the same.

The point here is that there's good exposition and bad exposition, which coincide roughly with good maps and bad maps. A good map, or a good writer, will help you navigate a tricky area; a bad map will just confused you.

And what would a map of mathematics itself look like? Dave Rusin has given it a shot, but each area of mathematics is just a circle to him, and it's not clear to me how they're connected to each other, or why his circles are arranged the way they are. He says at his A Gentle Introduction to the Mathematics Subject Classification Scheme that "[t]he welcome page for this site shows an image of the areas of mathematics which shows the relative numbers of recent papers in each area (arranged so as to illustrate the affinities among related areas)." It's a decent job -- nothing seems too far from where it should be, and I imagine that projecting the whole thing down into two dimensions makes it quite difficult! -- and it's not just based on Rusin's prejudices. It seems to come from correlations between classifications in that scheme -- two classifications are "close together" if there are papers which are classified in both of them. This somehow gives rise to a 61-dimensional space (the 61 dimensions presumably corresponding to the 61 two-digit classes in that scheme) of which this map is a two-dimensional projection. The vertical direction seems to work out with "discrete" mathematics at the top and "continuous" mathematics at the bottom; I'm not sure what the horizontal direction represents. (I think one could make an argument for "pure" on the left versus "applied" on the right, but it's a weak one.)


The same can be done with individual papers, or with mathematicians; see, for example, the papers of eight Fields medalists.

I sense that more could be done. Which areas of mathematics do you need to learn before learning certain others? Which areas historically grew out of each other? Within an area, how are the important results related to each other? And how could this be illustrated pictorially? I'm not sure what good this would be, though.



(At least two people seem to have done something similar for motorways in England); not surprisingly they both did their maps in the style of the London Underground map. It probably won't surprise you to learn that Great Britian's road numbering scheme isn't a grid, but rather divides England and Wales into zones emanating from London, and Scotland into zones emanating from Edinburgh.)

30 August 2007

ways of drawing the Interstate Highway System

Earlier this year, Chris Yates made available online a simplified map of the Interstate highway system, where the interstates are for the most part straight lines. This made its way around the Internet for a while, and it was frequently pointed out that it was wrong -- certain interstates just aren't on it. (The ones I noticed were I-83 (which runs between Harrisburg and Baltimore) and I-88 (Albany to Binghamton); a lot of other people pointed out that it also omits the entire state of Wisconsin.)

It occurred to me that it would be nice to see this map done correctly, and I'm trying to do it myself; unfortunately I don't have a really big piece of paper! It's probably possible to represent all the two-digit Interstates on a single piece of paper, though. But this map from Hedberg Maps looks like what I was thinking of -- it basically assumes that as many of the two-digit Interstates as possible actually form a grid and then fills in the rest, including the three-digit Interstates, accordingly. Too bad it's $40. They explain how the map works here; basically, the map's author, Nat Case, attempted to respect the topology of the system and align the main roads of the system with the grid, so the distance between, say, I-70 and I-80 is the same as the distance between I-80 and I-90. (Case claims inspiration from Yates' map, but seems to be a lot less visible on the Internet, probably because you can't view it online.)

A lot of people compare these to "subway maps", but there's one big difference. On a map of a subway system, the various subway lines are generally denoted by different colors; this makes it obvious when one has to transfer from one line to another. (See, for example, the standard Washington Metro map; there's a recolored version of the map to indicate the different service patterns they run on July 4th.) And this makes sense, because a change of trains has a substantial cost -- you've got to get off the train, often walk up or down some steps, and wait for another train to come. Roads don't work like that -- the cost to go from one highway to another is minimal.

I'd also like to see a similar map of, say, Philadelphia; the street signs indicate that each intersection is a certain distance west or east of one axis (Front Street or Germantown Avenue, depending on location) and north or south of another (Market Street), but the grid breaks down the further you get from the historic city. I actually live along a "seam" in this grid; as you walk down my street the numbers immediately go from the 500s to the 900s. How is the map distorted if you try to place locations not where they are, but where the addressing system implies they are?

edited: Nat Case talks about the map at mapHead; you can see a larger version of the map here. (1080 by 1080, which is large enough to get a very good idea what's going on. It's particularly interesting how some of the shapes of states get distorted here -- Florida is now much longer east-west than north-south, for example, and Arizona and New Mexico end up having roughly the shape of Chile, and Illinois is huge.

Personally, I think it's a little crowded in the northeast, which is something you might not necessarily expect from the description.

And now I don't have to make this map myself to see what it looks like, which is a relief because I'm not particularly good at this sort of thing.

profiles of interconnection networks

Flajolet and Sedgewick's Analytic Combinatorics (link goes to 12 MB PDF of the book, which is still in progress) is a most interesting book. I took a course based on the first three chapters of the book last fall; the purpose of the class was basically to teach us how to translate combinatorial structures into generating functions, in a fairly routine way. Unfortunately, we didn't get to the part of the text where we could then tell things about the asymptotics of those combinatorial structures, which is a tremendously useful thing to do.

So I've been working through parts of the book lately. In example V.8 Flajolet and sedgewick talk about a problem considered by Lagarias, Odlyzko and Zagier in their paper On The Capacity of Disjointly Shared Networks (although not quite in the same language). Flajolet and sedgewick ask: "There are 2n points on a line, with n point-to-point connections between pairs of points. What is the probable behavior of the width of such an interconnection network?" The width is defined in the following way: imagine the connections as circular arcs between the points on a horizontal line; let a vertical line sweep from left to right; then the width is the maximum number of arcs encountered by such a line.

For example, if we have n = 6 (and thus 12 points), we might make the connections 9-12, 5-10, 1-2, 6-8, 3-11, 4-7; as the line sweeps between 6 and 7 there are four open connections (and never are there more), so the width is 4. The "profile" of this network is the sequence given by the number of open connections just after the line sweeps past 0, past 1, past 2, and so on (call these a0, a1, a2, ...); this is the sequence

0, 1, 0, 1, 2, 3, 4, 3, 2, 3, 2, 1, 0

and in general this sequence increases to somewhere around n/2 and then decreases. Flajolet and sedgewick says that Louchard proved this, but I haven't looked at that paper. What surprised me, when I saw this problem, is that the profile actually looks something like a parabola, as you can see from the simulations at the top of p. 311 in Flajolet and sedgewick. Louchard proved that the profile "conforms asymptotically to a deterministic parabola 2nx(1-x)... to which are superimposed random fluctuations of O(√n)". So, for example, in such a network with one thousand nodes (n = 500), the profile goes as high as 250 (when x = 1/2) before going back down to zero.

Why should this be?

Imagine that you're putting together a random interconnection network. The profile sequence clearly has the property that each member ak is one more or one less than the last one. It'll be one more if k is the first member of one of the pairs, and one less if it's the second member. What's the probability that k is the first (i. e. smaller) member of its pair? If k is small, it'll be large; if k is near 2n then it'll be small. In fact, if we know k is a member of some pair, the other member of the pair is equally likely to be any integer from 1 to 2n except k; thus the probability that k is the smaller member of its pair is approximately 1 - k/2n, and the probability that it's the larger member is approximately k/2n.

Thus ak = ak-1 + 1 with probability 1 - k/2n, and ak = ak-1 - 1 the rest of the time. The expected value of ak - ak-1 is approximately (1-k/2n) - (k/2n), or 1-k/n. So, for example, one-fourth of the way through the process, when k = n/2, we expect ak to increase by about one-half each time we increase k by one.

Integrating with respect to n, we get

at ≈ ∫0t (1-k/n) dk = t - t2/2n

which is just a rescaling of Louchard's parabola. Flajolet and sedgewick's book is full of examples of things like this, where some apparently random structure turns out to have a lot of order when you "zoom out".

29 August 2007

non-nonstandard calculus, and the Carnival

Matt at The Everything Seminar talks about his experiences in teaching "non-nonstandard calculus". Formally, this is calculus done over the ring R[dx]/((dx)2=0); informally, this is calculus where dx is treated as a number whose square is zero. He writes:
By treating infinitesimals on the same footing as finite numbers, the approximation schemes of calculus become more intuitive. I think every mathematician has discovered this on their own, in their own private language. Why not make the language commensurable with the computations that we do?
I agree with this -- the reason we do whatever computations we do in a certain way is often because it's the easiest way we know. Why make the students suffer more than we have to?

As people have pointed out in the comments to Matt's post, this approach seems to have nothing to say about integration. But does the usual approach to teaching differentiation have anything to say about integration?

In general, it seems to me that mathematics and mathematics education are more separated from each other than they need to be. Obviously, some differences are necessary -- but why erect an artificial wall? (I feel this somewhat keenly because I am at the moment in my life -- two years into grad school -- where I am expected to jump over this wall.) John Armstrong has written about the Carnival of Mathematics and how it is becoming split between people who do lower-level mathematics and people who do higher mathematics, with the lower-level people winning. But it strikes me that they're claiming that the difference is one of kind, not one of degree. There should be a smooth continuum from 1+1=2 to, say, the Riemann hypothesis -- everything currently taught to students was research once -- but there isn't. It seems, though, that the Carnival of Mathematics welcomed submissions at all levels and it just turned out that there are more people writing at the lower levels.

This blog goes back and forth between levels quite often, so you'd expect me to think this. And that's intentional. One of my goals here is to make the point that mathematics -- well, at least some parts of it -- is not all that divorced from everyday experience.

(Oh, and by the way, yes I changed the layout. People have said for a while they didn't like it; the Adsense ads were making me only a pittance; and the old layout was orange and blue which are the Mets' colors.)

the lighter side of financial mathematics

Engraved Portraits of Gauss for sale, just 40 dollars!

These are in fact 10 Deutsche Mark notes. I have the feeling that the seller, Acme Klein Bottle, sold them at a lower price before 2002.

I'm kind of tempted to order one, but forty bucks is forty bucks, and I'm a grad student. It kind of seems like a nice conversation piece, though. Besides, for two dollars more I could get a Klein bottle. Or I could get a Klein bottle hat. I knew someone who tried to knit one of these once; I don't remember if she succeeded. Acme also sells Mobius band scarves, which I suspect would be quite annoying because I like my scarves to have ends. These would have novelty value and keep me warm.

(You might also consider these portraits of Euler, or any of the portraits from this gallery. Rather strangely, all the portraits have numbers on them.)

But let's say, hypothetically, I bought the portrait of Gauss. I could hang it on my wall and people would wonder why I hung money on my wall instead of spending it. It would kind of be like a Knuth reward check -- Knuth pays a bounty of $2.56 for each error people find in his books.

In 2002, Knuth said in this article in the notices of the AMS, when asked what would happen if all his reward checks were cashed:
There's one man who lives near Frankfurt who would probably have more than $1,000 if he cashed all the checks I've sent him. There's a man in Los Gatos, California, who I've never met, whom I've never met, who cashes a check for $2.56 about once a month, and that's been going on for some years now. Altogether I've written more than 2,000 checks over the years, and the average amount exceeds $8.00. Even if everybody cashed their checks, it would still be more than worth it to me to know that my books are getting better.
Knuth didn't answer the question directly, but I assume he'd be okay if they all were cashed -- I have a feeling he's got some money saved up.

On the contrary, Erdos once said that he would not be able to pay out all the rewards he had put on various problems, and compared this to that the strongest bank would not be able to survive if all its customers simultaneously wanted their money -- but believed the bank run to be more likely. I'm kind of curious if there's a list of Erdos problems out there. This article indicates that people usually did cash Erdos checks, perhaps because the amounts of money involved are greater -- and back then, you got a cancelled check back from the bank anyway. Ronald Graham estimates that the total outstanding bounties on Erdos problems are about $25,000, although it seems he's not sure because as of that writing there was no list of them. I was able to track down a list of a couple dozen or so.

And while I'm talking about money: I previously wondered about the density of money, and I concluded that the density of U.S. coinage -- if we make certain reasonable assumptions about how change is given -- is $28.58 per kilogram. This was inspired by the fact that I had some change I needed to cash in.

I cashed in my change recently; I had 95 quarters, 126 dimes, 76 nickels, and 339 pennies, for a total of $43.44. This is also 2,052 grams, for a money density of $21.17 per kilogram. As I said in my earlier post, I tend to use quarters for laundry. One expects dimes, nickels, and pennies to occur in a 2:1:5 ratio in randomly occuring change; my actual experience is not too far from that. In randomly occuring change, though, one expects three-fourths as many quarters as pennies; I didn't have nearly that many quarters.

28 August 2007

Indexed

Jessica Hagy's blog Indexed (features a wide variety of quasi-mathematical doodles on index cards. (via Freakonomics. She writes:
This site is a little project that lets me make fun of some things and sense of others. I use it to think a little more relationally without resorting to doing actual math.

Some of the mathematical things she uses:

Venn, or Euler diagrams: this diagram of scary stories told to kids (which include Aesop, the Brothers Grimm, and Leviticus, and her explanation of humor at Freakonomics, among many others (For the record, a Venn diagram specifically allows that all possible intersections of the sets in question and their complements, except for those containing both a set and its complement, are nonempty; an Euler diagram shows the actual intersections and inclusions between sets.

A visual joke about tangents. (This reminds me of a story: I was born at 11:41 in the morning. The first time I learned about Lie algebras, it was my birthday, and I had a lecture that morning to attend from eleven to twelve. I asked a question at about 11:39 or so. My professor went off on a tangent about Lie algebras, and I looked at my watch while he was talking and it was 11:41. I felt all special.)

Correlation does not imply causation. But sometimes correlated things are related by causation: if there's more food around, you get fat.

Optimization problems in music and in food.

Meaningless charts, which I think I've ranted about before.

The same diagrams get recycled with different labels -- for example, this post on money and status, this one on the evils of gym class, and this one on career choice. This is an example of the general phenomenon that lots of apparently different things are governed by the same mathematical structure.

And normal people get laid most following a normal distribution!

The complete graph on six vertices is good for something. Seven vertices, too, but she can't draw a regular heptagon. (Can you?)

dimensional analysis and modified Newtonian dynamics

I just finished reading Lee Smolin's book The Trouble With Physics: The Rise of String Theory, The Fall of a Science, and What Comes Next. It's really two books in one -- one about the history of modern physics (including, roughly speaking, how the string theorists have taken over) and one about how Smolin believes this is a Bad Idea. Apparently string theory doesn't look so promising these days, and yet the string theorists still dominate physics departments. He's currently at the Perimeter Institute, one of the aims of which is to break down this orthodoxy and give its researchers the freedom to pursue their interests.

Smolin, in examining possible research questions for the future, considers that perhaps it would be a good idea to try to find new interpretations of already surprising facts. For example, consider the length scale implied by the "cosmological constant", which is the length scale over which the universe is curved; this he calls R, which is about 10 billion light-years. What else happens on this scale? The obvious thing is, of course, the universe as a whole. But it turns out that c2/R also seems to be an intersting scale to look at. The constant c2/R (where c is the speed of light) has the units of acceleration, and is about 10-8 centimeters per second. It seems, actually, that Newton's law of gravitation might break down for very small accelerations; this is Mordehai Milgrom's theory of modified Newtonian dynamics (MOND). This can be seen by looking at stars orbiting the center of their galaxies; when the acceleration due to gravity is greater than this critical value, Newtonian gravity seems to work, but when it's less than this critical value it doesn't. When you get below the critical value, it appears that gravitational force falls off only as the inverse distance, not the inverse square distance.

When I read this, though, I was a bit skeptical, because Smolin says that when the acceleration is small, it varies as the square root of mass. This seems to violate everything I know; shouldn't forces be additive? If I'm reading Smolin's version correctly, it's saying that a galaxy four times as heavy should produce an acceleration only twice as large... but only if you view that galactic center as one large object, and not four smaller ones. A bit more ridiculously, consider a galactic center of mass M. It exerts a certain force F on some object far away. But now view that same center as k2 centers of mass M/k2. Then each of those centers exerts a force F/k on the object; the total force on the object is thus k2(F/k) = kF. How does the galactic center "know" whether it's one object or many?

I don't see this criticism anywhere, and it seems so obvious... but I'm not sure whether the issue is with MOND as a theory, with Smolin's summary of it, or with my own understanding of physics.

27 August 2007

some links, primarily about the "social graph"

Brad Fitzgerald, formerly of livejournal, gives his Thoughts on the Social Graph. The social graph is, roughly, the graph whose vertices are individuals and whose edges connect pairs of people who know each other (for some definition of "know"); Fitzgerald advocates making the social graph a "community asset" that would be available to anyone who wanted it.

I see the term "social graph" is a bit misleading because it suggests that there is only one type of edge, i. e. one kind of way in which one can know other people. (Indeed, this is a perennial problem on a lot of social networking sites; there's often no way to indicate the difference between, say, a person that one met once at a party and one's spouse.) Fitzgerald acknowledges this:
It's recognized that users don't always want to auto-sync their social networks. People use different sites in different ways, and a "friend" on one site has a very different meaning of a "friend" on another. The goal is to just provide sites and users the raw data, and they can use it to implement whatever policies they want.

The next step in the analysis of social networks might be to take into account the "strength" of relationships, as currently most social networking sites I know of don't acknowledge that I am more interested in some of my friends than others. Also, to what extent can the strength of relationships be guessed just from looking at the social graph without this strength information? A person who I know quite well is likely to be someone with whom I have many friends in common.

Also, the social graph is in some important ways directed (the link goes to Good Math, Bad Math); for example, if edges connect X to Y if X reads Y's LiveJournal. (It wouldn't surprise me to learn that Fitzgerald has this example in mind.) The somewhat addictive tool LJ Connect, which finds the shortest path between two individuals with LiveJournals via their friends, explicitly acknowledges this.

Finally, some shadowy figures seem to be aggregating the mathematics blogging community, and various other blogging communities as well. Their in-house mathematician gives yet another example of using the formula 1+2+...+n = n(n+1)/2.

And some links:

Paul Graham writes about Holding a program in one's head; a lot of his ideas seem to have obvious analogues to mathematical research, which isn't surprising, as programming and mathematics are quite similar. I'll probably have more to say about this later.

There is a graphical formalism for quantum mechanics, which its authors call "kindergarten quantum mechanics"; this is said to be a very considerable extension of Dirac's notation, and gives short derivations of deep results concerning teleportation, quantum mechanics, and so on. If this is true (I'm not familiar enough with the results concerned to say), it's a good example of what Graham has to say about choosing appropriate notation.

Vlorbik asks us to please lie more carefully. He writes:
We were supposed to test the claim that a certain population proportion was 10% against a sample proportion of 13%, based on n= 57 data points (at some stated confidence level that I’ve forgotten). But wait a minute. You can’t get 13% from a sample of 57 subjects: 7/57 ~ .122807 (i.e., 12%) and 8/57 ~ .14035 (i.e., 14%).
This particular one isn't a big problem, but it's symptomatic of something more general -- that the "applications" problems in a lot of mathematics textbooks bear very little resemblance to reality, which only seems to frustrate the students. (Another complaint I have about such textbooks is that the calculus texts seem to assume the student is familiar with physics, which is often not a reasonable assumption to make, and so the instructor ends up teaching physics instead of calculus.)

In Nature's Casino, by Michael Lewis, from the New York Times, August 26, about the insurance market for catastrophic events.

scalene triangles on Google

Would you believe that the #1 "hot trend" on Google Trends yesterday was scalene triangle?

I am not making this up.

The Google Trends page for yesterday is actually feeding me a fair bit of traffic right now, to this page in which I solved an unrelated puzzle about scalene triangles.

This would be mystifying to me, but there was a question on "Are You Smarter Than A 5th Grader" last night asking how many angles in a scalene triangle are the same. (Incidentally, I'm not entirely sure whether the answer is zero or one. All the angles are different. There was another question on that show that had two possible answers -- they asked for the world's longest river, whether the world's longest river is the Nile or the Amazon depends on how you measure.)

The #2 "hot trend" yesterday was "mars moons", and #8 was "feet in a mile"; "proper noun" is #21, and "worlds longest river" [sic] is #38. These also related to questions on that show.

The peak time for this search was 4 PM, which seems mystifying until you realize that Google uses Pacific time; this is 7 PM Eastern, which is the time the show aired in the East. Indeed, the four hours when "scalene triangle" was most searched were 4 PM, 7 PM, 5 PM, and 6 PM Pacific (in that order); these are, of course, 7 PM Eastern, Pacific, Central, and Mountain, respectively, which I suspect is the ranking of U. S. time zones from most to least populated. I suspect that it's possible to distinguish between trends that occur because of something on TV and trends that occur because of something that happens in the "real world" (i. e. a breaking news story) by looking at this; breaking news stories should show a single peak, while TV-inspired searches should show a large peak and then a small peak three hours later.

From what I can gather from about Google Trends, the quantity they're ranking is the number of searches done on a particular query in the day in question divided by the number of searches done on a "typical" day. My suspicion is that the various questions on the show were equally likely to be Googled, but more people Google for river lengths or grammatical terminology than scalene triangles on a normal day.

Scrabble, part two

After the post I made a few moments ago, I saw how to do the count.

The answer's 3,199,724, which apparently is the standard answer accepted in Google.

The method is pretty straightforward. First, find the number of "partial racks" (i. e. sets of zero to seven tiles) consisting of just A; there's one with no tiles, one with one tile, and so on up to seven tiles. (There are 9 As, so a rack full of them is possible.) Then find the number of "partial racks" containing A and B; there's one with no tiles (the empty rack), two with one tile (A and B), three with two tiles (AA, AB, BB), and three with each of three through seven tiles. (There are only two Bs.) Repeating this for each of the 27 tile types (26 letters plus the blank) gives the answer of 3199724, not far from my sampling-derived estimate of 3224068.

how many Scrabble racks are there?

I've been playing a lot of Scrabble on Facebook lately. While Googling around for some stuff on Scrabble strategy (I'm not the type to memorize words, mostly because that crosses some sort of invisible line between "fun" and "work", but knowing about things like how to try to keep a good mix of vowels and consonants on my rack makes the game more interesting just because I'm less likely to get stuck with a bad rack, which is frustrating), I found a couple things of interest.

First, this handout from the MIT Scrabble club. (I actually knew the guy who started it. He lived down the hall from me my senior year, when he was a freshman.) On that handout they quote Joel Sherman, who says:
Accept the idea that Scrabble is a math game just as much as it is a word game. All the strategic theory of the game is based on statistical analysis, probabilities, spatial relationships on the board, maximizing the value of small-numbered tiles by playing bingos (using all 7 tiles on your rack to earn the 50-point bonus), and large-numbered# tiles by causing them to interact with the colored premium squares on the board, or with other words on the board. (I.e.: you can score just as much for placing a 4-letter word in a place where it lies parallel to 4 other letters or whole words without hitting any premium squares as you might for the same word hitting a double or even triple word square but forming only one or two words on the play.)


Indeed, the conventional wisdom is that people who are good at math do well at Scrabble, and that having a large vocabulary is not particularly useful. (I suspect a lot of this is because large vocabularies tend to consist of long words, and words much longer than seven letters are quite rare in Scrabble.) The same thing is said to be true of crossword puzzles, although there it's not as obvious why there should be a correlation, and my belief is that it's not as pronounced.)

I also found at Scrabble on the Brain a question: "How many games of Scrabble would you have to play before you see every possible rack?"

How to answer this question is not obvious, because the racks one gets in a game aren't independent. I'll answer the easier question of how many games of Scrabble you'd have to play before you see every possible rack on the opening play. This is an instance of the "coupon collector's problem", which says: if we pick a sequence of elements from {1, 2, ..., n} at random, with replacement, how long will it take until we've picked each of the n integers at least once? The standard version of the problem has one picking uniformly at random, so this models, for example, the number of rolls of a standard six-sided die that would be necessary to have each number come up at least once. The answer in this case is n(1 + 1/2 + 1/3 + ... + 1/n), which is asymptotically equal to n(log n + γ), where γ = 0.5772156649... is the Euler-Mascheroni constant. I suspect that in the case where the distribution is non-uniform, as it is for Scrabble racks, it takes longer to observe all possiblilities;

and so the important thing is to compute the number of possible Scrabble racks. (If this has been done before, I can't find it!)

If Scrabble tiles could be differentiated (so instead of having nine tiles labeled A, we had tiles labeled A1 through A9), then the answer would just be the number of ways of picking seven tiles from 100, which is C(100,7) = 16,007,560,800. But they're not. I can't see a nice way to compute the actual number (although that doesn't mean it's not there, and I'd appreciate knowing it!) But the following sampling procedure should yield an estimate. Let X be a random variable defined as follows: pick a rack R of seven Scrabble tiles uniformly at random from all 7-sets of (distinguished) Scrabble tiles, and let X(R) be the number of different 7-sets of distinguished Scrabble tiles that collapse to the same set of non-distinguished tiles. So, for example, if we picked the tiles EXAMPLE, there are 12 E's, 1 X, 9 A's, 2 M's, 2 P's, and 4 L's in the tile set, so the number of possible sets of distinguished tiles reading EXAMPLE is C(12,2) C(1,1) C(9,1) C(2,1) C(2,1) C(4,1) = 9504. Then the number of possible racks of undistinguished tiles is the sum of 1/X(R) over all possible racks of distinguished tiles, or C(100,7) times the expectation of 1/X(R).

(Compare if we were trying to find the number of possible one-letter racks, that is, letters; we'd have, say, 9 terms which are each 1/9 corresponding to the labeled tiles A1 through A9, 2 terms each 1/2 corresponding to B1 and B2, and so on. The terms corresponding to each tile type sum to 1, so the sum is just the number of tile types, or 27 including the blank.)

And the expectation of 1/X(R) can be found by sampling. I generated a million racks at random and evaluated 1/X(R) (with some hacked-together Maple code); the expectation of 1/X(R) calculated from this sample is 0.0002014090957, and I approximate the number of possible Scrabble racks to be this quantity times C(10,7), or 3224068. (I'm not interested enough to try to calculate the standard error on this -- and I hope I can get an exact answer.) The number of games of Scrabble you'd have to play before you see every possible rack on the opening play is probably at least 3224068 log 3224068, or around fifty million.

24 August 2007

Some thoughts on the current journal system

From Ars Technica: The challenges of scientific integrity. (via Ars Mathematica). Apparently plagiarism has been breaking out on the arXiv -- for those of you who don't know, this is a central online repository for preprints in mathematics, physics, etc. What's amazing to me is that someone would be foolish enough to do this -- the plagiarizing papers which are showing up on the arXiv plagiarize other papers from the arXiv. That is, they plagiarize papers which are easily available online -- and, from the sound of it, by taking fairly long, verbatim excerpts. If you're going to plagiarize, don't take things from the first place where people will look! Take your results from some obscure journal that nobody reads anyway. This raises an interesting question, though -- what's the line between "plagiarism" and "research"? (One old joke has it that if you copy from one source, it's plagiarism, and if you copy from many sources, it's research, but these people copied from many sources.) Obviously, if you do a computation that somebody has done before (because you want your readers to see it), you'll use the standard notation and so on. A lot of the difference is just about citation -- the honest author says "I'm showing you how Erdos did this", whereas the dishonest author tries to pass off someone else's computation as their own. We all stand on the shoulders of giants. The shoulders of the giants hurt, and they want to be acknowledged. Can you blame them?

Peter, commenting at the Ars Mathematica post, says that plagiarism of admissions essays occurs fairly often, mainly from people from non-Western backgrounds. Is it possible that these people simply don't realize that what they're doing is wrong by our standards? However, since the admissions essay in question is a statement of what sort of research a person wants to do, and is written in the first person, this seems doubtful -- two people's research interests are very unlikely to line up exactly.

The comments at Not Even Wrong about this are also of interest. The general consensus seems to be that the current journal system needs some work, especially with regard to lax standards of peer review. That sounds plausible to me, but I don't have enough experience with the system to say for sure.

From Marginal Revolution, I learned about An Auction Market for Journal Articles, suggested by Jens Prufer and David Zetland; they suggest that papers would be posted to some central repository, and editors of journals would then bid for papers (in some sort of fake money they call "academic dollars"). The academic dollars they bid are then split among the authors, editors, and referees of the papers which this paper cites, thus giving editors incentive to pick papers that will be cited often. The authors claim that this system (which I have massively oversimplified) will encourage referees to do more work in improving papers, speed up publication times, and encourage authors to write better papers.

Of course, if you're going to do this, why have journals at all? Is there some logical reason why papers can't stand alone, independent of the journal system, other than "that's how it's always been?" The same Marginal Revolution post points to a note by Carl Bergstrom on "fantasy journals"; why couldn't these become the real journals? In fact, why can't a person just be their own journal? (Instead of trying to get published in prominent journals, one would try to get cited by prominent people. Who's prominent? The people who get cited a lot. It sounds circular, but that's how Google works and they seem to do a good job.) The reason that the journals exist is to solve the problem of distributing work to widely spread people; the Internet eliminates that problem.

23 August 2007

The solution to the triangle puzzle

Yesterday, I asked a question:

Given a scalene right triangle with sides of integer length and perimeter P, show that there is a corresponding scalene triangle with one angle which is 60 degrees and perimeter 1.5P.

For example, the 3-4-5 triangle is a right triangle with perimeter 12; the 3-7-8 triangle has a 60-degree angle (the one opposite the side of length 7) and has perimeter 18.


As promised, here's the solution.

First, how do we check that a triangle has a 60-degree angle? We use the law of cosines. When I originally did the analysis, I assumed that larger angles of a triangle had to be opposite longer sides. This is actually only guaranted to be true for triangles in which all angles are acute (in which case it follows from the law of sines). The sixty-degree angle in a scalene triangle (for those of you who don't know the word, it just means that all three sides have different lengths) must be the second largest angle. It can't be the largest angle, because then the sum of the angles would be less than 180 degrees; similarly it can't be the smallest angle, because then the sum of the angles would be greater than 180 degrees. In the following analysis I will assume that the side opposite the 60-degree angle is the second-longest side. It turns out that we can answer the problem under this assumption;

So we're looking for triples a < c < b of positive integers, with c2 = a2 + b2 - (2 cos 60o) ab. But cos 60o = 1/2, so we need to find triples for which c2 = a2 + b2 - ab. For example, the 3-7-8 triangle is one of these; we have a = 3, c = 7, b = 8, and 32 + 82 - 2(3)(8) = 72, as you can check.

To get an idea of what the correspondence might look like, I tried to figure out which triangles with 60-degree angles corresponded to the 5-12-13 and 7-24-25 right triangles. On a bit of a wild guess, I noticed that the shortest side of the 3-4-5 and 3-7-8 triangles was the same, so I assumed that the shortest side of the triangle corresponding to the 5-12-13 would be 5. The perimeter of the triangle we seek is 45, so it has sides 5, 40-b, b. Setting up the equation above, we have

(40-b)2 = 52 + b2 - 5b

which is actually a linear equation (the b2 terms cancel), with the single solution b = 21. This gives us a 5-19-21 triangle. The same procedure applied to the 7-24-25 right triangle gives 7-37-40.

At this point I have enough data that I can try to guess what the pattern is. I remember that all primitive Pythagorean triples-- that is, triples of integers (a,b,c) such that a2 + b2 = c2, and a, b, and c don't have a common multiple -- can be written in the form

(r2 - s2, 2rs, r2 + s2)

for some r and s, with r > s, where r and s relatively prime and not both odd. The perimeter of this right triangle is 2rs + 2r2, so I sought a corresponding triangle with a 60-degree angle and perimeter 3rs + 3r2. It seems reasonable to guess that the three side lengths will be linear combinations of r2, rs, and s2.

In the case of the 7-24-25 triangle, we have r = 4, s = 3. Thus r2 = 16, rs = 12, s2 = 9. I observe that 7 = 16-9, 37 = 16 + 12 + 9, 40 = 16 + 2(12). More generally, from the general triple listed above we get

(r2 - s2, r2 + rs + s2, r2 + 2rs).

It only remains to check that these three numbers sum to 3rs + 3r2 -- they do -- and this triple, if we call it (a, c, b), obeys the relation c2 = a2 + b2 - ab; it does, but I'll spare you the algebra.

I conclude by giving the primitive Pythagorean triples with hypotenuse less than 100, and their corresponding triangles with a sixty-degree angle and one and a half times the perimeter:


rsright-angled60-degree
213-4-53-7-8
325-12-135-19-21
4115-8-1715-21-24
437-24-257-37-40
5221-20-2921-39-45
549-40-419-61-65
6135-12-3735-43-48
6511-60-6111-91-96
7245-28-5345-67-77
7433-56-6533-93-105
7613-84-8513-127-133
8163-16-6563-73-80
8355-48-7355-97-112
8539-80-8939-129-144
9277-36-8577-103-117
9465-72-9765-133-153
If we're given a non-primitive Pythagorean triple, it's a constant multiple of a primitive one; for example, 6-8-10 is twice 3-4-5, so we get twice 3-7-8, i. e. 6-14-16. Alternatively, the triple 8-6-10 (in that order) comes from r=3, s=1; applying the formulas above gives 8-13-15. "Blinding Insight" gives a solution that's different from mine in the comments to yesterday's post, since it's based on a different parametrization; this solution gives different, but still valid, answers.

Note that this doesn't necessarily generate the triangles with a sixty-degree angle and integer sides in "primitive" form. For example, r = 5, s = 2 gives the triple (21, 39, 45) were we might hope for (7, 13, 15). Indeed, if s-r is divisible by 3, then all three sides of the sixty-degree triple will be divisible by three. Furthermore, the perimeter of the sixty-degree triple is always divisible by three, so we'll never get (7, 13, 15) in that form.

A puzzle

Given a scalene right triangle with sides of integer length and perimeter P, show that there is a corresponding scalene triangle with one angle which is 60 degrees and perimeter 1.5P.

For example, the 3-4-5 triangle is a right triangle with perimeter 12; the 3-7-8 triangle has a 60-degree angle (the one opposite the side of length 7) and has perimeter 18.

I'll post the solution tomorrow.

(The puzzle isn't mine; it's from the puzzle column in the March/April 2007 issue of Technology Review, which I was flipping through earlier today.)

Texas 30, Baltimore 3.

While at the Phillies game tonight (the Phillies lost, 15-3, to the Dodgers), I saw on one of the out-of-town scoreboard a score of Texas 26, Baltimore 3.

I figured it had to be an error, as did the other people in the stands who noticed it. Twenty-six runs? (At that moment, the other out-of-town scoreboard reported the score in that game as 16 to 3, because it can't show scores above 19. Those don't happen too often in baseball.)

In the end, Texas won that game -- the first game of a doubleheader, no less -- 30 to 3. MLB.com's account of the game is here. It's the most runs scored since the Chicago Colts (now Cubs) beat the Louisville Colonels (who haven't existed since 1899) by a score of 36 to 7, on June 29, 1897. Nobody in "modern" baseball (by convention, this is since 1901) has scored more than 29. Thirty runs is also an American League record. There have been eighteen 25-run games in modern baseball.

You'd think that a team scoring thirty runs would be likely to score in every inning, or at least in more innings than not. But Texas only scored in four innings -- 5 runs in the fourth, 9 in the sixth, 10 in the eighth and 6 in the ninth.

This reminded me of a record I once came across: teams which have scored in every inning of a nine-inning game. This is rarer than you'd think -- only seven times in major league history, all in the National League. (Note that it's almost always the visiting team that sets this particular record; a home team which has scored in the first eight innings will almost certainly not come up in the ninth.) Those seven teams won by scores of 19-8, 26-12, 20-10, 36-7 (the same 36-7 game as above), 22-8, 15-2, and 13-6. The losing teams in these efforts put up an average of 7.6 runs a game -- quite a bit higher than the average of four or five runs -- which lends a bit of credence to the idea that the two teams' scores in a game aren't independent.

And here's a mathematical question: if a team scores 30 runs, what is the probability they manage to do it while only scoring in four or less innings? We'll assume the team is a visiting team, and thus has all nine innings to work in. Furthermore, the number of runs the team scores in each inning is independent. The most recent data I could find for run distribution is from John Jarvis' collection, for the 2005 American League, so I'll use that. The distribution of the number of runs scored in a single inning, for the 2005 AL, is as follows:
Number of runs scored012345678910
Number of times14458307114426763251507822446
Probability.714.152.0713.0334.0161.00741.00385.00109.000198.000198.000297


Incidentally, under the assumption of independence, the probability of a team scoring in all nine innings in a game in the 2005 AL is (1-.769)9, or about one in 79,000; there are 2,430 games in a season currently (162 per team, times thirty teams, divided by two), so we expect one such game every thirty years or so.

We can thus take the probability generating function corresponding to this distribution:

f(z) = (14458 + 3071z + 1442z2 + 676z3 + 325z4 + 150z5 + 78z6 + 22z7 + 4z8 + 4z9 + 6z10)/20236.

The probability generating function corresponding to the sum of this distribution with itself is g(z)2; this is the sum of the number of runs scored in two independent innings, given that scoring takes place in both of those innings. A similar interpretation holds for higher powers. From this, we can easily find the probability that a team scores a total of thirty runs in the first k innings of a game, and none in the remaining 9-k innings; it's

[z30](f(z)-p)k p9-k

where p = 14458/20236 is the probability of not scoring in a given inning. The probability that a team scores a total of thirty runs in some k innings is this times the binomial coefficient C(9,k), which is the number of ways to pick the innings in which the scoring happens. Call this P(k). Then we get

k3456789
1010P(k)2.9174.85614.581670.551888.56908.64150.82
normalized P(k).00055.01409.11572.31455.35560.17108.02840


The normalized probabilities are just P(k) divided by the sum of the P(k) values; thus the value in column k is exactly the probability I sought originally. Rather surprisingly, to me, a team which scores thirty runs is most likely to score in seven of the nine innings, which is also the median of the distribution; innings with no score are so common that at least one happens in all but three percent of thirty-run games. To answer the original question, the probability that a team scoring thirty runs does so while scoring in four or less innings is about 1.5%. (Assuming, of course, that innings are independent and run distributions remain like the 2005 American League indefinitely.) And you can't even begin to suspect I'm wrong until, oh, a couple hundred thirty-run games have happened without this occurring again.

(Complete data of this form: a team scoring one run is of course most likely to score in one inning. A team scoring two or three runs is most likely to have scored in two innings; 4-7 runs, three innings; 8-12 runs, four innings; 13-19 runs, five innings; 20-28 runs, six innings; 29-40 runs, seven innings; 41-57 runs, eight innings; 58 or more runs, nine innings. I suspect at least the first few of these numbers could be checked against actual data.)

22 August 2007

They say nobody reads anymore

John Armstrong is begging me in a comment at at Concurring Opinions to comment on this article from the Associated Press, which makes the following claims:

  • One in four adults say they read no books at all in the past year

  • "The typical person claimed to have read four books in the last year -- half read more and half read fewer." -- so although they don't want to use the word "median", they're saying the median number of books read is four.

  • "Excluding those who hadn't read any, the usual number read was seven." Assuming that "usual number" means "median", this is saying that five-eights of people (the quarter who read no books, plus half of the rest) read seven or less books in the last year.


So what do we know about the distribution? One-quarter of people read no books; one-quarter read between one and four; one-eighth read between four and seven; three-eighths read more.


They claim a 3% margin of error, as well, which is standard for polls involving a thousand people (as this one was), but that margin of error only applies to the survey as a whole. The article includes a lot of claims of the form "Xs read more than Ys", but the number of Xs or Ys that were polled is less than a thousand, so the margin of error is greater.

It seems hard to get these numbers, though. The article claims that "In 2004, a National Endowment for the Arts report titled "Reading at Risk" found only 57 percent of American adults had read a book in 2002, a four percentage point drop in a decade. The study faulted television, movies and the Internet." (Emphasis mine.) If you interpret both of these claims at face value, 18 percent of non-readers have been converted to readers in the last five years. This seems unlikely.

I keep a list of the books I read; there are approximately one hundred and seventeen books on it. Yes, you read that right. That number's a bit inflated by the fact that about a third of those books were books I was rereading. But it's also a bit deflated because I have a tendency not to count textbooks, research monographs, and so on. (There's a pile of books up to my knee -- mostly library books, which is why they're in a pile, so I remember to return them -- which I read in the last year but aren't on this list.) I've also probably read a dozen or so books while sitting at the bookstore because I was too cheap to buy them, and some book-length online works...

Of course, I'm not typical.

A "book" doesn't seem like the right unit here, though. For one thing, some books are much longer than others. To take the two books on my shelf that I suspect have the most and fewest words, I estimate that Victor Hugo's Les Misérables has about 700,000 words, and Susanna Kaysen's Girl, Interrupted has about 40,000. For another thing, the person who reads a lot of books is going to engage a lot more deeply with some of them than others. There have been books that I've read in two hours and never gone back to; there have been books that I've returned to over and over again and find new insights every time. Most, of course, fall somewhere in between. And what if you don't finish a book? Does it still count?

I think the more useful metric would be "how much time have you spent reading books in the past year?" Because you can't ask people that, I suspect the Right Thing to do is to call people up and say "how much time have you spent reading books in the last week?", doing so at various times of year to control for the fact that reading is probably higher at some times of year than others, and averaging the results. But no one cares that much. (Actually, I suspect people in the publishing industry care very much, but they're not releasing their findings if they have any.) Or perhaps "how many words have you read in the past year?", but counting this seems almost impossible. (It wouldn't surprise me to learn that this number is rising, as a lot of online content is in written form.)

And as "Katie" commenting at Concurring Opinions points out, anyone at the high end of the spectrum probably underestimates. I don't know if this is true. I would have estimated I read "about two books a week" in the last year, for 104 books in the year; when I went and looked at the list, the actual count was 117. This might not matter all that much, because the pollster was asking about the median, and the people who are going to have trouble are the ones that are above the median. But I suspect that this poll is a lot like the recent polls about the number of sexual partners; people feel that they should lie about the number of books they read. I'm not sure in which direction they're likely to lie, though; I suspect it's correlated with education, and also with how many books one thinks one's friends read.

And what's so great about books, anyway? Why do we assume that reading books is automatically better than reading any other source of the written word? I suppose the argument is that a 100,000-word book requires more intellectual effort to read than, say, one hundred 1,000-word newspaper or magazine articles, because there is more interrelation among the ideas. But books come, for the most part, predigested. A lot of the real intellectual work is done in taking those clippings from various sources and making a book out of them. But this isn't a study of how intellectuals read, it's a study of how the person in the street reads. And even "study" is a bit too strong. They called up a thousand people and asked them some desultory questions. The AP did the poll itself. Let's face it, they're just trying to sell newspapers.

Baucus on free tuition

Senator Max Baucus, D-Mont., proposes free college tuition for students majoring in math and science. (Via slashdot.)

There would be a stipulation that these students would have to work or teach in the field for at least four years afterwards.

The sentiment here is noble. However, I'm not sure something like this would work.

First, the service requirement could backfire. There are two ways in which I see it backfiring:

It would inhibit the creation of startup companies. (Since all companies are startup companies at some point, by definition, this is even worse than it sounds.) Technology startups are likely started by people who are just out of school, for the simple reason that the young have greater risk tolerance. But who's going to verify that the people at a startup are actually working and not just using their startup as a way to get free tuition? (This becomes particularly problematic if computer science is included in the list of funded fields, because the activity that goes on in a startup based around software probably doesn't look all that different, on the surface, than my kitchen table does right now.) Something that requires one to "work in the field" seems to encourage people to get "traditional" 9-to-5 jobs simply for bureaucratic purposes.

Also, you'd get bad teachers. You make teaching one of the options, you're essentially guaranteeing you'll get teachers who don't want to be there. It's the same reason that a lot of military people don't want the draft. If Senator Baucus walked into any university department of math or science he'd see that there are a lot of people who don't want to teach, but do it anyway because they have a financial incentive to -- we call them "professors" -- and they don't seem to do a great job of it. They have an incentive to show up to their classes but not to make them good. And bad teaching will just make math and science harder for the next generation.

Second, jobs typically taken by just-out-of-college math and science majors will start paying less. Right now, a large number of such individuals have to pay off their student loans, and they know this when deciding what level of pay they are willing to accept. If they don't have student loans, they'll be willing to accept lower pay. I am not an economist, so I don't know how big this effect would be, or more importantly what effect it would have on the salaries of those who are past the four-year period. I suspect it might drag them down as well. In the end, this would make the technical fields less attractive to students who are choosing what to study in college.

Third, math is hard! This applies to the sciences as well. This isn't really a problem with Baucus' plan, but it bears mentioning anyway. To understand anything well requires sustained intellectual effort. And our society does not value sustained intellectual effort; we denigrate people who are willing to do it, for the most part. (This is part of why nerds are unpopular.) I suspect that the Right Thing to do is not to just throw money at the problem and hope it will go away, but to try to change the culture. But how do you change a culture? You can't tell people what to think. You can't make people value intellectual effort more than they do. I suspect that the one thing that can be done is to improve teaching -- a lot of the people I've talked to who said they're not good at math or don't like math can trace this to a single bad teacher. But as I said, I think the Baucus plan would hurt teaching, thus making the problem worse.

The prevailing trend actually seems to be in the other direction -- some schools are now raising tuition in some fields, including engineering. (The article about Baucus' proposal refers to "a full scholarship to any high school graduate majoring in math, engineering, science or technology.")

Philip Greenspun once wrote about Tuition-Free MIT, in which he proposed that MIT should be tuition-free. Since essentially all students at MIT major in "math, engineering, science, or technology", the two plans are similar, except Greenspun's version is restricted to a single elite school, which in fact makes it quite different. I'm not sure how I feel about Greenspun's plan. (Disclaimer: I did my undergraduate work at MIT. I owe nothing for it, because my parents paid for it; I am almost certain they would be happier with me having my MIT bachelor's degree and that $100,000+ in their pockets than they are with me just having that degree.)

mathematics-to-English translation

Some Words About Interpreting, from the NYT two weeks ago. (But I just found it today.)

Basically, as different countries become more interconnected, there's going to be a bigger market for translation between different natural languages.

This raises a question: as mathematics becomes more and more important in non-academic settings (and the general consensus seems to be that it will; it is not my purpose here to debate this question), will translation between mathematics and English (or whatever the natural language of the people in question is) become more important?

And what's the analog of "translator" in the mathematics-to-English domain, anyway? The obvious answer is "mathematics teacher", but that's not really correct; a mathematics teacher doesn't translate from mathematics to English, but rather attempts to teach people mathematics. A mathematics teacher is analogous to a foreign language teacher.

Perhaps the translator is the writer of expository books in mathematics for a populat audience. I'm thinking of books like, say, Duncan Watts' Six Degrees: The Science of a Connected Age or Steven Strogatz' Sync: The Emerging Science of Spontaneous Order, which treat mathematics that is obviously useful in real-world situations. On a different scale, whenever a mathematician (or other person doing work of a mathematical nature) struggles at a party to explain what it is they do, they're engaged in an act of translation.

But in any case, a translator attempts to explain a concept to a listener in the listener's language. The question here is one of efficiency: I choose not to learn, say, Chinese, because I do not need to deal with Chinese-speaking people sufficiently often for it to be worth my time to learn that language. (An ex-girlfriend of mine tried to learn Chinese in college. It's hard.) On those rare occasions when I have to deal with a monolingual speaker of Chinese, I hire a translator. More accurately, I never need to deal with such people directly. But I hear about China in the news a lot, and I presume someone somewhere is issuing statements in Chinese and someone else is translating on my behalf.

Will there evolve a market for a similar sort of translation from mathematics? For people who don't have the time to learn it, but want to know what mathematicians are saying?

21 August 2007

information-theoretic entropy in the weather

Right now, in Philadelphia, it's sixty-one degrees, with a light rain.

I have long maintained that this is "generic Philadelphia weather". By this I do not mean that it's always like this here. What I mean is that if I for some reason do not know what season it is, and I head outside and find it is sixty-one degrees with a light rain, this gives me very little information, because this sort of weather can happen any time of year. Another characteristic of such a day is that the high and low temperatures are very close together, say within ten degrees of each other; it's been between 60 and 64 since midnight and probably won't get out of the sixties (in either direction) all day.

Looking at the data from the last year, June 14 was pretty close to this, although it was just overcast, not rainy; January 6 might look that way on paper, except I remember it clearly and it was actually a freakishly warm and sunny day. I wore shorts. It was January. We had the New Year's Day parade that day. November 8, 13, and 14 fit as well; also October 11 and a few other October days; September 1. (I remember the weather on September 1 quite well, because I moved that day. The rain was light for most of the day and got heavier about an hour after my movers were done getting everything in.) I'm deliberately being vague about what constitute a day like this.

Not surprisingly, this sort of weather is most common in the spring and fall (mostly because I care about temperature) but it is possible in the winter or summer as well. And this gets me wondering -- in general, what is the information content of the weather? If it's 100 degrees and sunny, there might be a 2% chance that it's July 24; a 0.5% chance it's September 1; an 0.01% chance that it's November 1; and a one-in-a-million chance it's January 15. This sort of weather is very localized towards a certain time of year. One could imagine calculating the Shannon entropy corresponding to this distribution; it would be a lot smaller than the entropy you'd get from a similar distribution if you conditioned on sixty degrees and light rain.

Of course, in this formulation, the whole idea is kind of silly -- when am I not going to know what the date is? But looking at the information-theoretic entropy of weather seems like a potentially useful way to quantify how extreme the seasons in some place might be; it's possible but not normal to get the same weather in winter and summer in Philadelphia, say; routine in San Francisco; unheard of in Minneapolis. (I am picking these places without looking at actual statistics, so I might be wrong.) Why one would want to quantify that, though, I'm not sure.

links for 21 August

Terence Tao writes about the theorem that Danica McKellar, of Wonder Years fame, proved; she's been in the news a lot lately because of her book Math Doesn't Suck: How to Survive Middle-School Math Without Losing Your Mind or Breaking a Nail. A large part of the mainstream media said "she even proved a theorem! and it was published!" but nobody explained what the theorem was, because it takes a few pages of exposition. But Tao does a good job with it, getting in some nice statistical mechanics on the way.
(The paper itself, "“Percolation and Gibbs states multiplicity for ferromagnetic Ashkin-Teller models on Z2” " is available at McKellar's web site; I suspect this is the only instance of a mathematical theorem on a celebrity's web site, at least for suitable definitions of "celebrity". The theorem, by the way, says that the Curie temperature and the critical temperature for site percolation are equal in the Ashkin-Teller model on the square lattice.

Scott Morrison at Secret Blogging Seminar (which is becoming less secret every day) gives us grepping the Knot Atlas, which is about the Knot Atlas. The Knot Atlas is basically Wikipedia for knots. (The Knot Atlas, by the way, has no entry called "knot" or something similar that I could find. It doesn't define knots, it only tabulates them.) A knot is not exactly what the layperson would think it is; a mathematical knot is an embedding of a circle in 3-space, so the string has no ends. (Of course, one can represent any knot tied with an actual rope as a mathematical knot by splicing the ends together.) This is a continuation of the old Knot Atlas; it's a lot easier for anyone to edit a wiki. It is simultaneously a database of knot invariants, and the post I linked to is about how to extract that data.

One of the things that has always amused me about knot theory is Lord Kelvin's theory that atoms were knots; but psuedoscience often begets science. I once worked at the Department of Alchemy.

Another such wiki is qwiki for quantum physics. I don't know of any more but they seem like they could be useful, as references to various mathematical areas built by the community that actually uses them. Wikipedia plays this role to some extent, but its main function is to be a general-purpose encyclopedia so the technical details often don't make it in.

I find myself wondering if people have used wikis (with login, of course) to write mathematical papers with collaborators. (Or even alone! Since most wiki software stores old versions of pages, you could easily throw out arguments that you didn't like, confident that if they contained some seed of truth they could be recovered later.)

At Casting Out Nines, via Vlorbik: inside facts about calculators. The big one is that nobody in the "real world" actually needs a calculator; students generally have computers now, and the student versions of Maple and Mathematica are similarly priced to graphing calculators; therefore if it makes sense for students to use any sort of calculator, it should be what's usually called a "scientific" calculator. I agree. However, I have a ten-year-old TI-83 which I know pretty well, and I generally keep it around -- for when I'll be working in a coffee shop and don't feel like taking my laptop with me, or for when my officemate is using the computer. But unquestioning faith in the calculator is a bad thing. You can also read Eric Schechter's The Most Common Errors in Undergraduate Mathematics. Would it be possible to compile an analogous list for graduate students, or research mathematicians? Or do those of us at these levels make too many different errors for this to be possible?

20 August 2007

Shipman's 100-year rule

Joe Shipman writes, on the Foundations of Mathematics e-mail list:
I propose the thesis "any mathematics result more than a century old is suitable for undergraduate math majors".
This is something that I've heard before, in the weaker form "a good undergraduate math education gets you to around 1900". (Which raises the question: in 2100, will a good undergraduate math education still get you to 100 years from the present? Or will it only get you to, well, 1900 -- which will be two centuries in the past by that point? Will the words "undergraduate math education" even mean anything in a hundred years?) The only counterexample he saw was Dirichlet's theorem (which says that if a is prime to b, then there exists a prime congruent to a mod b).

Wayne Aitken, on the same mailing list, argues otherwise; so does Timothy Chow.

Shipman then suggests that this is actually a 100-year rule, to anticipate my earlier question. The reason for this is that although the current best-known proof of Fermat's last theorem, for example, isn't accessible to undergrads (or even most grad students!), in a hundred years we will have significantly shortened and simplified the proof, to the point where it could be explained to a seminar of intelligent, mathematically trained 21-year-olds.

Of course, we are creating more and more mathematics as time goes on, and the rate of growth is often stated to be exponential. It's been claimed that as much mathematics has been created in the last ten years as in all time before that, and even if it's not quite that fast it seems quite likely that mathematics is growing faster than world population. (Note that this trend can't continue forever.) I am inclined to attribute the claim of the exponential rate of growth to Andrew Odlyzko but I cannot find it.

The boiling down has some unintended consequences, though. Carl Sagan, in the chapter of The Demon-Haunted World: Science as a Candle in the Dark entitled "No Such Thing as a Dumb Question", wrote about the lure of psuedoscience and the fact that science is not emphasized in the schools. He quotes the philosopher John Passmore, on how science is generally presented in schools:
[science is often presented as a matter of learning principles and applying them by routine procedures. It is learned from textbooks, not by reading the works of great scientists or even the day-to-day contributions to the scientific literature... The beginning scientist, unlike the beginning humanist, does not have an immediate contact with genius. Indeed... school courses can attract quite the wrong sort of person into science -- unimaginative boys and girls who like routine. (p. 335)
I suspect that this is because in mathematics and science we are constantly doing this "boiling down", so the simplest way known to currently illustrate a result is very rarely the way it was originally shown. Furthermore, the standards of exposition are constantly changing, as (for example) new notations evolve, so it can be astonishingly difficult to read mathematical texts from a couple centuries ago, whereas the same evolution doesn't seem to have happened to literary texts, or at least not to the same degree. And even when a line of argument does stand the test of time, it's constantly reworded, tweaked, renotated... the argument is seen as something to be passed on, but the words which were used are not. The argument becomes a sort of communal property. Perhaps there is only one mathematician, working in many brains.

an interesting fact about a variant of Hex

I have long known about the game of Hex, which is played on a hexagonal lattice. The game is played on a rhombus-shaped section of a hexagonal lattice, with markers of two colors; the board is illustrated at left. One player places red markers, the other blue markers; they alternate in playing. The red player is attempting to create a chain of red hexagons extending from the top of the rhombus to the bottom (these are both shaded red in the screenshot); the blue player is simultaneously attempting to create a chain of blue hexagons from the left side to the right.

There's a nice theorem which states that a game of Hex can never end without a winner. Say you're playing red. If you have blocked off all possible horizontal chains that blue might make, you must have made your own red chain connecting the top and the bottom, and vice versa. (This isn't exactly a proof but it's the gist of any formal proof of the result.)

However, I learned recently from
Bollobas and Riordan's book Percolation that although this is true, an analogous result is not true if one were to play Hex on a square lattice. If we think of one color of hexagons as "open" and the other as "closed", then the earlier result about Hex says that in the rhombus-shaped triangular lattice of points (which is dual to the hexagonal lattice), there's either an open path from the top of the lattice to the bottom or a closed path from the left to the right.

On the square lattice, there is the question of the right way to define adjacency; are two squares which touch each other only at a corner adjacent to each other? It turns out that if we say they are (so each square has eight neighbors) then it's possible to color the squares of the board so that there's a red vertical path and a blue horizontal path. (This doesn't create a problem for a hypothetical game of Hex on a square lattice, though, because we just award the win to the player who completes their path first.) If we let each square have four neighbors, though, then ties are possible; one could have a situation where the entire board is filled in and there's no red vertical path or blue horizontal path. (An example for both of these occurs if we divide the board into four quadrants, and color the entire northeastern and southwestern quadrants red and the northwestern and southeastern quadrants blue.)

But Lemma 5.9 of Bollobas and Riordan tells us that
If we colour the squares of an n by m chessboard [blue] and [red] in an arbitrary manner, then either a rook can move from the left side ot the right passing only over [blue] squares, or a king can move from top to bottom using only [red] squares, but not both.

(I've changed the colors; Bollobas and Riordan use the more traditional black and white.)

However, there's a difference between the two cases, even so. If you color in all the hexagons in an n-by-n section of the hexagonal lattice uniformly at random -- so each is red or blue with probability 1/2, and each hexagon is independent -- then by a symmetry argument, the probability of having a red vertical path and a blue horizontal path are equal, and thus are both 1/2. This doesn't mean Hex is fair -- the first player is in fact guaranteed to win under perfect play. But in the square-lattice version of the game, I suspect the player who is allowed to consider squares which share a corner as adjacent has a very large advantage.

19 August 2007

climate phase lag trickery

Why does it get hot? Because of the sun.

But in temperate climates in the northern hemisphere, the most light from the sun comes around the summer solstice -- June 21 -- and yet the hottest part of the year is the second half of July. In certain places with a lot of maritime influence, the heat comes even later; the canonical northern-hemisphere example is probably San Francisco, where September is the warmest month. Similarly, the hottest time of the day isn't noon (even correcting for daylight savings time; solar noon is generally around 1 PM clock time under daylight savings time), but perhaps two to four hours later.

This phase lag comes about because the atmosphere holds some heat; the temperature as a function of the day of the year could be modeled by a differential equation, where the rate of change of temperature is some decreasing function of the temperature, plus a sinusoidal input. Thus the output should be similar to the input, but phase-lagged.

Furthermore, I am starting to believe (without evidence) that the temperature in my apartment building is lagged by maybe a couple weeks relative to the temperature outside during the day; I believe with somewhat more evidence that the hottest time of day in my apartment is generally around eight or nine in the evening. And so I find myself fantasizing about a world where that phase lag existed to such an extent that it's warm in my apartment at night and cold during the day. (Without using the air conditioner, that is.)

A bit more ridiculously, what if I could exploit that sort of seasonal phase lag to the point where it was cold in the summer and hot in the winter? But I'd need some sort of ridiculously large heat reservoir, something bigger than the ocean. Still, it's an interesting fantasy. The hottest time outside is a month after the time with the most solar radiation; the hottest time inside might be a few weeks after that; what if I had some sort of series of nested boxes each of which could shift the hottest time of year a few more weeks? If I nested enough of those boxes...

of course, it would be cheaper to just run the heat.

Seriously, though, there is some new construction that attempts to use the insulating properties of certain building materials in this way, although not so much to reverse the seasons as to smooth out seasonal variation. One particularly interesting one I heard about a few months ago depended on the thermal properties of a certain type of wood, which underwent a phase change around seventy degrees Fahrenheit; thermally it's very similar to keeping an ice cube in your house, except it's a magical seventy-degree ice cube instead of the thirty-two degrees of a normal ice cube.

more fun with sexual means

I, and about a zillion other people, responded to Gina Kolata's article in the NYT last week, in which medians and means were confused, in an article claiming that men had more sexual partners on average than women.

There is a follow-up article by Kolata in today's Times (Sunday).

There was blogospheric clamor for the full distribution of the number of sexual partners of men and women; the original report from the CDC, it turns out, doesn't have that, but groups people into four groups -- those who have had 0 or 1 sexual partners, 2 to 6, 7 to 14, and 15 or more. This strikes me as insufficient resolution. In particular, zero is much different than one, as any virgin could tell you. Two seems a lot different than six, as well; two sexual partners could be someone who had sex with their spouse and one other person, whereas six sexual partners in a lifetime, although not a lot, can't have such a simple story behind it.

In any case, I'll reproduce the table. (The numbers are percentages.)
Partners0-12-67-1415+
Men16.633.820.728.9
Women25.044.321.39.4

The claimed medians for the number of sexual partners for men and women are seven and four, respectively. But 50.4% of men have had six or less sexual partners, according to this data. My earlier claim that the data might support a two-peaked distribution for women seems unlikely, but can't be ruled out at this resolution. (But I've seen enough other distributions that would explain the difference in medians that I don't really believe my own theory any more.) You can't extract means from this data -- and in fact at this resolution, it's theoretically possible that all the men could have 0, 2, 7, or 15 sex partners, for a mean of 6.46, and all the women 1, 6, 14, or [large number] sex partners, for a mean of at least 7.3 (if [large number] is in fact 15), making the female mean actually higher than the male mean. It would be simple (though I won't do it) to tweak the numbers so that the two means came out exactly equal.

Kolata (who has a master's in math, according to Wikipedia), however, claims that the data is inconsistent, in that there's no way to make the means equal: "I got between 40 percent and 75 percent more male than female partners depending on how you guess the average on each interval." I wonder what she tried. Sure, I'm just showing that it's possible the means are equal, not that it's likely. But someone with mathematical training should know better.

18 August 2007

Ten thousand

The ten thousandth page load on this blog came today, at 5:31:47 AM (US Eastern Daylight Time, UTC-4), from someone located in or near Ithaca, New York, using Time Warner Cable's RoadRunner internet service; they viewed my post on the high school prom theorem (the article from the New York Times that inspired this, in which medians and means were confused, has been mentioned in quite a few blogs).

They came from this post at The Volokh Conspiracy; I'm not mentioned in the post, but John Armstrong provides a link to me in the comments.

They were using Firefox 2.0.0, wich is the browser used by slightly more than half of my readers. (Generally the statistics hover around 60% Firefox, 20% IE, 10% Safari, and 10% random other browsers, which is much different from the Internet as a whole.)

Ten thousand isn't a huge number, but it's enough to make me realize that people actually are willing to hear what I want to say, and I appreciate that. Thanks for reading.

Social interactions aren't random

When I graduated from high school six years ago, I received a prestigious national award. The people who received it were selected based on some combination of standardized test scores, academic ability, a bunch of essays we wrote, and some sort of correction for geography I never quite understood. About one hundred and forty people -- all high school seniors -- receive this award each year.

In the two years since I've graduated from college, I have met two other people who have received this award, and who I did not meet through the schools that I attended. (It doesn't seem quite fair to include people who went to the same schools as I did, or indeed who I met through any academic channels, because the award was supposedly selecting for academic ability.) One is an ex-girlfriend; the other is someone that I met the most recent time that I was looking for a place to live. (I live alone; after a string of potential roommates had "friends" materialize that were suddenly interested in their empty rooms, I realized that people were trying to tell me something.) Neither of them volunteered this particular piece of information (to be honest, it's basically irrelevant); in both cases I found it out by Googling them.

And actually, upon scanning the names of people who've gotten this award, I found a third one I know, the semi-invisible housemate of some friends of mine.

So, how likely is it that I would have met two (or three) such people in the last two years, out of all the people I've met in that time? The word "met" is a bit vague here, but since in both cases I found out this piece of information by Googling, in order to identify this fact I would have to have enough information to find them on Google. In particular, I'd have to know their last name.

How many people have I met over the last two years? I don't know an exact answer, but it seems like one person a day would be fairly accurate, if perhaps a bit high -- for a total of seven hundred and thirty people. This is a tricky number to estimate because on most days I meet no new people, but on some days I meet lots of new people. I suspect I'm not unique here.

Furthermore, I'm going to make the very crude assumption that everyone I meet is between one year younger than me and three years older than me -- and actually, as you'll see, irrelevant. This is surprisingly close to being the truth. This means that there are five years worth of potential award recipients for me to meet, or 700 people. (As I said previously, there are 140 recipients each year.) The total number of 17-year-olds in the country in 2000 is about four million. (There were about twenty million people between the ages of 15 and 19; I'm assuming one-fifth of them were seventeen.) So the fraction of people who were recipients of this award is 700 (the number of recipients in five years) in twenty million (the number of people that were the right age in those five years), or about one in thirty thousand. You'd expect me to have met about one-fortieth of one of these people. I've met three. Just how unlikely is this?

We can compute this using the binomial distribution. I'll let C(n,k) denote the binomial coefficient "n choose k", which is n!/(k! (n-k)!). (Incidentally, these are some of my favorite numbers. Yes, I have favorite numbers.) We can look at the probability that each person I've met is an award recipient, which is about one in thirty thousand; let p = 1/30000. Let q = 1-p be the probability that a given person is not an award recipient. (Note that in probability, q is almost always 1-p. This differs from the convention in number theory, where p is a prime, and q is a prime that isn't p.)

The assumption of independence is a bit sketchy, but each of the three people in question I've met through different people and in different ways, so it seems reasonable. This sort of thing isn't always reasonable; for example, one feature of my current life is that I know a lot of people who went to Moravian College, which seems noteworthy. But I met one of them and then she introduced me to the others, so it's not all that weird.

Anyway, the probability that I've met exactly k award recipients is

P(k) = C(730,k) pk q730-k

and we compute P(0) = 0.9760, P(1) = 0.0238, P(2) = 2.89 10-4, P(3) = 2.33 10-6.

The chances that I've met at least two award recipients are P(2) + P(3) + ... P(730); since all the other terms are ridiculously small in terms of P(2), we'll call it just P(2), and we see that the chances of meeting two award recipients -- assuming that I meet people at random -- is one in 3400. The chances that I meet at least three is very nearly P(3), or one in 420,000 or so.

What do I conclude from this? That I don't meet people randomly. Neither do you. We generally tend to meet people who are like us in terms of socioeconomic status, level of education, age, political leanings, and so on -- all things that are probably correlated with this award. (Yes, I said "political leanings" are correlated with an academic award. I invite you to contradict me.) This is also a problem that occurs in the small world phenomenon (sometimes more popularly known as "six degrees of separation") -- we generally know people that are like us, but somehow we're linked by short chains to people who have absolutely nothing in common with us. I expect I'll write about this in the future.

17 August 2007

some thoughts on numberpedia

Numberpedia: Store, share and search the world's statistics. From The Numbers Guy.

The goal of this site is nicely put at their overview:

Up to 25% of all news articles written in any given day are based around some kind of statistics. Many times when we read one of these articles we want to remember a particular statistic without bookmarking the entire page. Search results still require you click a link and read through paragraphs before we find the relevant number. In some instances, statistics are described in different way but are logically comparable. Search engines have a hard time returning all of the relevant statistics and projections because of the nuances of language and thought around forecasts and projections.

For the most part the numbers that are there are taken directly from news articles; what I'd like to see would be something that explains where a given statistic is really coming from, a sort of provenance for numbers. (Of course, one can presumably do this by tracing back to the original source.) This would be nice because then one could know when two statistics are comparable; I want to know if I'm comparing apples and apples, apples and oranges, apples and hamburgers, or apples and televisions. (Televisions are obviously very different from apples, because you can't eat them. I was going to say "apples and computers" but then saw the potential for misinterpretation.) Of course, this would also require a lot more work, and a lot of the time you just can't tell where people got their numbers from. It appears that the software includes the feature of "discussing" a given number, though, so this might naturally evolve if the site takes off as people begin to see numbers juxtaposed and wonder what to make of it.

The nuances of language as they apply to various questions of forecasting and projecting might be lost on people, too. It's interesting how logically equivalent questions can give much different results in political polls, for example.

more on communication and education

Vlorbik on Math Ed brings us the pencil rant, in which you expect him to say -- but he actually doesn't -- that math should be done in pencil, at least by students. Personally, I don't care what my students use, so long as their work is readable and not full of cross-outs. (It is surprisingly difficult to get them to do this; they seem to believe that there is no value in making one's work be presentable.) I use pen, myself, mostly because pens are lower-maintenance than pencils -- they don't need to be sharpened. Also, things I wrote in pen are still legible months or years later, while pencils smudge. Most importantly, if I erased everything I did that I thought was wrong but later turned out to be right, I'd never get anything done because I would constantly be retracing my steps! A teacher of mine in middle school once took off points because I took a quiz in pen. (If I remember correctly -- and I might not -- he had never explicitly said to use pencil.) My father was outraged when he learned about this; he said that the teacher ought to have given me extra points for being confident enough to write in something non-erasable.

Calculating the Word Spurt from MathTrek. Certain words are easier for children to learn than others (what makes a word easy to learn isn't entirely clear, but it seems to depend on a large number of factors, so the "difficulty" of learning a word is normally distributed). A child needs to hear an "easy" word less times than they need to hear a "hard" word in order to learn it. Thus a chhild will start out by learning a trickle of words, but when they get to the point when they've heard the medium-difficulty words enough times to learn them, that's when the flood comes. From what I can gather from the coverage I've read of this (I can't see the actual article), this particular theory applies only to little kids, and considers words independently of each other. But I would imagine that if you know certain words, it's easier to learn others. The obvious example is a lot of technical terminology which has explicit definitions which use other technical terminology, but non-techhnical natural language could be the same way. People talk about figuring out vocabulary by context. If there's one word in a sentence you don't know, you might be able to figure out what it means; if there are two words you don't know, probably not. (However, if you hear those two words in two sentences you might be able to figure it out; I'm seeing flickers of an analogy which identifies sentences with equations and unknown words with variables.) This last analogy has some interesting (and probably nearly trivial) ramifications for mathematics education, indeed education of all kinds -- try not to introduce too many new concepts at once. I have had professors who might have, say, ten things they want to say in a given lecture, and they cram them all into the first ten minutes, or the last ten minutes. By simply reordering what they say they could probably do a better job of facilitating the learning process. Similarly, giving a definition of the form "An X that has properties P1, P2, ... P10 is a Y", although logically sound, isn't cognitively sound -- it's better to break the definition up into chunks. "An X which has properties P1, P2, and P3 is said to be R. An X which has properties P4, P5, and P6 is said to be S. An X which has properties P7 through P10 is said to be T. Something which is R, S, and T is said to be Y." Those of us learning are not computers; we are humans, with human brains.

This sort of "chunking" probably comes about naturally if one does not talk from meticulously prepared notes, which is Vlorbik's suggestion at Jazz Math Ed. The human brain won't remember that list of ten things, but it will remember the chunked version, so the chunked version is what will come out. I believe that one of the worst sins of some mathematicians is to write everything as if it is reference material for those who already know it, therefore making it incredibly difficult to digest. (I almost called this blog "fuck Bourbaki", in fact, since that's a hallmark of the Bourbaki style, but having an obscenity in the title seemed unwise.)

16 August 2007

Fibonacci win points

The weekend after next, I'm going to the National Baseball Hall of Fame. While poking around on the internet, I found some references to Whatever Happened to the Hall of Fame by Bill James. James is well-known as one of the first people to apply statistical methods to baseball (though I must confess I've never read any of his work; I might get this book. If the Hall of Fame is at all competent, they'll have it at the gift shop.) To be frank, from what I've heard a lot of his work doesn't really have a sound mathematical basis, but it's inspired people to look at the numbers in order to judge players' performance, and a lot of people (like, say, the folks at Baseball Prospectus) have taken their inspiration from him.

Anyway, the Wikipedia article mentions various methods that James came up with to judge a player over the course of a career; this includes the intriguingly named "Fibonacci win score", but doesn't explain how this is calculated. Naturally, I was curious. Google turned up this thread at baseball fever, which says that the Fibonacci win score for a pitcher is the number of career wins, times the winning percentage, plus the number of "marginal wins" (i. e. wins minus losses). This is typical of James in that it doesn't make sense at first -- why would you multiply wins by winning percentage?

The reason it's called "Fibonacci" is because of the answer to the natural question -- how does the Fibonacci win score for a player compare to their actual number of wins? Say a pitcher's winning percentage is k, and he won W games in his career. Then he loses [(1-k)/k]W games, and his number of win points is kW + W - [(1-k)/k]W. For this to be equal to W, we have

k + 1 - (1-k)/k = 1

and this has one root with k between 0 and 1, namely k = (√5 - 1)/2;, or about .618; this is the limit of the ratio between consecutive Fibonacci numbers, hence the name. A pitcher with a better winning percentage than this will have a higher win score than his actual number of wins; a pitcher with a worse record than this will have a lower win score than his actual number of wins. (The highest win score in history is 511, by Cy Young, who won 511 games and lost 316 in his career; indeed, Young's win percentage was .618.) The purpose of this statistic is to reward pitchers that pitched well and penalize pitchers who were just mediocre over very long careers.

The other question that comes to mind is -- if a pitcher wins a game, or loses a game, what does this do to his number of win points? Let f(W,L) denote the win score of a pitcher with W wins and L losses; then we have

f(W,L) = W (W/(W+L)) + W - L = (2W2 - L2)/(W+L)

Incidentally, in this formula the numerator is negative for a pitcher whose winning percentage is less than 1/(1+√2), or .414. If we differentiate this with respect to W and simplify, we see that

fW(W,L) = 2 - [L/(W+L)]2

and thus an additional win gets a pitcher two win points, minus the square of his losing percentage. Similarly, we have

fL(W,L) = -1 - [W/(W+L)]2

and so a loss costs a pitcher one win point, plus the square of his winning percentage. It almost seems meaningless to say that, because there's no a priori reason why this particular arrangement of variables should mean anything -- though at least it's dimensionally consistent, and has units of wins; there are a lot of random-looking combinations of statistics that don't even do that!

IN UR QUANTUM BOX... MAYBE

IN UR QUANTUM BOX... MAYBE, perhaps the best lolcat ever. (But then again, I'm biased due to the affection I have for the whole Schrodinger's Cat problem, because I like quantum mechanics and I like cats.)

The comments include the following gems:
"Of corse, kitteh 2 big 2 b unserten in qwantum way witout bein reely reely reely cold. Like reely neer absuloot 0."

"Oh hai! I brought u a wayv funkshun. But i collapsded it."

"I wuz playin wit string theoriez, but now Iz restin."

"Kitteh partikalz givin off much cyootness energi. Round O’cheezburgerz fer evybodi!"

"umm, wen box closed and sealed from environment, Shrodingur’s kitteh in ’sooper-posishon’ of beeing bof alive an ded. Wen u open box, you interfear wiv da kitteh, da wayv funkshun kolaps and u see kitteh is eyver alive or ded."

and about four-fifths of the way down, a copy of this list of feline laws of physics.

(from I CAN HAZ CHEEZBURGER?)

On mathematical communication

A three-part blog post on how a theoretical physics paper gets made: inspiration, calculation, culmination. This tells the story through the example of a particular paper on cosmological inflation. (From Cocktail Party Physics.) The comments are probably worth reading, too.

Somewhat relatedly, although more about the mechanics of writing, Terence Tao on rapid prototyping of papers -- basically, sketch out the outline of the paper first, making the statements of the key intermediate results, and then fill in the gaps, rather than writing from beginning to end. I can't vouch for this working on the level of writing research papers for the simple reason that I have written none. (I hope this changes soon.) But it seems to work reasonably well for writing, say, solutions to rather involved homework problems that can take a few pages, and have three or four major intermediate results.

Also, Can Scientists be Great Communicators?, from The Accidental Scientist. I would say that regardless of whether or not we are (and I'm including mathematicians in this "we"), we have to be. This is true both in terms of communication among scientists (which is tremendously useful for driving along the whole scientific enterprise, because otherwise we'd all be reinventing the wheel) and in communication with the non-scientific public, which I think is quite important. For one thing, ultimately a lot of the money that funds science comes from taxes; if these people are in the end paying our salaries, don't we owe them some explanation what we're doing? But also, communicating complicated ideas in non-technical terms forces us to actually understand them. Feynman, when he was preparing his famous freshman physics lectures at Caltech, said that if he couldn't reduce something to the level where he could explain it to freshmen, it meant that he didn't really understand it. When you can't fall back on technical terms and convoluted equations you have to understand what you're doing. So communicating with the hypothetical "educated layman" perhaps pays dividends within science as well. I just wish that people didn't automatically glaze over when they heard I'm a mathematician, though...

Communicating with this person is becoming more and more feasible thanks to the web 2.0-ification of science. Write something. Google will find it. You'd be surprised to see how many hits I get from what looks like people trying to buy used furniture, for example. And although I offer no advice there on how much used furniture should cost, I feel like I'm doing something by just exposing them to the idea that perhaps mathematics can be used to figure out such things. It's a subtle propaganda campaign.

Another subtle propaganda campaign might be the sculptures at Bathsheba Sculpture (by Bathsheba Grossman), which are for the most part models of various mathematical objects, done via 3D printing in metal; she has both mathematical and artistic training. What other sorts of training might be useful for mathematicians?

10% of a constitution is still a lot

Venezuela head outlines changes, from the BBC:
President Chavez told the Assembly his proposals only affected 10% of the constitution.


A friend of mine said that there are certain things where a small change to the input doesn't necessarily create a small change in the output; his examples were genomes, software, and constitutions. I'd argue that genomes are basically a biological version of software. And is a constitution basically an operating system for a country? (I was able to find at least two example of this metaphor, from a constitutional law blog and at Wired, but I think I came up with it independently.) Change a few lines in the Constituion and you change a lot. The text of a lot of the amendments is quite short.

To this I would add mathematical proofs. I suspect that in most proofs, say, 90% of the thought is in 10% of the lines, the rest being (relatively) routine verification. A lot of mathematical formulas are also the same way -- change a single coefficient or a single sign and the formula doesn't work any more -- but those are sort of analysis to software, in that they're algorithms for solving some problem.

15 August 2007

visualizing higher dimensions

Jon Shock writes at SciTalks about visualizing things in four dimensions, although it's actually less about visualization than about how we don't need to visualize so long as we can calculate -- although he does provide a link to a youtube video showing how one can obtain the projection of a hypercube onto a plane.

I've never been good at visualizing things in more than three dimensions (actually, to tell the truth, I have trouble visualizing things in my head in more than two dimensions; to visualize anything successfully in three dimensions I have to use my hands and point at things in the empty space in front of me). But a guy I knew in college claimed to be able to visualize things in five dimensions fairly easily. The fourth dimension was time -- so a four-dimensional sphere, for example, can be thought of as a three-dimensional sphere which grows from a point to a large ball and then shrinks back again. The fifth dimension was color, although I never really understood this (he wasn't very good at explaining it); I think the idea was that, say, red would correspond to a large coordinate in the "color" dimension and blue would correspond to a small coordinate, similar to weather maps showing temperature. So like weather maps allow us to visualize a surface in three-dimensional space with a two-dimensional picture, you can get an extra dimension this way (or even more than one! -- the space of colors is usually thought of as three-dimensional -- although I can't imagine really taking advantage of that).

I suppose a lot of this visualization ability comes with practice, though, and I rarely have any reason to think in more than three dimensions. I can definitely see it coming with practice when I teach calculus students about solids of revolution; at first they just don't seem to get it but eventually they seem to figure it out. The difference there is that I can tell them to draw a picture of the solid in question, and they draw a two-dimensional picture of a three-dimensional thing, which is something we're all used to; drawing a two-dimensional picture of a four-dimensional thing that doesn't throw away all the useful information is a lot more difficult, to the point where perhaps you really have to understand what the thing looks like before you can draw the picture. So they're useful for explaining results to other people, but not so much for coming up with new results.

How common are prime powers?

The Mathematics Weblog at sixthform.info shows a cute little theorem: that the group table for any cyclic group can be arranged so that all the instances of the same element lie along a (possibly broken) diagonal, and that this isn't possible for noncyclic groups.

This reminds me of the addition tables one sees in the back of notebooks, which in base 10 look like

+0123456789
00123456789
112345678910
2234567891011
33456789101112
445678910111213
5567891011121314
66789101112131415
778910111213141516
8891011121314151617
99101112131415161718

If you just look at the last digits -- thus considering this as addition in the group Z10 -- then you see this pattern. All the numbers ending in, say, 5 -- the 5s and the 15s -- lie along a single broken diagonal.

You don't see the same pattern when you consider the analogous multiplication table, and so you think "hmm, the integers mod 10 under multiplication don't form a cyclic group?" But as it turns out, they do, and it's a cyclic group of order 4, because when we're considering the integers mod k under multiplication we only look at the invertible elements; these are 1, 3, 7, and 9, and we get the following little table:
×1397
11397
33971
99713
77139


Is this something special about 10, or is it common? Now, if we pick a random number k, what are the chances that the group of units (i. e. invertible elements) mod k is cyclic? From looking at the small cases, you'd think the probability is rather large: k = 2, 3, 4, 5, 6, 7, 9, 10, 11, 13, and 14 have this property. But in general, it's actually pretty rare. Given k, we can write k = paqbrc... where p, q, r... are primes, and consider the integers modulo each of the prime powers pa, qb, rc, ... separately. For example, the group of units mod 10 is the direct product of the group of units mod 2 (trivial) and mod 5 (cyclic of order 4). The group of units mod 12, in contrast, is the direct product of the group of units mod 4 (cyclic of order 2) and mod 3 (cyclic of order 2, again), which isn't cyclic.

It turns out (this is a fairly basic result in classical number theory) that the units mod k form a cyclic group if and only if k is 2, 4, a power of an odd prime, or twice a power of an odd prime. This is a quite restrictive condition, although it doesn't seem so from the list I made; I suspect this sequence is only a bit less sparse than the primes themselves.

However, I can't show this quickly. The number of odd prime powers up to N is a bit smaller than log3 N + log5 N + log7 N + ..., where the indices run over all the primes (the exact number is this with a floor function applied to each term); this is

(log N)(1/log 3 + 1/log 5 + 1/log 7 + ...)

and we can approximate the jth prime by j log j; thus this is approximately

(log N)(1/(2 log 2) + 1/(3 log 3) + 1/(4 log 4) + ...)

or at least has the same convergence behavior as this sum. But this sum can be approximated by the integral of 1/x log x from, say, 2 to infinity; that integral is log log x, so the sum diverges, although very slowly. As you may have guessed, I don't actually know analytic number theory, and I'm just fooling around here. I might actually be wrong; prime powers might end up being more common than primes for large enough numbers.

Are we living in a simulation?

The Simulation Argument is discussed at George Dvorsky's sentient developments, after being mentioned in an article yesterday by John Tierney in the New York Times. It's due to Nick Bostrom, whose original paper is available online.

The argument is as follows: "posthumans", that is, the people of the future who have much better computers than we do, will use their computers to run simulations of, well, more primitive humans. These simulations will be so detailed that they include a working virtual nervous system for all the people inside. And you have to figure that they're not going to run just one of these simulations; the future people run these simulations for fun! (This seems reasonable; I've spent way too much time playing SimCity to argue against this.) So we should expect that over the lifetime of humanity there are a very large number of such simulations being run, and that it is therefore very unlikely that we live in the "real world".

For me, this bears a superficial similarity to Pascal's Wager, although the mathematics is different; for one thing, Pascal's Wager involves an infinite payoff and there are no infinities here. But it's probably useful to think of there being effectively an infinite number of these simulations, in which case the probability that we're living in the "real world" turns out to be essentially zero. (Bostrom doesn't go this far; he says he feels there's about a 20 percent chance we're living in a computer simulation, which basically means he figures there's a 20 percent chance that civilization gets to the simulated-reality stage, if you neglect the probability that there are simulated realities but we're in the real one.) I suspect the real reason this reminds me of Pascal's Wager is because it seems natural to equate the runner of the simulation with "God".

What's especially strange, at first, is the idea that the simulations could have simulations within them. This reminds me of a cosmological theory that universes "evolve" by spouting black holes; in this theory, black holes are connections to other universes, where the various physical constants are slightly different than in the parent universe; thus there's a sort of Darwinian selection for universes, where the selective pressure is towards making lots of black holes. (Then why don't we live in a universe with lots of black holes?) As to the simulations within simulations -- if you carry this to the logical extreme, we are likely to live in some very deeply nested simulation. The problem is that infinite nesting probably isn't possible. And does the level of nesting even matter? My first instinct is to think that nested simulations would necessarily be of "lower fidelity" than first-level simulations, but since everything is digital this need not be true, as Bostrom himself points out. However, he also points out the disturbing fact that since a posthuman society would require more computing power to simulate -- you've got to simulate what all the computers are doing quite well -- if we head towards being posthuman we might be shut off! Personally I would like to think that the ethics of these simulations require the Simulator to not just shut us off. Presumably the person running the simulation could get on some sort of loudspeaker and let us know what was going on. (Although that might raise other ethical questions -- do simulated realities have some sort of Prime Directive, where you're not supposed to interfere with them?)

The mathematics of the argument seems so simple, though, that I'm almost inclined to throw it out on those means alone, along with other things likes Pascal's Wager and the Copernican principle. Surely proving the existence of God (and that's what this is, although people don't put it this way) can't be so easy!

14 August 2007

Primeshooter

Primeshooter, from 1729.com. You shoot numbers with their prime factors; when you hit a number with one of its prime factors (2, 3, 5, 7, 11, 13, 17, or 19) it is divided by that number. Once you reduce a number to a prime, you can hit it with a "prime" bullet that makes it go away; numbers also go away if you successfully get them down to 1. You lose a life if a number hits you; if a number reaches the bottom of the screen without getting down to 1 it reappears at the top. Sometimes (I don't know when) it appears in a bright red box, in which case you'll lose a life if it reaches the bottom. Sometimes you get an extra life.

I got a score of 834 (= 2 × 3 × 139, although that's irrelevant), which means I successfully shot at 834 prime numbers. I don't recommend trying to beat me, because even if you succeed you will lose about thirty-five minutes of your real life.

The rest of the web page looks kind of fun, too.

An exercise for the reader: why is it fitting that this is on 1729.com?

The "lights out" puzzle

Remember the puzzle Lights Out? You have a five by five array of buttons with lights under them. When you push one of the buttons, that button and its neighbors to the north, south, east, and west are toggled -- they turn on if they were off, and off if they were on. You start with some random configuration of lights, and the object is to turn them all off.

I remember being very frustrated by the physical version of this puzzle when I first encountered it, because I could never get it to work. It's easy to get most of the lights out, not surprisingly, but hard to get all of them.

It turns out there's a solution, but it's kind of frustrating in that it involves a lookup table. Basically, you look at the first row; whichever lights are on, you press the corresponding lights in the second row. Now, whichever lights are on in the second row, you press the corresponding lights in the third row, and so on. (This is known as "chasing the lights".) Now some lights in the bottom row are on. You then press some apparently randomly selected lights in the top row, according to the table, and repeat; when you get to the bottom this time, all the lights will be off.

This is very inefficient. For one thing, you can see that the order in which the lights are pressed doesn't matter (unlike, say, the Rubik's cube) -- what determines whether a light is on is just its starting position, and the parity of the number of times it and its neighbors have been pressed. Furthermore, if you press a light twice that's the same as not pressing it at all, and this solution in general leads to pressing a lot of lights twice.

In a 9-by-9 version of the puzzle, it turns out that there are only two possibilities for the final row if you do the analogous thing; either all the lights are off (and then you win) or the first, third, fifth, seventh, and ninth lights are on. Then you press the square in the upper right corner and repeat the whole procedure. It looks to me (although I don't have that many 9-by-9 examples of the puzzle, and so can't be sure) that pressing the upper left square toggles the puzzle between these two things, so there ought to be a solution of this form:
- press, or don't press, the top left square.
- proceed by the "chasing the light" method.

In either of these cases one can find a better (although not quite minimal) solution for any starting configuration, by doing the two-pass chasing-the-lights algorithm, noting which buttons one pressed, and then starting from the same original configuration, not pressing those buttons which were pressed twice the first time. For example, today's 9-by-9 lights out at logicgamesonline.com is solved in 58 moves by the crude two-pass method; it turns out that there are ten lights which are pushed on each pass, so eliminating the corresponding 20 presses, one can solve it in 38 moves. (The web site claims there's a 28-move solution, but I'm not bored enough to look for it.)
In the 5-by-5 case there are certain nice-looking patterns which don't do anything. This page describes them as being in the "null space of the adjacency matrix of Lights Out", which kind of makes sense. For more on this, see Mathworld's description of the puzzle; it turns out that the puzzle can be described as the solution of a system of 25 equations with mod 2 coefficients. What I wonder is if there is some way to find the minimal solution, or at least a solution that doesn't involve pressing the same button twice (we might call this an "efficient" solution), that doesn't require actually solving the system of equations, but can be done "by inspection". I suspect there isn't, but I may be wrong.

how many Asians are there in Philadelphia?

In today's Philadelphia Inquirer: Asian market interests banks, about how there are banks in Asian-American enclaves of Philadelphia (such as Chinatown) that specifically target that market. There are two major reasons why this market has been poorly served by traditional banks, according to the article: the language barrier, and the fact that many Asian-Americans don't have as much credit history or financial documentation as other people in similar financial situations, because they tend to deal more in cash.

The online version of the article doesn't include the following chart of ten-county population by race, attributed to the American Community Survey of the U. S. Census Bureau, which caught my interest:


20022005Percentage change
White392826340365972.76%
Black or African American111568311674514.64%
American Indian and Alaska Native881788480.35%
Asian20924625055219.74%
Native Hawaiian and Other Pacific Islander1262176840.10%
Hispanic or Latino31910137150216.42%
Total population566769057473531.41%


The point of the table, of course, is to illustrate that the Asian market has been growing in this area faster than the population as a whole, and that therefore it is logical that banks would be reaching out to this community.

Now, when I read this, something seemed suspicious. The population in every category except the neglible "American Indian and Alaska Native" one went up by at least 2.76%. But the total population went up by only 1.41%. Something is wrong here.

If you then add up the first six entries in each column, you get 5,582,372 and 5,836,718, respectively. For some strange reason, the sum of the racial numbers is 85,318 less than the total population in 2002, but 89,365 more than the total population in 2005. The obvious explanations are the following:

  • "Hispanic or Latino" is being treated differently than the other items here, which is actually fairly common in discussions of race in America; basically if your ancestors come from somewhere Spanish-speaking, then you're somehow outside of the usual "race" categories.

  • A lot more people have suddenly started to identify as multiracial between 2002 and 2005. A lot. Three percent of the total population. I'm aware that a lot more people have started to identify as multiracial recently instead of feeling like they have to pick one identity or the other, but it doesn't seem like it could happen that fast.



Usually on tables like this, one sees some note saying that the various pieces of population don't add up to the total population, due to certain minor categories not being listed, or due to people being included in more than one category. But there is no such note.

I had actually been anticipating something like Simpson's paradox, but it doesn't seem to be anything that complicated.

By the way, the ten-county area here is Bucks, Delaware, Chester, Montgomery, and Philadelphia counties in Pennsylvania; Camden, Burlington, Gloucester, and Atlantic counties in New Jersey; and New Castle county in Delaware. This strikes me as being slightly "off"; I would replace Atlantic with either Mercer or Salem. I suspect this doesn't make a huge difference in the data; however, it might be an example of "selective reporting", where one draws the boundaries in a nonstandard way to make some sort of point. The census bureau itself uses a nine-county area (the first eight above, plus Salem County, New Jersey); I'd say the most common definition of the Philadelphia area is as the first eight counties above.

links for 14 August

The "is the new" diagram attempts to document all instances of "X is the new Y", which is a fairly common phrase, with an arrow from X to Y. Not surprisingly, the node of highest indegree appears to be "black". You can try to make a partial order here, assuming that if X is the new Y and Y is the new Z, then X is the new Z; for example, local is the new kosher (local is the new organic, organic is the new kosher) -- but there aren't any really large components in this graph. It seems like most of the time, when people say "X is the new Y", nobody has ever made a statement "X is the new Z" or "W is the new Y" before. It's interesting to see if you can get why these things are true -- why is Ohio the new Florida? gadget bag the new man purse? individuality the new conformity? And isn't this last one impossible by definition? Is it really individuality if everyone's doing it? (Via information aesthetics.) It would be interesting to see how this network evolves with time; eventually people will start talking about things that have already been talked about and larger components will start to emerge.

[edit, 9:18 am: a particularly perplexing entry here is "Canada is the new Estonia". If you google this phrase, you get a lot of confused people, but eventually you find the beginning of a New York Times article from 2005, which says "America's next top model? she's [sic] likely from canada.Canada is the new Estonia - at least when it comes to modeling." I didn't realize Estonia was the old Estonia.]
Bad countries come with long names, from Social Science++, via Language Log. The theory is that a country with a good-sounding name like "Great Socialist People's Libyan Arab Jamahiriya" or "Democratic People's Republic of Korea" probably has something to hide; it seems to be borne out, although the United Kingdom of Great Britain and Northern Ireland would probably disagree. This is related to a pet theory of mine that restaurants advertising "authentic X cuisine" probably have something very far from authentic X cuisine. If you really were a democratic country, or you really served authentic food, you wouldn't go to the trouble of advertising it; building up a reputation would be enough. (But how is reputation built? This is a question I wouldn't mind studying from a mathematical perspective.)

Atle Selberg died, from Terence Tao; a mathematical obituary. Selberg and Erdos produced the first elementary proof of the prime number theorem; the story goes that Erdos got the lion's share of the credit, to the point where someone walked up to Selberg (not knowing that it was him) and said "did you hear that Erdos and some Scandinavian came up with an elementary proof of the Prime Number Theorem?" Then again, that seems like the sort of thing Erdos himself might have done; supposedly late in life he forgot who people were, even the people he collaborated with often.

The Quant Bloodbath, at Secret Blogging Seminar: how will the current crisis on Wall Street affect the job market for quants? Will Wall Street decide it doesn't need mathematicians after all (because the quantitative models don't seem to be working), or will it decide that it needs more mathematicians to make better quantitative models?

Ken Jennings (yes, that Ken Jennings) writes about Pic-Tac-Toe puzzles, in which nine images are arranged in a three-by-three grid, and the images in each of the three rows, three columns, and two diagonals have a common theme. An alternative is if you're given eight of them and you have to fill in the ninth; see this example from the 2003 MIT Mystery Hunt.

Science after Sunclipse has launched MathSciJournalWiki, which is what it sounds like -- a freely editable source of information about scientific journals. Also of interest is Scientific Publishing: A Mathematician's Viewpoint, by Joan Birman, from the July 2000 issue of the Notices of the AMS.

Yesterday I wrote about men having more sex than women?; over at Statistical Modeling, Causal Inference, and Social Science they're going on about the inaccuracies of the NYT article I mentioned there, which conflates the mean and the median. Not surprising, since they're statisticians over there; I still think that people lying about their sexual activities is the big thing, though.

13 August 2007

Life expectancy, walking, and that giant city a hundred miles north of me.

Why New Yorkers Last Longer, from New York magazine. (Thanks to Jess Haralson.)

The article mentions how life expectancy is calculated, and says:
The math works like this. Imagine that one man dies of AIDS at age 25. Since he was statistically supposed to live to 78.6 years, he’s died about 50 years too early, so he shaves 50 years off the city’s overall pool of life. If one Wall Street guy collapses of a heart attack at age 65, he shaves only ten years off. You’d have to have five Wall Streeters die at that age to equal the impact of one AIDS victim. By the same logic, one infant’s dying during childbirth—77.8 years too early—is equal to ten people’s succumbing to lung cancer at age 70. It is a very weird form of horse trading. The more you’re able to prevent young people—folks in their twenties and thirties—from dying, the more rapidly you boost a city’s overall life expectancy.

Although I'm not sure if the math works out exactly like that -- I've mentioned this before -- it certainly seems reasonable that it does. And a lot of the time, the "value" of a certain public health policy is computed in terms of number of years of potential life lost, not in terms of number of potential lives lost. This makes sense; younger people's lives are more valuable simply because what we're really valuing is the remainder of the life. And countries in sub-Saharan Africa that have been hit hardest by HIV do, in general, show the lowest life expectancies.

The article also makes the claim that New Yorkers live longer because they walk more. This seems reasonable to me. (And New Yorkers walk faster than the average person, too.) I don't live in New York -- I live in Philadelphia, which has a deep-seated inferiority complex in relation to New York. But we walk fast here, too -- I know this because people who move here from other places often say that we walk fast here, but rarely say that we walk slow. Walking fast is my own personal extreme sport, because I happen to walk faster than the average person (I have long legs) and so I actually end up weaving between people in order to go at my preferred speed. Am I healthier than average? I'm not sure.

Sometimes I wonder what my weaving through crowds would look like if time were made into a third spatial dimension -- or what a crowd of people in general would look like. People manage to not crash into each other. And when I'm walking, I will walk somewhere where someone else was a moment ago, where someone will be in a moment -- but we don't collide. Our world lines don't (indeed, can't) intersect. (World lines are usually thought of as positions in four-dimensional spacetime, but I'm assuming that my position is in two dimensions, thus a three-dimensional spacetime.) You'd see ordinary pedestrians' world-lines moving along in clumps. I go faster, so my world-line would be less steep (assuming time is the vertical dimensions). Clumps of tourists go slower and are probably more tightly clumped than ordinary people. (Don't believe me? Go down to Independence Hall. My theory is that tourists are afraid of being infiltrated by the natives.) Is it possible to identify tourists just by looking at their world-lines? And does a tourist alone be identified, or only a large clump of them? I would argue that the lone tourist doesn't exist, although it's difficult to articulate why -- perhaps because the lone tourist probably makes more of an effort to fit in with the crowd around them. All I know is that when I've been traveling alone in strange cities, people have actually come up to me and asked for directions. Somehow I manage to look like I belong.

The high-school prom theorem

The Myth, The Math, The Sex, in yesterday's New York Times. Gina Kolata talks about the fact that studies consistently report that on average, men have had more sexual partners than women. The article refers to, among other things, the study I wrote about in July. David Gale offers a proof of what he calls the "High School Prom" theorem -- which is that the total number of sexual partners of men and women should be the same, and therefore the population means should be the same. Apparently the sex researchers are aware of this, but cannot find a better way to collect the data.

Last time I wrote about this I claimed that the most likely thing was that women were either not having sex at all or having lots and lots of sex (the "madonna-whore complex"), and that that would create the observed difference in the medians. (The study reported medians, not means.) I acknowledged that self-reporting might create problems, but I phrased it in terms of people "forgetting" who they'd had sex with. In retrospect this is probably not the best way to phrase it; people lie about who they've had sex with.

I forgot to take that into account, because I wouldn't lie about such things. (Especially not in an anonymous survey!) But I tend to be more honest than other people, both because I have ethical objections to dishonesty and because I am not very good at keeping alternate versions of the world which correspond with my lies in my head. In short, I think it's wrong and I'm bad at it.

I suspect that mathematicians in general don't like lying, because it goes against everything that our profession stands for. (Also, there are persistent rumors that mathematicians are more likely than the average person to be autistic, and autistic people in general aren't good at lying because it requires having a "theory of other minds" -- that is, being able to understand what people are thinking, apart from what is actually true about the world.) But I imagine that being a mathematician is compatible with being good at lying, or at least with coming up with believable lies -- we do it all the time in proofs by contradiction. Collecting data on how likely people are to lie, though, is probably even harder than collecting data on people's sexual habits.

David Gale also says that "It is about time for mathematicians to set the record straight". To what extent is it the duty of mathematicians to point out when other people claim to be doing things which are mathematically impossible?

12 August 2007

fairness of playoff series

A seven-game playoff series is not significantly fairer than a five-game series when home-field advantage is considered, according to Brian Dean's paper in the arXiv, which I found while googling for something only tangentially related. ("Fairness" is defined as the probability of the team with home-field advantage winning the series.) It's not significantly fairer when home-field advantage isn't considered, either.

It is a little bit fairer, though (i. e. the better team is more likely to win in the longer series), as I wrote about back in June.

In case you read it: the cases the paper mentions where the seven-game series is "less fair" than the five-game series (i. e. the better team is less likely to win) are certain cases in which the team with home-field advantage is actually significantly worse than the team without it, even when playing at the first team's home. This doesn't happen for matchups that exist in practice in the playoffs, I don't think; for one thing, the team with the better record has home-field advantage in most such playoff series.

Also, the final section of the paper addresses the question of "morale"; if a team wins a game, are they more likely to win the next game? (Or less likely, because they start slacking off?) And how does this affect the probabilities? From what I remember, this sort of behavior doesn't show up in the regular season -- teams are as likely to win tomorrow if they win today as if they lose today -- but it might show up in the playoffs, although the sample sizes on playoff series are too small to say much with any confidence. But my guess is that pitching rotations are more important than morale.

first let's kill all the lawyers

Language Log writes about the semantics of pork, inspired by this New York Times article. A current health care bill includes statements like

"any hospital that is co-located in Marinette, Wis., and Menominee, Mich., is deemed to be located in Chicago"

and other similar references that sound innocuous -- they're just defining a metropolitan area -- until you think, wait a minute, is that anywhere near Chicago? (I checked a map; it's 259 miles away.) And how many hospitals like that could there be? (As it turns out, exactly one.) Apparently hospitals in metropolitan areas get bigger reimbursements from Medicare for various procedures, on the theory that the cost of living for their employees is greater than that for rural hospitals.

Language Log goes on to point out that is an example of how there are two ways to define any particular set: by listing its elements, or by specifying a set of constraints. A mathematical example that immediately comes to mind is "the set of even primes". Or, somewhat more innocuously, the statement "let p be an even prime". You don't see this that often, because it's easier to say "let p equal 2". But I have seen it, in proofs of the form: "Theorem: All primes p have property x. Proof: let p be an odd prime. Then (proof). Alternatively, let p be an even prime. Then (simpler proof)."

Of course, when one specifies a set by giving constraints, there's always the problem that the set might be empty. What would happen if the health care bill in question said, say, "$100,000 should be distributed evenly between all hospitals within a quarter-mile of Isabel's apartment?" There are none. Who gets the money? (I suspect there is some conventional interpretation for this sort of thing. I don't think you'd see it in this sort of bill, but I can imagine, for example, a doting grandparent writing in their will "my grandchildren shall equally split my [large sum of money]" and then the grandchildren tragically die before the grandparent.) I've heard the apocryphal story of a student who goes into his PhD defense and says that he will be presenting results on a certain sort of algebraic structure satisfying the following eight conditions. One member of the committee interrupts and says that he can prove there are no such groups. The student doesn't get the PhD. Proving things about something that doesn't exist is considered worthless, no matter how ingenious the proofs might be.

get your random numbers here! part two

I wrote about the Quantum Random Bit Generator a few weeks ago. Their signup page has a CAPTCHA -- you know, one of those things which says "to prove you're a human, read this word that's been distorted and colored funny". Except it's a math problem. And you have to know a bit of calculus to do it. The one it gave me randomly was hard enough that I didn't want to do it in my head; it was something like this one that Brad Fitzgerald mentioned. If you don't want to follow the link, it asked me to find something like

d/dx [ 7 sin 6x + 4 cos (7x + π/2) ]x=2π.

(In my defense, it's three in the morning here.) I'm not sure if it's generating these problems on the fly or if it has some repository of them it's working from; any ideas?

They're reasonable, though; they say "Note: If you do not know the answer to this question,
reload the page and you'll get another question. " Sometimes you get to differentiate something and evaluate the derivative somewhere. Sometimes you get to find the least real zero of a polynomial (of second or third degree, it looks, and the answer's always an integer). And if you refresh enough times you get something like "-4-7+6=?" Mercifully, they don't do the usual graphics tricks to distort the mathematical notation; it's just straight TeX output, I think.

Presumably in order to prove that one is human one must either be able to differentiate, or to at least recognize that one doesn't know how to differentiate. (Knowing what one does not know is often the hardest thing of all.) I suspect that most people who need really good random numbers know calculus, though.

11 August 2007

What is marriage for? (note: I will not answer this question.)

Wedded to Work, and in Dire Need of a Wife -- Shira Boss, in today's NYT.

From the headline, I thought the article would be about men who spent too much time at work and couldn't find a partner. No, it's about women who spend too much time at work to take care of things around the house that need taking care of, and their husbands won't take care of it either. As about a zillion studies have shown, women do more housework than men even when they work the same number of hours outside the home. (For the record, I live alone, am not in any sort of long-term relationship, and housework in my apartment just doesn't get done at all. My sink is currently not full of dishes, but that's rare.)

But I think the article has confused correlation and causation. Boss writes:
A more statistically rigorous analysis published in 2004, using the Minnesota Twins Registry, tried to isolate the effect of marriage on earnings. It found that holding education and genetics constant, married male twins made 26 percent more than their unmarried brothers.

Yes, but marriage doesn't happen at random. Are men earning more because they're married (as the NYT article would have us think?), or marrying because they're earning more? I don't know which, but the latter certainly seems possible. Weddings are expensive. And I can imagine someone not wanting to get married unti they could afford the wedding (or at least an engagement ring and whatever other parts of the whole shebang are traditionally paid for by one member or the other of the couple; I understand that traditionally parents put down a lot of the money). Somewhat more seriously, I can imagine people not wanting to get married until they feel they have some sense of "stability", which for a lot of people includes a steady job (or at least as close to that as we can get in the current economy). Maybe the unmarried brothers make less not because they don't have wives, but because they sit on their asses in their parents' basement, not making money and not meeting prospective marriage partners. Or perhaps the knowledge that you have someone other than yourself financially dependent on your work makes you work harder. (Especially if you've got kids. I think a valid comparison would take this into account, since married people are more likely to have kids than unmarried people.) Marriage and earnings are so confounded with each other that I'm immediately suspicious.

(Also, how often do you see the two consecutive words "Minnesota Twins" not referring to the baseball team? The Minnesota Twin Registry [this is the correct name] is part of the Minnesota Twin Family Study at the University of Minnesota-Twin Cities. The "Twin" in the name of the University and in the name of the baseball team are connected to each other; the baseball team is named for the Twin Cities, i. e. Minneapolis and Saint Paul.)

how should the electoral college be set up?

States Try to Alter How Presidents Are Elected, by Jennifer Steinhauer, today's New York Times.

Apparently factions in California and North Carolina are both considering apportioning their electoral votes by Congressional district instead of by state; this CNN article explains it a bit better. This is the same plan that's currently used in Nebraska and Maine, except that people just accept that as a strange quirk of the system because Nebraska and Maine are small, fairly homogeneous states, while California and North Carolina are not. The people in favor of these plans are pitching them as a matter of giving the states in question more representation. (More specifically, it's the minority party in both states that's pitching the plan that way, and so partisan politics clouds the issue; but I'll ignore that.)

But what is "representation", really? It seems to me like it comes down to this question: what is the probability that California can change the outcome of the election? In the current system, essentially zero; I suspect this election might be close, and a Democratic margin of victory will be slim enough that if California were to go Republican that would hand the presidency to the Republicans -- but California as a whole is not going to vote Republican. But in this new system, if it goes through? Now there's a little more of a chance for California to have some influence, because a few congressional districts might actually be in play. But the potential number of electoral votes that might actually be in play still is small; most districts are safe for one party or the other, thanks to gerrymandering. I will not attempt to compute the probability that I mentioned at the beginning of this paragraph -- any calculation I can do assumes that districts vote independently of each other, which clearly isn't the case.

I live in Pennsylvania. The presidential candidates in 2008 will pay attention to me, since Pennsylvania is seen as a "swing state". But I live in a district (the 2nd) in which the current Congressional representative, Chaka Fattah, routinely receives 85% or more of the vote. The last time this district elected a Republican was sixty years ago. If a plan like this were to go through in Pennsylvania, you can bet presidential candidates won't care about my vote any more. On the other hand, there are some districts in the Philadelphia suburbs which will probably be quite close; the people in those districts become even more influential.

(A superficially similar plan was attempted in Colorado in 2004, which would have alloted the state's electoral vote in proportion to the popular vote. This would clearly have been a loss for Colorado, since it meant that basically, instead of the people's votes deciding whether Colorado's electoral votes would be split 0-9 or 9-0, they were deciding whether they'd be split 4-5 or 5-4.)

It's not immediately obvious that this plan, should it pass in California, disenfranchises the state of California as a whole. Let's say (and I'm making this number up) that ten of California's 53 districts are actually in play. The people in those ten districts suddenly get some attention from presidential candidates. And perhaps a plan like this could actually work if all states had it. I can actually imagine that it might bring some measure of relevancy back to the politically heterogeneous parts of the state. Perhaps it even makes sense, on a state-by-state basis, for "safe" states to pass this sort of plan (since it gives some part of them influence) while "battleground" states don't (since it just shifts the influence around).

But I don't think it makes sense for the various states to be choosing how they'll allot their electors based on some probabilistic calculation. That would create the perception that the elections were unfair in one way or another, and the perception of fairness is as important as actual fairness. People need to believe that the political system has the interests of the people at heart. A complex pastiche in which some states apportion their votes by congressional district, some winner-take-all, some in proportion with the popular vote in that state, etc. -- even if each state chooses its system in such a way as to maximize the probability that its choices can flip the election -- isn't the way to go.

I don't support the idea, though. If it were to happen, I would want to see redistricting to eliminate gerrymandering, but even so you'd still end up with homogeneous districts in some places. And I suspect that a lot less people would live in "swing districts" than currently live in "swing states". The obvious way to make sure that everybody lives in a "swing district" is to have the "district" in question be the entire country; that is, to just count the national popular vote. And indeed, there's an ingenious hack of the Electoral College which would do just this without a Constitutional amendment. The goal of the National Popular Vote Project is to get bills passed in states representing 270 electoral votes that say that those states will give their electoral votes to the winner of the national popular vote -- but only if states representing 270 electoral votes have passed such a law. This seems like the fairest way to do things, to me; I understand that the purpose of the current system is to make sure that the concerns of all states are paid attention to and that a candidate can't just pile up votes in the big states and ignore the small ones, but that's not how it works right now.

A nationwide recount sounds scary, though...

10 August 2007

the traffic.com "Jam Factor"

Traffic.com has something they refer to as the "Jam Factor" to tell you how much traffic there is on various major roads in your area. It's an apparently arbitrary number from 0 to 10.

Now, if I see that the jam factor is 3.1 right now, what does that mean? How long is it going to take me to get where I'm going?

They claim:

The Traffic.com Jam Factor is like a "Richter scale" for traffic. It's an overall measure of the traffic conditions on a roadway, or on a section of a roadway. Because the Jam Factor calculation uses real-time speed and travel time measurements from our sensors and those of our partners, as well as our detailed accident, construction and congestion information, it's a comprehensive measure of the state of traffic on any roadway.

The Jam Factor is measured on a scale of 0-10, with 10 being the worst traffic conditions. It is designed to give you a quick, at-a-glance picture of conditions on the roadways or personal MyTraffic Drives you care about, whether you're on our web site, looking at an emailed Traffic Report, or listening to a Traffic Report call to your phone. If you see (or hear) a high Jam Factor, you can then delve into the detailed information in the Traffic Report or on the Traffic.com site to find out more.


If you click on the name of any segment of road, it'll tell you how long it takes to get by that piece of road without traffic, and how long it'll take Right Now -- that seems like more meaningful data to me.

If I had to sum up traffic conditions in a single number, it would be the ratio of the two numbers in the previous paragraph. I'd like to know that it'll take me 50% longer to get where I'm going than it would if the roads were clear. It wouldn't surprise me to learn that the "jam factor" is just this number disguised somehow.

I prefer what, say, Google Maps does, just overlaying colors (red/yellow/green) on the road; a number, especially one with two significant digits (the Jam Factor is reported to the nearest tenth), implies some sort of precision. I'd rather have no number than a number which has extra decimal places tacked onto it to seem "scientific". (Would I be less annoyed by this if the Jam Factor was reported on a 0-100 scale, to the nearest unit? Who knows?)

Also, if they have sensors of some sort -- why am I limited to knowing how long it'll take to get from a preset point A to a preset point B? Why can't I input the exit I actually get on and off at and have it tell me how long I should expect to spend on the highway? This is a simple matter; they have the speed that the road is moving at at any given moment, so just integrate the reciprocal of speed over the length of road in question. (I realize that it's not quite this trivial, because just because a segment of road is moving at 30 mph right now doesn't mean it'll be moving at that same speed when I get there. My method is sort of like how life expectancies are calculated, which doesn't actually tell YOU how long you're going to live. But forecasting how traffic on any given day will evolve, or how medical care will evolve, is a lot harder than just observing it.)

The Richter scale itself is logarithmic, as many of you probably know -- a difference of one unit on the Richter scale corresponds to a factor of ten in the amplitude of the seismic waves (not a factor of ten in the energy released, as is widely reported; although this isn't my field, it looks like the corresponding increase in energy released is a factor of 103/2, or about 32). Other logarithmic scales that you see fairly frequently are decibels (for measuring sound) and Google PageRank (on the 0 to 10 scale); a decibel corresponds to an increase in volume of 100.1, and I've read various things but one unit of PageRank seems to correspond to a factor of about five. (No one outside of Google knows for sure.) All of these situations have one thing in common -- most earthquakes, sounds, and web pages are relatively small, and some are much more prominent, so there's a good reason to spread out the small end of the scale and compress the large end. But in the case of the Jam Factor, this isn't necessary; if it takes me an hour on average to drive somewhere, on a really lucky day it might take forty minutes, and on a really unlucky day it might take three hours, but it's never going to take even as much as a hundred hours.

(Incidentally, as you may h ave guessed by now, my preferred mode of transportation is walking. A large part of this is that even though it's slow, I always know exactly how long it's going to take me. There is essentially no situation that slows down or speeds up my walking speed. And even if there were, it never takes me twice as long to walk somewhere as it ordinarily would, whereas driving times that are twice the average are routine. Unfortunately, it's not totally clear whether I save fossil fuel by walking, because I have to eat more food in order to walk.)

links for 10 August

Why Stuff Is Hard, at the Everything Seminar. I asked this in physics class in high school and never got a satisfactory answer. I suspect I'm not alone here. I was told it was some sort of electromagnetic repulsion between the outermost electrons, but apparently the Pauli exclusion principle is really doing most of the heavy lifting.

Mark Chu-Carroll at Good Math, Bad Math comments on the way math is taught at certain religious schools. If you don't want to bother reading the post, check out the course descriptions at one such school. They all start out "Students will examine the nature of God as they progress in their understanding of mathematics". The descriptions of the non-mathematics courses begin similarly. Today I was reading parts of Laplace's A Philosophical Essay on Probabilities (available in Hawking's anthology God Created the Integers: The Mathematical Breakthroughs That Changed History); among other things, Laplace mocks the idea of Pascal's wager. I couldn't help but thinking of what Laplace is said to have said to Napoleon when asked why he didn't mention God in his work on celestial mechanics: "I had no need of that hypothesis."

Compound interest isn't intuitive, from Adventures of BruteForce; if you invest a little money now that's like investing a lot of money later. People just aren't set up to understand exponential growth, which isn't surprising; unrestrained exponential growth isn't common in the situations for which we evolved. A population can't keep doubling every ten years without pretty quickly running out of space; a sum of money can. There's a persistent rumor that Ashkenazi Jews are actually better equipped for understanding this particular sort of abstraction than other classes of people, because of certain unique historical circumstances -- for quite some time they lived among Christians, who were forbidden to lend money for religious reasons, but these same Christians wanted to borrow money, and therefore turned to the Jews, who were not subject to those same religious laws. The Jews who were better at understanding this fact ended up with more money themselves, their kids didn't starve, and supposedly this explains why about a quarter of Nobel laureates are Jewish. I can't find exact numbers overall. One thing I can find is in this Wikipedia article which says that "Of American Nobel Prize winners, 37% have been Jewish Americans (19 times the percentage of Jews in the population) [...]" But I'm not sure who they count as "American". (Wikipedia used to have a list of Jewish Nobel laureates, but it's been deleted.)

It would be interesting if this were true, because it seems to imply that evolution can work ono the scale of a few hundred years. As humanity heads more and more towards working with its brains instead of its hands, will we get smarter? (On the other hand, at least at the present time in the United States, intelligence and number of children seem to be inversely correlated; it seems difficult for Darwinian evolution to work in a population when almost everyone survives long enough to have children.)

09 August 2007

another shot at Rubik's cube

On Monday I wrote that the Rubik's cube can be solved in 26 moves, where a move is defined as a half-turn or a quarter-turn of any face.

I'd forgotten that number, which I got from Wikipedia.

Then I came across this article at MathTrek, claiming 26 moves. The strategy was basically as follows: find the positions from which one could get to the starting position by making only half-turns, and no quarter-turns (call this set S); it turns out that from any of these, one can get to the starting position in 13 moves, and there are about 15,000 of them. Then the remaining items were put into sets such that the members of any such set only differed by half-turns; this roughly means that it's enough to reduce a single member of each set to some member of S. There are 1.4 trillion such sets, which sounds like a lot but is only one part in 30,000 of all the configurations of the cube. It turns out that a member of each such set can be reduced to a member of S in 16 moves, and 13+16 = 29. But the algorithm implied here solves most configurations in 26 moves or less; those that weren't, they solved by a brute-force computation.

Now, this sort of two-stage computation could probably be made better if the two stages were roughly equal in "size", that is, if 15,000 and 1.4 trillion (the product of which is roughly the number of configurations of the Rubik's cube) were equal. This is since the computation time is roughly proportional to the sum of these. If we have the constraint that the product of two numbers x and y is a known constant k, then we can minimize the sum of those numbers by taking x = y = k1/2. Of course, in the case of the Rubik's cube there might not be some natural class of configurations with size roughly the square root of the total number of configurations.

Similarly, if we had a problem with k possible configurations and a three-stage reduction, you'd ideally want to reduce the number of possible configurations by a factor of k1/3 at each stage.

But this is probably all useless, because if there were two intermediate subgroups you could reduce to, you'd probably want to take both of them even if you weren't lucky enough to have them at sizes k1/3 and k2/3 as would be ideal.

80-degree lows in Philadelphia

I find myself interested in the weather, both because my main means of transportation is walking, and because it's a complex system that enough other people are interested in that they try to predict it and which is a source of large data sets. (Surprisingly, I have no weather-forecasting ability. I figure I could learn how if I wanted to, but I'm not that curious, because I'd just be discouraged when I realized the professionals are better than me at it.) It hit 97 here in Philadelphia yesterday; more crazy-sounding is that Wednesday morning's low was 80 degrees. This has only happened 39 40 times in recorded Philadelphia weather history, i. e. since 1876.

I find myself wondering how the number of such days is distributed, but it's hard to come to any sort of conclusion. My instinct, though, is something like this: the number of clusters of 80-degree lows (that is, consecutive days with them) in a given summer is probably Poisson-distributed.

Of course, counting these "clusters" is a silly thing to do, because it seems to imply that, say, the lows on August 16, 2002 and August 18, 2002 (both 80 degrees) are independent events, just because it didn't get up to eighty on August 17. (The low that day was 77.) If you look at the data, there have been 109 summers with no 80-degree lows, 20 summers with one cluster of them, one summer with two clusters, and two summers with three clusters; that last one makes me suspicious, but one of those is 2002 (July 30, August 16, August 18) and one is 1995 (July 15, 26, 29).

Second, how long is each individual cluster? There are 19 clusters of length 1, 7 of length 2, and 2 of length 3; I'm inclined to suggest a geometric distribution. Once there's an 80-degree low, the probability of having an 80-degree low on the next day is a constant, approximately 1/4. It seems reasonable that this probability would be less than 1/2 but not too much less; an 80-degree low in Philadelphia is very unusual, so the next day is expected to be a bit cooler. I've heard that the simplest weather-forecasting rule is that "tomorrow will be like today". I suspect that a slightly better rule is "tomorrow will be like today, but a little less so" -- that is, it will regress to the mean a bit. But to apply this rule one has to have some idea what the mean is. On a day with a high of 70 in Philadelphia in July I'd want this rule to predict the next day would be warmer; on the same day in November I'd want it to predict the next day would be colder.

So the number of 80-degree days in a given summer, I expect, is the sum of some number of geometrically distributed variables with p = 3/4, the number of such variables being given by a Poisson distribution with mean about .21 (there were 28 observed clusters in 132 years). Determining what this says in terms of actual probabilities is left as an exercise for the reader, mostly because I don't trust these numbers enough. It seems like a reasonable first stab at a model, though. It only has any chance of working for extreme temperatures, though; if I replaced 80 with 70 (the average low in Philly this time of year) then the model wouldn't work so well, because it depends on this clustering phenomenon. (The highest-recorded low temperature ever in Philadelphia is 82, so you can see that 80 is extreme.)

While I'm on the subject of weather, check out weatherbonk.com, which displays the weather, live, on a map, showing the temperature at various volunteer-run weather stations. I'm never sure how much to trust any of these stations, though, because you hear that sometimes they're in an asphalt-paved parking lot next to the place where the heat comes out for a central AC unit. I suspect it would be more interesting in a place like San Francisco where the microclimates are pronounced enough that you can see them even over that noise.

08 August 2007

Walkscore.com -- how walkable is your neighborhood?

WalkScore.com will tell you how walkable a neighborhood is. Put in an address you're considering living at, and it'll tell you which grocery stores, restaurants, coffee shops, bars, movie theaters, schools, parks, libraries, bookstores, "fitness", drugs stores, hardware stores, and clothing and music stores are within walking distance. (They don't seem to have a firm cutoff for "walking distance"; I suspect something counts more towards walking distance if it's closer.) It uses this information to generate a score out of 100 telling how walkable the neighborhood is; it seems to correlate pretty well with my subjective impressions of places I've lived or am otherwise familiar with, although most places I'm familiar with are towards either end of their scale and I'd like some more data from the middle.

(Incidentally, it is possible to get a score of 100, though I'd doubted it for a while; Rittenhouse Square in Philadelphia is an example.)

The biggest problem is that they use "as the crow flies" distances; this was noticeable for one place I've lived which happened to be near a river (the Charles) but midway between two of the bridges across it (the Harvard and Longfellow). In most cases this doesn't seem that serious, because of a fact about development patterns -- the places with "inefficient" street patterns (where as-the-crow-flies distances tend to be the worst understimates) are places with low population density anyway.

There's an interesting picture illustrating the efficiency of grids of streets, showing all the places within a one-mile walk (via streets) from a point in downtown Seattle and similarly from a point in the Seattle suburb of Bellevue. The area (in the plane) filled in in the case of the grid is much larger. It makes me wonder -- how much gasoline could we save by designing in such a way that there weren't so many damn cul-de-sacs?

I recently came across Michael Bluejay's claim that walking is less efficient than driving, in terms of fuel consumption, if you eat like a typical American (i. e. lots of meat). This is because walkers need more food than drivers. It looks like a vegetarian walking is more fuel-efficient than a typical car, and that cyclists, whether meat-eating or vegetarian, are more fuel-efficient than the typical car.

However, as Bluejay points out, this is only true on the level of the individual person who has to make a trip to a certain place. On the level of a whole society, if we develop our communities in such a way that they're friendly to walking, people will travel less miles -- because our communities will be more compact, so one can get to the same number of different places with less travel. In fact, car-unfriendly communities are necessarily more compact, simply because less space is taken up by roads and parking lots! I suspect there's a "critical density" of some sort below which a vicious cycle starts: enough people have cars that businesses feel they need to offer free parking, which lowers the population density, which means even more people get cars, which means more free parking, and so on.

07 August 2007

why singularities are bad

What happens when you divide by zero. [edit: apparently the image isn't there any more; I didn't realize that would happen. The image was in the style of the Successories motivational posters or parodies thereof. A man stands wading in a body of water, not from an apparently infinitely deep whirlpool. Underneath, the text says:

YOU SON OF A BITCH
You divided by zero again, didn't you?

It's possible I don't have this exactly right; I'm doing it from memory.]

[edit again: Mark Dominus pointed me to another source for the image, which is linked to above; I think they allow hotlinking, but in case they don't, see http://macrochan.org/get.py?sha1=QOMEULLMJLINOMIA4L2FUAEQNHRMCC7O. Several dozen images on the same theme are available here.]



One can only imagine the more serious consequences if one, say, takes square roots or logarithms.

It's quite fortunate that doing incorrect mathematics does not actually have disastrous consequences in the physical world. Otherwise the area around any calculus class would implode.

books, companions to books, and varying levels of detail

When I was taking my first graduate algebra class, I was desperate for any sort of supplementary material. One of our textbooks was Lang (we also used Hungerford) and so naturally I tried to find anything on the Internet that would explain what Lang was saying; Lang's textbook can be terse at points. (Incidentally, there's a note on the back of my copy of Hungerford, quoting from a review of the text, which says something like "the book is sufficiently understandable that an averagely prepared graduate student can read it and understand most of it." I found this to be true, at least at those times when I was actually willing to sit down with the book instead of desperately flipping through it to find the theorem that would magically make my homework work out, and I remember being quite amused by the fact that this was sufficient praise that Springer would actually print it on the book.

At one point during my searching I found George Bergman's A Companion to Lang's Algebra, which is basically what it sounds like -- a couple hundred pages of supplementary material, which Bergman wrote in his attempts at teaching the basic graduate algebra course at Berkeley. This material includes proofs of things that Lang doesn't bother giving a proof of, notes about other things that one might like to think about when reading a certain passage, alternate notations, some supplementary exercises, and so on. (For example, at the beginning of the fourth section there's an explanation of where matrix multiplication comes from; no graduate textbook would include this, because it's pretty much standard, but it's good to see how it fits into the new framework one is learning.)

In general mathematicians have a habit of leaving out some details when writing -- some of which are crucial, and some of which are not. This is usually explained by saying that the reader should be able to fill in the details. While ideally this is true, sometimes one just doesn't have the time to fill in those details, or -- more importantly -- sometimes one just doesn't have the ability, and seeing some sort of example of a proof or a calculation would make it easier to fill in similar details later. Now, it would be unnecessarily pedantic to fill in all the details of every calculation. But I suspect that in the case of a book like Lang which is widely used, all the details have been worked out by someone. All the interpretive notes that one might want are in someone's mind. Wouldn't it be nice if, when you came to a sentence that said "it is easy to see" and you don't see it, you could just click and see an explanation -- either written by Lang, or by someone else who has either struggled with the same point or watched their students struggle with it? (And yes, I know Lang is dead, so we can't ask him to do this.) In general, having a web site full of various explanations of such classic texts would be useful. A lot of this material is already out there on the Internet; people have written it up for the use of their classes. But having it all in one place would be useful. These various lecture notes, supplemental handouts, and so on could be threaded together into a sort of virtual textbook. Indeed, one could probably learn from the notes people have written on the Internet on any classic textbook without having the textbook itself.

And not just for books; research papers, too, could benefit from this. It's often written in a paper that "the reader can see that"; but you know that the author of the paper has actually done the calculation, and wouldn't it be nice if you could just click on the paper and have it appear? Sometimes we shoot too much for terseness, at the expense of clarity. And sometimes terseness is good; one doesn't want to be disrupted by the unnecessary details.

This is different from the idea of, say, a textbook which is in the form of a Wiki, which anyone can edit. What I'm proposing is that the original text is inviolable, not least because in many of these cases there may be a lot of printed copies out there, but that anybody who wants to -- or at least some reasonably large set of people -- could write comments on the text, instead of everyone sitting around and sweating through the details on their own. Something like having a paper which lots of people have scribbled notes in the margin with lots of different colors of pen, except you could actually read the notes, you could decide that you only wanted to read certain people's notes (the ones who seemed to struggle at the same points you did, or who were particularly good at explaining things, and so on). I'm not sure to what extent the mathematical culture would like this, because we don't place much value on mathematical exposition, and so exposition of exposition might be thought even less valuable. But it's interesting to think about.

(Although I think that this is a new idea -- I don't recall seeing anything like this before, which combines multiple people's notes on a text -- I am not under the delusion that nobody's thought of this before; I'd appreciate pointers.)

more exotic dice

Mark Dominus at The Universe of Discourse shows that there exists a pair of six-sided dice not labeled 1, 2, 3, 4, 5, 6 (or some trivial transformation thereof) that gives the same probability of throwing any sum from 2 to 12 as a pair of standard six-sided dice.. In particular, if one die has faces {1,2,2,3,3,4} and the other has faces {1,3,4,5,6,8} the distribution comes out right. This results from the factorization

(x6 + x5 + x4 + x3 + x2 + x)2 = (x4 + 2x3 + 2x2 + x) (x8 + x6 + x5 + x4 + x3 + x)

and if it isn't clear to you how the fact about dice follows from this, you should read Mark's entry. But the interesting thing is not the result so much as how he got it; I've seen an ad hoc derivation of this alternate pair of dice, but Mark gets is by factoring the left side and rearranging it to get the right side.

In particular, this indicates that you couldn't make, say, a pair of dice that have the same sums as two d7s (i. e. seven-sided dice). That would require us to find two polynomials P(x), Q(x) such that

(x7 + x6 + x5 + x4 + x3 + x2 + x)2 = P(x) Q(x)

but the thing we're squaring on the left-hand side only factors as x(x6 + x5 + x4 + x3 + x2 + x + 1). So we could have

P(x) = (x6 + x5 + x4 + x3 + x2 + x + 1), Q(x) = x2(x6 + x5 + x4 + x3 + x2 + x + 1)

which corresponds to dice with sides {0, 1, 2, 3, 4, 5, 6} and {2, 3, 4, 5, 6, 7, 8}; this clearly isn't the intent of the problem. There's no other way to rearrange the factors. In the case of the six-sided dice, we had

x6 + x5 + x4 + x3 + x2 + x = x(x2 + x + 1)(x2 - x + 1)(x+1)

which allowed for a lot more rearrangement.

So, to take a similar problem, do there exist a pair of eight-sided dice with the analogous property? This time we're looking for a refactoring of

(x8 + x7 + x6 + x5 + x4 + x3 + x2 + x)2 = P(x) Q(x).

There are certain other conditions this refactoring has to satisfy, as Mark points out; we must have P(1) = Q(1) = 8 (which corresponds to the dice having eight sides) and none of the coefficients of P or Q can be negative; furthermore neither one can have a nonzero constant term (this corresponds to having faces labeled zero). The left-hand side factors nicely, as x(x+1)(x2+1)(x4+1). So we have

x2(x+1)sup>2(x2+1)sup>2(x4+1)sup>2 = P(x) Q(x)

and one of the factors x must be assigned to P(x), while the other must be assigned to Q(x). Now, at x=1 this becomes

12(1+1)2(1+1)2(1+1)2 = P(1) Q(1)

and we remember we needed P(1) = 8, Q(1) = 8; each of the six remaining factors on the left-hand side is a factor of 2, so we must have three of these making up P(x), and three making up Q(x). The possible factorizations, then, are

P(x) = x(x+1)2(x2+1), Q(x) = x(x2+1)(x^4+1)2
P(x) = x(x+1)2(x^4+1), Q(x) = x(x2+1)2(x^4+1)
P(x) = x(x+1)(x2+1)2, Q(x) = x(x+1)(x4+1)2

in addition to the original one; these correspond to pairs of dice labeled {1, 2, 2, 3, 3, 4, 4, 5} and {1, 3, 5, 5, 7, 7, 9, 11}; {1, 2, 2, 3, 5, 6, 6, 7} and {1, 3, 3, 5, 5, 7, 7, 9}; {1, 2, 3, 3, 4, 4, 5, 6} and {1, 2, 5, 5, 6, 6, 9, 10}. The proliferation of these pairs here seems to stem from the simple factorization, which in turn is a consequence of the fact that 8 is a power of 2. Alternatively, we can imagine replacing each d8 with three coins, labeled {0, 1}, {0, 2}, and {0, 4}; rolling a d8 corresponds to flipping these three coins and taking 1 plus the sum of their labels. The "alternative" dice correspond to the different ways in which we can split up this collection of six coins into two triplets; there are a lot of ways to rearrange these coins. Essentially, the d8 "factors" into these component coins. In the d6 case I don't see the dice being able to be factored in this way into smaller dice (a coin is just a two-sided die); this is because factoring the corresponding polynomials gives polynomials with negative coefficients, which has no "physical" analogue.

06 August 2007

the design of asymmetric dice

Mark Dominus says that "For some reason I've been trying to construct a die whose faces come up with probabilities 1/21, 2/21, 3/21, 4/21, 5/21, and 6/21 respectively." He has not been able to do this exactly, although it wouldn't be too hard to come up with an approximate solution given his claim that "the probability that the hexahedron will land on face F is not proportional to the area of F, but rather to the solid angle subtended by F from the hexahedron's center of gravity."

This sounds right -- if you throw a die in the air then it'll rotate around its center of gravity, and you expect that whichever face is pointing "down" when it hits a table for the first time is the face which will eventually settle as "down". (Note that because the faces of a general polyhedron are not necessarily parallel, I'm considering the "down" face, not the "up" face.) But I'm not entirely sure if it's true, because if the die comes close to landing on an edge it might take a funny bounce; I suspect the system is a little more chaotic than Mark's representation of it, and the only real way to solve the problem is experimentally. Unfortunately, the "experimental" part would require tossing a die a very large number of times, and would be incredibly boring. Stereotypically, one gets one's grad students to do that. But I'm a grad student and I wouldn't do it.

What makes Mark's problem difficult is the lack of symmetry; each face has to be different. Let's say, hypothetically, that I wanted to make a die which had two faces which come up with one frequency p and four with another frequency q; clearly 2p + 4q = 1. If q = 0 this is just a coin; if q = 1/4 it's some kind of four-sided stick; my guess is that we can smoothly vary q from 0 to 1/4 by taking a wide variety of rectangular boxes with two of the three dimensions the same.

Similarly, if we want a die whose sides come up with frequencies p, p, q, q, r, r we should be able to do it by taking rectangular boxes with all three dimensions different.

If we're willing to go away from the rectangular box, but have a shape which is still topologically a cube (by this I mean all six sides are quadrilaterals, meeting three at each vertex) I think we can get dice where four faces have the same frequency and each of the other two has a different frequency by using a truncated square pyramid; clearly the four trapezoidal sides each come up with the same frequency, by symmetry. There are two degrees of freedom in constructing such a truncated pyramid (say the ratio between the height and the side length of the base, and the ratio between the side length of the top and the side length of the base) so there's no obvious "dimensionality" reason why this wouldn't work. (A degenerate case of this has five of the six faces having the same frequency.)

And it's probably possible to get three faces with one frequency and three with another by "stretching" the three faces around a single corner, although I'm having trouble picturing how exactly one would do that.

But all of these constructions take into account the symmetry of the cube; I'm assuming that faces that "look the same" are landed on with the same frequency. Mark can't take advantage of that symmetry in his problem.

I also suspect that designing exotic dice like these is difficult, or perhaps impossible, because the probabilities might not be "stable"; they might depend on how hard one throws the die, which defeats the purpose of dice, which is to be a source of randomness that can't be controlled by the person throwing them. Wikipedia offers some oddly shaped dice, though, and seems to claim there exists a fair seven-sided die in the shape of a pentagonal prism, which suggests that stability isn't as much of an issue as I suspect.

13th roots and Rubik's cube

In the comments here a few days ago, commenter Michael S. pointed to this article telling us that the record for finding the 13th root of a 200-digit number has been broken by Alexis Lemaire. It is mentioned that the record for the 13th root of a 100-digit number was 23 minutes in 1970, and now Lemaire can do it in under four seconds. This seems a bit suspicious to me, because it seems like it would take more than four seconds just to read the problem and write down the answer, but presumably it's legitimate. I found what looks like a set of rules here.

What I'm curious about is how much of this improvement is an improvement in the existing methods, and how much of it is coming up with new methods. I suspect that some part of the method takes advantage of certain strange properties that only someone who was looking really hard could find. For example, in the case of cube roots it turns out that the cubes of 0, 1, 2, ..., 9 each end with a different digit (0, 1, 8, 7, 4, 5, 6, 3, 2, 9). So if, for example, you have a number that you know is a perfect cube and it ends in 3, it must be the cube of something that ends in 7. There's a similar trick for fifth roots -- n5 and n always end in the same digit.

From what I can gather the methods for the 13th root are a little less ad hoc; at least one person's method basically works by taking the logarithm of the 100-digit number and dividing it by thirteen to get the most significant digits of the 13th root of a 100-digit number; then looking at the last few digits of the 100-digit number gives you the last few digits of its 13th root. One occasionally sees large numbers given in this way, where the first few digits can be found by logarithms, the last few digits by modular arithmetic, but the middle is a lot harder to compute. I can't find descriptions of the method which these mental calculators use for finding 13th roots of 200-digit numbers; one might expect it to be similar, but it doesn't necessarily have to be.

An analogy comes to mind with solving the Rubik's cube. There are people who can solve it very quickly. But there are two metrics for the speed of solving it -- how many "moves" do you have to make to get it back to the starting position, and how many seconds does that take? The first of these is the more mathematically interesting question. A simple combinatorial argument says that there must be positions which are 18 moves from the initial position. The number of positions the Rubik's cube can take is (8! × 38) (12! × 212)/12 = 43,252,003,274,489,856,000. To see this, note that to specify a position of the Rubik's cube, one must specify:
  • the positions of the eight corner pieces, in 8! ways

  • how each of these is "twisted", in 38 ways

  • the positions of the twelve edge pieces, in 12! ways

  • how each of these is "flipped", in 212 ways

and the factor of 12 arises because it turns out not to be possible to get every possible combination of flips and twists. (Douglas Hofstadter referred to these as "conservation laws" in one of his Metamagical Themas columns, because they reminded him of certain laws about conservation of spin or color in particle physics.) Now, let's say you're randomly scrambling a Rubik's cube. There are twelve "moves" you can make from the initial position -- you can twist each of the six faces in either direction. After that, at each juncture there are eleven moves you can make -- it would be redundant to turn a face clockwise and then counterclockwise on the next move. So the number of positions that you can reach in exactly n moves is 12 × 11n; summing these up for 1, 2, ..., 17 gives a number which is less than the number of possible positions of the Rubik's cube. So there are positions which are at least 18 moves from the start. It turns out that there are positions that need 26 such moves ("quarter turns"), and an algorithm is known that can solve any position of the Rubik's cube in 35 quarter turns; see this Wikipedia article, or this paper for an idea of the methods. (They use a different metric, in which turning a face by 180 degrees counts as one move, not two; this lets one replace 18, 26 and 35 above by 15, 20 and 26.)

But the people that solve the cube very quickly are interested in the actual amount of time, which means first that they would not want a method that required lots of standing around thinking, and second that there are considerations of manual dexterity that just don't come into play if you're trying to minimize the number of moves. You can imagine a Rubik's cube competition where people are challenged to unscramble the cube in a minimal number of moves, as opposed to the smallest amount of time; however it would be a much different game.

Unfortunately, the analogy to the 13th-root problem breaks down. That's because in the Rubik's cube case, you can see what the players are doing. So it's possible to count the number of moves they make. You can't count the number of elementary arithmetic operations that someone does in their head when trying to extract a 13th root, and furthermore there seems to be an element of synaesthesia in some of their calculations, so hat they see as an elementary arithmetic operation might not be how a "normal" person would define it. What model of computation one uses is tremendously important here.

05 August 2007

links for 5 August

The Fermi Paradox is back, via Slashdot. The Fermi paradox, for those of you who don't know, is basically the following question: if there are so many examples of extraterrestrial intelligence, as a lot of people believe, how come none of them have contacted us yet? This ties in to one of my favorite overmathematizations, the Drake equation, which computes the number of extraterrestrial civilizations in our galaxy by multiplying seven factors, most of which we have no good idea of. The result is a number with a ridiculously huge margin of error; depending on who you ask, the number of extraterrestrial civilizations that we might be able to here could be anywhere between zero and a million or so. Good expositions of the Drake equation usually point out that we have no way of predicting, for example, the average lifetime of a civilization. One particularly interesting resolution I've seen of the Fermi paradox is that other civilizations decide that they just don't care about talking to other species and spend all their time looking at the local equivalents of Internet pornography and reality television. I'm not saying I believe this, just that I've heard it. A bit more plausible, I think, is the idea that civilizations evolve so quickly that a civilization that was where we'll be in the year 3000 (if we don't kill ourselves first) wouldn't be interested in talking to us. (If you think 3000 is too soon, substitute some year further in the future.) I think it would be interesting to talk to a civilization that was where we were a thousand years ago, but a lot of people believe that the evolution of civilization is accelerating; Ray Kurzweil is probably the best-known exponent of this idea, called the Singularity. I'm a bit suspicious of it because a lot of the arguments seem to rely on the fact that we remember what happened in the recent past much better than what happened in the distant past.

What autistic girls are made of, by Emily Bazelon in today's New York Times. Disorders on the autistic spectrum are usually thought of has being uniquely the province of boys, but they happen to girls too. There are researchers who think of autism as being an "extreme male brain", and if that's true it kind of makes sense that it would be more common among males than females. Also, apparently it's harder to be an autistic woman than an autistic man because women are expected to understand social networks better than men; I'm kind of curious if this has always been true or if it's a historical accident. Vaguely relatedly, Who's a Nerd, Anyway? by Benjamin Nugent from last Sunday's NYT; people who are considered nerds are "hyperwhite", according to the linguist Mary Bucholtz. (This is "white" in a cultural sense, as in the way white Americans tend to act; I don't think the author intends to say that there's anything genetic about being a nerd.) What I find interesting is that this same tendency towards oversystemization can be called either hyperwhite or hypermale, despite the fact that we usually think of sex and race as being orthogonal to each other. Finally, Mark Liberman comments at Language Log on reactions to Nugent's article, and how in general non-scientific bloggers blogging about science, and non-scientific journalists writing newspaper articles about science, make fools of themselves.

The Probabilistic Method by Noah Snyder at Secret Blogging Seminar. I love when people find out that the probabilistic method exists. For those of you who aren't familiar with it, the probabilistic method is a method used to prove that a collection of objects contains some object with a certain property not by actually finding the thing but by just proving that if you pick an object from the collection, it has probability greater than zero of being the thing you're looking for. It's kind of a mindfuck, because many of its applications are in combinatorics and people expect there to be explicit constructions of things in combinatorics. It's possible to have a group of forty-two people such that there's no five of them who all know each other and no five who don't know each other. But I can't explicitly tell you which people in that group know each other and which don't. (This is an example of a Ramsey number.)

partial orderings and power outages

Yesterday my power went out for a couple hours.

I did what I usually do when my power goes out (it doesn't happen that often) -- I went for a walk and hoped it would be back when I got back. That didn't work, so then I remembered some people I sort of knew were having a party (and fortunately I remembered the address!) and went there; when I came back again the power was back.

But at one point the following question occurred to me: when my service from Comcast (which provides my TV and Internet) goes out I get angry. Yet at first it seems like a power outage must cause more inconvenience than a Comcast outage, because when the power goes out, not only can I not use the Internet or TV, but I also can't use the lights (not terribly inconvenient in this case -- the power came back on before dark), refrigerator (you're not supposed to open the fridge during a power outage because otherwise all the cold air gets out), or most importantly the air conditioner. (At one point I tried to turn on a fan. I forgot that fans ran on electricity, too.)

You can imagine a partial ordering of "collections of things that could be wrong with my apartment". Clearly if one collection contains the other, then the latter is worse. Everything that is wrong in a Comcast outage is also wrong in a power outage, so clearly the power outage is worse, right? (Note that it's also possible to compare some collections that don't contain each other. For example, I think I'd prefer the gas being out to the Internet being out, at least in the summer.)

But there are two things that are part of such a collection that I didn't notice at first. The first is the expected length of the problem. The second is the identity of the people that are going to have to fix the problem. These are of course connected; some companies have better customer service than others. But it's not just the time. I never talked to anyone at PECO, my electric company, unless you count their automated hotline that you're supposed to call to report an outage. I never waited on hold. With Comcast I would have waited on hold, had to explain my problem to someone at a call center, and probably had to schedule an appointment to have someone come out and take a look at it; Comcast seems to assume that whatever the problem is, it's on the customer's property (or their landlord's, in my case). So the identity of the problem-fixers is important not just for telling me how long the problem will last, but how much of my own time I'm going to have to spend dealing with it.

And PECO's power outage hotline works by using Caller ID to identify who you are, then says "for your privacy, only the first four digits of the building number where you have service will be repeated". (There was a button to press if the number you're calling from wasn't where you have service.) That's nice of them -- but kind of silly, since almost all buildings in their service area have four digits or less in their number. And I can imagine a situation where this would create confusion for someone who had a five-digit building number -- say I live at 12024 John Street (yes, this name of a street is a bad joke) and I own the house next door, 12026; I rent out that house to tenants and I pay the electric bills there, so the phone number on the electric bill is my phone number, because I don't wish to go through the trouble of changing the phone number every time there are new tenants. (This is reasonable, because people have cell phones and land line numbers are more portable than they used to be.) I call, it says "1202" at this point. Clearly they should use the last four digits; it's a lot more likely that I have some interest in 12024 and 12026 (which are next door to each other!) than that I have some interest in 2024 and 12024. Of course, the real privacy issue here is that it doesn't say what street the building is on.

Also, PECO's power outage hotline pretends to tell you an estimated time when your service will be back. I say "pretends" because every time I've ever called (a few times during each of the two outages I've had here) it gives a time between two and two and a half hours in the future. If you call them at 6, they'll tell you 8:15; if you then call at 7, they'll tell you 9:15; then it'll come back on at 7:45. It's possible that this might just be the average time to restore service and they're concentrating more on restoring service than telling people when it'll be back, which is probably a Good Thing. (I bet they figured nobody would ever figure that out.)

04 August 2007

geographical random walks

Let's say you have a map of the streets in an area. Could you guess, from looking at that map, which intersections the locals consider "most important"?

You could probably make a decent first guess by assuming that the importance of an intersection goes up with the number of roads at the intersection.

Why? Imagine you are randomly walking around a town. Each time you get to an intersection you choose uniformly at random from each of the possible ways you could go. (For the sake of simplicity, I'll assume that this includes going back the way you came.) It's a well-known result that this sort of random walk on a graph leads to a certain stationary distribution; that is, if a bunch of people walk around randomly you'll get to a point where the number of people at any given intersection is roughly constant. And that result is that number of people at any intersection is proportional to the number of roads coming into it. (I'll count the two directions of a road coming into the same intersection as two distinct roads.)

This has implications if, say, you want to start a business. If you're located in the middle of a block, there are two roads coming in. If you're at a "T"-shaped intersection, there are three. A normal grid intersection, four. An intersection like 48th and Baltimore or 23rd and South, five -- both of these have two streets that go through and a third that starts there. (Most of the intersections on Philadelphia's Baltimore Avenue are five-pointed, because the street grid has a sort of discontinuity there; this may explain why it's sort of the main street of its part of the city.) An intersection like Broad and McKean, which has three streets going through it, six. You want to be where there are more roads, so that more people will walk by. In Washington they have intersections like Dupont Circle where five roads go through, for a total of ten "arms". (The counting gets a bit tricky, because some of the roads don't actually go "through".) In Paris there's an intersection with eleven arms, fittingly called L'Etoile. (Much more than that seems impractical.)

Of course, this neglects a huge number of facts. Not all roads are equal. For example, I'm ascribing Baltimore Avenue's primacy in its part of Philadelphia to the fact that it's got five-pointed intersections; but probably more important is that, historically, it was the road that went to Baltimore. (Hence the name.) Also, a trolley runs down Baltimore Avenue and has for over one hundred years; but why is it there, and not somewhere else? Because people already were living or working on or near that street. And they were doing that because other people were. Once a slight imbalance is created -- say by the random-walk model I alluded to above -- people are naturally going to gravitate, even if it's very slightly, towards the place where people already are. The roads that lead towards the important nodes will become important in their own right. And street networks aren't static -- they can change. (River networks can't, though -- and a lot of cities are located where two rivers come together to form a third, the most famous U. S. example probably being Pittsburgh.) But the initial imbalance has to come from somewhere, and this is as good a place as any, especially in places where the roads mostly form a grid and hence their arrangement is determined well in advance of their actually being built.

Incidentally, this is my theory on why Boston and Philadelphia -- two cities which are of similar size and population density, at least if you consider them as the center of their respective metropolitan areas, and the two cities I am most familiar with -- differ in their transit system. Boston is able to have subways because for the most part it's obvious where the subway stops should be. Radical Cartography has a map of Boston as a series of squares; the original transit system was basically built around people getting from one square to the next. Philadelphia would have trouble supporting a subway system more extensive than its current two lines, because it's hard to point to a place that a huge number of people go which isn't served by the existing system; the population is more diffuse, because for the most part Philadelphia's streets form a grid and no node is any different from any other. If I wanted to I could draw, say, six or eight subway lines that I'd like to exist in Philadelphia -- but none of them seem essential to me, at least at the price that it costs to build a subway. (Some Philadelphians will point out that there's a long-standing plan to put a subway down Roosevelt Boulevard; to them, I will say that I have very rarely had any reason to be in Northeast Philly, so I forget about the Boulevard. I'm sorry.) Manhattan has subways -- even though it's for the most part a grid -- because it's so dense. (But I think I heard somewhere that Times Square is the busiest subway stop -- and it's under Broadway, which cuts through the grid diagonally.)

Who gets credit for quadratic reciprocity?

In today's New York Times crossword, there's a clue "Discoverer of the law of quadratic reciprocity." The correct answer is, according to the crossword, this guy. I originally put in this guy instead. It turns out, according to the Wikipedia article on quadratic reciprocity, that "The theorem was conjectured by [first guy] and Legendre and first satisfactorily proven by [second guy]. [Second guy] called it the 'golden theorem' and was so fond of it that he went on to provide eight separate proofs over his lifetime." (I am deliberately obscuring the links because you might still want to do the crossword.)

Now, who should get the credit for "discovering" a mathematical result? The one who first suspected it might be true, or the one who proved it? I'm of the opinion that in this case they should share the credit (along with Legendre), mostly motivated by the fact that both of the people involved are Really Big Names. There are some examples in which First Guy discovered something and it's not named after him, but Erdos (and I don't think I'm spoiling anything by admitting that Erdos is neither First Guy nor Second Guy) once said that Goldbach's conjecture should be named for Goldbach, not First Guy, because "[First Guy] is so rich and Goldbach is so poor, it would be like taking candy from a baby." I don't know the history in the case of quadratic reciprocity. But I'm motivated here mostly by the fact that it's a lot easier to prove something if you already have reason to suspect it's true. For one thing, if you suspect something is true it is often on the basis of data, and you can look at that data and see how you might generalize the patterns you can see in it. (Quadratic reciprocity is almost certainly such a case, since data is easy to generate.) Secondly, there's a tremendous psychological boost to be gained from knowing that someone whose judgment you trust thinks something is true. Mathematicians are trained to think that only proofs matter, and I suspect there's an extreme strain of this that thinks that we really don't have any idea whether something is true until we've proven it or not; but someone who has given copious hints in the right direction surely deserves some of the credit. The tendency seems to be to give the credit to the person who put the last link in place; the highest-profile example is of course Wiles' proof of Fermat's last theorem. But what Wiles really showed was a special case of the Taniyama-Shimura conjecture; Ribet had already shown that Fermat would follow from this special case. So it seems to me that Ribet definitely deserves some of the credit (for establishing that as the right target for anyone wishing to prove FLT) and probably Taniyama and Shimura as well.

In the case of this particular theorem I realize that there are a very large number of people involved in one way or another, and it's hard to know who exactly to assign credit to, or -- and this is getting really silly -- how much credit to assign to them. Fermat proved that x3 + y3 = z3 has no trivial solutions. What should this count for? On the one hand, it's the first case. On the other hand, there are infinitely many cases. Should Pythagoras get some credit for coming up with his theorem, which inspired the whole thing? (Incidentally, any reasonable scheme of assigning "credit" to every mathematical result ever to all the mathematicians who were in some way involved has a pretty good chance of putting Pythagoras on top -- or at least of putting the Pythagorean theorem on top among theorems.) But even in the case of less famous results there are clearly a lot of people who did something towards them -- the people who first conjectured them, the people who proved some special case, the people who disproved some other special case (thereby helping to establish the boundaries of the result), and so on. Fortunately we don't need to find a way to quantify these contributions; decisions of who to hire or who to give prizes to can be made without them. (I'm not saying that the current hiring system is perfect, but it seems to work well enough.) And do you really want mathematicians in charge of some scheme that assigns a number to their total amount of mathematical contributions? Because let's face it, if you made up such a scheme mathematicians would find a way to beat it.

Except if it involved arithmetic. Mathematicians aren't any good at arithmetic.

(Oh, and this was going to be a post on how mathematically oriented people are good at crosswords. But it's not! Oh well. I'm sure I'll get around to writing that eventually.)

03 August 2007

When is number 756?

You can put money at newsfutures.com on Barry Bonds hitting 21 or more home runs this season. (He came into the season with 734, and 734 + 21 = 755, so you're betting on him at least tying Hank Aaron's career record. He's currently at 20 for the season, 754 for the career.)

Not surprisingly, the contract (which pays $100 if he does, and nothing if he doesn't) currently trades for $98.

But I found the following interesting: the page claims "The likelihood of this prediction is computed in real time by a prediction market." Note the word computed. Is what a prediction market does really "computation"? ("Right now" is the middle of the fifth of tonight's Giants-Padres game.) I suppose you could make an argument that it is, but when I hear "computation" I expect to see something like Clay Davenport's prediction -- on May 11, he predicted that Bonds had an 89% chance of hitting #756 by now, basically by guessing how many plate appearances Bonds was likely to get and how likely he is to hit a home run in each plate apparance -- although that builds in an explicit and somewhat ad hoc allowance for injuries, so I'm not sure how accurate it is. But when I hear "compute" I expect the work to be going on in a single silicon-based entity, not a distributed mess of carbon-based ones.

Incidentally, it was this prediction by Davenport that inspired my predictions about the Phillies' ten thousandth loss, which came on July 15, a few days earlier than I expected it.

At the freakonomics blog, you can pick the pitcher you think will give up Bonds' 756th home run. If you guess right you get a signed copy of Freakonomics. Most of the plausible pitchers (i. e. anybody on the pitching staffs of the teams the Giants are playing in the near future) are already taken, though.

a thought on teaching

From today's Philadelphia Inquirer: Masters of "spring" theory, about how physics is now being taught in some high schools by a "modeling" approach, in which students basically try to figure out the principles of physics for themselves, by teacher-facilitated experimentation. This requires the teachers to go through some intensive training themselves in order to teach this way.

To what extent is it possible to teach mathematics this way? I've heard of the Moore method, which seems like an analogue of this, but from what I've read that sounds like it's been most successful at the post-secondary levels. I also don't know if this particular page is the best explanation of the Moore method, because I've never experienced it myself.

(Also, apparently "string theory" is well-enough known that a mainstream news paper feels like they can make puns on it in their headlines. I'm not sure what to make of this.)

information theory for architects

Paul Bourke asks a question about the intersection of three cylinders (via Anarchaia):
Question. Is a 3D object uniquely described by a plan, front and side view?
Answer: No!
In particular, consider a sphere at the center of a Cartesian coordinate system. It can be inscribed in a cylinder with central axis the x-axis, or the y-axis, or the z-axis. The intersection of these three cylinders has the same plan, front, and side view (i. e. projections onto the three coordinate planes) as the sphere, but it's not a sphere. If you want to see what it looks like, there's a picture at the post, of one made from wood. Bourke writes:
I originally designed this object because Architecture students would ask me why the computer couldn't scan their plans and elevations and automatically build a 3D model. This was the simplest demonstration I could come up that there isn't enough information about a 3D object in its plane projections.
Thinking about how much "information" is contained in something is in general a useful idea. My first instinct is that these three projections, being plane curves, contain together about as much information as, say, three cross-sections of the object; you would never expect to be able to recreate an object from three of its cross-sections. I'm not sure if I would have been able to come up with this particular counterexample, but the previous argument would have reassured me that one probably exists.

But this sort of intuition can be misleading, especially when one is doing continuous (as opposed to discrete) mathematics. This line of reasoning would also lead you to think that Fourier series don't exist, because how can we encode uncountably many real numbers (the values of a function at every point in some real number) by only countably many real numbers (its Fourier coefficients)? The correct way to interpret this might be that functions which have Fourier series are somehow extremely rare among all functions, which is true because they've got to be piecewise smooth and continuous. I once asked a foolish question along these lines at a colloquium on the famous problem "Can you hear the shape of a drum?" The problem is basically what it sounds like. There's a membrane stretched in a certain shape, and when you hit it its motion is governed by a certain partial differential equation. The eigenvalues of the PDE, with boundary conditions governed by the shape of the drum are given to you; these correspond to the various frequencies one would hear in the sound if the drum were hit, hence the question. Can you (armed with an oscilloscope and a big fancy computer) tell what the drum looks like, if you don't get to see it? (The answer is "sort of".) It seemed to me for a moment that there's no way you could recover the whole shape from a countable sequence of numbers... but then I was reminded that Fourier series exist.

I prefer to only use this sort of intuition in discrete mathematics, because then usually one can count the things in question (at least in principle) since there are only finitely many of them. For example: it's easy to prove that any algorithm for sorting n items, where we're only allowed to compare them pairwise, must have runtime at least O(n log n). Basically, we want to determine which permutation of Sn is associated with the original list (in layman's terms, what order they started out in). From an information-theoretic point of view, each comparison gives us one bit of information; the total amount of information in the original permutation is log(n!), which is O(n log n) by Stirling's approximation.

a bunch of links, and a joke

Ranking is a trap, from Statistical Modeling, Causal Inference, and Social Science, originally from junk charts.

There's a plot that illustrates how common various first names are, and have been in the past, but instead of plotting the frequency with which the names occur against time, they plot the rank of the name in a list of all names. At least there are enough different names that this shouldn't obscure trends too much; if I remember correctly frequencies of first names follow a power law, with the nth most common name having a frequency proportional to n for some constant α > 1. The problem is worse when considering ranks in smaller populations, because then the deviation from some "ideal" distribution is likely to be a lot worse.

(Incidentally, do last names behave differently than first names? I would suspect they do, because people don't choose their children's last names -- or, if they do, they choose them from a pool of two -- while they choose their children's first names from a larger pool. So "undesirable" last names should stick around longer.)

Sum Divergent Series, III, from The Everything Seminar; this is the sequel to the post I mentioned when I was talking about generating functions a few days ago, and shows how to do some more strange sums (for example, 1 + 2 + 3 + 4 + ... = -1/12). It's followed by an interesting post about Mellin transforms. The idea on which the Mellin transform is based is stated as follows:
Generating functions are useful when you can break down Objects into constituent elements whose Size adds to give you the original Size and zeta functions are useful when you can break down Objects into constituent elements whose Size multiplies to give you the original Size.<

Since the integers have both additive and multiplicative structure, I can see how having a tool for going back and forth between those structures -- the Mellin transform, as it turns out -- would be quite useful.

Bivariate baseball score plots is the blog of bivariate baseball score plots, which features plots of the final scores in baseball games; unfortunately it's difficult to get any great insights just from staring at the plots. (This may in itself be a great insight -- namely that the good teams and the bad teams aren't that different.) In other baseball-related news, Strange Maps features The United Countries of Baseball, a map in which various parts of the country are painted with the colors of various baseball teams. There are other such maps -- common census's map is based on actually asking people; Geographer Dan at Baseball Think Factory has a map of which teams are blacked out on MLB.TV in various locations (although these territories overlap) and somewhere (I can't find it) there's another map of his which just shows which is the closest team to each point.

Finally, a joke I made by accident yesterday:
my friend: I have baby bok choi that I really need to use soon. What should I do with them?
me: Wait for it to grow up. Then do whatever you do with regular bok choi.
my friend's girlfriend: That is such a mathematician answer.

of course, I have no idea what one does with regular bok choi. I am not a fan of leaf vegetables.

02 August 2007

you can't predict the climate, either

Julie Rehmeyer, of MathTrek, writes about how predicting the weather -- or climate -- is basically impossible. The article begins:
Climate models may never produce predictions that agree with one another, even with dramatic improvements in their ability to imitate the physics and chemistry of the atmosphere and oceans. That's the conclusion of a report by James McWilliams, an applied mathematician and earth scientist at the University of California, Los Angeles. The mathematics of complex models guarantees that they will differ from one another, he argues. Therefore, says McWilliams, climate modelers need to change their approach to making predictions.

I had been under the impression that this was already known. It's known as "sensitive dependence on initial conditions"; if we know the weather to within a certain precision ε right now, then after one day we know the weather to within kε, after two days we know it to within k2ε, and so on, where k is some constant larger than 1 . More technically, the Lyapunov exponent of the weather is positive.

I've probably read several dozen versions of the story that is usually told about Edward Lorenz's toy model of the weather. (It would be interesting to see a web page that gives the various ways this particular story has been told, something like this page which gives over a hundred versions of the story of the young Gauss summing 1 + 2 + ... + 100.) The story, if you're not familiar with it, goes like this: Lorenz had a toy model of the weather in his computer, a system of differential equations. (I want to say it was a system of three equations, but I might be confusing it with the Lorenz attractor. Then again, they may actually be the same system.) It was the sixties, so computers were slow. Lorenz had his computer print out the position of the system in phase space at time 0, 1, 2, ...; one day he was looking at one of these printouts and saw a pattern he wanted to investigate. He fired up the computer again and typed in a line from the printout and told it to evolve the system from that point. The system evolved differently in the second run than the first; Lorenz thought it was a mistake, but eventually realized that the figures on the printout were rounded versions of the actual numbers in the computer, so he was introducing a small error by doing this, which was quickly amplified.

The MathTrek article is about climate, not weather, though, and it addresses this point (even anticipating my complaint about Lorenz!). Still, my instinct would have been -- even before reading this -- that the climate is a complex system just like the weather. The Lyapunov time -- the reciprocal of the Lyapunov exponent -- is much larger for climate than for weather. (I am quite confident saying that the average high temperature in Philadelphia next August will be about eighty-three degrees, and I am confident enough in this that when I take my air conditioner down when the summer ends, I will store it in my closet, instead of selling it. But I have a much worse idea what the weather will be on August 2, 2008.) Roughly speaking, climate is the average of weather, and averages change much less quickly than the things being averaged. I am confident that the Phillies will win about 29 of their remaining 55 games and just miss the playoffs, which is something I've gotten quite used to. (I'd be pleasantly surprised if they prove me wrong.) I have no idea whether they'll win tomorrow. (I would have said "I have no idea whether they'll win today," but they're up by four runs right now. It's only the fourth inning, though, so they have time to fall apart.)

The actual study is available here. Apparently the state of the art in climate and weather forecasting is to run a variety of different models on the same initial data; if they end up giving similar results then you can be fairly confident in the correctness of the forecast, while if they vary widely you know the forecast isn't so good. This is an experimental way of determining how sensitive the forecast is to the assumptions of the model. Although I'm not a meteorologist, it would be kind of interesting to see this on weather forecasts. I'm not sure how useful it would be for temperature; would I take a forecast high of "94, plus or minus 3" any differently than a forecast high of just "94"? Probably not. But for, say, snowfall estimates it could be incredibly useful. I don't care so much if they say there will be "two inches of snow". What I really want to know is if there's a chance of having some amount of snow that will seriously inconvenience me (say, over six inches). But I doubt you'll hear this on the TV news, because "we don't really know what the weather is going to be" kills the ratings -- even though "everyone knows" that the TV weather people don't really know what the weather is going to be. I've been known to actually use something like this "ensemble forecasting" myself -- I go to a bunch of different weather forecasts and see what they say. I'm not sure if it actually helps me, but it makes me feel better, usually because when I'm checking multiple weather web sites it means I'm procrastinating.

sex and the smart kids

A story that's recently been making the rounds has been this post mentioned in the Gene Expression blog in April about how "smart kids don't have sex". So far I've also seen it at slashdot, livejournal's "mathsex" community, and the Dilbert blog; it was also mentioned in Marginal Revolution back in April.

The Gene Expression post reproduces the plot which shows that "teens with IQs ranging from 75 to 90 had the lowest probability of virginity (the authors note this is also the same IQ range where propensity towards crime peaks)." "Teens" here means grades seven through twelve, so technically, at least in a lot of states, having sex is a crime for these people. And even if you ignore age of consent laws (as I think most teenagers are likely to do), there's still a strong message from parents and other authority figures that they shouldn't be having sex, and it's hard to find a place to do it. My suspicion is that both of these are tied in with some willingness to break rules. Why a willingness to break rules is correlated with a slightly below-average IQ, I don't know.

This post then goes on to consider college students, and to cite an article from the November 2001 issue of Counterpoint, a joint MIT-Wellesley publication, which has some charts purporting to show that likelihood of being a virgin is correlated with, um, scientificalness of one's major? (The choice of the word "scientificalness" is deliberate here, in that it's not a real word.) I'm inclined not to trust their results here because of sample size; they polled 287 Wellesley students and 236 MIT students, and break up the 287 students into 18 majors and the 236 MIT students into 15 majors. This means than, on average, they polled sixteen students in each major, and are then comparing those numbers.

The headline that grabbed me was the livejournal headline, "83% of undergraduate [...] students in math are virgins". 83%? Sounds like "five out of six" to me, or "ten out of twelve". Furthermore, 83% of Wellesley math majors were virgins; 29% of MIT math majors were virgins. Wellesley's a women's school; most MIT math majors are male. Similarly, they claim to be able to say which dormitories have more virgins than others; however they only mention eight of MIT's ten undergraduate dorms, and five of its several dozen Greek organizations. I can only assume that they got nobody from those places, which leads me to think that their samples from the dorms they did get a sample from weren't too big either.

And it's notoriously difficult to get information about people's sexual histories, even in an anonymous survey. One way I've heard of getting decent aggregate information is the following: instruct the subject to go into a closed room, and flip a coin. If it comes up heads, they are to answer the question (let's say it's "do you masturbate regularly?") truthfully. If it comes up tails, they are to flip the coin again, and answer the question "did the coin come up heads on the second toss"? From the results one can get a good estimate of the percentage of people who would have answered the first question "yes", but nobody feels incriminated.

I'm done criticizing the "study" in Counterpoint. (I remember criticizing it when I was an undergrad; I started at MIT in September of '01, and even after two months it was clear to me that something was fishy here.) But if "smart people" are having less sex, why? First, note that it's not "smart people" so much as "smart kids". There might be some correlation between "being smart" and "following rules", at least if you're using getting into a selective college as a proxy for intelligence. And as I said before, following the rules of society as a whole means not having sex outside a "committed relationship", whatever that is. But on the other hand, I think a lot of intelligence comes from knowing which rules to follow and which rules to break. MIT prides itself on creating the people who will lead the scientific and technological worlds in the future, and to get to that point people are going to have to break the rules, to ask the questions which haven't been asked, to do the things that the old farts have already said are impossible. What makes you think these kids aren't going to have sex?

Then again, "peer pressure" always seemed to me to be weak to nonexistent at MIT. So if college students are only having sex because they feel they "should" be having sex, then they're not going to.

As to whether mathematicians are more or less sexual than the general population, I can't really comment on this. I know I haven't had sex in a while, but other than that I can't really say; most of the people who I feel comfortable talking about sexual matters with are not mathematicians. There seems to be a tendency towards social awkwardness in the "smart kids" in high school, and less so in college, but I think after college it goes away; once people get out in the larger world they tend to find the people around whom they are comfortable. In the absence of the rule-breaking behavior I mentioned before, I would guess that the people having the most sex would be the ones right in the middle of the intelligence distribution, because people tend to have sex with people like them, and there are more people like the average people -- that's what being average means. (I am deliberately saying "average" instead of "mean", "median", or "mode", because intelligence is approximately normally distributed so these are all the same.) This is probably true for a lot of other things as well. The people having the most sex aren't the rich, beautiful people. The people having the most sex are probably of average financial status, average looks, average intelligence, ..., because finding a partner is easiest if you're Just Like Everyone Else.

01 August 2007

another way to estimate π

The Atlantic City Method for computing Pi, from Marginalia at mathpages.com, via today's installment of Anarchaia.

The method is as follows:
Here's the slowest, most inefficient method known for computing p: Choose an arbitrary initial value x (real) and iterate the mapping x -> (2-x)/(3-4x) for a total of M times. Count the number of iterates that fall within a small interval around zero. For example, let N denote the number of iterates that fall between -u/2 and +u/2 for some small u. Then the limiting value of the ratio uM/N is [π] (for sufficiently large M as u goes to zero).
I'm reminded of Buffon's needle, another experimental method for computing π. If we have a floor with parallel lines drawn on it at regular intervals, and we toss a needle on the floor at random, the average number of lines on the floor which are crossed by the needle is related to π. In particular, it's 2l/(tπ) where l is the length of the needle and t is the spacing of the lines. (For those who have seen this problem before, there's often a restriction that the needle has to be shorter than the distance between the lines; then the average number of lines crossed is just the probability that a line is crossed on any given toss of the needle.) It's possible to solve this for π. It's also possible to cheat and say you did the experiment and got a very good approximation to π, by taking into account the well-known approximation 355/113 for π and setting up the probability appropriately. This is described at the Wikipedia article. I'm not sure why anyone would bother to do this.

Another way to approximate π is to pick points uniformly at random in a square of side 2, in which is inscribed a circle of radius 1; the proportion of the points which are in the circle is π/4. I doubt that anybody can pick points uniformly at random from a square with sufficient accuracy for this to work. A computer isn't bad at picking points at random, though, and this leads to the following method of approximating π, which I've heard of before -- pick X and Y uniformly at random, and independently, from [-1, 1]. (This corresponds to picking a point in the square.) If X2+Y2 ≤ 1 (and thus the point is in the circle) increment a counter. Doing twenty such trials ten thousand times I get the following numbers of hits:

7788, 7792, 7806, 7806, 7814, 7825, 7833, 7834, 7841, 7842, 7846, 7850, 7851, 7851, 7870, 7877, 7897, 7914, 7929, 7934

which correspond to estimates of π/4 obtained by just placing a decimal point in front of each of these; we really have π/4 = .7853981634... It's not that surprising that these estimates are a bit spread out. The number of hits is given by a binomial distribution with n = 10000, p = π/4; thus its mean is 2500π (about 7854) and its standard deviation (10000 (π/4) (1-π/4))1/2, or about 41.

However, I can try out the "Atlantic City Method" in the comfort of my living room. (I could try the Buffon's needle method here, since I have the right sort of floor, but I'm not that bored.) Picking a number uniformly at random from [0,1], and carrying out the "Atlantic City Method" as described above with u=.01, N=10000, I get the following values for M:

29 (four times), 30 (seven times), 32 (four times), 33 (two times), 36 (three times)

These each correspond to approximations of 1/π -- just put a decimal point in front. In reality we have 1/π = .3183.

If instead I use u = .1, the situation seems a bit better; I now get

315 (once), 316 (twice), 317 (twice), 318 (three times), 319 (six times), 320 (five times)

and again these correspond to approximations of 1/π if you put a decimal point in front; not bad! It seems surprising that increasing u should do this -- we're told that we should let u approach zero -- but we're able to reduce the ratio of the standard deviation of the number of hits to the mean, and thus make the approximations better, by increasing u.

However, if we increase u even more this doesn't hold up; for example if u = 1 then M = 3524 or 3525. The random aspect of the problem seems to go away but the constant we get is not the one we were looking for! (And if I didn't know the value of π already, then I wouldn't be able to assess this.)

Now, why should this method work? The map f: x -> (2-x)/(3-4x) is an example of a Möbius transformation, also known as a fractional linear transformation. This insight is important, because Mobius transformations can be viewed as isometries of the Riemann sphere; it's clear that this particular transformation takes real numbers to real numbers. My first instinct was that it had something to do with the magnitude of the rotation represented by this particular Mobius transformation, but that doesn't seem to be anything nice. However, near zero the projection taking the complex plane to the Riemann sphere very nearly doubles lengths. (See the picture at the Wikipedia article on the Riemann sphere.) So the length of interval u around zero is mapped to an arc of interval just slightly smaller than 2u on the "real circle" -- that is, the projection of the real axis, plus a point at infinity, onto the Riemann sphere. The sequence of iterates x, f(x), f(f(x)), ... is what one might call a "pseudo-random" sequence of points (it's certainly not random, but I'll think of it as if it were) and it lies in the arc in question a proportion N/M of the time; however it also lies in the arc a proportion just slightly less than 2u/(2π) of the time if you believe the psuedo-randomness, since the arc has length slightly less than u and the circle is a circle of radius 1. As u approaches 0 the error in this approximation goes to zero. So we get N/M = u/π, or π = uM/N. (Of course, this is a heuristic argument, and certainly not a proof, but I just wanted to convince myself that it made sense.) What's more, the actual identity of the linear transformation appears to be a red herring; I suspect this works for a rather large class of Mobius transformations. (Not all, though; for example, for some choices of f the sequence x, f(x), f(f(x)), ... approaches one of the fixed points of f.) I'm having trouble identifying other Mobius transformations that actually have this property, though, so I'm not totally confident in saying this.

turning our backs on Bourbaki

From yesterday's New York Times (July 31): In Games, an Insight Into The Rules of Evolution, a profile of Martin Nowak, a mathematical biologist. His interests lie in trying to understand cooperation, which is important in evolution; he has been coming up with mathematical models for it; one example that's given is descendants of the Prisoner's dilemma where various members of a population interact preferentially with certain other members, instead of randomly. (The members that interact with each other more frequently are the ones that are "near" each other, either geographically or in some more abstract sense.) Cooperation turns out to emerge under conditions with are rather simple to identify.

I think that it's important for mathematicians to collaborate with people outside of mathematics, and to be exposed to ideas that at first glance don't necessarily appear mathematical. We're always going on and on to our students about how mathematics can be applied to large parts of everyday life, and I believe that's true. But at the same time, the mathematical community seems to look down on those who actually try to do so, instead lionizing people like Wiles and Perelman who have solved problems that have apparently no relevance to the real world.

The linguist Steven Pinker says that “Martin has a passion for taking informal ideas that people like me find theoretically important and framing them as mathematical models... He allows our intuitions about what leads to what to be put to a test.” I believe that this is one of the most important services we can render to the world. What mathematics is good at is stripping away the parts of a problem that are irrelevant and reducing it to its essence; seeing that that essence has something in common with many other apparently different essences; and finally using that knowledge to solve the problem. (It's like the "applications" exercises in most calculus textbooks, except not stupid. In such textbooks the translation from a real-world problem to mathematics is usually so simple as to just be an annoyance.)

Now, I believe that mathematical research that apparently has no use in the "real world" should be allowed to continue, and be funded with tax dollars, because we don't know what bits of mathematics that we're coming up with now will turn out to be "useful" a couple hundred years from now. (The canonical example here, I think, is that number theory has turned out to be very important for cryptography.) But at the same time, it seems to me that the mathematical community turns its backs on those who dare to actually think about those practical applications.

I hope I'm wrong.

how to break up time and space

Squaring the hexagon, from Strange Maps. This map illustrates a proposal for dividing France into perfectly rectangular (well, as rectangular has something on a sphere can be -- but France is small enough that one needn't worry too much about this) regions, of which there were 80 or so. (As you may note, France is not a rectangle.)

This proposal was made around the time of the French Revolution, and it has the ring of proposals from that period; these are the same people that invented the French Republican calendar and the metric system. The metric system has turned out to be a good idea (although some people will argue that many of its units don't correspond to anything on a human scale). The Republican calendar, for those who aren't familiar with it, broke up the year into twelve thirty-day months, of three ten-day "decades" each (these are analogous to weeks); there seems to be no a priori reason why this doesn't work, except that we're used to seven-day weeks. It's rather inconvenient that seven is prime, in fact; it would be useful to have a unit of the "half-week".

If you look at a Major League Baseball schedule some time, you'll see that that's actually the fundamental unit of baseball scheduling; teams generally play two series a week, one of them running from Monday to Wednesday, Tuesday to Thursday, or Monday to Thursday, and the other from Thursday or Friday to Sunday. You might notice that Thursdays are kind of ambiguous in this scheme; sometimes Thursday is the end of a series, sometimes it's the beginning, and sometimes it's neither. Teams often have off on Mondays or Thursdays. The weirdness of Thursday in baseball scheduling is a direct consequence of the fact that seven is odd.) In fact, I'd argue that the fact that the week has seven days is a lot more inconvenient than a thirteen-month year would be; the fact that thirteen is prime, and that therefore no simple fraction of a hypothetical 13-month year is a whole number of months, is pretty much irrelevant. When was the last time you ever did something that took exactly three months (one quarter) or six months (half a year)?

Returning to geography, drawing straight lines as boundaries often seems to be a bad idea; lines that are drawn without regard for population centers inevitably, if you draw enough of them, pass through some population center and divide it up. This causes the people in charge on either side of the line to ignore the other side in government, making the region less able to function as a whole. Surprisingly, it's hard to think of population centers in the U.S. that lie along a state border that's just a line on the map; most of the big population centers near state borders are along rivers, such as Philadelphia, New York, or Washington (note that even if the District of Columbia didn't exist, Washington would be on the Maryland-Virginia border). This isn't surprising; rivers are natural dividing lines. But I suspect that the situation would be different in more densely populated France. And drawing lines on a map without regard for settlement patterns is a cause of the chaos in the Middle East.

The difference between the U. S. and Europe is that in Europe the lines have evolved along with the historical population patterns (which haven't changed that much even in centuries), whereas in the U. S. a lot of the lines between states were drawn before the states in question were settled. One can't expect the people making the maps to forecast in advance where people are going to choose to live, especially in modern times when the population centers grow up along roads (which people can build) and not rivers (which are where they are).

A lot more thoughts on how territory is often divided up -- countries into states, states into counties, etc. -- can be found in Ed Stephan's book The Division of Territory in Society, text available online. His hypothesis is that there's a relationship between the size of a state, county, etc. and its population density; smaller states within a country, counties within a state, and so on tend to be denser. This can be derived from the assumption that "social structures evolve in such a way as to minimize the time expended in their operation", although it only gives a relation between density and county area. It doesn't make forecasts about shape, and it seeems to assume that densities are locally uniform when this is very far from the case.