31 August 2007

Patent for a five-sided dice [sic]

A patent for a "five sided dice" [sic].

The patent was filed by Louis Zocchi, inventor of the Zocchihedron, which is a 100-sided die; the Wikipedia article indicates that the Zocchihedron isn't a fair die.

Will this one be?

Previously I wrote about how one might design asymmetric dice; Mark Dominus claims that "the probability that the hexahedron will land on face F is not proportional to the area of F, but rather to the solid angle subtended by F from the hexahedron's center of gravity." I'm not sure if I believe this. It seems reasonable, because it captures how the die is likely to rotate in the air, but dice bounce when they hit the table, and I'm not convinced that the "bouncing" behavior isn't chaotic.

Anyway, the patent indicates that the die is basically a triangular prism (although with beveled edges), with 1 and 5 on the triangular faces and the pairs (2,3), (2,4), (3,4) on the rectangular faces (thus 2, 3, or 4 will appear "upwards" when the die comes to rest); by symmetry, 1 and 5 should occur with the same frequency, as should 2, 3, and 4. So there is such a die.

Part of the patent reads as follows:
The present invention has been tested for fairness wherein different sizes of dice were included in the test ranging from 13-18 milimeters in thickness.... During initial testing, it was felt that the 14 millimeter thickness was the closest size to providing equally random outcomes for each of the five faces so that each face would occur one-fifth of the time. Specifically, 0,63 rolls were made of the 14 millimeter thickness test dice which yielded 6,152 rolls in which a rectangular silhouette was seen and 4,011 rolls which yielded a triangular silhouette. This means that the two triangular faces came up 4,011/10,163=0.3947 of the time. If the dice was perfectly fair, those faces should come up exactly 04000 of the time. Given the number of rolls, the uncertainty (one standard deviation) was estimated to be 0.0070 which indicates that the experiment detected no significant deviation from fairness.
The actual standard deviation is more like √((10163)(.4)(.6)/10163 = 0.0049, meaning the results were a bit over one standard deviation from fairness; by the usual standards of statistics, though, it's still in a 95% confidence interval (i. e. within 1.96 standard deviations).

Eventually, it seems these will be manufactured at a thickness of 13.6 millimeters (which would prefer the triangular faces slightly more than the 14-millimeter thickness) but it is then stated that
It is believed that the dice may be ultimately manufactured in a range of size from 13 to 15 millimeters depending on the type of material they are to be used on.

It seems like a lot of trouble to have to have different dice for different purposes, which the inventor seems to think would be needed for fairness. (Perhaps this has something to do with the "bounciness".) There's a standard shape for a ten-sided die which could easily be used for this purpose (just label opposite sides with the same number), and from purely symmetrical grounds it's fair. I've been informed that rolling a ten-sided or twenty-sided (icosahedral) die and reducing mod 5 is standard among people who play role-playing games.

how maps are like mathematics, and maps of mathematics

Yesterday I wrote about different ways of drawing the Interstate highway system, and introduced this map by Nat Case that clearly shows it as a grid. Case had written about the map at Map Head, and says there that people accused him of copying this map by Chris Yates.

Apparently some people aren't happy about Case's map, because they're claiming that Case just ripped off Yates' idea. I don't know anything about intellectual property law, but it seems to me that there are a lot of maps of the same area that look very similar and convey almost the same information -- maps that are much more similar than those of Case and Yates. For example, two ordinary road maps of the same area should look very similar, because they'll depict the same roads! The impression I have is that the information (where the roads are) is freely available but the particular way in which it's drawn (if this isn't just a simple transcription of the information) is not. To quote one of the commenters here:
Copyright law expressly does NOT apply to "ideas", only to the execution of ideas. If this dude had copied the colors, font, or any other specific element from Chris' map, then it would be a different story (and "straight lines" isn't quite a specific element). If he had made his own map look like a subway map, even then it might have been a fuzzy area.

But, what seems to have happened was that he looked at Chris' map, then started from scratch creating his own expression of a similar idea. While it may have been courteous for him to credit Chris for the inspiration more obviously than he did, he certainly isn't legally required to do so. In fact, this very process of re-interpreting others' work is basically how art progresses in society in general. To use your analogy, I think this guy looked at a Picasso, said, "Hmm, cubism is cool," and painted his own cubist painting of the same model that Picasso used.

To this I would add that mathematics works the same way. Most proofs of the same mathematical fact, most expositions of the same concept, etc. will look the same in outline, simply because they have to respect how the ideas stand in logical relation to each other. But the choice of notation, of words to go between the equations explaining what's going on, and so on is unique to each author, and probably has a lot to do with what else is going on in a particular book or article. One might choose to emphasize certain parts of a proof -- doing some simple algebra in full instead of just saying "the reader can verify this" -- because the same calculation will come in handy later. Notation seems analogous to graphic design elements like color; a good map uses colors which are clearly different for things which are different, and good notation does the same. (I've had professors who used a and α, u and μ, or v and ν in the same problem, and had bad handwriting. No! I'd even go so far as to say that using letters which look similar in typed work is bad practice, because the conscientious reader will be copying your notation by hand in order to check things.) I doubt there are two proofs out there of, say, the Fundamental Theorem of Calculus (to take a result that's been reproduced in a zillion textbooks) that are word-for-word and symbol-for-symbol the same, just like there probably aren't two maps of Manhattan that are pixel-for-pixel the same.

The point here is that there's good exposition and bad exposition, which coincide roughly with good maps and bad maps. A good map, or a good writer, will help you navigate a tricky area; a bad map will just confused you.

And what would a map of mathematics itself look like? Dave Rusin has given it a shot, but each area of mathematics is just a circle to him, and it's not clear to me how they're connected to each other, or why his circles are arranged the way they are. He says at his A Gentle Introduction to the Mathematics Subject Classification Scheme that "[t]he welcome page for this site shows an image of the areas of mathematics which shows the relative numbers of recent papers in each area (arranged so as to illustrate the affinities among related areas)." It's a decent job -- nothing seems too far from where it should be, and I imagine that projecting the whole thing down into two dimensions makes it quite difficult! -- and it's not just based on Rusin's prejudices. It seems to come from correlations between classifications in that scheme -- two classifications are "close together" if there are papers which are classified in both of them. This somehow gives rise to a 61-dimensional space (the 61 dimensions presumably corresponding to the 61 two-digit classes in that scheme) of which this map is a two-dimensional projection. The vertical direction seems to work out with "discrete" mathematics at the top and "continuous" mathematics at the bottom; I'm not sure what the horizontal direction represents. (I think one could make an argument for "pure" on the left versus "applied" on the right, but it's a weak one.)

The same can be done with individual papers, or with mathematicians; see, for example, the papers of eight Fields medalists.

I sense that more could be done. Which areas of mathematics do you need to learn before learning certain others? Which areas historically grew out of each other? Within an area, how are the important results related to each other? And how could this be illustrated pictorially? I'm not sure what good this would be, though.

(At least two people seem to have done something similar for motorways in England); not surprisingly they both did their maps in the style of the London Underground map. It probably won't surprise you to learn that Great Britian's road numbering scheme isn't a grid, but rather divides England and Wales into zones emanating from London, and Scotland into zones emanating from Edinburgh.)

30 August 2007

ways of drawing the Interstate Highway System

Earlier this year, Chris Yates made available online a simplified map of the Interstate highway system, where the interstates are for the most part straight lines. This made its way around the Internet for a while, and it was frequently pointed out that it was wrong -- certain interstates just aren't on it. (The ones I noticed were I-83 (which runs between Harrisburg and Baltimore) and I-88 (Albany to Binghamton); a lot of other people pointed out that it also omits the entire state of Wisconsin.)

It occurred to me that it would be nice to see this map done correctly, and I'm trying to do it myself; unfortunately I don't have a really big piece of paper! It's probably possible to represent all the two-digit Interstates on a single piece of paper, though. But this map from Hedberg Maps looks like what I was thinking of -- it basically assumes that as many of the two-digit Interstates as possible actually form a grid and then fills in the rest, including the three-digit Interstates, accordingly. Too bad it's $40. They explain how the map works here; basically, the map's author, Nat Case, attempted to respect the topology of the system and align the main roads of the system with the grid, so the distance between, say, I-70 and I-80 is the same as the distance between I-80 and I-90. (Case claims inspiration from Yates' map, but seems to be a lot less visible on the Internet, probably because you can't view it online.)

A lot of people compare these to "subway maps", but there's one big difference. On a map of a subway system, the various subway lines are generally denoted by different colors; this makes it obvious when one has to transfer from one line to another. (See, for example, the standard Washington Metro map; there's a recolored version of the map to indicate the different service patterns they run on July 4th.) And this makes sense, because a change of trains has a substantial cost -- you've got to get off the train, often walk up or down some steps, and wait for another train to come. Roads don't work like that -- the cost to go from one highway to another is minimal.

I'd also like to see a similar map of, say, Philadelphia; the street signs indicate that each intersection is a certain distance west or east of one axis (Front Street or Germantown Avenue, depending on location) and north or south of another (Market Street), but the grid breaks down the further you get from the historic city. I actually live along a "seam" in this grid; as you walk down my street the numbers immediately go from the 500s to the 900s. How is the map distorted if you try to place locations not where they are, but where the addressing system implies they are?

edited: Nat Case talks about the map at mapHead; you can see a larger version of the map here. (1080 by 1080, which is large enough to get a very good idea what's going on. It's particularly interesting how some of the shapes of states get distorted here -- Florida is now much longer east-west than north-south, for example, and Arizona and New Mexico end up having roughly the shape of Chile, and Illinois is huge.

Personally, I think it's a little crowded in the northeast, which is something you might not necessarily expect from the description.

And now I don't have to make this map myself to see what it looks like, which is a relief because I'm not particularly good at this sort of thing.

profiles of interconnection networks

Flajolet and Sedgewick's Analytic Combinatorics (link goes to 12 MB PDF of the book, which is still in progress) is a most interesting book. I took a course based on the first three chapters of the book last fall; the purpose of the class was basically to teach us how to translate combinatorial structures into generating functions, in a fairly routine way. Unfortunately, we didn't get to the part of the text where we could then tell things about the asymptotics of those combinatorial structures, which is a tremendously useful thing to do.

So I've been working through parts of the book lately. In example V.8 Flajolet and sedgewick talk about a problem considered by Lagarias, Odlyzko and Zagier in their paper On The Capacity of Disjointly Shared Networks (although not quite in the same language). Flajolet and sedgewick ask: "There are 2n points on a line, with n point-to-point connections between pairs of points. What is the probable behavior of the width of such an interconnection network?" The width is defined in the following way: imagine the connections as circular arcs between the points on a horizontal line; let a vertical line sweep from left to right; then the width is the maximum number of arcs encountered by such a line.

For example, if we have n = 6 (and thus 12 points), we might make the connections 9-12, 5-10, 1-2, 6-8, 3-11, 4-7; as the line sweeps between 6 and 7 there are four open connections (and never are there more), so the width is 4. The "profile" of this network is the sequence given by the number of open connections just after the line sweeps past 0, past 1, past 2, and so on (call these a0, a1, a2, ...); this is the sequence

0, 1, 0, 1, 2, 3, 4, 3, 2, 3, 2, 1, 0

and in general this sequence increases to somewhere around n/2 and then decreases. Flajolet and sedgewick says that Louchard proved this, but I haven't looked at that paper. What surprised me, when I saw this problem, is that the profile actually looks something like a parabola, as you can see from the simulations at the top of p. 311 in Flajolet and sedgewick. Louchard proved that the profile "conforms asymptotically to a deterministic parabola 2nx(1-x)... to which are superimposed random fluctuations of O(√n)". So, for example, in such a network with one thousand nodes (n = 500), the profile goes as high as 250 (when x = 1/2) before going back down to zero.

Why should this be?

Imagine that you're putting together a random interconnection network. The profile sequence clearly has the property that each member ak is one more or one less than the last one. It'll be one more if k is the first member of one of the pairs, and one less if it's the second member. What's the probability that k is the first (i. e. smaller) member of its pair? If k is small, it'll be large; if k is near 2n then it'll be small. In fact, if we know k is a member of some pair, the other member of the pair is equally likely to be any integer from 1 to 2n except k; thus the probability that k is the smaller member of its pair is approximately 1 - k/2n, and the probability that it's the larger member is approximately k/2n.

Thus ak = ak-1 + 1 with probability 1 - k/2n, and ak = ak-1 - 1 the rest of the time. The expected value of ak - ak-1 is approximately (1-k/2n) - (k/2n), or 1-k/n. So, for example, one-fourth of the way through the process, when k = n/2, we expect ak to increase by about one-half each time we increase k by one.

Integrating with respect to n, we get

at ≈ ∫0t (1-k/n) dk = t - t2/2n

which is just a rescaling of Louchard's parabola. Flajolet and sedgewick's book is full of examples of things like this, where some apparently random structure turns out to have a lot of order when you "zoom out".

29 August 2007

non-nonstandard calculus, and the Carnival

Matt at The Everything Seminar talks about his experiences in teaching "non-nonstandard calculus". Formally, this is calculus done over the ring R[dx]/((dx)2=0); informally, this is calculus where dx is treated as a number whose square is zero. He writes:
By treating infinitesimals on the same footing as finite numbers, the approximation schemes of calculus become more intuitive. I think every mathematician has discovered this on their own, in their own private language. Why not make the language commensurable with the computations that we do?
I agree with this -- the reason we do whatever computations we do in a certain way is often because it's the easiest way we know. Why make the students suffer more than we have to?

As people have pointed out in the comments to Matt's post, this approach seems to have nothing to say about integration. But does the usual approach to teaching differentiation have anything to say about integration?

In general, it seems to me that mathematics and mathematics education are more separated from each other than they need to be. Obviously, some differences are necessary -- but why erect an artificial wall? (I feel this somewhat keenly because I am at the moment in my life -- two years into grad school -- where I am expected to jump over this wall.) John Armstrong has written about the Carnival of Mathematics and how it is becoming split between people who do lower-level mathematics and people who do higher mathematics, with the lower-level people winning. But it strikes me that they're claiming that the difference is one of kind, not one of degree. There should be a smooth continuum from 1+1=2 to, say, the Riemann hypothesis -- everything currently taught to students was research once -- but there isn't. It seems, though, that the Carnival of Mathematics welcomed submissions at all levels and it just turned out that there are more people writing at the lower levels.

This blog goes back and forth between levels quite often, so you'd expect me to think this. And that's intentional. One of my goals here is to make the point that mathematics -- well, at least some parts of it -- is not all that divorced from everyday experience.

(Oh, and by the way, yes I changed the layout. People have said for a while they didn't like it; the Adsense ads were making me only a pittance; and the old layout was orange and blue which are the Mets' colors.)

28 August 2007

the lighter side of financial mathematics

Engraved Portraits of Gauss for sale, just 40 dollars!

These are in fact 10 Deutsche Mark notes. I have the feeling that the seller, Acme Klein Bottle, sold them at a lower price before 2002.

I'm kind of tempted to order one, but forty bucks is forty bucks, and I'm a grad student. It kind of seems like a nice conversation piece, though. Besides, for two dollars more I could get a Klein bottle. Or I could get a Klein bottle hat. I knew someone who tried to knit one of these once; I don't remember if she succeeded. Acme also sells Mobius band scarves, which I suspect would be quite annoying because I like my scarves to have ends. These would have novelty value and keep me warm.

(You might also consider these portraits of Euler, or any of the portraits from this gallery. Rather strangely, all the portraits have numbers on them.)

But let's say, hypothetically, I bought the portrait of Gauss. I could hang it on my wall and people would wonder why I hung money on my wall instead of spending it. It would kind of be like a Knuth reward check -- Knuth pays a bounty of $2.56 for each error people find in his books.

In 2002, Knuth said in this article in the notices of the AMS, when asked what would happen if all his reward checks were cashed:
There's one man who lives near Frankfurt who would probably have more than $1,000 if he cashed all the checks I've sent him. There's a man in Los Gatos, California, who I've never met, whom I've never met, who cashes a check for $2.56 about once a month, and that's been going on for some years now. Altogether I've written more than 2,000 checks over the years, and the average amount exceeds $8.00. Even if everybody cashed their checks, it would still be more than worth it to me to know that my books are getting better.
Knuth didn't answer the question directly, but I assume he'd be okay if they all were cashed -- I have a feeling he's got some money saved up.

On the contrary, Erdos once said that he would not be able to pay out all the rewards he had put on various problems, and compared this to that the strongest bank would not be able to survive if all its customers simultaneously wanted their money -- but believed the bank run to be more likely. I'm kind of curious if there's a list of Erdos problems out there. This article indicates that people usually did cash Erdos checks, perhaps because the amounts of money involved are greater -- and back then, you got a cancelled check back from the bank anyway. Ronald Graham estimates that the total outstanding bounties on Erdos problems are about $25,000, although it seems he's not sure because as of that writing there was no list of them. I was able to track down a list of a couple dozen or so.

And while I'm talking about money: I previously wondered about the density of money, and I concluded that the density of U.S. coinage -- if we make certain reasonable assumptions about how change is given -- is $28.58 per kilogram. This was inspired by the fact that I had some change I needed to cash in.

I cashed in my change recently; I had 95 quarters, 126 dimes, 76 nickels, and 339 pennies, for a total of $43.44. This is also 2,052 grams, for a money density of $21.17 per kilogram. As I said in my earlier post, I tend to use quarters for laundry. One expects dimes, nickels, and pennies to occur in a 2:1:5 ratio in randomly occuring change; my actual experience is not too far from that. In randomly occuring change, though, one expects three-fourths as many quarters as pennies; I didn't have nearly that many quarters.


Jessica Hagy's blog Indexed (features a wide variety of quasi-mathematical doodles on index cards. (via Freakonomics. She writes:
This site is a little project that lets me make fun of some things and sense of others. I use it to think a little more relationally without resorting to doing actual math.

Some of the mathematical things she uses:

Venn, or Euler diagrams: this diagram of scary stories told to kids (which include Aesop, the Brothers Grimm, and Leviticus, and her explanation of humor at Freakonomics, among many others (For the record, a Venn diagram specifically allows that all possible intersections of the sets in question and their complements, except for those containing both a set and its complement, are nonempty; an Euler diagram shows the actual intersections and inclusions between sets.

A visual joke about tangents. (This reminds me of a story: I was born at 11:41 in the morning. The first time I learned about Lie algebras, it was my birthday, and I had a lecture that morning to attend from eleven to twelve. I asked a question at about 11:39 or so. My professor went off on a tangent about Lie algebras, and I looked at my watch while he was talking and it was 11:41. I felt all special.)

Correlation does not imply causation. But sometimes correlated things are related by causation: if there's more food around, you get fat.

Optimization problems in music and in food.

Meaningless charts, which I think I've ranted about before.

The same diagrams get recycled with different labels -- for example, this post on money and status, this one on the evils of gym class, and this one on career choice. This is an example of the general phenomenon that lots of apparently different things are governed by the same mathematical structure.

And normal people get laid most following a normal distribution!

The complete graph on six vertices is good for something. Seven vertices, too, but she can't draw a regular heptagon. (Can you?)

dimensional analysis and modified Newtonian dynamics

I just finished reading Lee Smolin's book The Trouble With Physics: The Rise of String Theory, The Fall of a Science, and What Comes Next. It's really two books in one -- one about the history of modern physics (including, roughly speaking, how the string theorists have taken over) and one about how Smolin believes this is a Bad Idea. Apparently string theory doesn't look so promising these days, and yet the string theorists still dominate physics departments. He's currently at the Perimeter Institute, one of the aims of which is to break down this orthodoxy and give its researchers the freedom to pursue their interests.

Smolin, in examining possible research questions for the future, considers that perhaps it would be a good idea to try to find new interpretations of already surprising facts. For example, consider the length scale implied by the "cosmological constant", which is the length scale over which the universe is curved; this he calls R, which is about 10 billion light-years. What else happens on this scale? The obvious thing is, of course, the universe as a whole. But it turns out that c2/R also seems to be an intersting scale to look at. The constant c2/R (where c is the speed of light) has the units of acceleration, and is about 10-8 centimeters per second. It seems, actually, that Newton's law of gravitation might break down for very small accelerations; this is Mordehai Milgrom's theory of modified Newtonian dynamics (MOND). This can be seen by looking at stars orbiting the center of their galaxies; when the acceleration due to gravity is greater than this critical value, Newtonian gravity seems to work, but when it's less than this critical value it doesn't. When you get below the critical value, it appears that gravitational force falls off only as the inverse distance, not the inverse square distance.

When I read this, though, I was a bit skeptical, because Smolin says that when the acceleration is small, it varies as the square root of mass. This seems to violate everything I know; shouldn't forces be additive? If I'm reading Smolin's version correctly, it's saying that a galaxy four times as heavy should produce an acceleration only twice as large... but only if you view that galactic center as one large object, and not four smaller ones. A bit more ridiculously, consider a galactic center of mass M. It exerts a certain force F on some object far away. But now view that same center as k2 centers of mass M/k2. Then each of those centers exerts a force F/k on the object; the total force on the object is thus k2(F/k) = kF. How does the galactic center "know" whether it's one object or many?

I don't see this criticism anywhere, and it seems so obvious... but I'm not sure whether the issue is with MOND as a theory, with Smolin's summary of it, or with my own understanding of physics.

27 August 2007

some links, primarily about the "social graph"

Brad Fitzgerald, formerly of livejournal, gives his Thoughts on the Social Graph. The social graph is, roughly, the graph whose vertices are individuals and whose edges connect pairs of people who know each other (for some definition of "know"); Fitzgerald advocates making the social graph a "community asset" that would be available to anyone who wanted it.

I see the term "social graph" is a bit misleading because it suggests that there is only one type of edge, i. e. one kind of way in which one can know other people. (Indeed, this is a perennial problem on a lot of social networking sites; there's often no way to indicate the difference between, say, a person that one met once at a party and one's spouse.) Fitzgerald acknowledges this:
It's recognized that users don't always want to auto-sync their social networks. People use different sites in different ways, and a "friend" on one site has a very different meaning of a "friend" on another. The goal is to just provide sites and users the raw data, and they can use it to implement whatever policies they want.

The next step in the analysis of social networks might be to take into account the "strength" of relationships, as currently most social networking sites I know of don't acknowledge that I am more interested in some of my friends than others. Also, to what extent can the strength of relationships be guessed just from looking at the social graph without this strength information? A person who I know quite well is likely to be someone with whom I have many friends in common.

Also, the social graph is in some important ways directed (the link goes to Good Math, Bad Math); for example, if edges connect X to Y if X reads Y's LiveJournal. (It wouldn't surprise me to learn that Fitzgerald has this example in mind.) The somewhat addictive tool LJ Connect, which finds the shortest path between two individuals with LiveJournals via their friends, explicitly acknowledges this.

Finally, some shadowy figures seem to be aggregating the mathematics blogging community, and various other blogging communities as well. Their in-house mathematician gives yet another example of using the formula 1+2+...+n = n(n+1)/2.

And some links:

Paul Graham writes about Holding a program in one's head; a lot of his ideas seem to have obvious analogues to mathematical research, which isn't surprising, as programming and mathematics are quite similar. I'll probably have more to say about this later.

There is a graphical formalism for quantum mechanics, which its authors call "kindergarten quantum mechanics"; this is said to be a very considerable extension of Dirac's notation, and gives short derivations of deep results concerning teleportation, quantum mechanics, and so on. If this is true (I'm not familiar enough with the results concerned to say), it's a good example of what Graham has to say about choosing appropriate notation.

Vlorbik asks us to please lie more carefully. He writes:
We were supposed to test the claim that a certain population proportion was 10% against a sample proportion of 13%, based on n= 57 data points (at some stated confidence level that I’ve forgotten). But wait a minute. You can’t get 13% from a sample of 57 subjects: 7/57 ~ .122807 (i.e., 12%) and 8/57 ~ .14035 (i.e., 14%).
This particular one isn't a big problem, but it's symptomatic of something more general -- that the "applications" problems in a lot of mathematics textbooks bear very little resemblance to reality, which only seems to frustrate the students. (Another complaint I have about such textbooks is that the calculus texts seem to assume the student is familiar with physics, which is often not a reasonable assumption to make, and so the instructor ends up teaching physics instead of calculus.)

In Nature's Casino, by Michael Lewis, from the New York Times, August 26, about the insurance market for catastrophic events.

scalene triangles on Google

Would you believe that the #1 "hot trend" on Google Trends yesterday was scalene triangle?

I am not making this up.

The Google Trends page for yesterday is actually feeding me a fair bit of traffic right now, to this page in which I solved an unrelated puzzle about scalene triangles.

This would be mystifying to me, but there was a question on "Are You Smarter Than A 5th Grader" last night asking how many angles in a scalene triangle are the same. (Incidentally, I'm not entirely sure whether the answer is zero or one. All the angles are different. There was another question on that show that had two possible answers -- they asked for the world's longest river, whether the world's longest river is the Nile or the Amazon depends on how you measure.)

The #2 "hot trend" yesterday was "mars moons", and #8 was "feet in a mile"; "proper noun" is #21, and "worlds longest river" [sic] is #38. These also related to questions on that show.

The peak time for this search was 4 PM, which seems mystifying until you realize that Google uses Pacific time; this is 7 PM Eastern, which is the time the show aired in the East. Indeed, the four hours when "scalene triangle" was most searched were 4 PM, 7 PM, 5 PM, and 6 PM Pacific (in that order); these are, of course, 7 PM Eastern, Pacific, Central, and Mountain, respectively, which I suspect is the ranking of U. S. time zones from most to least populated. I suspect that it's possible to distinguish between trends that occur because of something on TV and trends that occur because of something that happens in the "real world" (i. e. a breaking news story) by looking at this; breaking news stories should show a single peak, while TV-inspired searches should show a large peak and then a small peak three hours later.

From what I can gather from about Google Trends, the quantity they're ranking is the number of searches done on a particular query in the day in question divided by the number of searches done on a "typical" day. My suspicion is that the various questions on the show were equally likely to be Googled, but more people Google for river lengths or grammatical terminology than scalene triangles on a normal day.

26 August 2007

Scrabble, part two

After the post I made a few moments ago, I saw how to do the count.

The answer's 3,199,724, which apparently is the standard answer accepted in Google.

The method is pretty straightforward. First, find the number of "partial racks" (i. e. sets of zero to seven tiles) consisting of just A; there's one with no tiles, one with one tile, and so on up to seven tiles. (There are 9 As, so a rack full of them is possible.) Then find the number of "partial racks" containing A and B; there's one with no tiles (the empty rack), two with one tile (A and B), three with two tiles (AA, AB, BB), and three with each of three through seven tiles. (There are only two Bs.) Repeating this for each of the 27 tile types (26 letters plus the blank) gives the answer of 3199724, not far from my sampling-derived estimate of 3224068.

how many Scrabble racks are there?

I've been playing a lot of Scrabble on Facebook lately. While Googling around for some stuff on Scrabble strategy (I'm not the type to memorize words, mostly because that crosses some sort of invisible line between "fun" and "work", but knowing about things like how to try to keep a good mix of vowels and consonants on my rack makes the game more interesting just because I'm less likely to get stuck with a bad rack, which is frustrating), I found a couple things of interest.

First, this handout from the MIT Scrabble club. (I actually knew the guy who started it. He lived down the hall from me my senior year, when he was a freshman.) On that handout they quote Joel Sherman, who says:
Accept the idea that Scrabble is a math game just as much as it is a word game. All the strategic theory of the game is based on statistical analysis, probabilities, spatial relationships on the board, maximizing the value of small-numbered tiles by playing bingos (using all 7 tiles on your rack to earn the 50-point bonus), and large-numbered# tiles by causing them to interact with the colored premium squares on the board, or with other words on the board. (I.e.: you can score just as much for placing a 4-letter word in a place where it lies parallel to 4 other letters or whole words without hitting any premium squares as you might for the same word hitting a double or even triple word square but forming only one or two words on the play.)

Indeed, the conventional wisdom is that people who are good at math do well at Scrabble, and that having a large vocabulary is not particularly useful. (I suspect a lot of this is because large vocabularies tend to consist of long words, and words much longer than seven letters are quite rare in Scrabble.) The same thing is said to be true of crossword puzzles, although there it's not as obvious why there should be a correlation, and my belief is that it's not as pronounced.)

I also found at Scrabble on the Brain a question: "How many games of Scrabble would you have to play before you see every possible rack?"

How to answer this question is not obvious, because the racks one gets in a game aren't independent. I'll answer the easier question of how many games of Scrabble you'd have to play before you see every possible rack on the opening play. This is an instance of the "coupon collector's problem", which says: if we pick a sequence of elements from {1, 2, ..., n} at random, with replacement, how long will it take until we've picked each of the n integers at least once? The standard version of the problem has one picking uniformly at random, so this models, for example, the number of rolls of a standard six-sided die that would be necessary to have each number come up at least once. The answer in this case is n(1 + 1/2 + 1/3 + ... + 1/n), which is asymptotically equal to n(log n + γ), where γ = 0.5772156649... is the Euler-Mascheroni constant. I suspect that in the case where the distribution is non-uniform, as it is for Scrabble racks, it takes longer to observe all possiblilities;

and so the important thing is to compute the number of possible Scrabble racks. (If this has been done before, I can't find it!)

If Scrabble tiles could be differentiated (so instead of having nine tiles labeled A, we had tiles labeled A1 through A9), then the answer would just be the number of ways of picking seven tiles from 100, which is C(100,7) = 16,007,560,800. But they're not. I can't see a nice way to compute the actual number (although that doesn't mean it's not there, and I'd appreciate knowing it!) But the following sampling procedure should yield an estimate. Let X be a random variable defined as follows: pick a rack R of seven Scrabble tiles uniformly at random from all 7-sets of (distinguished) Scrabble tiles, and let X(R) be the number of different 7-sets of distinguished Scrabble tiles that collapse to the same set of non-distinguished tiles. So, for example, if we picked the tiles EXAMPLE, there are 12 E's, 1 X, 9 A's, 2 M's, 2 P's, and 4 L's in the tile set, so the number of possible sets of distinguished tiles reading EXAMPLE is C(12,2) C(1,1) C(9,1) C(2,1) C(2,1) C(4,1) = 9504. Then the number of possible racks of undistinguished tiles is the sum of 1/X(R) over all possible racks of distinguished tiles, or C(100,7) times the expectation of 1/X(R).

(Compare if we were trying to find the number of possible one-letter racks, that is, letters; we'd have, say, 9 terms which are each 1/9 corresponding to the labeled tiles A1 through A9, 2 terms each 1/2 corresponding to B1 and B2, and so on. The terms corresponding to each tile type sum to 1, so the sum is just the number of tile types, or 27 including the blank.)

And the expectation of 1/X(R) can be found by sampling. I generated a million racks at random and evaluated 1/X(R) (with some hacked-together Maple code); the expectation of 1/X(R) calculated from this sample is 0.0002014090957, and I approximate the number of possible Scrabble racks to be this quantity times C(10,7), or 3224068. (I'm not interested enough to try to calculate the standard error on this -- and I hope I can get an exact answer.) The number of games of Scrabble you'd have to play before you see every possible rack on the opening play is probably at least 3224068 log 3224068, or around fifty million.

24 August 2007

Some thoughts on the current journal system

From Ars Technica: The challenges of scientific integrity. (via Ars Mathematica). Apparently plagiarism has been breaking out on the arXiv -- for those of you who don't know, this is a central online repository for preprints in mathematics, physics, etc. What's amazing to me is that someone would be foolish enough to do this -- the plagiarizing papers which are showing up on the arXiv plagiarize other papers from the arXiv. That is, they plagiarize papers which are easily available online -- and, from the sound of it, by taking fairly long, verbatim excerpts. If you're going to plagiarize, don't take things from the first place where people will look! Take your results from some obscure journal that nobody reads anyway. This raises an interesting question, though -- what's the line between "plagiarism" and "research"? (One old joke has it that if you copy from one source, it's plagiarism, and if you copy from many sources, it's research, but these people copied from many sources.) Obviously, if you do a computation that somebody has done before (because you want your readers to see it), you'll use the standard notation and so on. A lot of the difference is just about citation -- the honest author says "I'm showing you how Erdos did this", whereas the dishonest author tries to pass off someone else's computation as their own. We all stand on the shoulders of giants. The shoulders of the giants hurt, and they want to be acknowledged. Can you blame them?

Peter, commenting at the Ars Mathematica post, says that plagiarism of admissions essays occurs fairly often, mainly from people from non-Western backgrounds. Is it possible that these people simply don't realize that what they're doing is wrong by our standards? However, since the admissions essay in question is a statement of what sort of research a person wants to do, and is written in the first person, this seems doubtful -- two people's research interests are very unlikely to line up exactly.

The comments at Not Even Wrong about this are also of interest. The general consensus seems to be that the current journal system needs some work, especially with regard to lax standards of peer review. That sounds plausible to me, but I don't have enough experience with the system to say for sure.

From Marginal Revolution, I learned about An Auction Market for Journal Articles, suggested by Jens Prufer and David Zetland; they suggest that papers would be posted to some central repository, and editors of journals would then bid for papers (in some sort of fake money they call "academic dollars"). The academic dollars they bid are then split among the authors, editors, and referees of the papers which this paper cites, thus giving editors incentive to pick papers that will be cited often. The authors claim that this system (which I have massively oversimplified) will encourage referees to do more work in improving papers, speed up publication times, and encourage authors to write better papers.

Of course, if you're going to do this, why have journals at all? Is there some logical reason why papers can't stand alone, independent of the journal system, other than "that's how it's always been?" The same Marginal Revolution post points to a note by Carl Bergstrom on "fantasy journals"; why couldn't these become the real journals? In fact, why can't a person just be their own journal? (Instead of trying to get published in prominent journals, one would try to get cited by prominent people. Who's prominent? The people who get cited a lot. It sounds circular, but that's how Google works and they seem to do a good job.) The reason that the journals exist is to solve the problem of distributing work to widely spread people; the Internet eliminates that problem.

23 August 2007

The solution to the triangle puzzle

Yesterday, I asked a question:

Given a scalene right triangle with sides of integer length and perimeter P, show that there is a corresponding scalene triangle with one angle which is 60 degrees and perimeter 1.5P.

For example, the 3-4-5 triangle is a right triangle with perimeter 12; the 3-7-8 triangle has a 60-degree angle (the one opposite the side of length 7) and has perimeter 18.

As promised, here's the solution.

First, how do we check that a triangle has a 60-degree angle? We use the law of cosines. When I originally did the analysis, I assumed that larger angles of a triangle had to be opposite longer sides. This is actually only guaranted to be true for triangles in which all angles are acute (in which case it follows from the law of sines). The sixty-degree angle in a scalene triangle (for those of you who don't know the word, it just means that all three sides have different lengths) must be the second largest angle. It can't be the largest angle, because then the sum of the angles would be less than 180 degrees; similarly it can't be the smallest angle, because then the sum of the angles would be greater than 180 degrees. In the following analysis I will assume that the side opposite the 60-degree angle is the second-longest side. It turns out that we can answer the problem under this assumption;

So we're looking for triples a < c < b of positive integers, with c2 = a2 + b2 - (2 cos 60o) ab. But cos 60o = 1/2, so we need to find triples for which c2 = a2 + b2 - ab. For example, the 3-7-8 triangle is one of these; we have a = 3, c = 7, b = 8, and 32 + 82 - 2(3)(8) = 72, as you can check.

To get an idea of what the correspondence might look like, I tried to figure out which triangles with 60-degree angles corresponded to the 5-12-13 and 7-24-25 right triangles. On a bit of a wild guess, I noticed that the shortest side of the 3-4-5 and 3-7-8 triangles was the same, so I assumed that the shortest side of the triangle corresponding to the 5-12-13 would be 5. The perimeter of the triangle we seek is 45, so it has sides 5, 40-b, b. Setting up the equation above, we have

(40-b)2 = 52 + b2 - 5b

which is actually a linear equation (the b2 terms cancel), with the single solution b = 21. This gives us a 5-19-21 triangle. The same procedure applied to the 7-24-25 right triangle gives 7-37-40.

At this point I have enough data that I can try to guess what the pattern is. I remember that all primitive Pythagorean triples-- that is, triples of integers (a,b,c) such that a2 + b2 = c2, and a, b, and c don't have a common multiple -- can be written in the form

(r2 - s2, 2rs, r2 + s2)

for some r and s, with r > s, where r and s relatively prime and not both odd. The perimeter of this right triangle is 2rs + 2r2, so I sought a corresponding triangle with a 60-degree angle and perimeter 3rs + 3r2. It seems reasonable to guess that the three side lengths will be linear combinations of r2, rs, and s2.

In the case of the 7-24-25 triangle, we have r = 4, s = 3. Thus r2 = 16, rs = 12, s2 = 9. I observe that 7 = 16-9, 37 = 16 + 12 + 9, 40 = 16 + 2(12). More generally, from the general triple listed above we get

(r2 - s2, r2 + rs + s2, r2 + 2rs).

It only remains to check that these three numbers sum to 3rs + 3r2 -- they do -- and this triple, if we call it (a, c, b), obeys the relation c2 = a2 + b2 - ab; it does, but I'll spare you the algebra.

I conclude by giving the primitive Pythagorean triples with hypotenuse less than 100, and their corresponding triangles with a sixty-degree angle and one and a half times the perimeter:

If we're given a non-primitive Pythagorean triple, it's a constant multiple of a primitive one; for example, 6-8-10 is twice 3-4-5, so we get twice 3-7-8, i. e. 6-14-16. Alternatively, the triple 8-6-10 (in that order) comes from r=3, s=1; applying the formulas above gives 8-13-15. "Blinding Insight" gives a solution that's different from mine in the comments to yesterday's post, since it's based on a different parametrization; this solution gives different, but still valid, answers.

Note that this doesn't necessarily generate the triangles with a sixty-degree angle and integer sides in "primitive" form. For example, r = 5, s = 2 gives the triple (21, 39, 45) were we might hope for (7, 13, 15). Indeed, if s-r is divisible by 3, then all three sides of the sixty-degree triple will be divisible by three. Furthermore, the perimeter of the sixty-degree triple is always divisible by three, so we'll never get (7, 13, 15) in that form.

A puzzle

Given a scalene right triangle with sides of integer length and perimeter P, show that there is a corresponding scalene triangle with one angle which is 60 degrees and perimeter 1.5P.

For example, the 3-4-5 triangle is a right triangle with perimeter 12; the 3-7-8 triangle has a 60-degree angle (the one opposite the side of length 7) and has perimeter 18.

I'll post the solution tomorrow.

(The puzzle isn't mine; it's from the puzzle column in the March/April 2007 issue of Technology Review, which I was flipping through earlier today.)

22 August 2007

Texas 30, Baltimore 3.

While at the Phillies game tonight (the Phillies lost, 15-3, to the Dodgers), I saw on one of the out-of-town scoreboard a score of Texas 26, Baltimore 3.

I figured it had to be an error, as did the other people in the stands who noticed it. Twenty-six runs? (At that moment, the other out-of-town scoreboard reported the score in that game as 16 to 3, because it can't show scores above 19. Those don't happen too often in baseball.)

In the end, Texas won that game -- the first game of a doubleheader, no less -- 30 to 3. MLB.com's account of the game is here. It's the most runs scored since the Chicago Colts (now Cubs) beat the Louisville Colonels (who haven't existed since 1899) by a score of 36 to 7, on June 29, 1897. Nobody in "modern" baseball (by convention, this is since 1901) has scored more than 29. Thirty runs is also an American League record. There have been eighteen 25-run games in modern baseball.

You'd think that a team scoring thirty runs would be likely to score in every inning, or at least in more innings than not. But Texas only scored in four innings -- 5 runs in the fourth, 9 in the sixth, 10 in the eighth and 6 in the ninth.

This reminded me of a record I once came across: teams which have scored in every inning of a nine-inning game. This is rarer than you'd think -- only seven times in major league history, all in the National League. (Note that it's almost always the visiting team that sets this particular record; a home team which has scored in the first eight innings will almost certainly not come up in the ninth.) Those seven teams won by scores of 19-8, 26-12, 20-10, 36-7 (the same 36-7 game as above), 22-8, 15-2, and 13-6. The losing teams in these efforts put up an average of 7.6 runs a game -- quite a bit higher than the average of four or five runs -- which lends a bit of credence to the idea that the two teams' scores in a game aren't independent.

And here's a mathematical question: if a team scores 30 runs, what is the probability they manage to do it while only scoring in four or less innings? We'll assume the team is a visiting team, and thus has all nine innings to work in. Furthermore, the number of runs the team scores in each inning is independent. The most recent data I could find for run distribution is from John Jarvis' collection, for the 2005 American League, so I'll use that. The distribution of the number of runs scored in a single inning, for the 2005 AL, is as follows:
Number of runs scored012345678910
Number of times14458307114426763251507822446

Incidentally, under the assumption of independence, the probability of a team scoring in all nine innings in a game in the 2005 AL is (1-.769)9, or about one in 79,000; there are 2,430 games in a season currently (162 per team, times thirty teams, divided by two), so we expect one such game every thirty years or so.

We can thus take the probability generating function corresponding to this distribution:

f(z) = (14458 + 3071z + 1442z2 + 676z3 + 325z4 + 150z5 + 78z6 + 22z7 + 4z8 + 4z9 + 6z10)/20236.

The probability generating function corresponding to the sum of this distribution with itself is g(z)2; this is the sum of the number of runs scored in two independent innings, given that scoring takes place in both of those innings. A similar interpretation holds for higher powers. From this, we can easily find the probability that a team scores a total of thirty runs in the first k innings of a game, and none in the remaining 9-k innings; it's

[z30](f(z)-p)k p9-k

where p = 14458/20236 is the probability of not scoring in a given inning. The probability that a team scores a total of thirty runs in some k innings is this times the binomial coefficient C(9,k), which is the number of ways to pick the innings in which the scoring happens. Call this P(k). Then we get

normalized P(k).00055.01409.11572.31455.35560.17108.02840

The normalized probabilities are just P(k) divided by the sum of the P(k) values; thus the value in column k is exactly the probability I sought originally. Rather surprisingly, to me, a team which scores thirty runs is most likely to score in seven of the nine innings, which is also the median of the distribution; innings with no score are so common that at least one happens in all but three percent of thirty-run games. To answer the original question, the probability that a team scoring thirty runs does so while scoring in four or less innings is about 1.5%. (Assuming, of course, that innings are independent and run distributions remain like the 2005 American League indefinitely.) And you can't even begin to suspect I'm wrong until, oh, a couple hundred thirty-run games have happened without this occurring again.

(Complete data of this form: a team scoring one run is of course most likely to score in one inning. A team scoring two or three runs is most likely to have scored in two innings; 4-7 runs, three innings; 8-12 runs, four innings; 13-19 runs, five innings; 20-28 runs, six innings; 29-40 runs, seven innings; 41-57 runs, eight innings; 58 or more runs, nine innings. I suspect at least the first few of these numbers could be checked against actual data.)

They say nobody reads anymore

John Armstrong is begging me in a comment at at Concurring Opinions to comment on this article from the Associated Press, which makes the following claims:

  • One in four adults say they read no books at all in the past year

  • "The typical person claimed to have read four books in the last year -- half read more and half read fewer." -- so although they don't want to use the word "median", they're saying the median number of books read is four.

  • "Excluding those who hadn't read any, the usual number read was seven." Assuming that "usual number" means "median", this is saying that five-eights of people (the quarter who read no books, plus half of the rest) read seven or less books in the last year.

So what do we know about the distribution? One-quarter of people read no books; one-quarter read between one and four; one-eighth read between four and seven; three-eighths read more.

They claim a 3% margin of error, as well, which is standard for polls involving a thousand people (as this one was), but that margin of error only applies to the survey as a whole. The article includes a lot of claims of the form "Xs read more than Ys", but the number of Xs or Ys that were polled is less than a thousand, so the margin of error is greater.

It seems hard to get these numbers, though. The article claims that "In 2004, a National Endowment for the Arts report titled "Reading at Risk" found only 57 percent of American adults had read a book in 2002, a four percentage point drop in a decade. The study faulted television, movies and the Internet." (Emphasis mine.) If you interpret both of these claims at face value, 18 percent of non-readers have been converted to readers in the last five years. This seems unlikely.

I keep a list of the books I read; there are approximately one hundred and seventeen books on it. Yes, you read that right. That number's a bit inflated by the fact that about a third of those books were books I was rereading. But it's also a bit deflated because I have a tendency not to count textbooks, research monographs, and so on. (There's a pile of books up to my knee -- mostly library books, which is why they're in a pile, so I remember to return them -- which I read in the last year but aren't on this list.) I've also probably read a dozen or so books while sitting at the bookstore because I was too cheap to buy them, and some book-length online works...

Of course, I'm not typical.

A "book" doesn't seem like the right unit here, though. For one thing, some books are much longer than others. To take the two books on my shelf that I suspect have the most and fewest words, I estimate that Victor Hugo's Les Misérables has about 700,000 words, and Susanna Kaysen's Girl, Interrupted has about 40,000. For another thing, the person who reads a lot of books is going to engage a lot more deeply with some of them than others. There have been books that I've read in two hours and never gone back to; there have been books that I've returned to over and over again and find new insights every time. Most, of course, fall somewhere in between. And what if you don't finish a book? Does it still count?

I think the more useful metric would be "how much time have you spent reading books in the past year?" Because you can't ask people that, I suspect the Right Thing to do is to call people up and say "how much time have you spent reading books in the last week?", doing so at various times of year to control for the fact that reading is probably higher at some times of year than others, and averaging the results. But no one cares that much. (Actually, I suspect people in the publishing industry care very much, but they're not releasing their findings if they have any.) Or perhaps "how many words have you read in the past year?", but counting this seems almost impossible. (It wouldn't surprise me to learn that this number is rising, as a lot of online content is in written form.)

And as "Katie" commenting at Concurring Opinions points out, anyone at the high end of the spectrum probably underestimates. I don't know if this is true. I would have estimated I read "about two books a week" in the last year, for 104 books in the year; when I went and looked at the list, the actual count was 117. This might not matter all that much, because the pollster was asking about the median, and the people who are going to have trouble are the ones that are above the median. But I suspect that this poll is a lot like the recent polls about the number of sexual partners; people feel that they should lie about the number of books they read. I'm not sure in which direction they're likely to lie, though; I suspect it's correlated with education, and also with how many books one thinks one's friends read.

And what's so great about books, anyway? Why do we assume that reading books is automatically better than reading any other source of the written word? I suppose the argument is that a 100,000-word book requires more intellectual effort to read than, say, one hundred 1,000-word newspaper or magazine articles, because there is more interrelation among the ideas. But books come, for the most part, predigested. A lot of the real intellectual work is done in taking those clippings from various sources and making a book out of them. But this isn't a study of how intellectuals read, it's a study of how the person in the street reads. And even "study" is a bit too strong. They called up a thousand people and asked them some desultory questions. The AP did the poll itself. Let's face it, they're just trying to sell newspapers.

Baucus on free tuition

Senator Max Baucus, D-Mont., proposes free college tuition for students majoring in math and science. (Via slashdot.)

There would be a stipulation that these students would have to work or teach in the field for at least four years afterwards.

The sentiment here is noble. However, I'm not sure something like this would work.

First, the service requirement could backfire. There are two ways in which I see it backfiring:

It would inhibit the creation of startup companies. (Since all companies are startup companies at some point, by definition, this is even worse than it sounds.) Technology startups are likely started by people who are just out of school, for the simple reason that the young have greater risk tolerance. But who's going to verify that the people at a startup are actually working and not just using their startup as a way to get free tuition? (This becomes particularly problematic if computer science is included in the list of funded fields, because the activity that goes on in a startup based around software probably doesn't look all that different, on the surface, than my kitchen table does right now.) Something that requires one to "work in the field" seems to encourage people to get "traditional" 9-to-5 jobs simply for bureaucratic purposes.

Also, you'd get bad teachers. You make teaching one of the options, you're essentially guaranteeing you'll get teachers who don't want to be there. It's the same reason that a lot of military people don't want the draft. If Senator Baucus walked into any university department of math or science he'd see that there are a lot of people who don't want to teach, but do it anyway because they have a financial incentive to -- we call them "professors" -- and they don't seem to do a great job of it. They have an incentive to show up to their classes but not to make them good. And bad teaching will just make math and science harder for the next generation.

Second, jobs typically taken by just-out-of-college math and science majors will start paying less. Right now, a large number of such individuals have to pay off their student loans, and they know this when deciding what level of pay they are willing to accept. If they don't have student loans, they'll be willing to accept lower pay. I am not an economist, so I don't know how big this effect would be, or more importantly what effect it would have on the salaries of those who are past the four-year period. I suspect it might drag them down as well. In the end, this would make the technical fields less attractive to students who are choosing what to study in college.

Third, math is hard! This applies to the sciences as well. This isn't really a problem with Baucus' plan, but it bears mentioning anyway. To understand anything well requires sustained intellectual effort. And our society does not value sustained intellectual effort; we denigrate people who are willing to do it, for the most part. (This is part of why nerds are unpopular.) I suspect that the Right Thing to do is not to just throw money at the problem and hope it will go away, but to try to change the culture. But how do you change a culture? You can't tell people what to think. You can't make people value intellectual effort more than they do. I suspect that the one thing that can be done is to improve teaching -- a lot of the people I've talked to who said they're not good at math or don't like math can trace this to a single bad teacher. But as I said, I think the Baucus plan would hurt teaching, thus making the problem worse.

The prevailing trend actually seems to be in the other direction -- some schools are now raising tuition in some fields, including engineering. (The article about Baucus' proposal refers to "a full scholarship to any high school graduate majoring in math, engineering, science or technology.")

Philip Greenspun once wrote about Tuition-Free MIT, in which he proposed that MIT should be tuition-free. Since essentially all students at MIT major in "math, engineering, science, or technology", the two plans are similar, except Greenspun's version is restricted to a single elite school, which in fact makes it quite different. I'm not sure how I feel about Greenspun's plan. (Disclaimer: I did my undergraduate work at MIT. I owe nothing for it, because my parents paid for it; I am almost certain they would be happier with me having my MIT bachelor's degree and that $100,000+ in their pockets than they are with me just having that degree.)

21 August 2007

mathematics-to-English translation

Some Words About Interpreting, from the NYT two weeks ago. (But I just found it today.)

Basically, as different countries become more interconnected, there's going to be a bigger market for translation between different natural languages.

This raises a question: as mathematics becomes more and more important in non-academic settings (and the general consensus seems to be that it will; it is not my purpose here to debate this question), will translation between mathematics and English (or whatever the natural language of the people in question is) become more important?

And what's the analog of "translator" in the mathematics-to-English domain, anyway? The obvious answer is "mathematics teacher", but that's not really correct; a mathematics teacher doesn't translate from mathematics to English, but rather attempts to teach people mathematics. A mathematics teacher is analogous to a foreign language teacher.

Perhaps the translator is the writer of expository books in mathematics for a populat audience. I'm thinking of books like, say, Duncan Watts' Six Degrees: The Science of a Connected Age or Steven Strogatz' Sync: The Emerging Science of Spontaneous Order, which treat mathematics that is obviously useful in real-world situations. On a different scale, whenever a mathematician (or other person doing work of a mathematical nature) struggles at a party to explain what it is they do, they're engaged in an act of translation.

But in any case, a translator attempts to explain a concept to a listener in the listener's language. The question here is one of efficiency: I choose not to learn, say, Chinese, because I do not need to deal with Chinese-speaking people sufficiently often for it to be worth my time to learn that language. (An ex-girlfriend of mine tried to learn Chinese in college. It's hard.) On those rare occasions when I have to deal with a monolingual speaker of Chinese, I hire a translator. More accurately, I never need to deal with such people directly. But I hear about China in the news a lot, and I presume someone somewhere is issuing statements in Chinese and someone else is translating on my behalf.

Will there evolve a market for a similar sort of translation from mathematics? For people who don't have the time to learn it, but want to know what mathematicians are saying?

information-theoretic entropy in the weather

Right now, in Philadelphia, it's sixty-one degrees, with a light rain.

I have long maintained that this is "generic Philadelphia weather". By this I do not mean that it's always like this here. What I mean is that if I for some reason do not know what season it is, and I head outside and find it is sixty-one degrees with a light rain, this gives me very little information, because this sort of weather can happen any time of year. Another characteristic of such a day is that the high and low temperatures are very close together, say within ten degrees of each other; it's been between 60 and 64 since midnight and probably won't get out of the sixties (in either direction) all day.

Looking at the data from the last year, June 14 was pretty close to this, although it was just overcast, not rainy; January 6 might look that way on paper, except I remember it clearly and it was actually a freakishly warm and sunny day. I wore shorts. It was January. We had the New Year's Day parade that day. November 8, 13, and 14 fit as well; also October 11 and a few other October days; September 1. (I remember the weather on September 1 quite well, because I moved that day. The rain was light for most of the day and got heavier about an hour after my movers were done getting everything in.) I'm deliberately being vague about what constitute a day like this.

Not surprisingly, this sort of weather is most common in the spring and fall (mostly because I care about temperature) but it is possible in the winter or summer as well. And this gets me wondering -- in general, what is the information content of the weather? If it's 100 degrees and sunny, there might be a 2% chance that it's July 24; a 0.5% chance it's September 1; an 0.01% chance that it's November 1; and a one-in-a-million chance it's January 15. This sort of weather is very localized towards a certain time of year. One could imagine calculating the Shannon entropy corresponding to this distribution; it would be a lot smaller than the entropy you'd get from a similar distribution if you conditioned on sixty degrees and light rain.

Of course, in this formulation, the whole idea is kind of silly -- when am I not going to know what the date is? But looking at the information-theoretic entropy of weather seems like a potentially useful way to quantify how extreme the seasons in some place might be; it's possible but not normal to get the same weather in winter and summer in Philadelphia, say; routine in San Francisco; unheard of in Minneapolis. (I am picking these places without looking at actual statistics, so I might be wrong.) Why one would want to quantify that, though, I'm not sure.

links for 21 August

Terence Tao writes about the theorem that Danica McKellar, of Wonder Years fame, proved; she's been in the news a lot lately because of her book Math Doesn't Suck: How to Survive Middle-School Math Without Losing Your Mind or Breaking a Nail. A large part of the mainstream media said "she even proved a theorem! and it was published!" but nobody explained what the theorem was, because it takes a few pages of exposition. But Tao does a good job with it, getting in some nice statistical mechanics on the way.
(The paper itself, "“Percolation and Gibbs states multiplicity for ferromagnetic Ashkin-Teller models on Z2” " is available at McKellar's web site; I suspect this is the only instance of a mathematical theorem on a celebrity's web site, at least for suitable definitions of "celebrity". The theorem, by the way, says that the Curie temperature and the critical temperature for site percolation are equal in the Ashkin-Teller model on the square lattice.

Scott Morrison at Secret Blogging Seminar (which is becoming less secret every day) gives us grepping the Knot Atlas, which is about the Knot Atlas. The Knot Atlas is basically Wikipedia for knots. (The Knot Atlas, by the way, has no entry called "knot" or something similar that I could find. It doesn't define knots, it only tabulates them.) A knot is not exactly what the layperson would think it is; a mathematical knot is an embedding of a circle in 3-space, so the string has no ends. (Of course, one can represent any knot tied with an actual rope as a mathematical knot by splicing the ends together.) This is a continuation of the old Knot Atlas; it's a lot easier for anyone to edit a wiki. It is simultaneously a database of knot invariants, and the post I linked to is about how to extract that data.

One of the things that has always amused me about knot theory is Lord Kelvin's theory that atoms were knots; but psuedoscience often begets science. I once worked at the Department of Alchemy.

Another such wiki is qwiki for quantum physics. I don't know of any more but they seem like they could be useful, as references to various mathematical areas built by the community that actually uses them. Wikipedia plays this role to some extent, but its main function is to be a general-purpose encyclopedia so the technical details often don't make it in.

I find myself wondering if people have used wikis (with login, of course) to write mathematical papers with collaborators. (Or even alone! Since most wiki software stores old versions of pages, you could easily throw out arguments that you didn't like, confident that if they contained some seed of truth they could be recovered later.)

At Casting Out Nines, via Vlorbik: inside facts about calculators. The big one is that nobody in the "real world" actually needs a calculator; students generally have computers now, and the student versions of Maple and Mathematica are similarly priced to graphing calculators; therefore if it makes sense for students to use any sort of calculator, it should be what's usually called a "scientific" calculator. I agree. However, I have a ten-year-old TI-83 which I know pretty well, and I generally keep it around -- for when I'll be working in a coffee shop and don't feel like taking my laptop with me, or for when my officemate is using the computer. But unquestioning faith in the calculator is a bad thing. You can also read Eric Schechter's The Most Common Errors in Undergraduate Mathematics. Would it be possible to compile an analogous list for graduate students, or research mathematicians? Or do those of us at these levels make too many different errors for this to be possible?

20 August 2007

Shipman's 100-year rule

Joe Shipman writes, on the Foundations of Mathematics e-mail list:
I propose the thesis "any mathematics result more than a century old is suitable for undergraduate math majors".
This is something that I've heard before, in the weaker form "a good undergraduate math education gets you to around 1900". (Which raises the question: in 2100, will a good undergraduate math education still get you to 100 years from the present? Or will it only get you to, well, 1900 -- which will be two centuries in the past by that point? Will the words "undergraduate math education" even mean anything in a hundred years?) The only counterexample he saw was Dirichlet's theorem (which says that if a is prime to b, then there exists a prime congruent to a mod b).

Wayne Aitken, on the same mailing list, argues otherwise; so does Timothy Chow.

Shipman then suggests that this is actually a 100-year rule, to anticipate my earlier question. The reason for this is that although the current best-known proof of Fermat's last theorem, for example, isn't accessible to undergrads (or even most grad students!), in a hundred years we will have significantly shortened and simplified the proof, to the point where it could be explained to a seminar of intelligent, mathematically trained 21-year-olds.

Of course, we are creating more and more mathematics as time goes on, and the rate of growth is often stated to be exponential. It's been claimed that as much mathematics has been created in the last ten years as in all time before that, and even if it's not quite that fast it seems quite likely that mathematics is growing faster than world population. (Note that this trend can't continue forever.) I am inclined to attribute the claim of the exponential rate of growth to Andrew Odlyzko but I cannot find it.

The boiling down has some unintended consequences, though. Carl Sagan, in the chapter of The Demon-Haunted World: Science as a Candle in the Dark entitled "No Such Thing as a Dumb Question", wrote about the lure of psuedoscience and the fact that science is not emphasized in the schools. He quotes the philosopher John Passmore, on how science is generally presented in schools:
[science is often presented as a matter of learning principles and applying them by routine procedures. It is learned from textbooks, not by reading the works of great scientists or even the day-to-day contributions to the scientific literature... The beginning scientist, unlike the beginning humanist, does not have an immediate contact with genius. Indeed... school courses can attract quite the wrong sort of person into science -- unimaginative boys and girls who like routine. (p. 335)
I suspect that this is because in mathematics and science we are constantly doing this "boiling down", so the simplest way known to currently illustrate a result is very rarely the way it was originally shown. Furthermore, the standards of exposition are constantly changing, as (for example) new notations evolve, so it can be astonishingly difficult to read mathematical texts from a couple centuries ago, whereas the same evolution doesn't seem to have happened to literary texts, or at least not to the same degree. And even when a line of argument does stand the test of time, it's constantly reworded, tweaked, renotated... the argument is seen as something to be passed on, but the words which were used are not. The argument becomes a sort of communal property. Perhaps there is only one mathematician, working in many brains.

an interesting fact about a variant of Hex

I have long known about the game of Hex, which is played on a hexagonal lattice. The game is played on a rhombus-shaped section of a hexagonal lattice, with markers of two colors; the board is illustrated at left. One player places red markers, the other blue markers; they alternate in playing. The red player is attempting to create a chain of red hexagons extending from the top of the rhombus to the bottom (these are both shaded red in the screenshot); the blue player is simultaneously attempting to create a chain of blue hexagons from the left side to the right.

There's a nice theorem which states that a game of Hex can never end without a winner. Say you're playing red. If you have blocked off all possible horizontal chains that blue might make, you must have made your own red chain connecting the top and the bottom, and vice versa. (This isn't exactly a proof but it's the gist of any formal proof of the result.)

However, I learned recently from
Bollobas and Riordan's book Percolation that although this is true, an analogous result is not true if one were to play Hex on a square lattice. If we think of one color of hexagons as "open" and the other as "closed", then the earlier result about Hex says that in the rhombus-shaped triangular lattice of points (which is dual to the hexagonal lattice), there's either an open path from the top of the lattice to the bottom or a closed path from the left to the right.

On the square lattice, there is the question of the right way to define adjacency; are two squares which touch each other only at a corner adjacent to each other? It turns out that if we say they are (so each square has eight neighbors) then it's possible to color the squares of the board so that there's a red vertical path and a blue horizontal path. (This doesn't create a problem for a hypothetical game of Hex on a square lattice, though, because we just award the win to the player who completes their path first.) If we let each square have four neighbors, though, then ties are possible; one could have a situation where the entire board is filled in and there's no red vertical path or blue horizontal path. (An example for both of these occurs if we divide the board into four quadrants, and color the entire northeastern and southwestern quadrants red and the northwestern and southeastern quadrants blue.)

But Lemma 5.9 of Bollobas and Riordan tells us that
If we colour the squares of an n by m chessboard [blue] and [red] in an arbitrary manner, then either a rook can move from the left side ot the right passing only over [blue] squares, or a king can move from top to bottom using only [red] squares, but not both.

(I've changed the colors; Bollobas and Riordan use the more traditional black and white.)

However, there's a difference between the two cases, even so. If you color in all the hexagons in an n-by-n section of the hexagonal lattice uniformly at random -- so each is red or blue with probability 1/2, and each hexagon is independent -- then by a symmetry argument, the probability of having a red vertical path and a blue horizontal path are equal, and thus are both 1/2. This doesn't mean Hex is fair -- the first player is in fact guaranteed to win under perfect play. But in the square-lattice version of the game, I suspect the player who is allowed to consider squares which share a corner as adjacent has a very large advantage.

19 August 2007

climate phase lag trickery

Why does it get hot? Because of the sun.

But in temperate climates in the northern hemisphere, the most light from the sun comes around the summer solstice -- June 21 -- and yet the hottest part of the year is the second half of July. In certain places with a lot of maritime influence, the heat comes even later; the canonical northern-hemisphere example is probably San Francisco, where September is the warmest month. Similarly, the hottest time of the day isn't noon (even correcting for daylight savings time; solar noon is generally around 1 PM clock time under daylight savings time), but perhaps two to four hours later.

This phase lag comes about because the atmosphere holds some heat; the temperature as a function of the day of the year could be modeled by a differential equation, where the rate of change of temperature is some decreasing function of the temperature, plus a sinusoidal input. Thus the output should be similar to the input, but phase-lagged.

Furthermore, I am starting to believe (without evidence) that the temperature in my apartment building is lagged by maybe a couple weeks relative to the temperature outside during the day; I believe with somewhat more evidence that the hottest time of day in my apartment is generally around eight or nine in the evening. And so I find myself fantasizing about a world where that phase lag existed to such an extent that it's warm in my apartment at night and cold during the day. (Without using the air conditioner, that is.)

A bit more ridiculously, what if I could exploit that sort of seasonal phase lag to the point where it was cold in the summer and hot in the winter? But I'd need some sort of ridiculously large heat reservoir, something bigger than the ocean. Still, it's an interesting fantasy. The hottest time outside is a month after the time with the most solar radiation; the hottest time inside might be a few weeks after that; what if I had some sort of series of nested boxes each of which could shift the hottest time of year a few more weeks? If I nested enough of those boxes...

of course, it would be cheaper to just run the heat.

Seriously, though, there is some new construction that attempts to use the insulating properties of certain building materials in this way, although not so much to reverse the seasons as to smooth out seasonal variation. One particularly interesting one I heard about a few months ago depended on the thermal properties of a certain type of wood, which underwent a phase change around seventy degrees Fahrenheit; thermally it's very similar to keeping an ice cube in your house, except it's a magical seventy-degree ice cube instead of the thirty-two degrees of a normal ice cube.