God Plays Dice: overmathematization

Showing posts with label overmathematization. Show all posts

15 August 2007

10% of a constitution is still a lot

Venezuela head outlines changes, from the BBC:

President Chavez told the Assembly his proposals only affected 10% of the constitution.

A friend of mine said that there are certain things where a small change to the input doesn't necessarily create a small change in the output; his examples were genomes, software, and constitutions. I'd argue that genomes are basically a biological version of software. And is a constitution basically an operating system for a country? (I was able to find at least two example of this metaphor, from a constitutional law blog and at Wired, but I think I came up with it independently.) Change a few lines in the Constituion and you change a lot. The text of a lot of the amendments is quite short.

To this I would add mathematical proofs. I suspect that in most proofs, say, 90% of the thought is in 10% of the lines, the rest being (relatively) routine verification. A lot of mathematical formulas are also the same way -- change a single coefficient or a single sign and the formula doesn't work any more -- but those are sort of analysis to software, in that they're algorithms for solving some problem.

10 August 2007

the traffic.com "Jam Factor"

Traffic.com has something they refer to as the "Jam Factor" to tell you how much traffic there is on various major roads in your area. It's an apparently arbitrary number from 0 to 10.

Now, if I see that the jam factor is 3.1 right now, what does that mean? How long is it going to take me to get where I'm going?

They claim:

The Traffic.com Jam Factor is like a "Richter scale" for traffic. It's an overall measure of the traffic conditions on a roadway, or on a section of a roadway. Because the Jam Factor calculation uses real-time speed and travel time measurements from our sensors and those of our partners, as well as our detailed accident, construction and congestion information, it's a comprehensive measure of the state of traffic on any roadway.

The Jam Factor is measured on a scale of 0-10, with 10 being the worst traffic conditions. It is designed to give you a quick, at-a-glance picture of conditions on the roadways or personal MyTraffic Drives you care about, whether you're on our web site, looking at an emailed Traffic Report, or listening to a Traffic Report call to your phone. If you see (or hear) a high Jam Factor, you can then delve into the detailed information in the Traffic Report or on the Traffic.com site to find out more.

If you click on the name of any segment of road, it'll tell you how long it takes to get by that piece of road without traffic, and how long it'll take Right Now -- that seems like more meaningful data to me.

If I had to sum up traffic conditions in a single number, it would be the ratio of the two numbers in the previous paragraph. I'd like to know that it'll take me 50% longer to get where I'm going than it would if the roads were clear. It wouldn't surprise me to learn that the "jam factor" is just this number disguised somehow.

I prefer what, say, Google Maps does, just overlaying colors (red/yellow/green) on the road; a number, especially one with two significant digits (the Jam Factor is reported to the nearest tenth), implies some sort of precision. I'd rather have no number than a number which has extra decimal places tacked onto it to seem "scientific". (Would I be less annoyed by this if the Jam Factor was reported on a 0-100 scale, to the nearest unit? Who knows?)

Also, if they have sensors of some sort -- why am I limited to knowing how long it'll take to get from a preset point A to a preset point B? Why can't I input the exit I actually get on and off at and have it tell me how long I should expect to spend on the highway? This is a simple matter; they have the speed that the road is moving at at any given moment, so just integrate the reciprocal of speed over the length of road in question. (I realize that it's not quite this trivial, because just because a segment of road is moving at 30 mph right now doesn't mean it'll be moving at that same speed when I get there. My method is sort of like how life expectancies are calculated, which doesn't actually tell YOU how long you're going to live. But forecasting how traffic on any given day will evolve, or how medical care will evolve, is a lot harder than just observing it.)

The Richter scale itself is logarithmic, as many of you probably know -- a difference of one unit on the Richter scale corresponds to a factor of ten in the amplitude of the seismic waves (not a factor of ten in the energy released, as is widely reported; although this isn't my field, it looks like the corresponding increase in energy released is a factor of 10^3/2, or about 32). Other logarithmic scales that you see fairly frequently are decibels (for measuring sound) and Google PageRank (on the 0 to 10 scale); a decibel corresponds to an increase in volume of 10^0.1, and I've read various things but one unit of PageRank seems to correspond to a factor of about five. (No one outside of Google knows for sure.) All of these situations have one thing in common -- most earthquakes, sounds, and web pages are relatively small, and some are much more prominent, so there's a good reason to spread out the small end of the scale and compress the large end. But in the case of the Jam Factor, this isn't necessary; if it takes me an hour on average to drive somewhere, on a really lucky day it might take forty minutes, and on a really unlucky day it might take three hours, but it's never going to take even as much as a hundred hours.

(Incidentally, as you may h ave guessed by now, my preferred mode of transportation is walking. A large part of this is that even though it's slow, I always know exactly how long it's going to take me. There is essentially no situation that slows down or speeds up my walking speed. And even if there were, it never takes me twice as long to walk somewhere as it ordinarily would, whereas driving times that are twice the average are routine. Unfortunately, it's not totally clear whether I save fossil fuel by walking, because I have to eat more food in order to walk.)

05 August 2007

links for 5 August

The Fermi Paradox is back, via Slashdot. The Fermi paradox, for those of you who don't know, is basically the following question: if there are so many examples of extraterrestrial intelligence, as a lot of people believe, how come none of them have contacted us yet? This ties in to one of my favorite overmathematizations, the Drake equation, which computes the number of extraterrestrial civilizations in our galaxy by multiplying seven factors, most of which we have no good idea of. The result is a number with a ridiculously huge margin of error; depending on who you ask, the number of extraterrestrial civilizations that we might be able to here could be anywhere between zero and a million or so. Good expositions of the Drake equation usually point out that we have no way of predicting, for example, the average lifetime of a civilization. One particularly interesting resolution I've seen of the Fermi paradox is that other civilizations decide that they just don't care about talking to other species and spend all their time looking at the local equivalents of Internet pornography and reality television. I'm not saying I believe this, just that I've heard it. A bit more plausible, I think, is the idea that civilizations evolve so quickly that a civilization that was where we'll be in the year 3000 (if we don't kill ourselves first) wouldn't be interested in talking to us. (If you think 3000 is too soon, substitute some year further in the future.) I think it would be interesting to talk to a civilization that was where we were a thousand years ago, but a lot of people believe that the evolution of civilization is accelerating; Ray Kurzweil is probably the best-known exponent of this idea, called the Singularity. I'm a bit suspicious of it because a lot of the arguments seem to rely on the fact that we remember what happened in the recent past much better than what happened in the distant past.

What autistic girls are made of, by Emily Bazelon in today's New York Times. Disorders on the autistic spectrum are usually thought of has being uniquely the province of boys, but they happen to girls too. There are researchers who think of autism as being an "extreme male brain", and if that's true it kind of makes sense that it would be more common among males than females. Also, apparently it's harder to be an autistic woman than an autistic man because women are expected to understand social networks better than men; I'm kind of curious if this has always been true or if it's a historical accident. Vaguely relatedly, Who's a Nerd, Anyway? by Benjamin Nugent from last Sunday's NYT; people who are considered nerds are "hyperwhite", according to the linguist Mary Bucholtz. (This is "white" in a cultural sense, as in the way white Americans tend to act; I don't think the author intends to say that there's anything genetic about being a nerd.) What I find interesting is that this same tendency towards oversystemization can be called either hyperwhite or hypermale, despite the fact that we usually think of sex and race as being orthogonal to each other. Finally, Mark Liberman comments at Language Log on reactions to Nugent's article, and how in general non-scientific bloggers blogging about science, and non-scientific journalists writing newspaper articles about science, make fools of themselves.

The Probabilistic Method by Noah Snyder at Secret Blogging Seminar. I love when people find out that the probabilistic method exists. For those of you who aren't familiar with it, the probabilistic method is a method used to prove that a collection of objects contains some object with a certain property not by actually finding the thing but by just proving that if you pick an object from the collection, it has probability greater than zero of being the thing you're looking for. It's kind of a mindfuck, because many of its applications are in combinatorics and people expect there to be explicit constructions of things in combinatorics. It's possible to have a group of forty-two people such that there's no five of them who all know each other and no five who don't know each other. But I can't explicitly tell you which people in that group know each other and which don't. (This is an example of a Ramsey number.)

04 August 2007

Who gets credit for quadratic reciprocity?

In today's New York Times crossword, there's a clue "Discoverer of the law of quadratic reciprocity." The correct answer is, according to the crossword, this guy. I originally put in this guy instead. It turns out, according to the Wikipedia article on quadratic reciprocity, that "The theorem was conjectured by [first guy] and Legendre and first satisfactorily proven by [second guy]. [Second guy] called it the 'golden theorem' and was so fond of it that he went on to provide eight separate proofs over his lifetime." (I am deliberately obscuring the links because you might still want to do the crossword.)

Now, who should get the credit for "discovering" a mathematical result? The one who first suspected it might be true, or the one who proved it? I'm of the opinion that in this case they should share the credit (along with Legendre), mostly motivated by the fact that both of the people involved are Really Big Names. There are some examples in which First Guy discovered something and it's not named after him, but Erdos (and I don't think I'm spoiling anything by admitting that Erdos is neither First Guy nor Second Guy) once said that Goldbach's conjecture should be named for Goldbach, not First Guy, because "[First Guy] is so rich and Goldbach is so poor, it would be like taking candy from a baby." I don't know the history in the case of quadratic reciprocity. But I'm motivated here mostly by the fact that it's a lot easier to prove something if you already have reason to suspect it's true. For one thing, if you suspect something is true it is often on the basis of data, and you can look at that data and see how you might generalize the patterns you can see in it. (Quadratic reciprocity is almost certainly such a case, since data is easy to generate.) Secondly, there's a tremendous psychological boost to be gained from knowing that someone whose judgment you trust thinks something is true. Mathematicians are trained to think that only proofs matter, and I suspect there's an extreme strain of this that thinks that we really don't have any idea whether something is true until we've proven it or not; but someone who has given copious hints in the right direction surely deserves some of the credit. The tendency seems to be to give the credit to the person who put the last link in place; the highest-profile example is of course Wiles' proof of Fermat's last theorem. But what Wiles really showed was a special case of the Taniyama-Shimura conjecture; Ribet had already shown that Fermat would follow from this special case. So it seems to me that Ribet definitely deserves some of the credit (for establishing that as the right target for anyone wishing to prove FLT) and probably Taniyama and Shimura as well.

In the case of this particular theorem I realize that there are a very large number of people involved in one way or another, and it's hard to know who exactly to assign credit to, or -- and this is getting really silly -- how much credit to assign to them. Fermat proved that x³ + y³ = z³ has no trivial solutions. What should this count for? On the one hand, it's the first case. On the other hand, there are infinitely many cases. Should Pythagoras get some credit for coming up with his theorem, which inspired the whole thing? (Incidentally, any reasonable scheme of assigning "credit" to every mathematical result ever to all the mathematicians who were in some way involved has a pretty good chance of putting Pythagoras on top -- or at least of putting the Pythagorean theorem on top among theorems.) But even in the case of less famous results there are clearly a lot of people who did something towards them -- the people who first conjectured them, the people who proved some special case, the people who disproved some other special case (thereby helping to establish the boundaries of the result), and so on. Fortunately we don't need to find a way to quantify these contributions; decisions of who to hire or who to give prizes to can be made without them. (I'm not saying that the current hiring system is perfect, but it seems to work well enough.) And do you really want mathematicians in charge of some scheme that assigns a number to their total amount of mathematical contributions? Because let's face it, if you made up such a scheme mathematicians would find a way to beat it.

Except if it involved arithmetic. Mathematicians aren't any good at arithmetic.

(Oh, and this was going to be a post on how mathematically oriented people are good at crosswords. But it's not! Oh well. I'm sure I'll get around to writing that eventually.)

17 July 2007

more thoughts on calorie counting

Last night I wrote about commercials which "overmathematize" eating.

This morning, in the New York Times, I read the article Calorie Labels May Clarify Options, Not Actions. New York City apparently has a new law requiring calorie counts to be posted in certain restaurants, and other places are considering similar laws.

The argument that various experts are making is that a lot of restaurant food is worse for you than people think. I said yesterday that I believe the human body will naturally behave in such a way as to have people eating the amount they should. Of course it should! Millions of years of evolution can't be wrong.

But there weren't restaurants when evolution did most of its work. And as far as I know, there has been no evolutionary tendency away from eating everything that's put in front of us because we don't know when there will be food again. Different people have differing appetites, which makes me suspect that given enough time living in a society like the one we have right now where high-calorie food is readily available, we'll evolve to not want more than the amount of food we need to live. But that's only true given enough time. Evolution is slow.

So maybe here the posting of calorie counts is a good idea.

Maybe.

But then say that someone knows they "should" be eating 2000 calories a day, given their current weight, age, gender, and lifestyle, in order to maintain that current weight. And they get to a restaurant for lunch and they see the 400-calorie burger, the 300-calorie fries, and the 200-calorie soda. (I'm making these numbers deliberately low, by the way.) None of those numbers are that large compared to 2000, so they think it's okay. But those numbers add up to 900, which is nearly half of 2000. Does this person know they're eating nearly half the number of calories they need?

In short, I suspect that though well-intended, the usefulness of this will be thwarted by the fact that a substantial number of Americans can't do basic arithmetic on three-digit numbers. (This probably explains the reason why the Weight Watchers "points" scheme is so popular -- it lowers the size of the numbers people have to keep track of.)

I believe this also explains a lot of the current state of the housing market -- people signed up for loans without realizing that the advertised payment wasn't even going to cover the interest -- but that's a different story.

16 July 2007

not everything needs to be counted

"One eight-ounce glass counts as almost twenty-five percent of your fruit and vegetable servings." -- from a commercial for Florida orange juice I just saw on television.

Something about this "counts as" construction bothers me. It sounds to me like they're saying "it's not really fruit" or something like that, like eating is some sort of game.

Similarly, there's a commercial for the cereal Special K that, if I remember correctly,has a couple of really skinny girls deciding not to skip breakfast; if you write it and kingdom living, among others, have complained that this commercial feeds our national obsession with being thin. A calorie count is given at some point (200, I think?) and they show a bowl of cereal which certainly has more than 200 calories. (Take a look at the serving size written on your cereal box sometime. It's probably a lot less than what you eat when you eat cereal. They also seem to imply that this little bowl of cereal will tide someone over until lunch, which just isn't going to happen.)

In general, I feel that the human body knows wat it wants to eat, and that counting calories is kind of a silly idea. I would call this overmathematization, and I think it's something that various parts of our society are succumbing to -- the fact that box office numbers are becoming a big part of news broadcasts, for example, when really only the people who work in the movie business should care about those. Not everything needs to have a number.

(edit, 7:23pm: Wow, google is fast! I wondered if "overmathematization" was a word people had used, so I googled it. There are nine hits. One of them is this entry, which I posted twenty-three minutes ago.)

God Plays Dice