Showing posts with label Strogatz. Show all posts
Showing posts with label Strogatz. Show all posts

16 April 2008

Edward Lorenz dies

Edward Lorenz, father of chaos theory and butterfly effect, dies at 90. (Link goes to MIT press release; I found out from Greg Laden's blog.)

He traditionally gave a lecture to the course on chaos, and he did when I took that class in 2003. I wish I remember what he said! I suspect it was something interesting. Steven Strogatz wrote, in his book Sync,
Every time I taught my chaos course, we'd go through the same ritual each year, and I'd come to look forward to it. I'd call up Professor Lorenz and invite him to give a guest lecture to the class. He'd say, with genuine puzzlement, as if it were an open question, "what should I talk about?" And I'd say, How about the Lorenz equations? "Oh, that little model?" And then, as predictable as the seasons, he'd show his face to my awestruck class, and tell us not about the Lorenz equations but about whatever he was working on then. It didn't matter. We were all there to catch a glimpse of the man who'd started the modern field of chaos theory.

Strogatz wasn't at MIT when I was there -- Dan Rothman taught the class using Strogatz's text -- but Lorenz still gave an annual lecture to that class at least as late as fall of 2006. I'd like to think they did the same silly little dance.

30 March 2008

It's almost baseball season again...

A Journey To Baseball's Alternate Universe, in today's New York Times, by Samuel Arbesman and Steven Strogatz.

Arbesman and Strogatz ask the question: how likely is it that some major league baseball player, at some point, would have had a 56-game hitting streak, as Joe DiMaggio did in 1941?

I've seen attempts to determine this before, but they're usually handwaving things that start out by saying "assume everybody bats .266 and gets 3.83 at-bats a game" (actual averages for the 2007 National League), and then let's compute the probability that such a player has a 56-game hitting streak in any given sequence of 56 games. In this case that's easy; the average player gets has a probability (1-.266)3.83 = 0.306 of not getting a hit in any given game, thus a probability 0.694 of getting a hit in any given game; raising this to the 56th power tells you that the average player has a probability of 1.31 in a billion of getting hits in, say, the 56 games starting tomorrow and ending sometime in early June.

So what's the expected number of 56-game hitting streaks this season, according to this model? There are 107 ways any given player could get a 56-game streak -- starting in game 1, 2, ..., 107. So the expected number of 56-game streaks for Joe Qankee (yes, I'm reviving the Qankees) is this probability times 107, or 1.41 × 10-7. Now, assume there are eight Qankees that play every day. (The Qankees are an extraordinarily healthy National League team. The fact that their name rhymes with that of the American League team that DiMaggio actually had his streak with is purely coincidence.) The expected number of 56-game hitting streaks by Qankees this season is thus 1.13 × 10-6. (Note that this is not the probability that one of them has such a streak. A 57-game streak would get counted twice here, a 58-game streak three times, and so on. However, it is an upper bound for the probability of a Qankee having a streak of at least 56 games.)

Now, there have been something less than three thousand "team seasons" in Major League Baseball (one team playing for one season). So the expected number of 56-games streaks is bounded above by (1.13 × 10-6) × 3000 = 0.0338, or about one in 300.

But we've had one. That seems like a lot.

What's the problem here? Well, the average player isn't the one that's going to have that streak. A .280 hitter will put together a streak in 7.40 56-game frames out of every billion. A .300 hitter, in 69 out of every billion. A .320 hitter, in 498 out of every billion. (And I'm still assuming such a player only gets 3.83 at-bats a game; that's probably not true, because the player who hits well will lead his team as a whole to have more at bats.) But an equally bad hitter doesn't drag down the expectation nearly as much. I've ignored batting order (which Arbesman and Strogatz did take into account, implicitly; their inputs for each player are the total number of hits, number of games played, and number of plate appearances, and number of plate appearances varies with position in the batting order).

Rather than making some assumptions on how batting averages are distributed (which would probably be wrong, and even if they were right in the peak of the distribution would still be wrong because what really matters is the tails), I'll defer to Arbesman and Strogatz. Their method is to simulate the entire history of baseball 10,000 times, which is enough to get a nice basically-smooth curve for the distribution of the length of the longest streak. The median length of the longest streak, in their simulations, is 53 games.

Simulation might not be entirely necessary, though. It's routine to calculate the distrbution of the length of the longest streaks in sequences of biased coin flips; aggregating that information together is a little harder. But I don't care enough to do it, so I'll stop here.