28 July 2007

baseball commentators say silly things

Heard just now on FOX, which is airing the Braves-Diamondbacks game:

First, the TV commentator claims that the Diamondbacks are a very streaky team this year, because they've had three separate five-game losing streaks and have won their last seven.

In fact, the Diamondbacks have won 57 games out of 105, for a winning "percentage" of 0.543; thus their probability of losing five straight games is (1-57/105)5 = 0.0200. They've played one hundred and five games so far, so there are 101 games in which they could have started a five-game losing streak; thus their expected number of five-game losing streaks is something like (0.0200)(101) = 2.01. (Yes, that's right; the figure 0.0200 is rounded.) So it's not all that surprising that they've had three such streaks.

Similarly, the expected number of seven-game winning streaks is 99(57/105)7 = 1.37; the fact that the Diamondbacks have had one such streak is not at all surprising. (If I had to guess, I'd say that 1 is actually the most likely number of such streaks, but I'm not interested enough to do the analysis.)

Of course, not every game is independent. A more sophisticated analysis would take into account which teams were playing, and so on. An even more sophisticated analysis of streaks in baseball ought to take into account the pitching rotation; the existence of a pitching rotation reduces the likelihood of streaks. Let's say your team wins one-half of its games; then the probability of winning five straight games is 0.03125. But now say you have five starting pitchers, and your team wins in 70%, 60%, 50%, 40%, and 30% of their games respectively. If each pitcher pitches every fifth game, then the probability of winning five consecutive games is now (0.7)(0.6)(0.5)(0.4)(0.3) = 0.0252.

See The Hot Hand in Sports for more of this sort of analysis.

Second, the Braves are, according to the television guy, "exactly one percentage point" behind the Phillies. The Braves are 54-50 going into today's play; the Phillies are 53-49. Baseball winning "percentages" are conventionally reported to three decimal places; the Braves are at .519, the Phillies at .520. For those of you who don't know, it's conventional to say that one team is ahead of the other by "percentage points" in a situation such as this where both teams have the same difference between their number of wins and number of losses; in this case both teams have won four more games than they've lost. But what bothers me is the "exactly one" here; of course those figures are rounded. As it turns out, the Braves' winning percentage is 0.519230...; the Phillies; is 0.519608...; the difference is 0.000377..., or not even half a point. If baseball truncated winning percentages, instead of rounding them, the two teams would be "tied".

The first of these things -- the streakiness comment -- is the one that bothers me more, though. The "percentage points" comment is just a matter of a convention that disagrees with the one the rest of the world makes. (Why doesn't baseball report winning percentages to just two decimal places? Because that wouldn't be enough accuracy; baseball teams play 162 games a season.) But the streakiness comment is the sort of thing that shows that people don't understand the nature of randomness; people read something into "streaks" that is really just good luck.

No comments: