27 June 2007

the 10,000th Phillies loss will come on the West Coast

Walking around this morning, I saw the Philadelphia Weekly's cover story: Losing proposition. This is an article about how the Phillies are very close to having ten thousand losses. The New York Times made fun of us a couple weeks ago (but the Times mocks anything involving Philadelphia). There are sites like Countdown to 10000 and Celebrate 10000 in honor of it. They sell T-shirts. Some people claim the 10,000th loss was in June of 2005, against the Red Sox -- but this is only true if you count the Worcester Worcesters of 1880-1882 as being the Phillies. They're not.
(Yes, the Worcester Worcesters. Some sources call them the Brown Stockings, but I like calling them the Worcesters because it shows even less ingenuity in naming than the name "Phillies" does.)
There are three facebook groups. (I wonder if there's a myspace group; the link goes to a paper that's been circulating about the class differences between Facebook and Myspace.)
Then I remembered that I have Phillies tickets for their game against the Cardinals on July 13th, the first game after the All-Star break.
I got to thinking -- what are the chances that I'd see the Phillies' ten thousandth loss? They've lost 9,991 games so far; they've got nine more to go.
Surely the 10,000th loss is a historic moment in all of professional sports. No team has lost this many games. (The San Francisco (formerly New York) Giants have won 10,000.)
It's not so hard to compute this. What I needed to know was the probability that the Phillies lose each particular game. This can be found via a method which for some cryptic reason is called the "log5 method", which I learned about from this article from Diamond Mind which computed the probabilities that each of the 2002 playoff teams would win the World Series. The method is as follows: if team A wins pA of its games, and team B wins pB of its games, then the probability that team A wins in any given game against team B is
pA(1-pB) / (pA(1-pB) + pB(1-pA).
The best justification for this formula is that it works when you test it on actual data. (Actual baseball data, that is; I'm not sure if it's good for other sports.) But an intuitive justification for it is as follows: you have two coins, coin A and coin B. Each coin has "win" on one side and "loss" on the other. Coin A comes up "win" with probability pA, and coin B comes up "win" with probability pB. To simulate a game, flip the two coins. If one comes up "win" and one comes up "loss", that gives you the outcome of the game; if they both come up the same, flip again. Notice that the formula passes a couple sanity checks. If pA = 0, then it always gives 0 -- that is, if a team never wins, then its probability of winning against any opponent is zero. If pB = 1/2, then it just gives pA -- so a team which is playing aginst average teams performs how it usually performs.
To adjust for home field advantage, I added 0.02 to the home team's winning percentage and subtracted 0.02 from the visiting team's winning percentage; this is the method used at Baseball Prospectus' postseason odds simulation, which I'll have more to say about later.
So, for example, the Phillies play the Reds tonight, in Philadelphia. The Reds have won 29 games and lost 48, so their winning percentage is .377; we replace this with .357 since the Reds will be playing on the road. The Phillies have won 40 and lost 36, so their winning percentage is .526; we replace this with .546 since they're playing at home. The formula tells us that the Reds' chance of winning tonight is
(.357)(1-.546) / ((.357)(1-.546) + (.546)(1-.357))
which is 0.315. This is the Phillies' chance of losing, which is what I'm interested in.
So after tonight, the Phillies will have eight losses to go with probability 0.315; they'll have nine losses to go with probability 1-0.315, or 0.685.
They'll play the Reds again tomorrow night. After that game, they have seven losses to go with probability (0.315)2 = 0.099; they have eight losses to go with probability (.315)(.685)+(.685)(.315) = .432; they have nine losses to go with probability (0.685)2 = 0.469.
Thus, I set up a spreadsheet which calculates the probability that after each game, they have 9, 8, 7, ..., 1 losses to go. The probability of the Phillies getting their ten-thousandth loss on a certain day is the probability that they have 9,999 losses before that day ("1 loss to go"), times the probability of losing that day.
The results are as follows. The rows in red are home games, following the same color scheme as the sorted schedule. The winning percentages are from mlb.com standings as of June 27.

DateOpponentChance of 10,000th loss
Jun 27 v. Reds0.000000
Jun 28 v. Reds0.000000
Jun 29 v. Mets0.000000
Jun 29 v. Mets0.000000
Jun 30 v. Mets0.000000
Jul 01 v. Mets0.000000
Jul 02 @ Astros0.000000
Jul 03 @ Astros0.000000
Jul 04 @ Astros0.000467
Jul 06 @ Rockies0.002946
Jul 07 @ Rockies0.009603
Jul 08 @ Rockies0.021746
Jul 13 v. Cardinals0.030071
Jul 14 v. Cardinals0.041621
Jul 15 v. Cardinals0.052571
Jul 16 @ Dodgers0.091757
Jul 17 @ Dodgers0.106722
Jul 18 @ Dodgers0.112506
Jul 19 @ Padres0.108077
Jul 20 @ Padres0.097745
Jul 21 @ Padres0.083264
Jul 22 @ Padres0.067340
Jul 24 v. Nationals0.031618
Jul 25 v. Nationals0.026675
Jul 26 v. Nationals0.022282
Jul 27 v. Pirates0.018717
Jul 28 v. Pirates0.015312
Jul 29 v. Pirates0.012425
Jul 30 @ Cubs0.014016
Jul 31 @ Cubs0.010085
Aug 01 @ Cubs0.007142
Aug 02 @ Cubs0.004984
Aug 03 @ Brewers0.004103
Aug 04 @ Brewers0.002533
Aug 05 @ Brewers0.001533
Aug 07 v. Marlins0.000613
Aug 08 v. Marlins0.000441
Aug 09 v. Marlins0.000317
Aug 10 v. Braves0.000251
Aug 11 v. Braves0.000171
Aug 12 v. Braves0.000115
Aug 14 @ Nationals0.000074
Aug 15 @ Nationals0.000051
Aug 16 @ Nationals0.000034
Aug 17 @ Pirates0.000024
Aug 18 @ Pirates0.000016
Aug 19 @ Pirates0.000011
Aug 21 v. Padres0.000008
Aug 22 v. Padres0.000005
Aug 23 v. Padres0.000003
Aug 24 v. Dodgers0.000002
Aug 25 v. Dodgers0.000001
Aug 26 v. Dodgers0.000001
Aug 27 v. Mets0.000000
So it appears most likely that the Phillies will have their ten-thousandth loss on the West Coast, between July 16 and July 22; there's a 66.8% chance of it happening in those seven games. This is where you'd "naturally" expect things to peak anyway -- since the team loses about half the time, you'd expect it to take them 18 games in order to lose 9. That road trip is the 16th through 22nd games if we start counting from today. Plus, they'll be on the road, and the Dodgers and Padres are both good teams. It actually surprised me to see that the 10,000th loss is nearly twice as likely in the first game of that road trip (July 16 @ Dodgers) than in the last game of the preceding homestand (July 15 v. Cardinals), and similarly for the last game of the road trip (July 22) and the first game back (July 24). The soonest it can happen, as of this writing, is July 4, if they lose the next nine -- that would seem somehow appropriate, given what happened in Philadelphia on a long-ago July 4. The tail of the distribution is long -- there's always that slim chance that the Phillies could get ridiculously hot and stretch this out for thirty games or more. I wouldn't bet on it, though.

And I've only got a three percent chance of seeing this historic moment on the 13th of July. I hope I don't see it, because that would mean the Phillies would only win four out of their next thirteen.

edit (Friday, 2:39 pm): Frank athot dogs and beer features a similar analysis.


Anonymous said...

signed to your rss

Anonymous said...

в конце концов: благодарю.. а82ч