## 22 August 2009

### Two baseball games the same?

The Yankees and Red Sox have played each other two thousand and something times, as the good folks on Fox told us, so I started to wonder: which baseball teams have played each other the most times? If I had to guess it's Giants-Dodgers; Wikipedia agrees with me but gives no source. Apparently Cardinals-Cubs, Cardinals-Pirates, and Cubs-Pirates are all a close second. These are the ones you'd expect if you take into account when teams changed divisions, etc.

Anyway, trying to find this out I found a blog post entitled Have Two Baseball Games Ever Played Out Identically?. The answer is no, but "identically" is defined a bit too strictly; (say) a groundout to second and a groundout to shortstop are counted as different. And the metric that the author uses for similarity of two games A and B is, I think, the number of times where the nth plate appearance in games A and B had the same outcome. Intuitively I think you'd want to line up innings with each other. Two "most similar" games should at least have similar-looking line scores. I think what one wants is some notion of "edit distance" between games, and defining that is hardly trivial.

I've sort of poked at this before: in 2007 I asked what's the most common line score in connection with a promotion that MLB did for that year's all-star game.

There's a nice combinatorial/probabilistic question hiding here; I've seen results on, say, the probability that two randomly chosen permutations of [n] have the same cycle type, or the probability that two binary trees with n labelled nodes have the same shape. Baseball games are combinatorial structures, and I'm not just saying that to justify the fact that I'm probably going to spend six hours today watching baseball. (The Yankees and Red Sox are on TV now, the Phillies and Mets later.)