31 December 2007

Some football probabilities

Back in September I answered a question about probabilities in Major League Baseball's wild-card race..

John Armstrong (of The Unapologetic Mathematician fame) asked me about NFL football: assuming all teams are evenly matched, what is the probability that a division sends three teams to the playoffs? For those of you who aren't familiar with the NFL's playoff structure, there are two conferences of sixteen teams each (the NFC and the AFC); each conference is divided into four divisions of four. The winner of each division gets into the playoffs, and the two best teams among non-division winners in each conference get in as "wild cards". (For an example, see this year's final standings at nfl.com.)

This is equivalent to the following question: what is the probability that both wild cards in a conference come from the same division? I suspect that this question is inspired by the fact that this actually happened in both conferences this year, in the AFC South and NFC East respectively. (This must be especially frustrating for fans of the fourth-place team in each of those divisions. The fourth-place team in the NFC East was the Philadelphia Eagles. I think I'll avoid Eagles fans today. But that might be hard.)

Anyway, the probability that this event happens in any given conference in any given year is 2/11. Let the teams play out their season. Consider the twelve non-division winners in a given conference, and rank them in order from 1 to 12. (You might think: in the NFL each team only plays sixteen games, so won't there be ties? Perhaps, but the NFL's wild card tiebreakers don't care what division a team is in.) The top two teams get in. The divisions of the twelve teams (from top to bottom) form a word with three N's, three E's, three S's, and three W's, picked uniformly at random. Say without loss of generality that the first letter that the word starts with an N; then the remainder of the word is picked uniformly at random from permutations of NNEEESSSWWW. The chances that starts with an N are 2/11.

Alternatively, there are ${12 \choose 3,3,3,3} = 369600$ words which contain three each of N, E, S, and W. The number of these that begin with the same letter twice is $4{10 \choose 1,3,3,3} = 67200$ -- pick the letter that appears twice at the beginning in one of four ways, then arrange the ten remaining letters however you like. So the probability of the two division winners coming from the same division is 67200/369600, or 2/11. I don't like this approach quite as much, though, because it involves big numbers that cancel out; it seems cleaner to me to not invoke the big numbers in the first place.

By the way, the NFL has had its current structure since 2002, and the years and conferences in which this event has actually occurred are the 2007 AFC South, 2007 NFC East, and 2006 NFC East, making three out of twelve.

The question I asked in September was not the analogue of this one (MLB only has one wild card, so there is no analogue); there I found the probability that the four MLB teams making the playoffs from any league were actually the four teams with the best records. But I don't have a general method to find the solution for some league with arbitrary structure, and the NFL would require slogging through more cases than MLB (two wild cards instead of one) so I won't bother for now. (Or probably ever -- I don't particularly care about that number.)

12 comments:

John Armstrong said...

Actually, I'd read the question slightly differently and get 40/121. This is (using your answer) the probability that *a* division sends two wildcards. It could be from one conference (2/11), could be from the other (2/11), or it could happen in both (4/121). Then the probability that it happens at all is 2/11+2/11-4/121 by inclusion-exclusion.

dfan said...

Hmm, there's some simplification going here that I'm not sure is warranted. The NFL schedule is "unbalanced"; each team plays the other teams in its division twice. This should make it harder for multiple teams from a single division to stand out, since the intradivisional games in that division are going to account for 12 losses alone, or an average of 3 losses apiece, just because someone's got to lose those games.

I'm sure this would be a big issue if there were three wild card spots and you wanted to know the probability of everyone in a division going to the playoffs. My intuition is that it will still have an effect in the 2-wild-card-team case, although of course not as much.

Anonymous said...

"average of 3 losses apiece, just because someone's got to lose those games."

There can be a tie game if after 15 minutes of sudden death no one has scored. I'd have to ask my brother in law if there has been a tie as I don't follow football.

Unknown said...

Dear Isabel,

I believe you are oversimplifying, you are not considering the fact that for this to happen you actually need to have a third team that is very good; I think that considering this, the result is going to be less than 2/11. In fact, for a third team to get into the playoffs it must have played in a division with two better teams, hence it has a worse "environment" that the second team from other divisions, I assume that any team plays more often agains teams from its own division. My point is that for this calculations to be accurate one should consider all the games played by all the teams in the conference. Am I missing something here? I probably am.
Best greetings from Spain,
Sebastian.

Anonymous said...

A slightly more difficult problem would be to calculate the probability of both wild cards coming from the same division given the 2001 composition of the league. At that time there were 31 teams, five divisions of five teams and a sixth division with six teams.

Anonymous said...

How good is a team in a division? This can be done at any point in the season, and stops evolving at season's end.

First, estimate how good the team is by the number of games they've won. Normalize by the number of games they've played. Rank the teams by this normalized 1-goodness.

Second, it's better to beat a good team than a bad team. So weight each win by the rank of the team beaten. Normalize. This is the 2-goodness.

Iterate this. In the limit, the infinity-goodness is the eigenvalue for that team of the matrix of scores of each team against each other team they've played.

This is crudely phrased, but standard in tournaments and weighted digraphs.

Now, that's the baseline against which the strange systems of NFL and BCS and MLB are to be compared.

I'm close enough to see the blimp over the Rose Bowl right now...

May the best team win. Modulo best.

Happy New Year.

-- Jonathan Vos Post

Anonymous said...

Let's not forget that the original premise is that all teams are equal. This means that if a 7-3 teams plays a 3-7 team, the probability of each team winning is still 0.5. I agree an answer of 2/11 does involve having made a simplification regarding the previously mentioned scheduling considerations. However, I expect that this would make only a slight difference since not only must there be more certain losses due to games within the division there must also be more certain wins. If rather than the two wildcards coming from the same division we asked if the teams ranked 16th and 17th (median teams) both came from the same division, I expect the answer would be exactly 2/11.

Anonymous said...

I meant 8th and 9th ranked teams in my last post as there are only 16 teams in each conference. The probability should be 2/11 that the 8th and 9th ranked teams in a conference come from the same division.

Unknown said...

Dear Isabel,

I believe you are oversimplifying, you are not considering the fact that for this to happen you actually need to have a third team that is very good; I think that considering this, the result is going to be less than 2/11. In fact, for a third team to get into the playoffs it must have played in a division with two better teams, hence it has a worse "environment" that the second team from other divisions, I assume that any team plays more often agains teams from its own division. My point is that for this calculations to be accurate one should consider all the games played by all the teams in the conference. Am I missing something here? I probably am.
Best greetings from Spain,
Sebastian.

Unknown said...

Dear Isabel,

I believe you are oversimplifying, you are not considering the fact that for this to happen you actually need to have a third team that is very good; I think that considering this, the result is going to be less than 2/11. In fact, for a third team to get into the playoffs it must have played in a division with two better teams, hence it has a worse "environment" that the second team from other divisions, I assume that any team plays more often agains teams from its own division. My point is that for this calculations to be accurate one should consider all the games played by all the teams in the conference. Am I missing something here? I probably am.
Best greetings from Spain,
Sebastian.

John Armstrong said...

Sebastian, that's the third time you've said the exact same thing. As has been pointed out, we're making the simplifying assumption that all teams are evenly matched, since that's the question as it was originally posed to me.

Anonymous said...

You have to express more your opinion to attract more readers, because just a video or plain text without any personal approach is not that valuable. But it is just form my point of view