Anyway, the Wikipedia article mentions various methods that James came up with to judge a player over the course of a career; this includes the intriguingly named "Fibonacci win score", but doesn't explain how this is calculated. Naturally, I was curious. Google turned up this thread at baseball fever, which says that the Fibonacci win score for a pitcher is the number of career wins, times the winning percentage, plus the number of "marginal wins" (i. e. wins minus losses). This is typical of James in that it doesn't make sense at first -- why would you multiply wins by winning percentage?
The reason it's called "Fibonacci" is because of the answer to the natural question -- how does the Fibonacci win score for a player compare to their actual number of wins? Say a pitcher's winning percentage is k, and he won W games in his career. Then he loses [(1-k)/k]W games, and his number of win points is kW + W - [(1-k)/k]W. For this to be equal to W, we have
k + 1 - (1-k)/k = 1
and this has one root with k between 0 and 1, namely k = (√5 - 1)/2;, or about .618; this is the limit of the ratio between consecutive Fibonacci numbers, hence the name. A pitcher with a better winning percentage than this will have a higher win score than his actual number of wins; a pitcher with a worse record than this will have a lower win score than his actual number of wins. (The highest win score in history is 511, by Cy Young, who won 511 games and lost 316 in his career; indeed, Young's win percentage was .618.) The purpose of this statistic is to reward pitchers that pitched well and penalize pitchers who were just mediocre over very long careers.
The other question that comes to mind is -- if a pitcher wins a game, or loses a game, what does this do to his number of win points? Let f(W,L) denote the win score of a pitcher with W wins and L losses; then we have
f(W,L) = W (W/(W+L)) + W - L = (2W2 - L2)/(W+L)
Incidentally, in this formula the numerator is negative for a pitcher whose winning percentage is less than 1/(1+√2), or .414. If we differentiate this with respect to W and simplify, we see that
fW(W,L) = 2 - [L/(W+L)]2
and thus an additional win gets a pitcher two win points, minus the square of his losing percentage. Similarly, we have
fL(W,L) = -1 - [W/(W+L)]2
and so a loss costs a pitcher one win point, plus the square of his winning percentage. It almost seems meaningless to say that, because there's no a priori reason why this particular arrangement of variables should mean anything -- though at least it's dimensionally consistent, and has units of wins; there are a lot of random-looking combinations of statistics that don't even do that!
1 comment:
You are pretty much correct about James's mathematical sophistication. What he did have was an immense amount of common sense, and even though he didn't have the mathematical tools that sabermetricians 20 years later take for granted, it was always a pleasure to watch him dissect nonsensical arguments and cut to the heart of the issues he analyzed, largely because his arguments were so easy to follow and never required much statistical knowledge at all. Every once in a while you could see him struggle because he didn't quite have the tools to do what he wanted, but it was amazing what he did with effectively 8th grade math.
Post a Comment