Showing posts with label Putnam. Show all posts
Showing posts with label Putnam. Show all posts

08 December 2009

Distribution of Putnam scores

The distributions of Putnam exam scores are interesting. See, for example, the 2001 distribution. It takes a bit of number-crunching to get an actual distribution of scores from the data; they report the "rank" of the people getting each score. The rank corresponding to a given score is, I assume, A+(B+1)/2 where A is the number of people scoring higher than that score and B is the number of people scoring that particular score. For example, in 2001 -- which happens to be one of the years in which I took the Putnam -- the table begins




Score101100 86 80 79 77 73 72 71 70 69 68
Rank1 2 3 4.5 6 7.5 9 11 14 16.5 19 23.5
Number111212133236

where the first two rows are provided by the organizers, and the third row can be worked out by working left to right. For example, once we know 17 people got 70 or better, the fact that the score 69 corresponds to rank 19 means that the people scoring 69 must have been the 18th, 19th, and 20th-best; so there were three of them. (Incidentally, most increasing sequences of half-integers, when interpreted as sequences of ranks, don't appear to correspond to legitimate score distributions; the number of people getting certain scores ends up negative if you're note careful.)

Anyway, if you crunch the numbers on a typical Putnam score distribution you observe two things:
- the scores follow, roughly, a power law; the number of people scoring 10n decays like some power of n, for integer n.
- once you remove this decay (which I haven't actually done; I've just eyeballed it), there are "spikes" at multiples of 10. For example, the number of people scoring 18, 19, 20, 21, 22, 23 in 2001 were 8, 23, 99, 60, 39, 11. Twenty-four people scored 50; seven scored each of 49 and 51.

I can't explain the first one (and it may just be an artifact of the way I'm doing the plotting; lots of things look close to linear when plotted on a logarithmic scale). But the second one is actually easy to explain; Putnam problems are worth ten points each, and most scores are 0 or 10 with a smattering of 1, 2, 8, or 9. Scores between 3 and 7 on a problem are exceedingly rare. So to get a score of, say, 55, one has to get five problems right and have made a bit of progress on three to five more, which is less likely than straight-out solving five or six problems (for 50 or 60, respectively).

Incidentally, I haven't looked at the problems from the 2009 Putnam, because I have work to do.

07 December 2008

2008 Putnam problems

This year's Putnam exam problems, via 360.

I haven't thought about these, but I might as a break from writing things up over the next few days.

13 December 2007

Problem A3 from the Putnam

Here is problem A3 from the 2007 Putnam exam:
Let k be a positive integer. Suppose that the integers 1, 2, 3, ..., 3k+1 are written down in random order. What is the probability that at no time during this process, the sum of the integers that have been written up to that time is a positive integer divisible by 3?


I'm going to give a solution that probably wouldn't get full credit on the Putnam (I'm leaving out some details) -- a lot gets lost in the details.

First, instead of considering these numbers, just consider everything modulo 3. In particular, we want to know what the probability is that if we arrange k+1 +1s, k -1s, and k 0s in random order, no partial sum will be divisible by 3. (The number of permutations of {1, ..., 3k+1} which map to a given sequence modulo 3 is the same for any such sequence, so this is legitimate.) The number of ways to arrange k+1 +1s, k -1s, and k 0s is, of course, ${3k+1 \choose k, k, k+1}$.

Now we consider the simpler problem of arranging k+1 +1s and k -1s so that no partial sum is divisible by 3. Clearly the last partial sum (i. e. the full sum) is 1, and the sequence of partial sums moves upwards or downwards by 1 at each step. Such an arrangement must begin with a +1; if it began with a -1 then the first partial sum would be -1, the last partial sum would be 1, so some intermediate partial sum would have to be zero. This +1 must be followed with another +1 to avoid having the second partial sum be 0; then a -1 so that the third partial sum is not 3; and so on. Thus there is exactly one such arrangement, namely

+1, +1, -1, +1, -1, +1, -1, ..., +1, -1

where the partial sums are

1, 2, 1, 2, 1, 2, 1, ..., 2, 1.

The probability of obtaining an arrangement with no partial sums divisible by three if we pick at random from all arrangements of k+1 +1s and k -1s is thus $1/{2k+1 \choose k}$.

Now, to obtain an arrangement of k+1 +1s, k -1s, and k 0s uniformly at random, we can first place the 0s at random, then the +1s and -1s. There are  ${3k+1 \choose k}$ ways in which the 0s can be placed. Interspersing 0s among some arrangement of +1s and -1s doesn't change the set of partial sums which occur -- unless a 0 precedes all the +1s and -1s. So the probability of placing the 0s such that a "good" arrangement is still possible is ${3k \choose k}/{3k+1 \choose k}$. There is exactly one way to place the +1s and -1s once we've done this that actually gives a "good" arrangement. So the final probability is
f(k) = {{3k \choose k} \over {3k+1 \choose k}} {1 \over {2k+1 \choose k}} = {k!(k+1)! \over (2k)!(3k+1)}.


For example, f(1) = 1/4 = 6/24; this corresponds to the six "good" arrangements of {1, 2, 3, 4}, namely 1342, 1432, 1423, 4312, 4132, 4123. The function f(k) is asymptotic to $\sqrt{\pi k}/(3 \cdot 4^k)$ from Stirling's approximation. I like to interpret the 4 there as meaning that if we are building up such an arrangement of +1s, -1s, and 0s, we have a probability of roughly 1/41/3 of making an acceptable choice at each step; the "1/3" comes from the fact that we actually make 3k choices in building up such an arrangement. This constant is about 0.63, and seems about right; it should be around two-thirds, since at each step except the first, exactly two of {0, +1, -1} are valid ways to extend the sequence.

The simplicity of the problem seems to hinge on the fact that the partial sums have to fluctuate within a very narrow band, and only move by ± 1. If we instead required that no partial sums were divisible by, say, 4, so the partial sums could move around in {1, 2, 3}, this would be trickier. But I think the best way to work that problem would still be to work modulo 3, so each summand is either +1, -1, or 0; the crucial observation I made early on was that if the partial sum was -1 at some point and +1 later it had to go through zero, but you don't get this sort of "continuity" (the observation is a discrete analogue of the Intermediate Value Theorem) in any modulus other than 3.

(This is basically the same as the solution given at the UNL Putnam archive, but I came up with it independently.)