God Plays Dice: linear algebra

I'm applying for jobs. Many jobs, because that's how academic job searches work these days. So I have a spreadsheet (in OpenOffice) to keep track of them.

Among the things that I track for each job, there is a column with 0, 1, or 2 in it. 0 means that I haven't submitted anything; 1 means I've submitted something, but not everything that was asked for; 2 means the application is complete. Averaging these numbers and dividing by 2 tells me what proportion of the search is complete.

But I also wanted to know how many 0s, 1s, and 2s there were. And as far as I know the built-in functions in OpenOffice won't do that.

What they will do, however, is this. I have a column consisting of 0s, 1s, 2s, and empty cells. By doing

COUNT(J8:J1000)
SUM(J8:J1000)
SUMPRODUCT(J8:J1000;J8:J1000)

I get the number of cells in that column which are nonempty; their sum; and the sum of their squares. (The SUMPRODUCT function takes 2 arrays of the same shape and returns the sum of the products of corresponding cells.) "8" is the row that contains the first job on the list, and "1000" is just a number that is comfortably more than the number of jobs I am applying for. Call these a, b, and c respectively. Let n₀, n₁, and n₂ be the number of entries which are 0, 1, and 2 respectively. Then I have

a = n₀ + n₁ + n₂
b = n₁ + 2n₂
c = n₁ + 4n₂

which is a three-by-three linear system, and can be solved for n₀, n₁, n₂, giving

n₀ = a - 3b/2 + c/2, n₁ = 2b-c, n₂ = (c-b)/2

and so I can recover the number of applications with status 0, 1, or 2 from this. From the sums of the 0th, 1st, and 2nd powers I can recover the distribution of the values themselves. (The actual code is slightly different, but of course equivalent, because I solved the system "by inspection" and never actually explicitly wrote it out until just now.)

Believe it or not, I actually use this trick in a preprint, "The number of cycles of specified normalized length in permutations", to do some actual mathematics! There I find the expectation of X⁰, X¹, X², ..., X^k where X is a certain random variable known to take on the values 0, 1, ..., k, namely the number of cycles of length in the interval [γ n, δ n] in a permutation of [n] chosen uniformly at random where γ and δ are constants. k is the greatest integer less than or equal to 1/γ ; for example, if we're looking at cycles of length at least 0.15n in permutations of n, there can't be more than six of them. This gives a linear system like the one above which gives the probability that X takes on each value 0, 1, ..., k.

God Plays Dice

31 December 2009

A hack I'm disturbingly proud of, and its connection to some real math

Blog Archive

Contributors

Other blogs