God Plays Dice: algebraic geometry

Charlie Siegel, whose office is just down the hall from mine, had for a while the following statement on the blackboard in his office (paraphrased and renotated a bit): Let g(n) be the number of lines on the generic hypersurface of degree 2n-1 in complex projective (n+1)-space. Then

$g(n) = [z^n] (1-z) \prod_{j=0}^{2n-1} (2n-1-j+jz)$

where the notation

means "the coefficient of zⁿ in f(z)". For example, g(2) is the coefficient of z² in (1-z)(3)(2+z)(1+2z)(3z), which turns out to be 27; this means that the number of lines on the generic hypersurface of degree 3 in complex projective 3-space is 27. Informally, "there are 27 lines on the cubic". Want to know why? For the cubic case, see this post by Charlie or go to his talk last week.

The first few values of g(n) are 1, 27, 2875, 698005, 305093061, 210480374951, 210776836330775, 289139638632755625... -- so this sequence grows pretty quickly. How quickly? I couldn't tell at first, but it kept nagging at me every time I was in Charlie's office. The sequence is A027363 in the Encyclopedia of Integer Sequences. It turns out that it's in an appendix of the paper that the Encyclopedia links to (arXiv:math/0610286, specifically its appendix by Zagier) but the derivation there is long and involved and it was more fun with the formula myself. There are a few gaps, but here it is.

We can deal easily with the 1-z factor out front, and we want

$g(n) = [z^n] f_n(z) - [z^{n-1}] f_n(z)$

where

$f_n(z) = \prod_{j=0}^{2n-1} (2n-1 - j + jz)$

We can already see it's going to be kind of tricky; the coefficients of f_n(z) are probably going to be pretty big and not that far apart.

Now, we can pull out a 2n-1 from each factor in the expression for f_n(z), and we get

$f_n(z) = (2n-1)^{2n} \prod_{j=0}^{2n-1} {(2n-1 - j + jz) \over 2n-1 }$

Call the product p_n(z). Now this is where, out of nowhere, I start having Fun With Probability. (It's kind of amusing because there is nothing probabilistic about the original question.) The term corresponding to j in that product is the probability generating function of a random variable which is 1 with probability (j/(2n-1)) and 0 otherwise. The product is thus the p.g.f. of the sum of such random variables for j = 0, 1, ..., 2n-1.

Now, this sum has mean which is the sum of the means of those random variables, that is,

$\sum_{j=0}^{2n-1} {j \over 2n-1} = n$

and variance which is the sum of their variances,

$\sum_{j=0}^{n-1} {j \over 2n-1} \left( 1 - {j \over 2n-1} \right) = {2n(n-1) \over 3(2n-1)}$

which is approximately n²/3. Furthermore, since the sum is a sum of a bunch of small contributions, it's approximately normal. So p_n(z) is the probability generating function of a random variable which is roughly normal with mean n and variance n²/3, but integer-valued, and its kth coefficient is the probability that such a random variable is equal to k.

Therefore [z^k] p_n(z) is approximately

$q_n(k) = {3 \over \sqrt{2\pi n}} \exp \left( {-(k-n)^2 \over 2n/3} \right)$

which is the density of such a normal random variable. We want to know [zⁿ] p_n(z) - [zⁿ] p_n-1}(z), and this is approximately q_n(n) - q_n(n-1). You'd think this would be q_n'(n), but q_n(n) is actually zero -- the fact that the coefficients were "close to each other" is even more troublesome than you'd expect. Still, we can make a Taylor series for q_n at n, and the linear term is zero, so we get

$q_n(n) - q_n(n-1) \sim {q_n^{\prime\prime}(n) \over 2} = \sqrt{27 \over 8\pi n^3}$

And g(n) = [zⁿ] f_n(z) - [zⁿ] f_n-1(z) = (2n-1)²ⁿ ([zⁿ] p_n(z) - [zⁿ] p_n-1(z)); using the approximation we get

$g(n) \sim \sqrt{27 \over 8\pi n^3} (2n-1)^{2n}$

and now note that

$(2n-1)^{2n} = (2n)^{2n} \left(1-{1 \over 2n}\right)^{2n} \sim {(2n)^{2n} \over e}.$

Therefore

$g(n) \sim \sqrt{27 \over 8\pi e^2 n^3} (2n)^{2n}$

which matches (with a bit of computation) the result given by Zagier in the arXiv, and the terms reported in the OEIS. I wouldn't have trusted it otherwise; that "take the second derivative" step is particularly sketchy, although it can probably be justified as there are results on the rate of convergence of the central limit theorem. But asymptotic analysis is nice in this way; if it's easy to compute the terms of some sequence, then we can often be confident in results like this even without

Besides, I'd never seen the trick of using the second derivative to approximate a difference before. At least not that I can remember.

From The Princeton Companion to Mathematics, specifically the article "Algebraic Geometry", by János Kollár:

Finally, if we marry a scheme to an orbifold, the outcome is a stack. The study of stacks is strongly recommended to people who would have been flagellants in earlier times.

I feel this way about most of algebraic geometry, but that's only because Penn has a high enough concentration of algebraic geometers that I get tired of hearing people walking around and talking about it all the time.

Also, I find myself screaming at the book quite often. But I do it in a good way; it's often some crucial insight that makes me think "why didn't anybody tell me that before?" (Of course, it's possible they did and I wasn't listening.) And sometimes it's "oh, damn, I thought I came up with that myself". For example, I just read the article on computational number theory, by Carl Pomerance, in which he explains a heuristic reason why Fermat's last theorem is true. I won't give it in full here, but it's basically the following. First, Euler showed it for n = 3. Second, consider all the positive integers which are nth powers for some n ≥ 4. The "probability" that a number m is in this set is about m^-3/4. So replace the set of fourth-or-higher powers with a random set S, which contains m with probability m^-3/4. Then the probability that a givennumber n can be written as a sum of two such elements of this set S is proportional to n^-1/2; independently, it has probability n^-3/4 of being in S. So the probability that n can be written as a sum of two elements of S and is also in S itself is proportional to n^-5/4, and so we expect finitely many examples. This isn't quite true, because the set of fourth-or-higher powers has some nontrivial structure. But it also took a couple hundred words, instead of a couple hundred pages like the real proof.

But I figured out something like this when I was in college, and I was so proud of myself! So it saddens me to learn I'm not the only one who thought of it. (It also makes me happy, though, because the idea is due to Erdos and Ulam, and there are worse people to be imitating.)

God Plays Dice

26 November 2008

The asymptotics of "27 lines on a cubic"

23 November 2008

A couple amusements from the Princeton Companion

Blog Archive

Contributors

Other blogs