08 October 2007

Choosing partitions at random

There is such a thing as a partition of a positive integer; it's a way to write that positive integer as a sum of positive integers, where order is irrelevant. So, for example, the partitions of 7 are

7, 6+1, 5+2, 5+1+1, 4+3, 4+2+1, 4+1+1+1, 3+3+1, 3+2+2, 3+2+1+1, 3+1+1+1+1, 2+2+2+1, 2+2+1+1+1, 2+1+1+1+1+1, 1+1+1+1+1+1+1

and there are fifteen of them. We can define a function p, where p(n) is the number of partitions of n; thus p(7) = 15.

So how might one pick a partition of n "at random"? There are at least three ways I know of. The first is to pick one uniformly, so each one is chosen with probability 1/15.

The second is as follows: begin with the partition 1. Then generate a partition of 2 by augmenting some part of the partition by 1; we can either augment the first part (giving 2) or the second part (there is no second part, so this corresponds to adding a new part 1, giving 1+1). Then do the same thing to obtain a partition of 3. If you had 2 at the second step, then you get 3 or 2+1 at the third step with equal probability; if you had 1+1 at the second step, then you get 2+1 or 1+1+1 with equal probability. But you can't have 1+2 -- we make the rule that partitions have to be written in decreasing order. So at the third step, you have 3 with probability 1/4, 2+1 with probability 1/2, and 1+1+1 with probability 1/4. At the fourth step, 3 becomes 4 or 3+1 with equal probability; 2+1 becomes 3+1, 2+2, or 2+1+1 with equal probability, and 1+1+1 becomes 2+1+1 or 1+1+1+1 with equal probability. Thus you have the partitions 4 or 1+1+1+1 with probability 1/8 each; the partitions 3+1 or 2+1+1 with probability 7/24 each; the partition 2+2 with probability 1/4.

As you can see, this process doesn't generate partitions uniformly at random. But in a way it's natural, because at each step of the process one is doing something uniform. There's a natural way to draw a picture of a partition as a Ferrers diagram, where a partition of n will have a diagram with n dots; this corresponds to putting down n dots "at random" subject to the constraint that at each step the diagram you obtain is a valid Ferrers diagram. It turns out that the Ferrers diagram obtained in this way tend, up to scaling, to a certain shape as they grow large. As do partitions chosen uniformly at random -- but it's a different shape.

And there's a third way of choosing partitions at random. This is called the "Plancherel measure". We pick a partition λ of n with probability (dim λ2)/n!, where dim λ is the "dimension" of a partition. The dimension of a partition is just the number of ways in which it can be obtained as the nth term of a sequence of partitions, starting with the empty partition and at each step adding one to some part. For example, the partition 5 = 3+2 is the last element of

0, 1, 2, 3, 3+1, 3+2
0, 1, 2, 2+1, (3+1 or 2+2), 3+2
0, 1, 1+1, 2+1, (3+1 or 2+2), 3+2

so there are five such sequences, and dim (3+2) = 5. Thus in the Plancherel measure we choose the partition 3+2 with probability 52 / 5! = 25/120. The dimension of a partition is also the number of standard Young tableaux corresponding to it.

This just seems so artificial, though -- if I wanted to actually pick a partition according to this distribution, how would I do it?

There are n! permutations of n objects. Picking one of these uniformly at random seems like a natural thing to do. (For example, it has a physical instantiation in the shuffling of a deck of cards.) It turns out that there is a correspondence called the Robinson-Schensted correspondence. (Sometimes Knuth's name is in here as well.) This correspondence associates with a permutation of [n] a pair of standard Young tableaux, each containing n squares, and of the same shape. The shape of those Young tableaux gives a partition, then; if we started by picking a permutation uniformly then we've now obtained a partition chosen with the distribution induced by the Plancherel measure.

Suddenly this distribution seems natural. I'm much more comfortable with the idea of picking something from a random distribution if I can imagine how I would actually go and choose it if I had to.

2 comments:

Anonymous said...

So, what do any of these methods (if any) have to do with the character theory of the symmetric group? In particular, the orthogonality relations for characters (which correspond to partitions) seem tailor-made for giving rise to probability densities.

Michael Lugo said...

John,

that's a good question, but I don't know the answer. I don't know much about the character theory of the symmetric group. Perhaps I will do something about that.