Think about probability distributions supported on the positive integers. Not all of them have a "square root" -- that is, given a random variable
X supported on the positive integer, there do not exist independent, identically distributed variables
Y1,
Y2 such that X has the same distribution as Y
1 + Y
2.
You might be wondering why it's natural to refer to this as a "square root". Well, let
pk be the probability of the event
X =
k. Then the probability generating function for X is
.
Similarly, let
qj be the probability of the event
Y =
j, and define the probability generating function
.
Let
Y,
Y1,
Y2 be equidistributed. Then
X is equidistributed with
Y1 + Y
2 if and only if f(z) = g(z)
2, since addition of independent random variables corresponds to
convolution of distributions and multiplication of their generating functions.
Conversely, the random variable
X has a square root in this sense if and only if its generating function
f(
z) has a square root which has a Taylor series at
z = 0 with all coefficients positive.
The simplest case is when
X has finite support. Then
X has a square root if and only if its generating function
f(
z) is the square of a polynomial. For example, the random variable which takes values 0, 1, 2 with probabilities 1/4, 1/2, 1/4 has a square root; its generating function is
.
But the random variable taking values 0, 1, 2 with probability 1/3 each is not, since 1/3 + z/3 + z
2/3 is not a square.
But what about distributions with infinite support? Some of these are easy -- the square root of the
Poisson distribution with mean λ is Poisson with mean λ/2. This can easily be seen since the probability generating function of a Poisson(λ) random variable is
.
(In fact, the Poisson distribution is
infinitely divisible; in the nomenclature used in this post one might say it has roots of all orders.)
Now consider a random variable
X, with
geometric distribution with parameter 1/2 supported on {0, 1, 2, 3, ...}; this has P(
X =
k) = 2
-(k+1). This is a special case of the
negative binomial distribution, which is infinitely divisible. We have f(z) = 1/2 + z/2
2 + z
2/2
3 + ... = 1/(2-z). So the square root distribution has generating function
and in general the coefficient of z
n is, by the binomial theorem,
That binomial coefficient is ${1 \over 2} {2n \choose n}$; we have ${2n \choose n} \sim 4^n/\sqrt{\pi n}$, so the coefficient of z
n in our probability generating function, which we'll call q
k, is asymptotic to
In particular, the "square root" distribution decays just a bit faster than the distribution that it's the square root of, the main difference being the additional factor of n
-1/2. This is reasonable, if you think about the process of convolution. We have
p
n = q
0 q
n + q
1 q
n-1 + q
2 q
n-2 + ... + q
n-1 q
1 + q
n q
0and each of the
n terms is roughly 1/(n2
n). This is just another negative binomial distribution. (I actually didn't realize that until I started writing this post; the post was originally titled "what's the name of this distribution?" and then I did some research.)