30 June 2008

A tail bound for the normal distribution

Often one wants to know the probability that a random variable with the standard normal distribution takes value above x for some positive constant x.

(Okay, I'll be honest -- by "one" I mean "me", and the main reason I'm writing this post is to fix this idea in my head so I don't have to go looking for my copy of Durrett's text Probability: Theory and Examples every time I want this result. Durrett gives a much shorter proof -- two lines -- on page 6 of that book, but it involves an unmotivated-seeming change of variables, which is why I have trouble remembering it.)

The probability density function of the standard normal is ${1 \over \sqrt{2\pi}} \exp( -x^2/2)$, and so the probability in question is
f(x) = \int_x^\infty {1 \over \sqrt{2\pi}} \exp (-t^2/2) \, dt
It's a standard fact, but one that I can never remember, that this is bounded above by ${1 \over \sqrt{2\pi} x} \exp(-x^2/2)$ (and furthermore bounded below by 1 - 1/x2 times the upper bound, so the upper bound's not a bad estimate).

How to prove this? Well, here's an idea -- approximate the tail of the standard normal distribution's density function by an exponential. Which exponential? The exponential of the linearization of the exponent at t. The exponent has negative second derivative, so the new exponent is larger (less negative) than the old one and this is an overestimate. That is,
f(x) < \int_x^\infty {1 \over \sqrt{2\pi}} \exp(-x^2/2-x(t-x)) \, dt
where the new exponent is the linearization of -t2/2 at t=x.

Then pull out factors which don't depend on t to get
{\exp(x^2/2) \over \sqrt{2\pi}}\int_x^\infty  \exp(-xt) \, dt
and doing that last integral gives the desired bound.

Basically, the idea is that since the density to the right of x is dropping off as the exponential of a quadratic, most of it's concentrated very close to x, so we might as well approximate the density of the function by the exponential of a linear function, which is easier to work with.

By similar means one can show that the expectation of a real number selected from the standard normal distribution, given that it's greater than x, is something like x + 1/x. The tail to the right of x looks like an exponential random variable with mean 1/x. For example, the expectation of a real number selected from the standard normal distribution, conditioned on being larger than 10, is 10.09809.... But this is probably useless, because the probability of a real number selected from the standard normal distribution being larger than 10 is, by the previous bound, smaller than 1 in 10(2π)1/2e50, or about one in 1.3 x 1023.


Efrique said...

Have you seen the related literature on approximating Mills' ratio for the normal?

I think theres a section in Kendall&Stuart (at least in the 3rd ed. or thereabouts)

Anonymous said...

Can you please point to a book or another reference where I can learn more about the tricks you used to get the upper bound especially linearization trick?

Anonymous said...

Cool article you got here. I'd like to read more concerning this topic. Thanks for giving this information.
Sexy Lady
Blonde escort