08 February 2012

Thousandth, and last, post.

This is the last post here at God Plays Dice. It happens to be the thousandth, but I didn't plan that.

I'm moving to Wordpress, and to gottwurfelt.wordpress.com. (The obvious subdomain was taken, by somebody that I don't want to send traffic to.)

I'm also hoping to update more frequently there. Wordpress seems to have better support for adding images to posts, definitely has better support for mathematical notation, has built-in analytics, and, let's face it, has a better image. So it's (perhaps past) time to move.

So update your bookmarks, your feed readers, or whatever you kids are using to follow blogs these days. I'll see you there.

15 December 2011

Solution to distance between random points from a sphere

So I asked on Sunday the following question: pick two points on a unit sphere uniformly at random. What is the expected distance between them?

Without loss of generality we can fix one of the points to be (1, 0, 0). The other will be chosen uniformly at random and will be (X, Y, Z). The distance between the two points is therefore

√((1-X)2 + Y2 + Z2)

which does not look all that pleasant. But the point is on the sphere! So X2 + Y2 + Z2 = 1, and this can be rewritten as

√((1-X)2 + 1 - X2)

or after some simplification


But by a theorem of Archimedes (Wolfram Alpha calls it Archimedes' Hat-Box Theorem but I don't know if this name is standard), X is uniformly distributed on (-1, 1). Let U = 2-2X; U is uniformly distributed on (0, 4). The expectation of √(U) is therefore

04 (1/4) u1/2 du

and integrating gives 43/2/6 = 8/6 = 4/3.

(The commenter "inverno" got this.)

Of course it's not hard to simulate this in, say, R, if you know that the distribution of three independent standard normals is spherically symmetric, and so one way to simulate a random point on a sphere is to take a vector of three standard normals and normalize it to have unit length. This code does that:

xx1=rnorm(10^6,0,1); yy1=rnorm(10^6,0,1); zz1=rnorm(10^6,0,1)
xx2=rnorm(10^6,0,1); yy2=rnorm(10^6,0,1); zz2=rnorm(10^6,0,1)

and then the output of mean(d), which contains the distances, is 1.333659; the histogram of the distances d is a right triangle. (The code doesn't make the assumption that one point is (1, 0, 0); that's a massive simplification if you want to do the problem analytically, but not nearly as important in simulation.)

11 December 2011

A geometric probability problem

Here's a cute problem (from Robert M. Young, Excursions in Calculus, p. 244): "What is the average straight line distance between two points on a sphere of radius 1?"

(Answer to follow.)

If any of my students are reading this: no, this should not be interpreted as a hint to what will be on the final exam.

16 November 2011

In which I declare four things which my probability class is not about

In class today, I said approximately this:

So people decide whether to have children by flipping a coin, and if it comes up tails they have a kid, and if it comes up heads they don't. They repeat this until it comes up heads. This is probably not a good model of how people decide whether or not to have children, but maybe it's good in the aggregate. And anyway this isn't a class about how people decide whether to have kids.

Then there are two kinds of children, girls and boys -- well, not always, but this isn't a class about that -- and each child is equally likely to be a boy or a girl -- well, wait, that's not exactly true, but it's not a horrible assumption about how reproduction works on a cellular level, but this isn't a class about that either.

And people's decisions to stop having kids is independent of the sex of the children they've had -- which says this isn't China, because people do interesting things under the one-child policy -- but this isn't a class about that.

(Then I actually did some math -- namely, assume that the number of children a random family has is geometrically distributed with some parameter p, and assume that all children are equally likely to be male or female and that their genders are independent of the gender of any other children or the number of children in the family. Pick a random family with no boys. What is the distribution of the number of children they have?)

11 November 2011


You may have heard that it's 11/11/11. (Or, if you live in the UK, 11/11/11.) When I was growing up, I'd get confused and think that World War II ended on this day, one hundred years ago. You know, at the eleventh hour of the eleventh day of the eleventh month of the eleventh year.

The New York Times says that marketers are viewing this as a singular event -- but they went on about this four years, four months, and four days ago.

The Corduroy Appreciation Club says it is Corduroy Appreciation Day.

A bit more mathematically, you can watch a video about the number eleven by James Grime, which appears to be the first of a series of Numberphile videos.

Edited, November 12, 12:29 pm: from the New York Times, a hundred years ago: "To-day it is possible to write the date with the repetition six times of a single digit." The article also points out that a digit will probably never occur again seven times in the date -- we'd have to make it to November 11, 10011 for that to happen.

09 November 2011

Small sample sizes lead to high margins of error, unemployment version

The ten college majors with the lowest unemployment rates, from yahoo.com. I've heard about this from a friend who majored in astronomy and a friend who majored in geology; both of these are on the list, with an unemployment rate of zero.

The unemployment rates of the ten majors they list are 0, 0, 0, 0, 0, 0, 1.3, 1.4, 1.6, and 2.2 percent.

I would bet that the six zeroes are just the majors for which there were no unemployed people in the sample. The data apparently comes from the Georgetown Center on Education and the Workforce; there's a summary table at the Wall Street Journal, and indeed the majors which have zero unemployment are among the least popular. Just eyeballing the data, some of the majors with the highest unemployment are also among the least popular. The red flag here would be, say, an unemployment rate of 16.7% (one out of six) or 20.0% (one out of five) for some major near the bottom of the popularity table, but I don't see it; I guess their sample is big enough that no major is that small, or maybe they actually made some adjustments for this issue.

The actual Georgetown report seems to be available here but I am having trouble viewing it.

In case you were wondering, mathematics is the 28th most popular major (of 173) and has 5.0% unemployment; "statistics and decision science" is 128th most popular and has 6.9% unemployment, which seems to go against the popular wisdom these days that statistics majors are more employable than math majors. (But I work in a statistics department, so my view of the popular wisdom may be biased.)