Without loss of generality we can fix one of the points to be (1, 0, 0). The other will be chosen uniformly at random and will be (X, Y, Z). The distance between the two points is therefore

√((1-X)

^{2}+ Y

^{2}+ Z

^{2})

which does not look all that pleasant. But the point is on the sphere! So X

^{2}+ Y

^{2}+ Z

^{2}= 1, and this can be rewritten as

√((1-X)

^{2}+ 1 - X

^{2})

or after some simplification

√(2-2X).

But by a theorem of Archimedes (Wolfram Alpha calls it Archimedes' Hat-Box Theorem but I don't know if this name is standard), X is uniformly distributed on (-1, 1). Let U = 2-2X; U is uniformly distributed on (0, 4). The expectation of √(U) is therefore

∫

_{0}

^{4}(1/4) u

^{1/2}du

and integrating gives 4

^{3/2}/6 = 8/6 = 4/3.

(The commenter "inverno" got this.)

Of course it's not hard to simulate this in, say, R, if you know that the distribution of three independent standard normals is spherically symmetric, and so one way to simulate a random point on a sphere is to take a vector of three standard normals and normalize it to have unit length. This code does that:

xx1=rnorm(10^6,0,1); yy1=rnorm(10^6,0,1); zz1=rnorm(10^6,0,1)

d1=radic(xx1^2+yy1^2+zz1^2)

x1=xx1/d1;y1=yy1/d1;z1=zz1/d1;

xx2=rnorm(10^6,0,1); yy2=rnorm(10^6,0,1); zz2=rnorm(10^6,0,1)

d2=radic(xx2^2+yy2^2+zz2^2)

x2=xx2/d2;y2=yy2/d2;z2=zz2/d2;

d=radic((x1-x2)^2+(y1-y2)^2+(z1-z2)^2);

and then the output of mean(d), which contains the distances, is 1.333659; the histogram of the distances d is a right triangle. (The code doesn't make the assumption that one point is (1, 0, 0); that's a massive simplification if you want to do the problem analytically, but not nearly as important in simulation.)