09 November 2011

Small sample sizes lead to high margins of error, unemployment version

The ten college majors with the lowest unemployment rates, from I've heard about this from a friend who majored in astronomy and a friend who majored in geology; both of these are on the list, with an unemployment rate of zero.

The unemployment rates of the ten majors they list are 0, 0, 0, 0, 0, 0, 1.3, 1.4, 1.6, and 2.2 percent.

I would bet that the six zeroes are just the majors for which there were no unemployed people in the sample. The data apparently comes from the Georgetown Center on Education and the Workforce; there's a summary table at the Wall Street Journal, and indeed the majors which have zero unemployment are among the least popular. Just eyeballing the data, some of the majors with the highest unemployment are also among the least popular. The red flag here would be, say, an unemployment rate of 16.7% (one out of six) or 20.0% (one out of five) for some major near the bottom of the popularity table, but I don't see it; I guess their sample is big enough that no major is that small, or maybe they actually made some adjustments for this issue.

The actual Georgetown report seems to be available here but I am having trouble viewing it.

In case you were wondering, mathematics is the 28th most popular major (of 173) and has 5.0% unemployment; "statistics and decision science" is 128th most popular and has 6.9% unemployment, which seems to go against the popular wisdom these days that statistics majors are more employable than math majors. (But I work in a statistics department, so my view of the popular wisdom may be biased.)

02 May 2010

Arithmetic geometers write about statistics

Jordan Ellenberg, in yesterday's Washington Post: The census will be wrong. We could fix it.

This continues a proud tradition of mathematicians whose area of expertise is nowhere near statistics writing newspaper pieces saying that statistical sampling in censuses a good idea; Brian Conrad, 1998, New York Times.

In some sense it carries more weight when mathematicians who don't spend most of their time battling randomness in some sort or another . Statisticians of course think that doing statistical adjustments to the census in order to make it more accurate is a Good Idea; it gets them, their students, or their friends jobs!

As a combinatorialist I admire the theoretical elegance of our country's once-a-decade exercise in large-scale, brute-force combinatorics. But in practice, well, of course it needs some statistical help.

And here's something interesting:
Since 1970, a mail-in survey has provided the majority of census data, so what we enumerate is not people but numbers written on a form, which are as likely to be fictional as any statistical estimate.
I wonder if people are actually lying on their census forms. I suspect this would skew the count upwards. People who deliberately lie on their census forms, at least the sort of people I know, are likely to give "joke" answers. And large numbers are funnier. I live in a one-bedroom apartment, and if I were the sort of person who lied on government forms I would easily say that ten people live in my apartment. I can't give a comically low number of people living here, because the census insists that a positive integer number of people live in each place. Does the census has some sort of way to correct for this?