12 July 2008

A prediction-making quiz

I just read Ian Ayres' book Super Crunchers, which talks about how the large amounts of data that are now routinely collected enable better predictions than before. Sort of like Freakonomics but a bit more statistical. (Although all the math is hidden -- but I knew that going in.)

Now, there was a recent article The End of Theory which predicts that we don't need theories, we can just mine our data for correlations; I don't believe this. And Ayres talks about how some predictive models need human input -- for example, a model for predicting how Supreme Court justices will vote needs people to read previous input on the cases in order to decide whether the ruling being appealed was liberal or conservative, and also to determine what the major issues involved in the case are. But he ponts out that people are bad at predicting things because we are overconfident about our predictions.

This piqued my curiosity. Here's a quiz; I want to see how good you are at calibrating your own predictions. (This is taken from Ayres' book, p. 113.) For each of the following ten questions, give a range that you are 90 percent confident contains the correct answer. Ayres' test implicitly uses English units, but if you want to use metric (which I suspect a lot of you are more comfortable in) that's fine; I'll convert.

So, for example, if one of the questions were "What is the population of Philadelphia?", and you gave the numbers "1.2 million, 1.6 million", that would indicate that you believe with probability 90 percent that the population of Philadelphia is in that interval. (The 2006 Census estimate for this, by the way, is 1,448,394.)

Your goal is to get exactly nine of these right. Yes, I know that sounds weird! But the point is that if you get all ten right, you're proabably underestimating your own abilities to predict things. If you get eight or less, you're probably overestimating them.

Send your answers to me at izzycat AT gmail DOT com; don't leave them in comments.

Here are the questions:
1. How old was Martin Luther King, Jr. at death?
2. What is the length of the Nile River?
3. How many countries belong to OPEC?
4. How many books are there in the Old Testament?
5. What is the diameter of the moon?
6. What is the weight of an empty Boeing 747-400?
7. In what year was Mozart born?
8. What is the gestation period of an Asian elephant?
9. What is the air distance from London to Tokyo?
10. What is the depth of the deepest known point in the ocean?

1. feel free to forward this quiz to other people. (I encourage it, although there's a non-negligible chance I might regret this if I get too many answers. I'll survive.)
2. if you have stories about how you made your guess, send them to me; I may use them in a future post.
I'm not going to post the answers; none of them are hard to find. Once answers stop coming in I'll make a post about how good you are at making these predictions.


Jonah said...

I've seen this test before. I think I might have gotten 8/10, but it's been a while.

While it's a cute concept, playing with only 10 questions and aiming for exactly 9 just seems too stringent to me. Even if you're perfectly good at picking that 90% confidence interval for your answers, you'll getting exactly 9 of them right only 38.7% (.9^9*.1^1*{10 \choose 9}) of the time. This is barely more likely than the perfect score, which would show up 34.9% of the time.

misha said...

Facts without theories? Mathematics, especially combinatorics, is heading in the same direction, mosly due to computer generated or computer assisted discovery and proof.

Michael Lugo said...


I thought of that. But I really was just going with this as a cute concept, nothing more. (For one thing, if I wanted to know something about the psychology of average people, this blog's readers are a horrible sample.)

Guy Srinivasan said...

Hm, not quite - the goal is to get exactly 9 right AND not be able to guess which one you'll get wrong.

jonah, what do you mean by too stringent? Not all games are winnable 100% of the time.

Michael Lugo said...

That's a good point; otherwise you could easily get 9 right by just guessing ridiculously wide intervals for nine questions (for example, Martin Luther King Jr. was between 0 and 120 at his death) and making an obviously wrong single-point estimate for the tenth (a Boeing 747 weighs exactly one pound).

Jadagul said...

Okay, having checked my answers I was embarrassingly off on almost all of them. (In my defense, three of them were a result of a wild misestimation of the diameter of the earth, but still).

Interestingly, only two of my answers were too large, and they both had to do with age.

Anonymous said...

Personally, I found that it was basically impossible for me to generate something with a 90% confidence interval without feeling like my interval was too large. As a result, the vast majority of my intervals were actually at a much lower confidence rate than 90% (and it turns out that many of them were wrong).

I think this stems from the fact that in my daily life, when making an estimate, it's implicitly required that I make an estimate that's useful. An estimate that New York City's population is between 2 million and 40 million may be the 90% confidence interval for some people, but it's so large as to be functionally equivalent to answering "I don't know". Because of that, all of my estimates were small enough to be useful, which resulted in them being much lower confidence than 90%.

Efrique said...

Your goal is to get exactly nine of these right. Yes, I know that sounds weird! But the point is that if you get all ten right, you're proabably underestimating your own abilities to predict things. If you get eight or less, you're probably overestimating them.

If you're doing it "exactly right", the number correct will be binomial (10,.9). The probability of getting 10 right is 35%, the probability of getting 9 right is 39%.

That is, if you're doing it right, you're about as likely to get all ten intervals right as to get 9.

Moreover, if you're doing it exactly right, the chance that you WON'T get 9 intervals continaing the value that they should exceeds 60%!

Anonymous said...

It's true that a person could do this the correct way and often end up with the wrong number of correct answers. But then Isabel can plot the results and see how close her readers come to 90% distribution on the aggregate.