18 August 2007

Social interactions aren't random

When I graduated from high school six years ago, I received a prestigious national award. The people who received it were selected based on some combination of standardized test scores, academic ability, a bunch of essays we wrote, and some sort of correction for geography I never quite understood. About one hundred and forty people -- all high school seniors -- receive this award each year.

In the two years since I've graduated from college, I have met two other people who have received this award, and who I did not meet through the schools that I attended. (It doesn't seem quite fair to include people who went to the same schools as I did, or indeed who I met through any academic channels, because the award was supposedly selecting for academic ability.) One is an ex-girlfriend; the other is someone that I met the most recent time that I was looking for a place to live. (I live alone; after a string of potential roommates had "friends" materialize that were suddenly interested in their empty rooms, I realized that people were trying to tell me something.) Neither of them volunteered this particular piece of information (to be honest, it's basically irrelevant); in both cases I found it out by Googling them.

And actually, upon scanning the names of people who've gotten this award, I found a third one I know, the semi-invisible housemate of some friends of mine.

So, how likely is it that I would have met two (or three) such people in the last two years, out of all the people I've met in that time? The word "met" is a bit vague here, but since in both cases I found out this piece of information by Googling, in order to identify this fact I would have to have enough information to find them on Google. In particular, I'd have to know their last name.

How many people have I met over the last two years? I don't know an exact answer, but it seems like one person a day would be fairly accurate, if perhaps a bit high -- for a total of seven hundred and thirty people. This is a tricky number to estimate because on most days I meet no new people, but on some days I meet lots of new people. I suspect I'm not unique here.

Furthermore, I'm going to make the very crude assumption that everyone I meet is between one year younger than me and three years older than me -- and actually, as you'll see, irrelevant. This is surprisingly close to being the truth. This means that there are five years worth of potential award recipients for me to meet, or 700 people. (As I said previously, there are 140 recipients each year.) The total number of 17-year-olds in the country in 2000 is about four million. (There were about twenty million people between the ages of 15 and 19; I'm assuming one-fifth of them were seventeen.) So the fraction of people who were recipients of this award is 700 (the number of recipients in five years) in twenty million (the number of people that were the right age in those five years), or about one in thirty thousand. You'd expect me to have met about one-fortieth of one of these people. I've met three. Just how unlikely is this?

We can compute this using the binomial distribution. I'll let C(n,k) denote the binomial coefficient "n choose k", which is n!/(k! (n-k)!). (Incidentally, these are some of my favorite numbers. Yes, I have favorite numbers.) We can look at the probability that each person I've met is an award recipient, which is about one in thirty thousand; let p = 1/30000. Let q = 1-p be the probability that a given person is not an award recipient. (Note that in probability, q is almost always 1-p. This differs from the convention in number theory, where p is a prime, and q is a prime that isn't p.)

The assumption of independence is a bit sketchy, but each of the three people in question I've met through different people and in different ways, so it seems reasonable. This sort of thing isn't always reasonable; for example, one feature of my current life is that I know a lot of people who went to Moravian College, which seems noteworthy. But I met one of them and then she introduced me to the others, so it's not all that weird.

Anyway, the probability that I've met exactly k award recipients is

P(k) = C(730,k) pk q730-k

and we compute P(0) = 0.9760, P(1) = 0.0238, P(2) = 2.89 10-4, P(3) = 2.33 10-6.

The chances that I've met at least two award recipients are P(2) + P(3) + ... P(730); since all the other terms are ridiculously small in terms of P(2), we'll call it just P(2), and we see that the chances of meeting two award recipients -- assuming that I meet people at random -- is one in 3400. The chances that I meet at least three is very nearly P(3), or one in 420,000 or so.

What do I conclude from this? That I don't meet people randomly. Neither do you. We generally tend to meet people who are like us in terms of socioeconomic status, level of education, age, political leanings, and so on -- all things that are probably correlated with this award. (Yes, I said "political leanings" are correlated with an academic award. I invite you to contradict me.) This is also a problem that occurs in the small world phenomenon (sometimes more popularly known as "six degrees of separation") -- we generally know people that are like us, but somehow we're linked by short chains to people who have absolutely nothing in common with us. I expect I'll write about this in the future.

1 comment:

Mary Pat said...

Hmmm, if the award is what I think it is, I got it, too, back in 1992. I was one of two people from North Carolina who got it.