Here's something interesting from a New York Times poll a couple weeks ago. People were asked what percentage of all Americans are black. Results include that 8 percent of whites, and 17 percent of blacks, guessed that more than 50 percent of all Americans are black. (It's question 80 in the poll.)
The actual figure, from the 2006 census estimates, is 12.4 percent. (If you had asked me, I would have probably said twelve percent, which is the figure I learned quite some time ago.)
Jordan Ellenberg, who linked to this poll, asks whether people are ignorant of what "50 percent" means, or whether they're ignorant of the actual makeup of the United States population. I'm not sure how to answer this.
But I'd be interested to know how people's guesses of the percentage of the population which is black are correlated with the percentage of the population in their immediate area which is black. People probably expect that the people around them are representative of the general population, because psychologically we may be wired that way; numbers, even numbers obtained from counting millions of people, just don't have the same psychological impact as the faces you see while walking down the street. (You might have to factor in some other things, though, such as people's choice of television shows, movies, etc.; subconsciously we might not be that good at distinguishing between people that we're seeing on television and people we're seeing in reality.)
Similar questions could be asked in other populations. For example, if you ask Philadelphians about the racial distribution of Philadelphia, what do they say? For black and white people, the answer is 44.3% black, 41.8% white, from this Census Bureau page with a ridiculously long URL. But most Philadelphians live in neighborhoods that are mostly black or mostly white, so I suspect you'd get a lot of extreme answers.
Although the extreme answers might not correspond to what people actually see day to day! There may be people living in mostly-white neighborhoods who think most Philadelphians are white, or people living in mostly-black neighborhoods who think most Philadelphians are black. But you might also see people living in mostly-white neighborhoods who feel like their neighborhood is one of the only places where white people live, and guess that the city is mostly black, or vice versa. (Note to people who know anything about Philadelphia -- I am not saying that such neighborhoods exist, or that I know which ones they are. I'm just saying I can imagine them.)
Yes, in my secret other life I want to study things like that.
Showing posts with label demographics. Show all posts
Showing posts with label demographics. Show all posts
29 July 2008
26 July 2008
Bill Rankin's population density graphs
Last week I wrote a post about population densities.
Take a look at the interesting graphs at Bill Rankin's Radical Cartography; they show how population density is related to:
Take a look at the interesting graphs at Bill Rankin's Radical Cartography; they show how population density is related to:
- racial and ethnic groups (American Indians and Alaska Natives, not surprisingly, live at the lowest population densities; what surprised me was the large amount of Hispanic population at between 1 and 10 per square mile, which Rankin says might correspond to ranchers);
- age. Roughly speaking, people ages 18 to 39 or under 5 are overrepresented at "high" densities (above 4000 or so), and other ages are overrepresented at "low" densities (below that same cutoff). This is, I suspect, a reflection of people moving to the city when they leave their parents house, and then leaving the city when it's time for their kids to go to school.
- income is highest at suburban and central-city densities, with a valley in between. Not surprising; in general the central part of a city is rich, it's surrounded by poorer neighborhoods, and then eventually income starts going up again. Rural places are poor as well.
- gender -- there are more women at high density, which I can't explain.
- population and area -- I tried to make a plot like this but had some trouble, because I was just playing around with output from another web site and didn't have the raw data.
17 July 2008
Population densities vary over nine orders of magnitude
The United States has an area of 3,794,066 square miles, and a population, as of the 2000 census, of 281,421,906. This gives a population density of 74.2 people per square mile.
But what is the average population density that Americans live at? It's not 74.2 per square mile. Only about 11 percent of Americans live in census block groups (the smallest resolution the census goes down to; there are about 200,000 of these, corresponding to about 1,500 people each) lower than this density. That's not too surprising; that average includes lots of empty space.
But the median American, it turns out, lives in a block group with a density of 2,521.6 per square mile. At least, when I asked the web site I was using for the distribution of block groups by population density that's what it said; the front page says this number is 2,059.23. I suspect the smaller number is actually the median population density of block groups, not of individuals; the block groups tend to have lower populations in less dense areas, which explains the difference. This number was surprisingly high to me, and seems to illustrate how concentrated population is.
In case you're wondering, the most densely populated block group is one in New York County, New York -- 3,240 people in 0.0097 square miles, for about 330,000 per square mile. The least dense is in the North Slope Borough of Alaska -- 3 people in 3,246 square miles, or one per 1,082 square miles. The Manhattan block group I mention here is 360 million times more dense than the Alaska one; population densities vary over a huge range.
Here's a table; in the first row is a percentile n, in the second row the population density such that n% of Americans live in a block group with that density (in people per square mile) or less. (Generating such a table at fakeisthenewreal.com is slow, which is why I'm providing it here.)
I hesitate to interpret this. But I must admit that I'm curious if demographers have some way of predicting the general shape of this data. It's clear in the US that more people live at "intermediate" densities than at very high or low ones -- but that's not exactly a meaningful statement.
(Facts from fake is the new real, crunching Census Bureau data.)
By the way, Wikipedia has an article entitled list of U. S. states by area. This includes an almost entirely useless map which colors the larger states darker. I can see which states are larger without the colors, because they're larger, which is kind of the point of a map. The area the state takes up on my screen should be proportional to its actual area.
But what is the average population density that Americans live at? It's not 74.2 per square mile. Only about 11 percent of Americans live in census block groups (the smallest resolution the census goes down to; there are about 200,000 of these, corresponding to about 1,500 people each) lower than this density. That's not too surprising; that average includes lots of empty space.
But the median American, it turns out, lives in a block group with a density of 2,521.6 per square mile. At least, when I asked the web site I was using for the distribution of block groups by population density that's what it said; the front page says this number is 2,059.23. I suspect the smaller number is actually the median population density of block groups, not of individuals; the block groups tend to have lower populations in less dense areas, which explains the difference. This number was surprisingly high to me, and seems to illustrate how concentrated population is.
In case you're wondering, the most densely populated block group is one in New York County, New York -- 3,240 people in 0.0097 square miles, for about 330,000 per square mile. The least dense is in the North Slope Borough of Alaska -- 3 people in 3,246 square miles, or one per 1,082 square miles. The Manhattan block group I mention here is 360 million times more dense than the Alaska one; population densities vary over a huge range.
Here's a table; in the first row is a percentile n, in the second row the population density such that n% of Americans live in a block group with that density (in people per square mile) or less. (Generating such a table at fakeisthenewreal.com is slow, which is why I'm providing it here.)
Percentile | 5 | 10 | 20 | 30 | 40 | 50 |
Density | 29.3 | 64.9 | 226.9 | 677.5 | 1499.8 | 2521.6 |
Percentile | 60 | 70 | 80 | 90 | 95 | |
Density | 3737.2 | 5257.1 | 7529.0 | 13261.9 | 24219.5 |
(Facts from fake is the new real, crunching Census Bureau data.)
By the way, Wikipedia has an article entitled list of U. S. states by area. This includes an almost entirely useless map which colors the larger states darker. I can see which states are larger without the colors, because they're larger, which is kind of the point of a map. The area the state takes up on my screen should be proportional to its actual area.
14 August 2007
how many Asians are there in Philadelphia?
In today's Philadelphia Inquirer: Asian market interests banks, about how there are banks in Asian-American enclaves of Philadelphia (such as Chinatown) that specifically target that market. There are two major reasons why this market has been poorly served by traditional banks, according to the article: the language barrier, and the fact that many Asian-Americans don't have as much credit history or financial documentation as other people in similar financial situations, because they tend to deal more in cash.
The online version of the article doesn't include the following chart of ten-county population by race, attributed to the American Community Survey of the U. S. Census Bureau, which caught my interest:
The point of the table, of course, is to illustrate that the Asian market has been growing in this area faster than the population as a whole, and that therefore it is logical that banks would be reaching out to this community.
Now, when I read this, something seemed suspicious. The population in every category except the neglible "American Indian and Alaska Native" one went up by at least 2.76%. But the total population went up by only 1.41%. Something is wrong here.
If you then add up the first six entries in each column, you get 5,582,372 and 5,836,718, respectively. For some strange reason, the sum of the racial numbers is 85,318 less than the total population in 2002, but 89,365 more than the total population in 2005. The obvious explanations are the following:
Usually on tables like this, one sees some note saying that the various pieces of population don't add up to the total population, due to certain minor categories not being listed, or due to people being included in more than one category. But there is no such note.
I had actually been anticipating something like Simpson's paradox, but it doesn't seem to be anything that complicated.
By the way, the ten-county area here is Bucks, Delaware, Chester, Montgomery, and Philadelphia counties in Pennsylvania; Camden, Burlington, Gloucester, and Atlantic counties in New Jersey; and New Castle county in Delaware. This strikes me as being slightly "off"; I would replace Atlantic with either Mercer or Salem. I suspect this doesn't make a huge difference in the data; however, it might be an example of "selective reporting", where one draws the boundaries in a nonstandard way to make some sort of point. The census bureau itself uses a nine-county area (the first eight above, plus Salem County, New Jersey); I'd say the most common definition of the Philadelphia area is as the first eight counties above.
The online version of the article doesn't include the following chart of ten-county population by race, attributed to the American Community Survey of the U. S. Census Bureau, which caught my interest:
2002 | 2005 | Percentage change | |
White | 3928263 | 4036597 | 2.76% |
Black or African American | 1115683 | 1167451 | 4.64% |
American Indian and Alaska Native | 8817 | 8848 | 0.35% |
Asian | 209246 | 250552 | 19.74% |
Native Hawaiian and Other Pacific Islander | 1262 | 1768 | 40.10% |
Hispanic or Latino | 319101 | 371502 | 16.42% |
Total population | 5667690 | 5747353 | 1.41% |
The point of the table, of course, is to illustrate that the Asian market has been growing in this area faster than the population as a whole, and that therefore it is logical that banks would be reaching out to this community.
Now, when I read this, something seemed suspicious. The population in every category except the neglible "American Indian and Alaska Native" one went up by at least 2.76%. But the total population went up by only 1.41%. Something is wrong here.
If you then add up the first six entries in each column, you get 5,582,372 and 5,836,718, respectively. For some strange reason, the sum of the racial numbers is 85,318 less than the total population in 2002, but 89,365 more than the total population in 2005. The obvious explanations are the following:
- "Hispanic or Latino" is being treated differently than the other items here, which is actually fairly common in discussions of race in America; basically if your ancestors come from somewhere Spanish-speaking, then you're somehow outside of the usual "race" categories.
- A lot more people have suddenly started to identify as multiracial between 2002 and 2005. A lot. Three percent of the total population. I'm aware that a lot more people have started to identify as multiracial recently instead of feeling like they have to pick one identity or the other, but it doesn't seem like it could happen that fast.
Usually on tables like this, one sees some note saying that the various pieces of population don't add up to the total population, due to certain minor categories not being listed, or due to people being included in more than one category. But there is no such note.
I had actually been anticipating something like Simpson's paradox, but it doesn't seem to be anything that complicated.
By the way, the ten-county area here is Bucks, Delaware, Chester, Montgomery, and Philadelphia counties in Pennsylvania; Camden, Burlington, Gloucester, and Atlantic counties in New Jersey; and New Castle county in Delaware. This strikes me as being slightly "off"; I would replace Atlantic with either Mercer or Salem. I suspect this doesn't make a huge difference in the data; however, it might be an example of "selective reporting", where one draws the boundaries in a nonstandard way to make some sort of point. The census bureau itself uses a nine-county area (the first eight above, plus Salem County, New Jersey); I'd say the most common definition of the Philadelphia area is as the first eight counties above.
23 July 2007
life expectancies
The Numbers Behind Life Expectancy, from Carl Bialik's "The Numbers Guy" column at the Wall Street Journal.
Michael Moore said in Sicko that life expectancy is higher in Cuba than in the U.S.; CNN says it's the other way around; it turns out that so much computation goes into these calculations that there's probably a substantial amount of error. You might naively think that if life expectancy is, say, 77 years, that means that the average person born 77 years ago (in 1930) is just now getting around to dying. But the problem is that medical care isn't static, so this doesn't tell us how long people being born now should expect to live. So what's actually done is that one looks at how many people of age N in, say, 2005 survive to age N+1 (in 2006), and then these are chained together to tell us how many people would live to, say, age 80 if medical care remained as it is today and so the mortality rates remained constant. Basically, life expectancy is a moving target, because medical care changes substantially during a single person's life.
However, although the number "77" might not be that meaningful, I would guess that differences between those numbers for different populations which had been computed in the same way are valid to look at. A society where this number is 80 is probably healthier than one where it's 70.
But as many people point out, the Cuban statistics might not reflect what's actually going on in that country. It's difficult to know for sure.
Also, this means that the low life expectancies for countries in sub-Saharan Africa which have been affected by the AIDS epidemic are probably lower than one would naively expect; one hopes that the AIDS epidemic won't keep killing people at the same rates that it is now.
Michael Moore said in Sicko that life expectancy is higher in Cuba than in the U.S.; CNN says it's the other way around; it turns out that so much computation goes into these calculations that there's probably a substantial amount of error. You might naively think that if life expectancy is, say, 77 years, that means that the average person born 77 years ago (in 1930) is just now getting around to dying. But the problem is that medical care isn't static, so this doesn't tell us how long people being born now should expect to live. So what's actually done is that one looks at how many people of age N in, say, 2005 survive to age N+1 (in 2006), and then these are chained together to tell us how many people would live to, say, age 80 if medical care remained as it is today and so the mortality rates remained constant. Basically, life expectancy is a moving target, because medical care changes substantially during a single person's life.
However, although the number "77" might not be that meaningful, I would guess that differences between those numbers for different populations which had been computed in the same way are valid to look at. A society where this number is 80 is probably healthier than one where it's 70.
But as many people point out, the Cuban statistics might not reflect what's actually going on in that country. It's difficult to know for sure.
Also, this means that the low life expectancies for countries in sub-Saharan Africa which have been affected by the AIDS epidemic are probably lower than one would naively expect; one hopes that the AIDS epidemic won't keep killing people at the same rates that it is now.
Subscribe to:
Posts (Atom)