Showing posts with label media. Show all posts
Showing posts with label media. Show all posts

20 September 2007

Broccoli causes lung cancer, or why most science is wrong

Robert Lee Hotz wrote in the Wall Street Journal that most science studies appear to be tainted by sloppy analysis. (My impression here is that "science" means "biomedical science", which is the sort of science that is most reported on by the media, for the simple reason that it is the kind of science which has the most direct effect on people's lives.) I learned about this from Marginal Revolution; NeuroLogica and denialism have picked it up as well; this is all based on some grand meta-analysis by John Ioannidis.

I'd like to share with you an example I made up a few days ago, while I was randomly pontificating. Imagine that for some reason, scientists suspect that there is a link between broccoli and lung cancer. One hundred scientists do one hundred separate studies trying to find this link. Three of them say "there appears to be a positive correlation between broccoli and lung cancer", meaning that they found such a correlation and that it is outside a 95% confidence interval. The media will hear this and will say "look! broccoli causes lung cancer!" and the broccoli industry will be very unhappy. (The elder George Bush will be happy, though.) But if there were no correlation, you'd expect five of the hundred scientists to have found this correlation just by sheer luck! The fact that only three of them found it is evidence towards broccoli not causing lung cancer.

But you'll never hear that on the evening news, because the news people want you to think you're sick, or you're going to get sick. Their audience is aging and their main advertisers are drug companies.

A slightly less toy example can be found in an old Marginal Revolution post.

A bit more seriously, it seems like a lot of people are tinkering with their data too much:
Statistically speaking, science suffers from an excess of significance. Overeager researchers often tinker too much with the statistical variables of their analysis to coax any meaningful insight from their data sets. "People are messing around with the data to find anything that seems significant, to show they have found something that is new and unusual," Dr. Ioannidis said.

But one in twenty things that are "neither true nor false" will appear true. Perhaps it makes sense to require a wider confidence interval for results which are obtained by this sort of "data mining" than for the results which one originally intended to find? It's turning out that a lot of results are not reproducible.

Although I don't presume to be qualified to speak for how medical researchers should do their work, it seems to me that perhaps they need more forums to report negative results. In the broccoli-and-lung-cancer example, I suspect that the researchers publishing the three papers with positive results wouldn't know about enough of the negative results to make them doubt their claim. As Steven Novella points out, the fact that "most published research is wrong" is probably a combination of this lack of such forums and something like my example.

There are growing suggestions that this would even be useful in mathematics, where you'd think we wouldn't need it because we can prove our results beyond a shadow of a doubt. But we don't publicize our negative results -- we don't publish papers saying "I thought proposition X might be true, and here's why, but then I came up with this counterexample", although we might say these things in informal conversation. So there's still probably a tremendous duplication of work. Some duplication of work is probably desirable, even in mathematics; different people will have different perspectives on the same problem. But a lot of people probably have the sense that they are going down an already-explored dead end and it would be nice if they had at least some ability to confirm or refute that. This can only be more important when we're talking about the sort of research where lives are at stake.

12 July 2007

can scientists make up their minds?

How should unproven findings be publicized? at Statistical Modeling, Causal Inference, and Social Science, via Notional Slurry.

A year or so ago it was claimed by Satoshi Kanazawa that, roughly, attractive people are more likely to have daughters than sons. I've also heard recently that successful people are more likely to have sons; for example, something like sixty percent of all the children of U.S. presidents have been male. The mass media have recently picked up this story. Andrew at Statistical Modeling, Causal Inference, and Social Science criticized this finding; you can read his commentary to find out why, but he believes that the findings are statistical artifacts. However, he thinks there may be some truth to the conjectures

Of course, any given mass media outlet isn't going to report "maybe attractive people have more daughters than sons". They'll report one of the two following things:

  • Scientists say that attractive people have more daughters than sons.

  • Scientists say that there is no link between attractiveness and how many daughters you have.


Then in the first case they'd go and find an attractive couple that had, say, three sons and no daughters, to "disprove" this.

This reminds me of the way that the mass media treats, say, dark chocolate. Chocolate's supposed to be bad for you, because it is full of fat and sugar. But it's also supposed to be good for you because it contains certain antioxidants. One day they'll report one thing, one day they'll report the other. You know what I do? I ignore all the studies and eat chocolate, because I like it. (I happen to prefer dark chocolate to milk chocolate, which is probably a good thing in terms of health, but my preference is motivated purely by taste.)

This probably leads laypeople to have the idea that scientists are constantly changing their minds (which is true -- good scientists change their minds as new evidence comes in, or as new interpretations for old evidence become clear). I fear, though, that this may also lead to laypeople distrusting science -- if they can't even decide whether chocolate is "good for you" or "bad for you", what good are they?

But science is more complicated than that -- no food is entirely "good" or "bad", and so on.

What I'd like to see -- and perhaps it's already out there -- is more reporting of the meta-literature. Not "one group of scientists said today that chocolate is good", or "another group of scientists said today that chocolate is bad", but "some scientist looked at what all the other scientists said, and most of them think chocolate is bad". But you're not going to hear that on the evening news because the chocolate maker advertises there. But what about aggregating all that stuff online?

But I think most people will want a one-bit answer -- "yes" or "no". And it's more complicated than that. Some studies don't even give you a whole bit of information -- they tell you "probably yes". And a bunch of these "probably yes" answers can add up to a "yes". But not if everybody's working in isolation. This applies to ordinary life as well -- and I believe that it would be useful if there were some way that all the anecdotal experiences of people with, say, a particular company could be aggregated into something statistically significant. But that's a matter for another post.