20 September 2007

Broccoli causes lung cancer, or why most science is wrong

Robert Lee Hotz wrote in the Wall Street Journal that most science studies appear to be tainted by sloppy analysis. (My impression here is that "science" means "biomedical science", which is the sort of science that is most reported on by the media, for the simple reason that it is the kind of science which has the most direct effect on people's lives.) I learned about this from Marginal Revolution; NeuroLogica and denialism have picked it up as well; this is all based on some grand meta-analysis by John Ioannidis.

I'd like to share with you an example I made up a few days ago, while I was randomly pontificating. Imagine that for some reason, scientists suspect that there is a link between broccoli and lung cancer. One hundred scientists do one hundred separate studies trying to find this link. Three of them say "there appears to be a positive correlation between broccoli and lung cancer", meaning that they found such a correlation and that it is outside a 95% confidence interval. The media will hear this and will say "look! broccoli causes lung cancer!" and the broccoli industry will be very unhappy. (The elder George Bush will be happy, though.) But if there were no correlation, you'd expect five of the hundred scientists to have found this correlation just by sheer luck! The fact that only three of them found it is evidence towards broccoli not causing lung cancer.

But you'll never hear that on the evening news, because the news people want you to think you're sick, or you're going to get sick. Their audience is aging and their main advertisers are drug companies.

A slightly less toy example can be found in an old Marginal Revolution post.

A bit more seriously, it seems like a lot of people are tinkering with their data too much:
Statistically speaking, science suffers from an excess of significance. Overeager researchers often tinker too much with the statistical variables of their analysis to coax any meaningful insight from their data sets. "People are messing around with the data to find anything that seems significant, to show they have found something that is new and unusual," Dr. Ioannidis said.

But one in twenty things that are "neither true nor false" will appear true. Perhaps it makes sense to require a wider confidence interval for results which are obtained by this sort of "data mining" than for the results which one originally intended to find? It's turning out that a lot of results are not reproducible.

Although I don't presume to be qualified to speak for how medical researchers should do their work, it seems to me that perhaps they need more forums to report negative results. In the broccoli-and-lung-cancer example, I suspect that the researchers publishing the three papers with positive results wouldn't know about enough of the negative results to make them doubt their claim. As Steven Novella points out, the fact that "most published research is wrong" is probably a combination of this lack of such forums and something like my example.

There are growing suggestions that this would even be useful in mathematics, where you'd think we wouldn't need it because we can prove our results beyond a shadow of a doubt. But we don't publicize our negative results -- we don't publish papers saying "I thought proposition X might be true, and here's why, but then I came up with this counterexample", although we might say these things in informal conversation. So there's still probably a tremendous duplication of work. Some duplication of work is probably desirable, even in mathematics; different people will have different perspectives on the same problem. But a lot of people probably have the sense that they are going down an already-explored dead end and it would be nice if they had at least some ability to confirm or refute that. This can only be more important when we're talking about the sort of research where lives are at stake.

10 comments:

Anonymous said...

Several issues at play here:
1. The pressures of the marketplace are forcing biochemical scientists to produce reports based on statistical significance. This happens a lot on the blind studies, double blind studies etc. Your work as a scientist was to develop the drug and you spent a lot of money, time and effort doing it. Now that the drug needs to be put out on the market, comes the role of the statisticians. In the case of drugs, you will find that in many cases the benefit of a drug is only marginally significant compared to placebo. However, you can prove that this marginal benefit is "statistically significant" for doing so, you create a new study and change the statistics, because you can't change the drug.
2. The press lives off news. They need them for survival. Most of the AP articles you will see on science are clearly not written by people with scientific knowledge, only with knowledge of what "will sell". You see, the more hits your story gets, the happier you feel as a journalist (you must know this as a blogger).

Anonymous said...

I thought a bit about this while I was writing my master's thesis.

The problem is you can't write a thesis on a negative result because you just don't publish that sort of thing. "Yeah, it didn't work" is not a master's thesis.

If it hadn't worked, I would have had to write something essentially dishonest about how, while it didn't actually work, it was still highly useful in contrived situation a under wrong assumptions b and c.

The problem has to be even worse for a ph.d. dissertation where you are required to produce something that actually moves the art forward.

Alfredo said...

"One hundred scientists do one hundred separate studies trying to find this link. Three of them say "there appears to be a positive correlation between broccoli and lung cancer", meaning that they found such a correlation and that it is outside a 95% confidence interval."

Not so. Presumably each scientist has looked at two groups of people, one who periodically eats broccoli, and one who doesn't. And he has looked at the incidence of cancer in both groups. After discounting other effects like smoking, the scientist finds that the difference in the incidence of cancer between the two groups is statistically significant. Each scientist, looking at different samples, will reach the same conclusion. It is not at all the case that 95 scientists will find a correlation, and 5 will not. What the statistical test does is answer the question, given the fact that you are looking at a finite sample from a larger population, how likely is it that the sample is representative of the larger population as a whole?

Isabel said...

Anonymous (2:47 pm),

I'm familiar with the situation of having to write up a negative result. It's never happened on anything big of mine (but I'm sure it will someday) but I distinctly remember having that problem in chemistry labs in college (I double-majored in math and chemistry) and I remember writing up more than my fair share of lab reports where I desperately tried to salvage my crappy experimental data.

I say "more than my fair share" because my experiments rarely worked, which is why I got out of chemistry.

Sharm said...
This comment has been removed by a blog administrator.
John Armstrong said...

Anonymous 2:47 PM, you could write that for your thesis in physics...

Anonymous said...

I think your mathematics bias is showing.

Most scientists are interested in getting grant money, not finding some kind of absolute scientific truth. Most funding institutions are interested in studies that support whatever agenda they are interested in, not in "scientific truth."

Mathematicians routinely fall into the fallacy of assuming that because their field is focused on "objective truth" that other fields and other people also care about "objective truth." Indeed, this myopia among mathematicians is probably one of the main reasons the field is the least well-funded of any science.

In sum, noone *cares* whether broccoli does or does not cause cancer, or whatever other absurd correlation the soft scientists like to find and announce. P-values are useful not in *spite* of the fact that they can be used to prove anything at all - but *because* of that fact. The data massaging to reach a p-value is handy to funding agencies.

Thus, all the statisticians and mathematicians ranting about misused p-values are just utterly misunderstanding how real science and real politics works in the "real world" rather than in some textbook proof.

John Armstrong said...

Anonymous 7:40 AM: if this be myopia, make the most of it!

Isabel said...

anonymous 7:40 AM,

Sure, the search for objective truth might not be how science works. But it's how it should work, I think, and how it does work at its best. Furthermore, a lot of science is funded through tax dollars; in some sense, then, don't the scientists owe the people who will be the beneficiaries of their research results that are true, since those people will be sticking certain things in their mouth on the basis of those results? Of course there will be errors. But that doesn't mean that scientists should ignore those errors.

Buy thesis said...

this kind of blog always useful for blog readers, it helps people during research. your post is one of the same for blog readers.

Masters Thesis Writing