Statistical Errors in Mainstream Journals

While we frequently on SBM target the worst abuses of science in medicine, it’s important to recognize that doing rigorous science is complex and mainstream scientists often fall short of the ideal. In fact, one of the advantages of exploring pseudoscience in medicine is developing a sensitive detector for errors in logic, method, and analysis. Many of the errors we point out in so-called “alternative” medicine also crop up elsewhere in medicine – although usually to a much less degree.

It is not uncommon, for example, for a paper to fail to adjust for multiple analysis – if you compare many variables you have to take that into consideration when doing the statistical analysis otherwise the probability of a chance correlation will be increased.

I discussed just yesterday on NeuroLogica the misapplication of meta-analysis – in this case to the question of whether or not CCSVI correlates with multiple sclerosis. I find this very common in the literature, essentially a failure to appreciate the limits of this particular analysis tool.

Another example comes recently from the journal Nature Neuroscience (an article I learned about from Ben Goldacre over at the Bad Science blog). Erroneous analyses of interactions in neuroscience: a problem of significance investigates the frequency of a subtle but important statistical error in high profile neuroscience journals.

The authors, Sander Nieuwenhuis, Birte U Forstmann, and Eric-Jan Wagenmakers, report:

We reviewed 513 behavioral, systems and cognitive neuroscience articles in five top-ranking journals (Science, Nature, Nature Neuroscience, Neuron and The Journal of Neuroscience) and found that 78 used the correct procedure and 79 used the incorrect procedure. An additional analysis suggests that incorrect analyses of interactions are even more common in cellular and molecular neuroscience.

The incorrect procedure is this – looking at the effects of an intervention to see if they are statistically significant when compared to a no-intervention group (whether it is rats, cells, or people). Then comparing a placebo intervention to the no-intervention group to see if it has a statistically significant effect. Then comparing the results. This seems superficially legitimate, but it isn’t.

For example, if the intervention produces a barely statistically significant effect, and the placebo produces a barely not statistically significant effect, the authors might still conclude that the intervention is statistically significantly superior to placebo. However, the proper comparison is to directly compare the differences to see if the difference of difference is itself statistically significant (which it likely won’t be in this example).

This is standard procedure, for example, in placebo-controlled medical trials – the treatment group is compared to the placebo group. But what more than half of the researchers were doing in the articles reviewed is to compare both groups to a no-intervention group but not comparing them to each other. This has the effect of creating the illusion of a statistically significant difference where none exists, or to create a false positive type of error (erroneously rejecting the null hypothesis).

The frequency of this error is huge, and there is no reason to believe that it is unique to neuroscience research or more common in neuroscience than in other areas of research.

I find this article to be very important, and I thought it deserved more play than it seems to be getting. Keeping to the highest standards of scientific rigor is critical in biomedical research. The authors do an important service in pointing out this error, and researchers, editors, and peer reviewers should take note. This should, in fact, be part of a check list that journal editors employ to ensure that submitted research uses legitimate methods. (And yes, this is a deliberate reference to The Checklist Manifesto – a powerful method for minimizing error.)

I would also point out that one of the authors on this article, Eric-Jan Wagenmakers, was the lead author on an interesting paper analyzing the psi research of Daryl Bem. (You can also listen to a very interesting interview I did with Wagenmakers on my podcast here.) To me this is an example of how it pays for mainstream scientists to pay attention to fringe science – not because the subject of the research itself is plausible or interesting, but because they often provide excellent examples of pathological science. Examining pathological science is a great way to learn what makes legitimate science legitimate, and also gives one a greater ability to detect logical and statistical errors in mainstream science.

What the Nieuwenhuis et.al. paper shows is that more scientists should be availing themselves of the learning opportunity afforded by analyzing pseudoscience.

FacebookGoogle BuzzDigg<!--<!--LinkedInStumbleUponLiveJournalShare

Related Posts

Comments are closed.