Is Replication the Answer?

One question that is forced on us by the publication of papers like Daryl Bem’s Feeling the Future is what went wrong and how it can be fixed.

One demand that often arises is for replication. It is one of the standard demands made by interested skeptics in forums and such places. I can understand why calling for replication is seductive.
It is shrewd and skeptical. It says: Not so fast, let’s be sure first while at the same time offering a highly technical criticism. Replication is technical jargon, don’t you know?. On the other hand it’s also nice and open-minded. It says: This is totally serious science and some people who aren’t me should spend a lot of time on it.
And perhaps most important of all, it requires not a moments thought.

Cynicism aside, replication really is important. As long as a result is not replicated it is very likely wrong. If you don’t replicate you’re not really generating knowledge. Not only can you not rely on the results, you also lose the ability to determine if you are using good methods or are applying them correctly. Which I’d speculate will decrease reliability still further over time.

Replication is essential but is replication really all that is needed?

Put yourself in the shoes of a scientist. You have just run an experiment and found absolutely no evidence that people can see the future.  That’s going to be tough to publish.
Journals are sometimes criticized for being biased against negative results but the simple fact is that they are biased against uninteresting results. Attention is a limited quantity; there’s only so much time in a day that can be spent reading. Most ideas don’t work out and so it is hardly news when an idea fails in experiment. Think for an example of all the chemicals that are not drugs of any kind.

Before computers and the information age it probably wouldn’t even have been possible to handle all the information about failed ideas. Things have changed now but the scientific community is still struggling to incorporate these new possibilities. However, one still can’t expect real life humans to pay attention to evidence of the completely expected.

Now you could try a new idea and hope that you have more luck with that.
Or you could do what Bem did and work some statistical magic on the data. And by magic I mean sleight of hand. The additional work required is much less and it is almost certain to work.
The question is simply if you want to advance science and humanity or your career and yourself.

If you go the 2nd route, the Bem route, your result will almost certainly fail to replicate.

So you might say that replication, if it is attempted solves the problem. Until then you have a confused public by premature press reports, perhaps bad policy decisions, and certainly a lot of time wasted trying to replicate the effect. Establishing that an effect is not there always takes more effort than just demonstrating it.

To this one might say that the nature of science is just so, tentative and self-correcting. Meanwhile the original data magician, our Bem-alike, has produced a publication in a respectable journal, which indicates quality work, and received numerous citations (in the form of failed replications), which indicates that the paper was fruitful and stimulated further research. These factors, number of publications, reputation of journal and number of citations are usually used to judge the quality of work by a scientist in some objective way.

Eventually, if replication is all the answer needed, one should expect science to devolve into producing seemingly, amazing results that are then slowly disproven by subsequent failed replications. Any of that progress we have come to expect would be merely an accidental byproduct.

The problem might be said to lie rather in judging scientists in such a way. Maybe we should include the replicability of results in such judgments. But now we’re no longer talking about replication as the sole answer. We’re now talking about penalizing bad research.

And that’s the point. Science only works if people play by the rules. Those who won’t or can’t must be dealt with somehow. In the extreme case that means labeling them crackpots and ostracizing them.
But there’s less extreme examples.

The case of the faster than light neutrinos

You probably have heard that some scientists recently announced that they had measured neutrinos to go faster than light. This turned out to be due to a faulty cable.

This story is currently a favorite of skeptics who pointed out that few physicists took the result seriously, despite the fact that it was originally claimed that all technical issues had been ruled.. It makes a good cautionary tale about how implausible results should be handled and why. Human error is just always possible and plausible.

There’s another chapter to this story, one that I fear will not get much attention.

The leaders of the experiment were forced to resign as a consequence of the affair.

There were very many scientists involved in the experiment due to the sheer size of the experimental apparatus. Among them, there was much discontentment about how the results were handled. Some said that they should have run more tests, including the test that found the fault, before publishing. Which means, of course, that they shouldn’t have published at all.

It is easy to see how a publish-or-perish environment that puts a premium on exciting results encourages not looking too closely for faults. But what’s the alternative? No incentive to publish equals no incentive to work. No incentive for exciting results just cements the status quo and hinders progress.


1 Comment

  1. April 30, 2012 at 11:41 am

    […] I have written about how replication cannot be the whole answer before. In a nutshell, by cunning abuse of statistical methods it is possible to give any mundane and boring result the impression of showing some amazing, unheard of effect. That takes hardly any extra work but experimentally debunking the supposed effect is a huge effort. It takes more searching to be sure that something is not there than to simply find it. For statistical reasons, an experiment needs more subjects to “prove” the absence of an effect with the same confidence as finding it. But there’s also that there might be some difference between the original experiment and the replication that explain the lack of effect. In this case it was claimed that maybe the 3 failed because they did not believe in the effect. It takes just seconds to make such a claim. Disproving it requires finding a “believer” who will again run an experiment with more subjects that the original. […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: