“Extraordinary claims require extraordinary evidence” is a common skeptical quote and there has already been a lot written about it.
So rather than reinvent the wheel and talk about the history of the statement or give some abstract justification I am going to give an example of how it is applied.
A Medical Example
Think of a medical test like a HIV test or a pregnancy test. Such tests can go wrong. A pregnancy test could say you or your partner is pregnant when she is not. Or it may fail to say so when she actually is. There can be a false positive (aka Type I error) or false negative (aka Type II error). Such medical tests are extensively tested themselves before being marketed. It is therefore well-known how often they are wrong on average.
For our example we will imagine a test that has a 5% chance of a false positive and to keep things simple we will ignore false negatives. What happens when we apply that test to 10.000 people, 20 of whom have the disease and 9.980 who don’t?
We get 20 true positives and 9.980*5%= 489 false positives for a total of 519 positives. That means that although the test proudly proclaims to show a false positive in only 5% of cases, we found that of all the positives less than 4% are true.
Of course, in reality one will only know the test results and not what the actual truth is but one may have a good idea from previous experience. And, of course, one will rarely get a result that is exactly average but let’s not get into probability theory.
Now let’s say we test 10.000 people of whom 1.000 have the disease. We find 1.450 positives, 450 of which are false (9.000*5%). This time over 2/3rds of the positives are true.
In both cases we have the same evidence, namely a positive test result, but in the first case it is very, very likely false but in the second case probably true.
Such situations are encountered frequently in medicine. It is the reason that only risk populations are screened for diseases. Screening everyone would produce an overwhelming number of false positives that would cause needless distress.
Think back to the first example where we had 519 positives. If we apply another test, even a better one with only a 1% rate of false positives we would still get 5 false positives besides the 20 true ones. So even though we applied 2 tests, one with a 5% rate of false positives and another with a 1% rate we still only have 80% true positives.
There’s an implication here. Have you ever heard of a case where a tumor was suddenly gone? Well, maybe it wasn’t there in the first place. Even though medicine is well aware of this logic and compensates for it by using tests that are very reliable we cannot, as a matter, of principle achieve certainty. Especially since real life also throws us mixed up paperwork and human error besides any technical faults.
From The Medical To The General
The same logic can be applied to real life in a very straight-forward way. Say someone claims to have developed a perpetual motion machine. Many people have claimed that in the past but it never panned out. This is so extreme that patent offices these days refuse to review patents on such machines.
So in the very least we must assume that there are 1.000s if not 10.000s of perpetual motion claims that are untrue for every one that is true even if such a thing is possible.
But what does a test look like? Typically there will be a demonstration where the machine is shown in action. This will at least prove that there is some sort of machine. Any claims that exist merely in the form of an april fools press release or suchlike will fall to the wayside. So we can say that this is a true test in that it may be either negative or positive.
On the other hand, a demonstration only proves that some machine exists, it does not prove that this machine is perpetual, so the rate of false positives must be very high indeed.
The next test might be allowing an engineer or physicist to inspect the machine. How weighty is that evidence? Hard to say. If it is a scam, that person may simply be in on it. And even if not, that person may be fooled. How likely is that? That depends very much on how much leeway he or she is allowed. Your chances of seeing through a magic trick while sitting in an audience are pretty much nil but very high if you have free reign to roam backstage and to set up cameras etc. Don’t mistake knowing how a trick is done with actually seeing through it.
At what point would it be reasonable to believe in perpetual motion? Generally, when a radical change to something considered a law of nature is proposed one would like to have many independent replications in different labs and by different groups. In the case of a perpetual motion machine this should be quite easy.
An entirely different example of an extraordinary claim is Bigfoot. There are many, many species around the world. There’s even a fair number of large mammals that are different enough to be easily distinguishable by an amateur. There also are many fictitious beasts. I couldn’t put a number on it but the ratio of fictitious to real can’t be all that bad.
Two or three hundred years ago one would surely have considered the report of some reasonable credible person as sufficient to establish the existence of Bigfoot. So what has changed?
Nowadays, there are people everywhere. Zoologists have scoured every corner of the globe for new species. Discovering a large mammal means fame eternal. But there simply can’t be many left, especially not in North America. One needn’t even find a live specimen, with modern DNA techniques, simply finding a few hairs, feces or bones is enough.
Every expedition that fails to find Bigfoot, or even every Hiker, equals a negative test result. Of course, there can be false negatives. Just because you failed to find something doesn’t mean it is not there but it does mean that it is less likely.
You probably have a usual spot for your keys. But when you don’t see them there you look somewhere else. Before you looked, you thought it most likely that you’d find them there but when you didn’t see them you adjusted the probabilities. You’ve probably made the experience of finding something in a spot where you had searched before, even exhaustively. There you were a victim of a false negative. Every test can throw a false result!
Back to Bigfoot, if we think of every person who might have found evidence of Bigfoot as a test then we have a massive lot of negative results. Yet we don’t know if that means anything because we should expect a lot of false negatives even if Bigfoot exists. At the same time we also have a few positive results. People who reported seeing Bigfoot or who found Bigfoot tracks, there’s even Bigfoot films. These might be false positives, though. Misidentifications, hoaxes and the like.
On balance what carries more weight, positives or negatives? The answer is, as you might have guessed, the negatives. There are no DNA sample, no carcass, evidence which would be all but conclusive. And we should have such evidence if even a fraction of the positives, IE the tracks, the sightings, especially the film, were true.
You’re probably thinking that the medical analogy, this talk about tests gets quite strained here. Well, it’s what I’m thinking anyways, so let’s just let it be. There’s just one more thing I want to mention.
Think about what would happen if I tested 100 healthy people with a test that has a 5% rate of false positives. You get 5 positives, of course (on average). Test 100 more and you have 10. Test 1.000 more and you have 110! Why! You have an epidemic on your hand!
At least that’s how it might seem to a casual observer. The point is, it only takes dedication to have a growing pile of evidence for anything. So whenever you hear of a “statistically significant result”, and newspapers are fully of them, remember that.
Science as a whole has ways of dealing with that and will probably not be fooled for long, at least, if reasonably unbiased people take an interest. The lasting facts tend to emerge more slowly, over the course of several years, or even decades. They rarely make newspaper headlines because there is no one event that could be reported on.
I hope this was enough to bring home the principle behind “extraordinary claims require extraordinary evidence”.
The important thing to take away is that claims are not different by their nature but by what we know. Ordinary claims can become extraordinary once counter-evidence piles up and vice versa.
One might equally well say: “Look at all the evidence!”
Don’t just look at the demonstration for that perpetual motion machine, look at all the identical claims in the past that have failed. Don’t just look at those Bigfoot tracks, look at what’s not been found.
We already know a thing or two. There’s nothing open-minded about ignoring that for the sake of evidence that may or may not materialize.