Feeling the Future: Part 2

In my first post on Feeling the Future, I discussed mainly how it’s misuse of statistics related to science in general. I said little about how exactly the statistics were misused. My thinking was that a detailed examination would be too boring for the average reader.
I still think that but nevertheless I will spread out exactly how we know that Bem misused statistics.

The Problem Explained

The good news is that you don’t need to know statistics to understand this problem. You surely know games that use dice. Something like monopoly for example. In that game you throw 2 dice that tell you how far to move.
What if someone isn’t happy with the outcome and decides to roll again? That’s cheating!

Even small kids intuitively understand that this is an advantage, something that skews the outcome in a direction of one’s choosing. There’s no knowledge of statistics or probability theory necessary. While the outcome of each roll is still random,there’s a non-random choice involved.

If you roll 3 dice and pick 2. That’s the same thing, right?

How about we roll 4 and then pick 2 but with the stipulation that the 2 remaining dice must be used on the next? Again there’s a choice involved. Within limitations the player can choose
how to move which allows an advantage. The player’s moves are not longer random.
This is despite the fact that the dice rolls are random and none are discarded.

Now we’re ready to get to Feeling the Future.
The results presented were very unlikely to have arisen by chance. Therefore, the argument goes, they probably didn’t arise by chance. Which means there must have been some unknown factor influencing the outcome.

You may realize that this is a shaky argument. Just because something is unlikely does not mean it doesn’t happen. The impossible doesn’t happen but the unlikely, by definition, must and does. The unlikely is set apart from the likely merely by happening less often.
Then again, the impossible is only impossible as far as we know. And that we’re wrong on something is at best unlikely, if that. In reality, as opposed to in mathematics, we’re always dealing with probability judgements, never with absolutes.
In other words, that argument is all we have. It is used in the same way in almost every scientific experiment.

So the argument is solid enough. In fact, I believe that there is something other than chance involved. Of course, dear reader, if you didn’t know that already you must have skipped the beginning of the post.

Bem’s experiments each had, according to Feeling the Future, 100-200 participants. In reality, at least some of them were assembled from smaller blocks of maybe around 50. This is a problem for exactly the same reason as the dice examples. Even if the outcomes in every block were completely random, once hand-picked blocks are assembled to a larger whole, this whole no longer is.

Proof that it happened

How do we know that this happened? This doesn’t require knowledge of statistics either, just a bit of sleuthing.
First we note what it says in footnote about experiment 5

This experiment was our first psi study and served as a pilot for the basic procedures adopted in all the other studies reported in this article. When it was conducted, we had not yet introduced the hardware based random number generator or the stimulus seeking scale. Preliminary results were reported at the 2003 convention of the Parapsychological Convention in Vancouver, Canada (Bem, 2003); subsequent results and analyses have revised some of the conclusions presented there.

Fortunately, this presentation is also available in written form. Unfortunately it is immediately obvious that it doesn’t present anything corresponding to experiment 5.
The presentation from 2003 reported not 1 but 8 experiments, each with at most 60 participants. The experimental design, however, matches that reported in 2011.
The 8 experiments are grouped into 3 experimental series, so perhaps he pooled these together for the later paper? But no, that doesn’t work either.

I could write several more paragraphs of this kind, trying to write up a logic puzzle full of numbers as if it were a car chase. But my sense of compassion wins out. I know I would merely bore you half blind, my dear readers, and I won’t have that on my conscience.

Therefore I shall only give my answers as one does with puzzles. Check them with the links at the bottom if you like. I could easily have overlooked something or made a typo.

Experimental series 300 of the presentation is the “small retroactive habituation experiment that used supraliminal rather than subliminal exposures” that is mentioned in the File-Drawer section of “Feeling the Future”.
Experiment 102 with 60 participants must have been excluded because it has 60 rather than 48 trials per session.
Experiments 103, 201, 202, 203 combined form experiment 6. They have the same number of participants (n=150). Moreover, the method matches precisely. 100 of these 150 were tested for “erotic reactivity“. This is true for experiment 6 as well as the combination.
Experiment 101 could be part of experiment 5 but there aren’t enough participants. Additional data must have been collected later.
Note that the footnote points to “subsequent results“.

Warning signs

Even without following up the footnotes and references there are some warnings signs in Feeling the Future that hint that something is amiss. For example.

The number of exposures varied, assuming the values of 4, 6, 8, or 10 across this experiment and its replication.

The only reason one would change something in an experiment is to determine if this one factor has any influence on the results. Here we learn that a factor was varied but there is neither reason nor justification given. Much less results.

These two items were administered to 100 of the 150 participants in this replication prior to the relaxation period and experimental trials.

The same thing applies here. A good experiment is completely preplanned and rigidly carried through. There’s no problem with doing less formal, exploratory work to find good candidate ideas that merit the effort necessary for a rigid test. But such exploratory experiments have almost no evidential weight.

Such warning signs are also present in the other experiments described in Feeling the Future. That could indicate that the same thing was done there as well. But don’t make the mistake of assuming that this issue is the only one that invalidates Bem’s conclusions. There’s also the issue of data dredging which is like deciding which card game to play depending on what hand you were dealt. Small wonder then, if you find your cards to be unusually good, according to the rules of the game you chose.

In terms of an experiment that means analyzing the results in various ways and then reporting those results that favor the desired conclusion. That Bem did this is also evident from a comparison of the 2003 and 2011 description of what is apparently and purportedly the same data.

Particularly worrying is that Bem has explicitly and repeatedly denied using such misleading methods. I shall restrain myself from speculating about what made him deny such an obvious, documented fact. It does not have to be dishonesty but none of the other possibilities is flattering, either.

There’s a common conceit among believers that skeptics don’t look at the data. Whenever someone claims this, ask them if there is anything wrong with Feeling the Future and you will know the truth of that.

Sources:
Bem, D. J. (2003, August). Precognitive habituation: Replicable evidence for a process of anomalous cognition. Paper presented at the 46th Annual Convention of the Parapsychological Association, Vancouver, Canada.
Bem, D. J. (2011). Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology.

Modern Mediumship Research: Negative

In this post I am going to deal with results that speak against the reality of mediumship. We begin with mediumship tests where a negative outcome was admitted.
For starters there’s a study by Jensen and Cardena. It involved only one medium, however. So there’s not really much we can conclude about mediumship in general.

A larger study was published by O’Keefe and Wiseman. It involved five mediums and 5 sitters (a jargon term for client). It discusses methodological issues in great detail and can be downloaded from Wiseman’s website. It is recommended reading for anyone thinking about putting together their own medium test. In fact, similar methods could be used to test someone who claims telepathy or just merely being a good judge of character.

One argument against such negative studies is that they simply didn’t have a genuine medium on their hands. This, however, is not a very common argument in my experience. Perhaps because it is uncomfortably close to accusing the tested mediums of fraud, or perhaps because it leaves the implication that there is

The other more common argument is that the conditions didn’t permit mediumship. Usually, some factor is said to have blocked the ability, such as the presence of skeptics. This has by now turned into a muchly ridiculed cliché.
Others say it was simply that the unfamiliar surroundings of the lab and strict limitations of the protocol made the mediums uncomfortable. This has been compared (by males, of course) to the inability of people to have sex under such conditions. I find it amazing that in this day and age there are still men who are unaware of the existence of porn.
As a remedy, Julie Beischel of the Windbridge Institute argued that O’Keefe and Wiseman should have tested their mediums before testing them to ensure that they can pass the test. It sounds better if you call the pretest a “screening”.
Some more cerebral believers argue that the conditions for paranormal just so happen to be the same as for error or fraud (though they will, of course, not put it so bluntly). Such an argument is implicit in Kelly and Arcangel’s paper when they talk about “priming the pump”.

Challenges

Perhaps one might include skeptic challenges as another negative piece of evidence. These challenges are meant to call out people who claim abilities they don’t have. The most famous of these, and the blue-print for most others, is James Randi’s Million Dollar Challenge.
The rules are both simple and scrupulously fair. Whoever wishes to take up the challenge needs not only claim some ability but design himself a task, impossible to ordinary people. So rather than just claim mediumship, one would need to say, for example, ‘I can tell if someone’s parents are alive or dead.’ Of course, the conditions would have to be so as to preclude ordinary means of inference, however unlikely. And it would have to be stipulated how many tries there would be and how many would have to be successful.
Basically, one has to convince James Randi and his advisors that there is no way to do the promised feat with ordinary means. Once they accept this, the claimant has a legal contract that stipulates payout on success.

So far, the money has been quite safe.

The arguments against the challenge are numerous but ill-informed. Some think that whether or not someone wins is a judgement call by James Randi. In fact, the rules explicitly forbid any judgement calls. Success or failure must be obvious to anyone.
Others accuse Randi of dishonesty. They say he only allows weak claimants or even that he manipulates the result. This is some rather outrageous slander but even so, there are other challenges. In any case, one can also contact parapsychologists. Yet by all appearances, they don’t have any potential winners either
Naively, one would expect something like Randi’s challenge to be treated as the X-Prize of parapsychology. But what if there are no humans with super-powers? Then those who claim them would be frauds or lunatics. The reaction to the offer of a million dollars certainly does nothing to dissuade one from considering that possibility.

Still, it is true that just because no one has risen to the challenge, this does not disprove mediumship. One might claim, for example, that a true medumistic gift always comes with extreme shyness.

Science?

Finally, there is also the fact that what science has revealed over the last centuries about the world, about us and about our place in the world is rather inconsistent with mediumship. Not so long ago science believed that there was a life-force that distinguished living from non-living matter. Today this is considered refuted.
Something similar is true for the human mind, which today is considered to be the workings of the brain.

Not only has science not uncovered support for mediumship, it has refuted some of its core tenets.

In regards to this some will point out that science has been wrong before. This is quite true and, indeed, science is nothing if not a quest to be ever less wrong. But here we have a case where this quest simply leads away from the proffered answer.
There are even many examples where scientists held onto ideas that should have been recognized as discredited by the evidence. Who is the dogmatic hold-out in this story?

Still, the fact remains that while science as a whole offers strong arguments against the possibility of mediumship, it cannot be ruled out. Science could be wrong and that is a simple, unalterable fact.
If the evidence indicated that it was real then one would have to find a way to reconcile it with other seemingly contradictory evidence.
However, I have never so far come across a serious attempt at this. There are plenty of demands that mainstream science should take parapsychology seriously. Demands that parapsychology take mainstream science seriously are usually met with the charge of close-mindedness.

The Universal Negative

Eventually, what all this boils down to is quite simply that you can’t prove a universal negative. You can’t test everyone and even when you test someone it only shows that they failed in this specific test on this one occasion.

Modern Mediumship Research: Kelly and Arcangel

This paper is almost hot of the presses, having been published only this year.

Emily Kelly and Dianne Arcangel are firm believers in mediumship. Their stated goal is to find a medium that can perform under laboratory conditions, like existed back in the old days.
To do this they administer a test to volunteer mediums. A skeptic would see this as simply a test of mediumship or someone’s paranormal ability. For them it is only a test if mediums can do something paranormal under certain conditions. The possibility that just perhaps one or the other medium may not be able to do anything paranormal at all never enters the article.
I wonder why that is. Are they somehow convinced of the abilities of their research mediums? Do they want to spare their feelings? Are they afraid of the hostile reaction of the paranormal scene that skepticism invariable provokes?

Despite being firmly in the believer camp, they find the same problems in previous modern research on mediumship (IE by Schwartz, Beischel and Robertson&Roy) as yours truly. Naturally I find this quite endearing. Indeed, making allowances for the authors’ convictions, there is nothing about the paper that I could call wrong. Sure, it’s not evidence for mediumship but neither is it claimed to be.

The Experiments

Their first experiment involved 4 mediums and 12 sitters (that is people who receive a reading). Each medium read 3 sitters for a total of 12 readings. The readings were held over telephone, with Emily Kelly standing in for the actual sitter. She only knew the first name and birthday (but not year) of the deceased person that was to be contacted. Care was taken that only Dianne Arcangel had contact with the sitters before the reading.
That way it was ensured that no information was leaked to the mediums.

The sitters were given their own reading as well as 3 randomly selected others. Mentions of names and birthdays had been deleted. They then rated the readings for accuracy and had to pick their own.
The results were completely and unambiguously negative.

Therefore they decided to make a few changes in their next experiment. Mediums would now receive a photograph of the deceased on top of the other information. Also the experiment would be conducted in a sloppier manner. In the previous experiment it was made sure that the stand-in for the sitter knew nothing besides name and birthday. This good practice was abandoned.
This introduces the possibility that the medium received clues before or during the reading. It also makes one wonder if any other corners were cut.

This second experiment involved 40 sitters and 9 mediums. 14 of the 40 chose their reading as the most applicable which is significantly more than you would expect if they had been guessing. The question, of course, is why.

The Conclusion

The failure of the first experiment is in itself interesting. The mediums, we are told, had thought that they could perform under the conditions. This tells us that they overestimated their abilities.
It also seem inconsistent with a study by Gary Schwartz and Julie Beischel that is widely touted as evidence for mediumship (discussed here, under Triple Blind!). That study found significant results under very similar conditions.
Why the difference? There’s a number of possibilities. For one, it could have been a fluke. But it must also be said that while similar, the task in this experiment was actually more difficult. Eventually, there is too little data to engage in fruitful speculation.

The success of the second experiment in contrast with the first may seem to imply that either the loosened protocol or the photograph played a key role but one shouldn’t forget that the mediums were different, too. There is ample room for a normal, rather than paranormal explanation, and indeed the differences between the outcomes of the various experiments can be seen as pointed to such a one.
Some believers have argued that some of the “hits”, quoted in the paper, are too specific to be explicable. They forget that some of those who picked the wrong reading also found specific information, just for them. Illusory correlation goes a long way to explain this.

Further Research

If Kelly and Arcangel have found one or more real mediums then they should be able to present results from well designed experiments within the next few years. Of course, this has never happened in any similar situation in the past, so don’t hold your breath.

Nevertheless that doesn’t mean that we shouldn’t expect more research. For one, they express the believe that it may be necessary to “prime the pump” by feeding information to mediums. Then the mediums can supposedly come up with even more information. This is effectively what they have done in this study and I’d expect them to do more of the same in the future. What I don’t expect is for them or anyone else to address the contradiction between this claim and the claims of other’s that mediums don’t need any previous information at all, not even to contact the deceased (see EG Julie Beischel).

Source:
An Investigation of Mediums Who Claim to Give Information About Deceased Persons by Kelly and Arcangel

Modern Mediumship Research: Robertson & Roy

This post will examine the relatively extensive experiments conducted by Tricia Robertson and Archie Roy. In some ways, these are the best I know of. But I’m getting ahead of myself, for they were of to a rather shaky start.

Archie Roy is Professor Emeritus of Astronomy from Glasgow University (Emeritus means that he is retired). He has had an interest in the paranormal for several decades and was president of the Society for Psychical Research (SPR) from 1993-1995. Tricia Robertson is a past president of the Scottish SPR (SSPR) but I don’t know that she has mainstream qualifications.

The First Article

In 2001 they published the first paper that will interest us here. Interestingly, the results are presented not as evidence of mediums but as a test of “the sceptical hypothesis that the statements made by mediums to recipients are so general that they could as readily be accepted by non-recipients.

To do so they conducted a “two-year study involving 10 mediums, 44 recipients and 407 non-recipients “. That’s really awesome in many ways. They really spent time and effort on this and they don’t even claim proof of mediumship. Compare that to the junk that Gary Schwartz has produced!

Unfortunately, despite all that the study is scientifically worthless. A complete waste of time and effort.

The first problem is obviously that no half-way knowledgable skeptic actually claims that. The second problem is that the experiment was carried out so badly that no conclusions are possible. It is indeed a maximally shaky start.

What they did

They assembled an audience and then had a medium perform a reading for one of the people in that audience. The reading was then broken down into individual items. All subjects (the entire audience), including the intended recipient of the reading, then rated these items as applicable to themselves or not.

This then allows them to determine if the recipient endorsed more items as applicable then the non-recipients. Which would mean that they found the reading as a whole more applicable. It also allows them to deduce how specific an item was. If many people say that an items applies to them it is quite general, if few, or just one, say so then it is specific.

They develop a fairly complicated method to statistically evaluate the readings based on both the number and the specificity of the accepted statements but we needn’t concern ourselves with the details now.

The First Problem

Now we face the first problem. What they test is if the entire reading is so general that it could be accepted by anyone. Yet, if we actually look at the literature on how to fake such readings we learn that this is only part of the art. An accomplished “faker” (actually, many mentalists, like magicians, are quite upfront about their use of trickery) will use any means at his or her disposal to tailor his or her statements specifically to the client.

But all right. Just because no one actually believes a certain hypothesis to be true, does not mean that it shouldn’t be tested. It just means that falsifying it is as exciting as finding that water is not dry.

This gets us to problem number two.

The Second Problem

Now it’s time to ask ourselves what results we would expect from the experiment, depending on whether the hypothesis is true or not.

If the medium makes statements that are true only for the recipient and for few or no others then we would find that the intended recipients accept more statements than non-recipients. Obvious.

But what if the medium only makes statements that are true for most or every one of the subjects?

We would expect exactly the same result. This is counter-intuitive but supported by ample psychological research. Such phenomena are called Barnum effect, illusory correlation, confirmation bias and a number of other names. It’s not necessary to go into detail now. Suffice it to say that Robertson and Roy will be able to confirm this expectation in a later experiment.

This simply means that the experiment did not test the hypothesis it was supposed to test. In fact, I don’t think any conclusion can be drawn from the data so collected. 2 years of research, 451 participants and all for nothing.

The Second Article

Others would have dug in, insisted on their worthless research and ranted about closed-minded skeptics. Robertson and Roy set out to make amends. A few months later they published a protocol for a new experiment that was supposed to demonstrate mediumship under conditions eliminating all ordinary, skeptical explanations.

Publishing the protocol in advance enabled them to take criticism into account before carrying out the experiments. Indeed, the experimental design found in their 3rd and last paper was improved over this one.

Basically, it is quite similar to the one in the first article except that the audience doesn’t learn for who the reading is and the medium performs in a separate room so that he or she does not get any usable clues.

I won’t bore you, dear readers, with the minutiae but as far as I can tell, in this final, rigorous protocol, skeptics would not expect the intended recipient to accept more statements in a reading than anyone else.

The Third Article

The third and final article in the series was published in 2004.  It presents data collected in 13 sessions that took place over 2 and a half years. They involved some 300 participants and 10 mediums giving 73 readings.

However, few of the experiments conducted actually followed the rigorous protocol they took so much care designing.  Ostensibly the reason for this is to assess other factors that may influence subject’s acceptance or rejection of statements but there is not much in the way of analysis of such factors. There’s not even a complete table with results to allow readers to perform their own analyses. But I’m getting ahead of myself.

Skeptics vindicated

One thing that they did find was that misleading the subjects on who the intended recipient is, really does affect the results. As I’ve said previously, psychology leads us to expect that someone who believes that a reading is intended for him or her will rate it as more applicable than someone who believes it is for someone else. This expectation was found to be true, confirming that the experiment presented in the first article was indeed a waste of time.

Despite the ample evidence that psychology has amassed that lead us to expect this, it is good to have it confirmed. The situation is not exactly the same as in the standard experiments demonstrating the relevant cognitive biases. This leaves a slight possibility that these biases are not operating under the conditions in which Robertson and Roy studied mediumship.

Rigorous Results

Now what about the results to the rigorous protocol. That’s, after all, the supposed main point of the exercise. I’m not sure how to put this but… they aren’t there. Yes. Seriously.

We never learn how mediums did under conditions precluding normal explanations.

All right, this needs some more explanation.

There is a lot of analysis presented in the article but it’s almost all completely pointless filler. The results come from experiments that differed in crucial design aspects. In principle, this allows one to determine if a variable factor correlates with greater mediumistic success. However, such analyses are almost completely absent with the notable exception mentioned.

Instead the results from these different experimental conditions are pooled without any reason being given or even inferable. Such pooled results are also compared with other pooled results, even though there is nothing to be learned from that. It’s just weird and bizarre. I can only conclude they don’t really have anything to report that they want to report and so stuff the paper with filler.

The reviewers seriously failed in not demanding that the analyses be cleaned up and all the results reported.

An Abstract Untruth

So where does the supposed main result come from that they report in the abstract?

Due to the design of the experiments the results cannot be due to normal factors such as body language and verbal response. The probability that the results are due to chance is one in a million.

In context one would be lead to believe that these results are actually the results of the rigorous protocol that was so much talked about. One would be wrong.

Indeed these results come from pooling data from several different experimental conditions. That includes the rigorous protocol but also data from experiments where normal factors operate. For one, the bias of the subjects introduced by whether they belief the reading to be for them or not. There are also other factors which might conceivably have influenced the result but one is enough.

To make this clear, if I have results that indicate that something is going on and I combine them with other data then the combination should also indicate something going on because that’s the case, even if there’s nothing in the added data.

In a very strict sense the quoted statement is true. The mentioned factors do not operate, different factors do. It is merely completely misleading in context.

Conclusion

Clearly Robertson and Roy confirmed one skeptical explanation despite their protestations.  Moreover, all results that they report have ordinary, conventional explanations. Speculating about paranormal effects is not justified.

I personally believe that the results of the rigorous protocol must have been negative or else they would surely have been reported.

As such, I consider Robertson and Roy’s work the best evidence against mediumship obtained in modern times.

I also think that the misleading nature of their last article was not intentional. I think they just couldn’t face up to the results. Although, the fact that Tricia Robertson has threatened to sue people who claim that they fixed or rigged the result does not speak of a clean conscience. Not that I see why anyone would claim that, given that the available results are so ordinary.

Sources:

Robertson, T. J. and Roy, A. E. (2001) A preliminary study of the acceptance by non-recipients of mediums’ statements to recipients. JSPR Vol 65.2

Roy, A. E. and Robertson, T. J. (2001) A double-blind procedure for assessing the relevance of a medium’s statements to a recipient. JSPR 65.3

Robertson, T. J. and Roy, A. E. (2004) Results of the application of the Robertson-Roy Protocol to a series of experiments with mediums and participants. JSPR 68.1

JSPR, Journal of the Society for Psychical Research, can be accessed via Lexscien. Skeptical minds, interested in double-checking my work, can do so for free by signing up for a free trial.

Tricia Robertson on critics: “I am aware that critics will say the tests were somehow rigged. But, rest assured, we could not have been more scientific in the way this was carried out. If anyone claims it is fixed or rigged, we would sue.”

Modern Mediumship Research: The Afterlife Experiments

The Afterlife Experiments is a book by Gary Schwartz, Ph.D. It is a chronicle of his mediumship experiments and his personal journey. These experiments were also published in various disreputable journals devoted to such ideas. This article deals primarily with these experiments.
Schwartz’ work has been extensively torn to bits. I recommend the review by Ray Hyman and if you have time this one is also nice but more limited.

The true challenge lies not in refuting Schwartz but in bringing home just how horrible his work was. Ahh, but how to explain the vastness of the ocean to one who has never seen it. I know I must fail and yet I will try.

Schwartzian Theory

Let’s begin as he begins his book, let’s look at his “theory”. He invokes systems theory and quantum physics as foundations for it. In truth, his theory contains as much science as the technobabble in a TV show like Stargate.
Light travels forever between the stars, we are still receiving photons from the Big Bang , he observes, and that much is true. Some photons coming from our bodies will also travel through space forever, assuming they don’t bump into anything on their way out. To him this somehow means that we live forever.
By that kind of logic, taking a picture is stealing someone’s soul.
He says that according to quantum mechanics matter is mostly empty space. Yes, matter is mostly empty space, in a sense but no, that was established by the gold foil experiment with no quantum physics as such involved. Schwartz somehow concludes that you can take away the matter and still go on. Kind of like you still have power when you take the batteries out, I guess. The analogy is fitting because it’s the electric force that keeps together electrons and protons in an atom and that gives solidity to what is otherwise ’empty space’.
It was pretty painful getting through all that nonsense. It’s not just wrong or mistaken, it betrays a deeply irrational mind. I will be honest with you. When I read that-, I thought myself reminded of the ravings of a schizophrenic but not at all of the ruminations of a scientist.

But nevermind that. Eventually it’s all about the data. Right?

The Data Speaks
But which data should we let speak first? How about some data from the late 40ies.
A psychologists called Forer gave a personality test to his students. Then he played a little prank of them, by giving everyone the same text as a phony result. Asked to rate the accuracy of the personality assessment on a scale of 1-5 (with 5 being best), they gave it an average of 4.26.
What this data tells us, or rather shouts in our faces, is that we need to be real careful when relying on human judgement.
This effect has become known as the Forer or Barnum Effect. It is very easy to elicit. So easy, in fact, that not eliciting it is usually the challenge for researchers.
Since then, more has been learned. For example, such Barnum statements (at least some of them) are viewed as more applicable to oneself then to others. Some statements are rated more accurate than others, with people typically preferring favorable statements. Also it makes a difference for the accuracy ratings if the receivers believe in the credibility of the method. Statements presented as resulting from astrology, for example, will go over well with believers, not so well with skeptics.

Psychology knows a number of similar effects. I will only mention one more to emphasize the point. It is called Illusory Correlation and was described around 1970. The experimental subjects were shown drawings and were told that the drawers had certain psychological problems. In reality the psychological problems had been assigned randomly to the pictures.
Still, the subjects found ample signs of these non-existent problems in the drawings.
This was connected to the uncovering of serious problems with diagnostic methods in psychiatry but since then the effect has been demonstrated in a variety of different guises. It is also thought responsible for the persistence of racial and gender stereotypes.

Mediumship

So now that we have listened to what some data tells us, we can think about what that means for mediumship.
For one, we cannot simply ask someone to rate the accuracy of a reading. A human being simply is unable to give an objective measure of that.
This isn’t just a problem for proving mediumship that need only concern skeptics.
Say, you want to find an especially good medium. You have clients rate the accuracy of the reading. Mediums that consistently get higher ratings are surely better at something. But are they better at generating readings that are accurate or just at generating readings that are perceived as accurate?
With that method you might end up selecting fakes over real mediums!
Even worse. Think about how budding mediums usually learn their trade. They practice and refine their skill based on feedbacks from their sitters. Are they really practicing a paranormal skill there? And if not, maybe they ruin any such talent they might have?

Schwartz’ folly

Let’s get back to The Afterlife Experiments
Schwartz holds a PhD in psychology, it is virtually unthinkable that he was not aware of these pitfalls. Even if he was that incompetent, he sought out the advice of experts like Ray Hyman who told him.
The closest Schwartz comes to acknowledging such problems is hand-waving dismissal. He finds it ‘implausible’ that sitters might misrate their reading, without a word about all the evidence to the contrary.

Schwartz inadvertently offers an example of how these psychological effects work in practice.
First what the medium said according to the transcript in the book.

[The medium just said that the dead grandmother was at the client’s wedding.]
And she’s talking about … some kind of flower connection. And what’s weird is she’s showing me flowers that I wouldn’t think about being at a wedding, and these are daisies. Um. they’re showing me daisies.
So I don’t know what the reference is to daisies, but they’re showing me daisies.

The medium could have equally well said:
Come up with a connection between your wedding and daisies!
The connection was eventually made between the wedding of the sitter’s mother and daisies. The grandmother had brought daisies for that wedding.
What Schwartz says is that the medium knew this, even though the medium states quite clearly otherwise.

Pretend Science

But while Schwartz completely fails to deal with this problem he still pretends to. One way is by letting other people besides the recipient rate the reading. Of course, when people know that a reading is not meant for them, this lowers the perceived accuracy.

Schwartz also made a stupid and amateurish attempt at using a control group. He wanted to know if anyone could guess information like a medium. So he had psychology undergraduates answer some questions.
Now you might think:  But psychology graduates are not anyone but a select group! Or you might think: So what if mediums can guess better than these guys? They are pros and should be expected to know the statistics better. That doesn’t tell us about afterlife communication.

Both true but neither captures the depth of Schwartz’ incompetence. Here’s an example.
Medium says:

I think that this is her mother, she is definitely a pistol, she must have had false teeth because she is taking them in and out, in and out. And she’s not supposed to do that in front of everybody.

The control group was asked:

Who had false teeth?
What did she do with her teeth?

A significant proportion of people have false teeth. And they do a lot more with them than just taking them in and out. The actual task of the “control group” was to guess not facts, but what the medium said and how it was interpreted.

Finally, I should say something about Schwartz’ attempts to calculate probabilities. In short, Schwartz clearly has no understanding of probability theory. A detailed explanation of where the errors lie would require again as much place as I have spent on his ignorance of known psychological effects and moreover be quite “mathy” which to most people reads “boring”.
The necessary math should have been learned at school. If it has been forgotten or was never properly understood then there is probably no interest in catching up now.

Experiments deserving special attention

There are two experiments, the first and the last in the book, that deserve a closer look. They avoid the problems that make most of Schwartz’ work scientifically worthless.

In the first, there were two mediums. Medium one was given four deceased persons to contact. She then made one drawing for each, based on information supposedly from that person.
Medium two then had the task of matching the names to the drawings.
This is not ideal as one might speculate that the drawings might contain hints about the names, put there either consciously or unconsciously. Still, it’s hard to imagine a high degree of accuracy unless the mediums are in collusion and have an agreed upon code.
In some ways, this is one of the best conceived experiments that Schwartz ever reported. But there’s a catch.

You’d think that medium two would simply match the drawings and the pictures in the absence of medium one to avoid being influenced. Indeed, there was a session with the 3 experimenters and medium two. Some vague impressions of colors and shapes were recorded but we are not told of any matching of pictures to person. The medium was unable to receive any clear descriptions of the pictures.
We are not told of any attempt at matching the pictures at that point, which seems incredible!
Then things proceeded in the presence of medium one. The three experimenters attempted to match pictures and persons but without success. We are also told that medium two had no success. She was unable to make good contact, we are told, and she couldn’t recall much of the information that came through in the previous session.
But never fear. When they went back to the impressions recorded without medium one, suddenly everyone was able to guess correctly.

We can reasonably infer that after the first round of guessing, medium one must have revealed at least part of the answer. If she had not told them that they had no success then they would not have kept guessing, right?
I think, the most plausible scenario is that medium one revealed the full answer upon which everyone attempted to fit medium two’s vague utterings to the drawings in an exercise of illusory correlation.
Of course, that is just a possibility but I’m thinking, if it had been possible, even easy, to match correctly with the information first received, why didn’t they do it?

Basically, it looks like the experiment failed and Schwartz just couldn’t face it.

The Last Experiment

The last experiment in the book was apparently added last minute. It is described on only 1 and a half pages, without any additional information in the appendix.
In the experiment, the medium made the reading without having any information on the sitter. That means she could not tailor her readings to specific clients based on appearance or feedback.
Also every sitter received two readings to rate, his or her own and one other. Sitters had to choose which of the two was meant for them personally.
In most previous experiments the expected outcome was the same regardless of whether the mediums were real or not. As such they did not provide any evidence for the reality of mediumship.
In this experiment there was finally a difference. Conventionally, one would expect every sitter to have a 50/50 chance of chosing their own reading. If mediumship is real, the chance should be much higher.

The outcome was that 4 of 6 sitters chose their own reading. Assuming that everyone had a 50/50 chance, getting that many (ie 4 or more) has a probability of 34%. By normal scientific standards that means that there is no reason to assume that anything noteworthy happened.
For comparison, if one assumes a 95% chance of picking the right reading then the chance of getting that few (ie 4 or less) is only 3%.

In Schwartz’ mind this somehow morphs into a ‘breathtaking’ finding.
He points out that one of those sitters rated the intended reading as very accurate and the non-intended reading as 0% accurate. That sitter is supposedly an especially talented sitter.
The whole argument would be more convincing, or rather the least bit convincing at all, if it had been tested in a specially designed experiment rather than being based on some quirk noticed after the fact.

He also says that if the experiment had been much larger, with 25 sitters, with again 2/3rds picking the right reading then that would have been ‘statistically significant’ (ie fairly unlikely according to the conventional 50/50 expectation). If wishes were horses
It’s now almost ten years and Schwartz hasn’t done any experiment of that size, so don’t hold your breath.

Triple Blind!

In science, some participant in an experiment is ‘blind’ if he doesn’t know relevant facts which might bias him.
Single-blind means that the experimental subjects don’t know the relevant facts, double-blind means that the experimenters in contact with the subjects don’t know them either. Terms such as triple-blind or quadruple-blind are sometimes encountered but don’t have a truly standardized meaning.
This is doubly true for mediumship research, which doesn’t have any standard experimental designs. Still, Schwartz and his troupe employ them most haphazardly. It seems to be more of a marketing term rather than anything with scientific sense. As a consequence, they have begun touting it much like razor makers the number of blades. They are now at quintuple-blind, something that in mainstream science is only encountered in the punchline of jokes.

A few years after the book came out, Schwartz, together with Julie Beischel, published a paper with a triple blind medium test. After what Schwartz produced during The Afterlife Experiments era, this experiment completely blew me away with its good design. I think Julie Beischel is a good influence on the project. Objectively, that shows just how low my expectations have become.

There were 8 sitters and 8 mediums. Each medium gave 2 readings and each sitter rated 2 for a total of 16.
The sitters were paired up. One sitter who had lost a parent together with a sitter who had lost a peer. Moreover, the pairings were done so that the deceased each one hoped to contact would be maximally different. Both sitters in a pair received both readings and had to pick theirs, again while being ‘blind’.
The pairing with maximization of differences was to ensure that picking the right reading would be especially easy. That’s a neat idea but that it was found necessary has some interesting implications.
Years earlier, Schwartz entertained us with ‘breathtaking’ findings and ‘breakthrough evidence’. Now things appear to have become a little more difficult. Could it be that mediumship is harder to demonstrate when known psychological effects are taken into account rather than merely being dismissed?

Now things get a little tricky.
You could look at the medium. Each medium is faced with a pair of sitters and needs to “guess” who has the deceased parent and who the peer. That would mean that each medium has a 50/50 chance of getting her readings right. In total that’s eight 50/50 chances
Or you could look at the sitters. Each sitter is faced with the task of picking the right reading out of a pair, which is a 50/50 chance. But each sitter has to do that twice which makes a total of sixteen 50/50 chances.
Which of these views is correct? I don’t know. That depends on what the mediums did. The choice was up to them.

The reason why this matters is in the probabilities. For example, if there are 6 of 8 pairs that are correct then the likelihood for that is 14%, fairly high. If you say that there are 12 out of 16 readings correctly picked then there is only a 4% chance. It is like either getting 6 heads in 8 coin tosses or 12 in 16.

Unfortunately, Schwartz and Beischel fail to realize this problem and adopt the second view. In fact, the sitters picked 13 out of 16 readings correctly which has a likelihood of only 1%. Given that the number of correct picks is uneven, the first view cannot be entirely correct but different mediums may have made different choices. So we shouldn’t assume that it is entirely incorrect either. The likelihood of getting that many readings correct may well exceed 10%.
That’s the difference between interesting and boring.

Of course, even if it’s a 1 in a 100 chance it can still have been chance, or if it’s a 1 in 10 chance it may still have been mediumship.

Related to that problem is the problem of the small size. In comparison to Schwartz’ previous experiments, 16 readings is a lot. But compared to the huge amount of work that appears to be going on at Beischel’s “Windbridge institute” this is only a drop in the bucket.
So if you say that the results could arise by chance only once in so many experiments, well, in this case it is distinctly possible that there are so many experiments out there.

The biggest problem however lies in the interpretation. Each medium received the first name of the “discarnate” they were to contact. But names contain information about a person.
Think of Gertrude vs. Britney. Who is the dead parent, who the dead peer? Think of Tyrone vs. Cletus, who has the higher skin cancer risk?

So if anything was going on in the experiment what was it? A display of mediumship or of statistical knowledge?
We know that one of those is possible. We don’t know that the other is.

On the whole, I find the flimsy results coming from a research program spannig well over a decade quite telling.

Randi’s Prize: Answering Chapter 3

Chapter Three:  Communicators

With this chapter, things get at once more and less interesting. It is less interesting because we no longer deal with movie style magic. No more psychokinetic kids and ectoplasmic ghosts. Instead we get something even more juicy: laboratory experiments.
Alright, that’s not quite as exciting. But if you want to get to the bottom of things it is very, very interesting.

The key to the past
It is said (especially by geologists) that ‘the present is the key to the past’. It is assumed by the historical sciences that the laws of nature that we observe today are the same that operated in the past. This assumption seems to hold up well, judging from what we learn from distant (and thus old) starlight or from the convergence of different radiometric dating methods.
The assumption essentially follows from Occam’s razor but that it works is really the important point.
Much has been written on this issue. Especially in response and in opposition to the efforts of creationists to inject religious dogma into scientific inquiry. I’m not going there now.

There is also another reason for the assumption that is based on the constraints imposed by the scientific method. Science can only study the testable. It is an essential characteristic of science that claims are judged by experiment.
If mediumship (or other psychic powers) are shown to be active in the past, then science will assume it operated in the past. Arguments that explain certain recorded events in terms of these “new” laws of nature will be credited. The reverse will never stand up.

Back to the book
What this chapter deals are mental mediums. Unlike the physical mediums of the previous chapter these don’t produce effects like straight from a Ghostbusters movie but only purport to communicate with the spirit world. They only talk.

The first example we are given, at great length, is Leonora Piper. A so-called trance medium from the second half of the 19th century. McLuhan has succeeded in rousing my curiosity as to Piper’s accomplishments but it a historical case will never be acceptable, scientific proof.
He cites example after example of her amazing feats and then more examples by Osborne Leonard. Then also the Edgar Vandy case.

He frequently points to a piece and transcript and argues that it doesn’t look like cold reading. Yet, all his bibliography offers on cold reading is a single chapter in The Elusive Quarry by Ray Hyman. For the curious it includes this article.
That’s awfully little to speak with authority on the subject.
By the way, my personal recommendation would have been Ian Roland’s The Full Facts Book of Cold Reading, for the dry wit. It’s a much more entertaining read than Hyman’s academic prose.

He points out that these mediums were never caught in fraud which seems a rather curious argument. A sleight-of-hand conjurer can easily be caught in his trick. A cold reader only talks. She can only be shown to have made statements that look like cold reading. McLuhan points out that this is true for the mediums he mentions but apparently doesn’t consider it significant.
Of course, it is reasonable to focus on the more interesting instances. A true psychic should still be expected to use some cold reading (maybe unconsciously), just like any hot reader will make use of these techniques.
But how do we know that these tidbits were gained paranormally rather than conventionally?
For one, the mediums were sometimes trailed by detectives to catch them in any information gathering. Without success. Yet, it should be immediately obvious that catching someone gathering information is vastly more difficult than catching someone using a conjuring trick right before one’s eyes.
The possibility that they picked up information just by the way, in chats or such, is discounted on the grounds that the researchers were well-respected and surely would not have made the mistake of enabling this.
As I am quite familiar with the antics of contemporary mediumship researchers I find this appeal to authority more than doubtful. That aside, even if psi exists, it must be rare to encounter it so strongly as in these mediums. Is that a counter argument against Piper and the others having been real?
If improbability is not an argument against paranormality, then why is it an argument against normality?

The Fundamental Error
MyLuhan shows one proposition to be improbable and then concludes that another, also improbable proposition must be true. That has been McLuhan’s fundamental mistake from the start. I have pointed it out from many angles. It is one thing to show an idea to be unlikely and quite another to show that this idea is less likely than another. Besides, who says that the right idea isn’t one that no-one thought of?

The consequential application of his erroneous reasoning leads McLuhan to the only possible conclusion: It is proven that these mediums were real.
By McLuhan’s logic, only travelling back in time to find a satisfying, conclusive explanation for each of the mediums’ apparent paranormal deeds can undo this proof.

Conversely, bringing science, even out-spoken skeptics, round to McLuhan’s opinion would be much easier. Just show mediumship to work today.

Modern Mediumship Research
This modern research consists of the work done by Gary Schwartz. Not mentioned in the book is the quite extensive work by Robertson and Roy, perhaps because it was not available to McLuhan at the time of printing.
I will need to write a few posts on the current state of mediumship research at some point.

For now we are focused on Randi’s Prize. McLuhan gives an outline of Schwartz’ experiments and also of the criticisms. The whole thing is related in a he said/she said style that remains sterile. One does not get the sense that he actually engaged the material on a deeper level.
Eventually he does gives some credence to the critics but still finds that “these experiments give a strong suggestions” of psi. Given how abysmally horrible Schwartz’ research was, I find such a conclusion mind-boggling. What went wrong there? I don’t know.

I will not go into more detail here, dear reader. Please await the up-coming series on modern mediumship research.

Feeling The Future, Smelling The Rot

Daryl Bem is (or was?) a well-respected social scientist who used to lecture at Cornell University. The Journal of Personality and Social Psychology is a peer-reviewed, scientific journal, also well-respected in its field. So it should be no surprise that when Bem published an article that claimed to demonstrate precognition in that journal it made quite a splash.

It was even mentioned, at length, in more serious newspapers like the New York Times. Though at least with the skepticism a subject with such a lousy track record deserves. In fact, if the precognition effect that Bem claims was real, casinos were impossible, as a reply by dutch scientists around EJ Wagenmakers points out.

By now, several people have attempted to replicate some of Bem’s experiments without finding the claimed effect. That’s hardly surprising but it does not explain how Bem got his results.

What’s wrong with the article?

It becomes obvious pretty quickly that the statistics were badly mishandled and a deeper look only makes things look worse. The article should never have passed review but that mistake didn’t bother me at first. Bem is experienced, with many papers under his belt. He knows how to game the system.

The mishandled statistics were not just obvious to me, of course. They were pointed out almost immediately by a number of different people.

These issues should be obvious to anyone doing science. If you don’t understand statistics you can’t do social science. What does statistics have to do with understanding people? About the same thing that literacy has to do with writing novels. At its core nothing, it’s just a necessary tool.

Mishandled statistics are not all that uncommon. Statistics is difficult and fields such as psychology are not populated by people with an affinity for math. Nevertheless, omitting key information and presenting the rest in a misleading manner really stretched my tolerance. That he simply lied about his method when responding to criticism, went too far. But that’s just in my opinion.

Such an accusation demands evidence, of course. The article is full of tell-tale hints which you can read about here or in Wagenmakers’ manuscript (link at the bottom).
But there is clear proof, too. As Bem mentions in the article, some results were already published in 2003. Comparing that article to the current article reveals that he originally performed several experiments with around 50 subjects each. He thoroughly analyzed these batches and then assembled then to packets of 100-200 subjects which he presents as experiments in his new paper.

[Update: There is now a more extensive post on this available.]

That he did that is the omitted key information. The tell-tale hints suggest that he did that and more in all experiments. Yet he has stated that exploratory analysis did not take place. Something that is clearly shown to be false by the historical record.

Scientists aren’t supposed to do that sort of thing. Honesty and integrity are considered to be pretty important and going by the American Psychological Association’s ethics code that is even true for psychologists. But hey, it’s just parapsychology.

And here’s where my faith in science takes a hit…

The Bem Exploration Method

Bem Exploration Method (BEM) is what Wagenmakers and company, with unusual sarcasm for a scientific paper, called the way by which Bem manufactured his results. They quote from an essay Bem wrote that gives advice for “writing the empirical journal article”. In this essay, Bem outlines the very methods he used in “Feeling the Future”.

Bem’s essay is widely used to teach budding social psychologists how to do science. In other words, they are trained in misconduct.

Let me give some examples.

There are two possible articles you can write: (a) the article you planned to write when you designed your study or (b) the article that makes the most sense now that you have seen the results. They are rarely the same, and the correct answer is (b).
The conventional view of the research process is that we first derive a set of hypotheses from a theory, design and conduct a study to test these hypotheses, analyze the data to see if they were confirmed or disconfirmed, and then chronicle this sequence of events in the journal article.

I just threw a dice 3 times (via random.org) and got the sequence 6,3,3. If you, dear reader, want to duplicate this feat you will have to try an average of 216 times. Now, if I had said I am going to get 6,3,3 in advance this would have been impressive but, of course, I didn’t. I could have said the same thing about any other combination, so you’re probably just rolling your eyes.
Scientific testing works a lot like that. You work out how likely it is that something happens by chance and if that chance is low, you conclude that something else was going on. But as you can see, this only works if the outcome is called in advance.
This is why the “conventional view” is as it is. Calling the shot after making the shot just doesn’t work.

In real life, it can be tricky finding some half-way convincing idea that you can pretend to have tested. Bem gives some advice on that:

[T]he data. Examine them from every angle. Analyze the sexes separately. Make up new composite indexes. If a datum suggests a new hypothesis, try to find additional evidence for it elsewhere in the data. If you see dim traces of interesting patterns, try to reorganize the data to bring them into bolder relief. If there are participants you don’t like, or trials, observers, or interviewers who gave you anomalous results, drop them (temporarily). Go on a fishing expedition for something —anything —interesting.

There is nothing, as such wrong, with exploring data, to come up with new hypothesis to test in further experiments. In my dice example, I might notice that I rolled two 3s and proceed to test if maybe the dice is biased towards 3s.
Well-meaning people, or those so well-educated in scientific methodology that they can’t believe anyone would argue such misbehavior, will understand this passage to mean exactly that. Unfortunately, that’s not what Bem did in Feeling The Future.

And again, he was only following his own advice, which is given to psychology students around the world.

When you are through exploring, you may conclude that the data are not strong enough to justify your new insights formally, but at least you are now ready to design the “right” study. If you still plan to report the current data, you may wish to mention the new insights tentatively, stating honestly that they remain to be tested adequately. Alternatively, the data may be strong enough to justify re-entering your article around the new findings and subordinating or even ignoring your original hypotheses.

The truth is that once you go fishing, the data is never strong (or more precisely the result).

Bem claimed that his results were not exploratory. Maybe he truly believes that “strong data” turns an exploratory study into something else?
In practice, this advice means that it is okay to lie (at least by omission) if you’re certain that you’re right. I am reminded of a quote by a rather more accomplished scientist. He said about science:

The first principle is that you must not fool yourself–and you are
the easiest person to fool. So you have to be very careful about
that. After you’ve not fooled yourself, it’s easy not to fool other
scientists. You just have to be honest in a conventional way after
that.

That quote is from Richard Feynman. He had won a Nobel prize in physics and advocated scrupulous honesty in science. I imagine he would have used Bem’s advice as a prime example of what he called cargo cult science.

Bayesians to the rescue?

Bem has inadvertently brought this wide-spread malpractice in psychology into the lime-light.
Naturally, these techniques of misleading others also work in other fields and are also employed there. But it is my personal opinion that other fields have a greater awareness of the problem. Other fields are more likely to recognize them as being scientifically worthless and, when done intentionally, fraud.
If anyone knows of similar advice given to students in other fields, please inform me.

The first “official” response had the promising title: Why psychologists must change the way they analyze their data by Wagenmakers and colleagues. It is from this paper that I took the term Bem Exploration Method.
The solution they suggest, the new way to analyze data, is to calculate Bayes factors instead of p-values.
They aren’t the first to suggest this. Statisticians have long been arguing the relative merits of these methods.
This isn’t the place to rehash this discussion or even to explain it. I will simply say that I don’t think it will work. The Bayesian methods are just as easily manipulated as the more common ones.

Wagenmakers & co show that the specific method they use fails to find much evidence for precognition in Bem’s data. But this is only because that method is less easy to “impress” with small effects, not because it is tamper-proof. Bayesian methods, like traditional methods can be more or less sensitive.

The problem can’t be solved by teaching different methods. Not as long as students are simultaneously taught to misapply these methods. It must be made clear that the Bem Exploration Method is simply a form of cheating.

Sources:
Bem, D. J. (2003). Writing the empirical journal article. In J. M. Darley, M. P. Zanna, & H. L. Roediger III (Eds.), The compleat academic: A career guide (pp. 171–201). Washington, DC: American Psychological Association.
Bem, D. J. (2003, August). Precognitive habituation: Replicable evidence for a process of anomalous cognition. Paper presented at the 46th Annual Convention of the Parapsychological Association, Vancouver, Canada.
Bem, D. J. (2011). Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology.
Bem, D. J., Utts, J., & Johnson, W. O. (2011). Must psychologists change the way they analyze their data? A response to wagenmakers, wetzels, borsboom, & van der Maas (2011). Manuscript submitted for publication.
Wagenmakers, E.-J., Wetzels, R., Borsboom, D., & van der Maas, H. L. J. (in press). Why psychologists must change the way they analyze their data: The case of psi. Journal of Personality and Social Psychology.

See here for an extensive list of links on the topic. If I missed anything it will be there.

The Saga of Rupert Sheldrake and the Psychic Dog

The saga starts  in 1994 with a book with the not-quite-modes promise in its title Seven experiments that could change the world.

Sheldrake relates how some pet owners think that their pets can tell when they are coming home, even if that should be impossible. He believes that there is a telepathic link between pet and owner. One of these seven world-changing experiments is to show this behavior.

Three Surveys

The saga continues with three surveys of pet owners in England and California, published in 1997/98. About 50% of dog owners said their pet anticipated the return of a family member and 30% of cat owners. Almost 20% of dog owners said that this behavior started more than 10 minutes before the person’s arrival.

Jaytee

Meanwhile a specific dog by the name of Jaytee was the center of an exhaustive investigation. Jaytee’s owner, Pam Smith, and her parents had noticed as early as 1991 that Jaytee was anticipating Pam’s return. They put this down to routine as she returned home from work at always the same time. However, the behavior seemed to persist even after Pam was laid off in 1993 and no longer followed a set routine.
Pam Smith learned of Rupert Sheldrake’s interest in psychic pets in April 1994 from a newspaper article. She volunteered for an experiment. In the following month her parents began taking notes.
The first observations seemed promising so the notes got more detailed and eventually lead to several specific tests. A few had Pam return by an unusual mode of transport so that the dog would not hear the familiar car. In two tests the return time was determined by coin toss.
There was also a test by austrian state television (ORF) for a documentary.

What Jaytee could do

Based on these observations and tests Sheldrake argued that Jaytee reacted whenever Pam Smith decided to journey home.
Sometimes the dog reacted before Pam Smith started journeying home but Sheldrake said that, in fact, this was because Jaytee had reacted when Pam prepared to travel home, rather than when the journey actually started. When the dog reacted late this might have been because the parents had not been paying attention and simply missed the proper time, so that the dog only seemed to have been late.
For some failures other reasons were found such as distractions outside (like a bitch on heat) or the dog being ill.

Such arguments to explain failures away may not be too convincing but Sheldrake could also point to some successes. Yet those successes relied greatly on the reliability of Pam Smith’ parents as unbiased observers, or in one case, of a film crew.

Videotaped experiments

The next step was to videotape the whole thing. The camera was trained on a certain spot, in front of a window. Going there and looking out was, according to the Smiths, how Jaytee anticipated his owner.
This would take place in several locations. On 30 occasions, Jaytee was left with Pam’s parents, as before. Five times, he was left with Pam’s sister, and 50 times he was left alone.
When Jaytee was with Pam’s sister at her place he spent altogether less time at the window but Sheldrake describes his behavior as being similar to when he was at the parents’ place.
There were also 50 such observations where Jaytee was alone. There he usually did not go to the window at all. Only in 15 cases he showed his usual response. However, no graphs, or other information, are given to support this statement.

Jaytee’s behavior when he was with Pam’s parents is shown here:

Graph from Sheldrake 1998

The 30 trials were separated according to how long Pam Smith was absent.
Each step on the x-axis represents a 10 minute (600 second) period. The y-axis tells us how many seconds of these
10 minutes, Jaytee spent at the window. The filled circle/square indicates the first 10 minutes of Pam’s return journey.  The lower line, marked with squares excludes 7 observations where Jaytee spent especially much time at the window before Pam returned, I won’t be using it.

But what does that mean?
For one thing it clearly contradicts Sheldrake’s earlier conclusion. Jaytee does not suddenly go to the window and wait there as soon as Pam starts returning. He simply spent more and more time there.

It does seem as if he had a rough idea of when Pam would return and behaved accordingly but maybe he was merely reacting to the parents anticipation. Even though she should not have told them they may have been able to guess from clues like what she took along. Or indeed, Jaytee may guessed himself.
I must admit, though, that I am not entirely certain if that may not be simply a statistical illusion.

Weird!

Now things get seriously weird. A normal person, or at least a normal scientist, faced with that data would now seriously reevaluate his assumptions. Maybe when the parents took the notes, they were picking up some different, more subtle clues from the dog. Maybe just looking at when the dog goes to a certain spot is not good enough.
Or maybe the telepathic link was between Pam and her parents in the first place.

That’s however not what Sheldrake does. Sheldrake argues that the data confirms his idea. Jaytee spends the most time at the window right before Pam returns therefor he’s psychic. That’s the argument. No kidding.
It gets worse.
Yes.
Really.

He is aware that if the dog goes to the window more and more this will also have him at the window most when Pam returns. And this is why he produced that graph. I took it right from his paper. And, you see, it shows how Jaytee did not go to the window more and more. You don’t see? Good for you.
His argument is simply wrong but for the morbidly curious here it is: He compares the short, medium and long absences. For example, when Pam returned after 80 minutes (short) the dog spent an average of about 300 of the last 600 seconds (10 minutes) at the window. But after 80 minutes in the medium and long absences, he only spent about 100 or 50 seconds there respectively.

That’s true, and as I said, might indicate that the dog knew something. It just tells us nothing about whether the dog really did go to the window more and more. It is obvious from the graph anyway but if you wanted to test that mathematically you would use a so-called linear regression. Based on an off-hand remark in a different section this seems to have been done (by Dean Radin) with expected results but not included.

This may seem like the end but the saga is not finished yet.

Randomization

There is one final experiment to be done. By determining a return time for Pam at random and only communicating it to her once she is on her way, we can make sure that Jaytee has no clue when she is going to return. Sheldrake performed 12 such experiments that naturally showed Jaytee being at the window most right before Pam’s return. He still thinks that this indicates telepathy, in complete defiance of the facts and any rational argument.

Richard Wiseman

When you hear this saga related elsewhere you will always hear of Richard Wiseman as well. Wiseman is a British psychologist with a well-known skeptical interest in the paranormal. The seemingly stunning performance of Jaytee that was filmed by the austrian television crew lead him to contact Sheldrake. Pam Smith graciously agreed to take part in his experiment and Sheldrake allowed him to use the same video-camera.
Wiseman, with the assistance of two colleagues, Matthew Smith and Julie Milton, performed four experiments.

Since Sheldrake had already done all the preliminary groundwork, Wiseman could jump right in. The dog was supposed to do a certain thing, that is, go to the spot at the window right when Pam Smith was about to return. He would simply test if that was the case. Wiseman would stay with the dog, filming him. Smith would go with Pam and tell her to return at the appointed time.
If the dog went to the window in the same 10-minute time frame as the return, the test would be a success.
As we would expect, the dog was much too early.
However, the dog stayed there only a brief moment, maybe because of some distraction outside. It was decided to try again but this time the dog would have to stay at the window for a full two minutes.
Same thing again, of course.
So it was decided to wait until winter with the next try, when there would be fewer distraction outside.
Yet again too early.
In the fourth experiment the dog didn’t ‘signal’ at all.

Of course, Jaytee’s pattern of going to the window more and more is present in this data as well. By Sheldrake’s twisted logic this means that Wiseman found evidence of telepathy. This is where the saga takes an unsavory turn.
Wiseman has bluntly stated that he failed to find evidence of Jaytee being psychic, moreover he finds Sheldrake’s own data unconvincing. To Sheldrake this is an outrage.
When Wiseman confirmed that he agreed that his data showed the same pattern as Sheldrake’s this was to Sheldrake an admission that Wiseman had found telepathy. To Sheldrake, Wiseman is simply being dogmatic and irrational in not saying so.
It may seem hard to believe that anyone could read through Sheldrake’s work and not see the foolishness in his logic but it isn’t just fans of Rupert Sheldrake who uncritically accept his twisted reality. It is also authors, such as Chris Carter and Robert McLuhan as well, who pride themselves on having investigated such issues. This has, by now, turned into a character assassination campaign against Wiseman.
I must add that Wiseman himself has largely ignored this and never criticized Sheldrake for his irrationality. He has only expressed disagreement and laid out his arguments.

Kane
There was also a small number of tests with another dog called Kane. His pattern seems to have been slightly different but there were only very few tests. That makes it virtually impossible to say anything with confidence.

Conclusion?

You may now think that all this psychic dog business is completely debunked. Well, in a way.
We have seen how Sheldrake’s original hypothesis seemingly collapsed with more stringent tests but one could claim that this was due to error on the part of the scientists.
Wiseman was with the dog, filming him, did that throw the dog off? Sheldrake switched to a different, nonsensical statistical analysis which may cover up evidence.

And even if Sheldrake’s hypothesis about how the dog expresses telepathy is completely wrong, the dog may still be telepathic but just expressing it in a different way.
There are any number of reasons why the tests would have failed to find telepathy.

Is there anything we could interpret as possibly evidence of telepathy?
When we look at the twelve highest quality experiments, those with videotape and random return tim, we find that in four of these Jaytee only went to the window when Pam was on her way home. Not any sooner Maybe that means something?
On the other hand, that only happened when the return time was very early. When the return time was late, he was always too soon (except in one case when he did not do anything at all). That makes it seem much less interesting. It makes it look like the “hits” depend more on the random time being just right than anything else.
The case for telepathy can be strengthened again by subtracting those times where there was some identifiable distraction that may have caused Jaytee to go to the window. Two tests then turn from failures to successes.
But how reliable is such a retrospective judgement? A worrying detail is that the graphs that were published in the parapsychological literature and those contained in Sheldrake’s books show slight differences.
Also there are Wiseman’s results which were all clear failures by this standard.

There’s another issue and it’s the most important one. These few cases that might be telepathy are the result of me going over the results in detail, searching for anything that, at least, doesn’t contradict telepathy. I had to completely ignore Sheldrake’s argument which is simply wrong.
I also had to ignore that the failed tests suggest that this was just chance.

That makes the whole evidence not very convincing. We’d need additional tests to determine if this idea stands up.

The question is, how much effort do we put in before giving up?

Most people, surely most scientists, would look at the track record of telepathy claims. Perhaps they would also look at Sheldrake’s track record who had a well-deserved reputation for irrationality, well before this episode. Based on that they would dismiss the whole thing from the start.
Wiseman gave it more of a chance than most would. His fate may hold something of an answer to those who wonder why people aren’t more open-minded.

How much effort would you personally expend?

Sheldrake thinks he has good evidence of telepathy in dogs. And yet he, too, has given up on research. One would think that by finding a telepathic dog the science would only begin. One would think that your average scientist would continue by uncovering the physiological basis for it.

If one wanted to pick up the work that Sheldrake dropped one would first have to find a psychic dog. Going by Sheldrake’s surveys, this should be easy, if people don’t fool themselves about their pets being telepathic.
There was one person who tried this. A former high-tech entrepreneur turned podcaster by the name of Alex Tsakiris. He put in quite some effort and money.
His plan was to turn the project over to professional scientists once he had found some suitable dogs but it never happened. He found candidates that seemed promising to him but nothing worked out. Eventually he quietly abandoned the project.

So here’s my personal conclusion: I am going to live my life as if there is no such thing as a psychic pet or telepathic dog or whatever. I am also going to be highly doubtful about anything coming from Rupert Sheldrake.
You draw your own conclusion.

Sources:
Papers by Rupert Sheldrake
Papers by Richard Wiseman
Dogs that know by Alex Tsakiris

Randi’s Prize: Answering Chapter 2

Chapter Two: Eusapia Palladino and the Phantom Narrative

In this chapter we get an overview of spiritualism at the end of nineteenth/beginning of the twentieth century. Back then séances were all the rage. Mediums like Eusapia Palladino produced ghosts made from ectoplasm and performed real magic.

McLuhan compares these to contemporary charlatans like Uri Geller. They must be conjurers of genius, he concludes from their effect on the audiences. He forgets that not everyone was impressed by Geller, and those who were, were impressed by his psychic abilities rather than his magic skills. What sets people like Geller apart from other conjurers is his cunning ability to manipulate people and the media. Look at how he used naive people like Targ and Puthoff to further his reputation. That may require genius of a sort but it most of all requires ruthlessness.

We are treated to a number of descriptions of miraculous events that took place in the séance room. In many ways it is a repeat of chapter 1. He is incredulous that so many people sober people could be fooled. Same old, same old…

We also learn of skeptical magicians who find themselves stumped and even endorse paranormal explanations. McLuhan doesn’t understand why skeptics ignore such admissions but only retell explanations. The reason is simple, of course. Because explanations are interesting and ignorance is not.
Many people devote themselves to the study of physics where they learn the explanations for a variety of phenomena, such as gravity. No one is interested in a list of people who do not know these explanations.

Eusapia Palladino features big in the chapter as the title implies. We learn that she is caught “cheating” frequently. However, one team of scientists sticks it out with her nonetheless. They figured that she is only using trickery sometimes and at other times not.

Palladino herself seemed aware of this. She explained – and it seemed to be confirmed by observation – that psychokinetic effects occurred during her trance state by a process of will. The initial channel for the will would be physical: if you or I want to lift something we grasp it with our hands and raise it up, and this was a natural impulse in her also. It was by checking this impulse, allegedly, that the psychokinesis could be unlocked. For this reason she is recorded shouting ‘Controllo!’ at moments when she felt the energy building, to ensure that she was properly held and did not release it by reaching out to perform an action manually.

Amazing. And the evidence seems to confirm it even!

Of course, how could it not? They can’t catch her every time. Their very persistence ensures that they must be fooled and yet it is this very persistence that impresses McLuhan. People who caught her cheating and gave up on her were just being shoddy debunkers.

In this chapter McLuhan also develops his concept of “rational gravity” and  the “phantom narrative”.

Rational gravity, people’s tendency to gravitate towards rational explanations, is certainly a real phenomenon. The reason is quite simply that it works, oohing and aahing over mysteries not so much. He also suggests that stories change over time to become more compatible with rational explanations. I’m pretty sure that happens but I cannot understand why McLuhan fails to see that this works in both directions.

Richard Hodgson and Davey staged a fake séance in 1887. That the séance was fake was unknown to the sitters who duly took notes of the proceedings. The descriptions of the happenings were so inaccurate as to prompt this conclusion:

…the account of a trick by a person ignorant of the method used in its
production will involve a misdescription of its fundamental conditions…so
marked that no clue is afforded the student for the actual explanation.

Richard Hodgson, Proceedings of the Society for Psychical Research, 9, 360,
1894.

Practically this means that some happenings will be literally inexplicable not because they were paranormal but simply because the account is garbled.
McLuhan cites work done by Hodgson and Davey but clearly fails to realize the implication.

The phantom narrative is what McLuhan calls attempts of skeptics to explain what went on in some séances, that is how the tricks were performed. McLuhan does not find these speculations convincing. No problem, after all they are just speculations. We don’t have a time machine, we can’t go back.
The problem is that he takes his doubts about these speculations as evidence for the paranormal. Either, there is a perfectly convincing and satisfying normal explanation or the event is evidence for the paranormal.
Showing how one explanation falls short is meritorious but it does nothing to show that another explanation is right.
What’s worse is that at least since Hodgson and Davey, we have a mechanism that explains the inexplicable. Even if nothing paranormal happened, there can still be accounts of this that are inexplicable!

In the end McLuhan mentions the possibility that modern technologies like infra-red cameras might settle the matter. Yet he has doubts, according to him it is a question of reconciling ourselves with the idea of psychokinesis.
I found this curious because it suggests that McLuhan is unaware that infra-red videos of séances have been made and also that modern physical mediums generally disallow that.

One example would be psychologist Kenneth Batcheldor who filmed himself with some students while they rocked a table in pitch darkness. McLuhan quotes Batcheldor’s claim that during one séance the table levitated for 9 seconds. The filmed séances show nothing of the sort. See for yourself:

There are 3 more parts, go to youtube to watch them
Why would I believe that these people rock the table with their minds, or via some spirit, rather than with their hands? I guess I am just not reconciled with the idea of psychokinesis.

Another example that I want to mention is the scandal that took place at camp chesterfield. There’s a bit of footage of that to be found, too.

I guess McLuhan would say that just because people clearly have their hands on a table, they might still be using their minds to actually move it. He would probably also say that just because infra-red videos of sèances do nothing but uncover trickery, does not mean there aren’t real cases, too.
Both would be true. Of course, the logical impossibility of proving a negative isn’t actually evidence for anything either.

Randi’s Prize: Answering Chapter 1

Chapter 1: NAUGHTY ADOLESCENT SYNDROME

Chapter 1 deals with Poltergeist cases. It starts out by retelling some cases that were soundly debunked by various skeptics, including the infamous James Randi. These are related with hardly a counter argument and are so damning towards any idea of paranormal involvement that I actually checked the cover to make sure I was reading the right book.

The conclusion seems inescapable that poltergeist cases are usually caused by a troubled teen out for mischief.

It seems impossible that anyone who acknowledges the work of skeptics on these cases as valid and valuable would go on  to argue for their paranormality. And yet McLuhan does exactly that.

Here’s how McLuhan puts his misgivings:

Yet there was something here that didn’t seem quite right to me, and it kept drawing me back. If you read the literature on the subject you’ll find that poltergeist incidents tend to be extraordinarily fraught. The people involved are overcome with panic and confusion, not just for a few hours but for days and weeks on end. This isn’t an effect one expects to result from mere children’s pranks. And as I said before, I often wondered how these children managed to create such convincing illusions and remain undetected.

The fallacy which we see here twice in a paragraph is a common one and sometimes called by the unwieldy name of the fallacy of the transposed conditional. Nevermind the name.

I wouldn’t either expect a child’s prank to spook a normal person so thoroughly. But there’s another thing I wouldn’t expect: To hear from it.

We only hear of those cases where some people got really spooked. Is it possible that some pranks could do that? In my opinion yes. Either the child may be gifted with an ability to play with people’s expectations, or the victims may be especially prone to seeing some paranormal influence, or maybe everyone just wants to get into the news and does more to promote the case then to find some solution. Whatever the individual circumstances are, the exceptional cases are the only ones we hear about.

This fallacy is central to the chapter and indeed the book.

He just can’t believe why normal children would do this or that they could do it at all. He is right in his disbelief but normal children just don’t get involved in poltergeist cases either.

There’s another problem with what he says. He says the children remained undetected and yet he has just related several cases in which they were caught red-handed. Oh, and what he calls children are all teenagers, one as old as nineteen.

He has more arguments:

There are a large number of similar cases which suggests a distinct natural phenomenon. I could agree with that but I would have to point out that the cases also suggest a distinct natural cause: The troubled teen.

And, of course, he mentions cases where no trickery was found. Unfortunately we already know that sometimes people hoax others. Should we really assume that every hoaxer is found out? If so then perhaps we should also count unsolved crimes as poltergeist cases.

Some parapsychologists compound the problem by insisting that some cases are “real” even when someone was found hoaxing. It is normal, they say, that people under pressure should use trickery to produce the phenomena that previously happened spontaneously and paranormally.

McLuhan comes closest to addressing this by pointing out that believing investigators expose some cases as hoaxes. This means we should assume that they know what they are doing. If someone can uncover one hoax, he must be able to uncover them all. It’s just like with police detectives. If a detective can solve one case, he or she is  able to solve all cases or else it must be alien abduction, right?

Skeptical investigators, meanwhile, deal with too few cases. The more cases someone investigates, the more credible they are. This may seem sensible, practice makes perfect. But who else than a believer will devote so much of their lives to this? To the skeptic this is just an endless parade of dysfunctional families. Dragging them into a paranormal investigation is not just a waste of time, it is downright unethical. What they need is a social worker.

Eventually, it will be the truest of believers, the downright delusional, who investigate most cases.

McLuhan does his best to raise doubts about the “normal explanation” and some of his arguments have merit. If we knew that hauntings were “for real” and had only been looking at cases to find which were probably real and which faked then they might even have had a point. But, as it is, we don’t know that. That is what these cases were supposed to establish.

One has to give McLuhan credit. He sees that the cases are not convincing by their nature. Where he fails is in taking the unremarkable as evidence.

« Older entries Newer entries »