## Feeling the Future: Part 2

In my first post on Feeling the Future, I discussed mainly how it’s misuse of statistics related to science in general. I said little about how exactly the statistics were misused. My thinking was that a detailed examination would be too boring for the average reader.
I still think that but nevertheless I will spread out exactly how we know that Bem misused statistics.

The Problem Explained

The good news is that you don’t need to know statistics to understand this problem. You surely know games that use dice. Something like monopoly for example. In that game you throw 2 dice that tell you how far to move.
What if someone isn’t happy with the outcome and decides to roll again? That’s cheating!

Even small kids intuitively understand that this is an advantage, something that skews the outcome in a direction of one’s choosing. There’s no knowledge of statistics or probability theory necessary. While the outcome of each roll is still random,there’s a non-random choice involved.

If you roll 3 dice and pick 2. That’s the same thing, right?

How about we roll 4 and then pick 2 but with the stipulation that the 2 remaining dice must be used on the next? Again there’s a choice involved. Within limitations the player can choose
how to move which allows an advantage. The player’s moves are not longer random.
This is despite the fact that the dice rolls are random and none are discarded.

Now we’re ready to get to Feeling the Future.
The results presented were very unlikely to have arisen by chance. Therefore, the argument goes, they probably didn’t arise by chance. Which means there must have been some unknown factor influencing the outcome.

You may realize that this is a shaky argument. Just because something is unlikely does not mean it doesn’t happen. The impossible doesn’t happen but the unlikely, by definition, must and does. The unlikely is set apart from the likely merely by happening less often.
Then again, the impossible is only impossible as far as we know. And that we’re wrong on something is at best unlikely, if that. In reality, as opposed to in mathematics, we’re always dealing with probability judgements, never with absolutes.
In other words, that argument is all we have. It is used in the same way in almost every scientific experiment.

So the argument is solid enough. In fact, I believe that there is something other than chance involved. Of course, dear reader, if you didn’t know that already you must have skipped the beginning of the post.

Bem’s experiments each had, according to Feeling the Future, 100-200 participants. In reality, at least some of them were assembled from smaller blocks of maybe around 50. This is a problem for exactly the same reason as the dice examples. Even if the outcomes in every block were completely random, once hand-picked blocks are assembled to a larger whole, this whole no longer is.

Proof that it happened

How do we know that this happened? This doesn’t require knowledge of statistics either, just a bit of sleuthing.
First we note what it says in footnote about experiment 5

This experiment was our first psi study and served as a pilot for the basic procedures adopted in all the other studies reported in this article. When it was conducted, we had not yet introduced the hardware based random number generator or the stimulus seeking scale. Preliminary results were reported at the 2003 convention of the Parapsychological Convention in Vancouver, Canada (Bem, 2003); subsequent results and analyses have revised some of the conclusions presented there.

Fortunately, this presentation is also available in written form. Unfortunately it is immediately obvious that it doesn’t present anything corresponding to experiment 5.
The presentation from 2003 reported not 1 but 8 experiments, each with at most 60 participants. The experimental design, however, matches that reported in 2011.
The 8 experiments are grouped into 3 experimental series, so perhaps he pooled these together for the later paper? But no, that doesn’t work either.

I could write several more paragraphs of this kind, trying to write up a logic puzzle full of numbers as if it were a car chase. But my sense of compassion wins out. I know I would merely bore you half blind, my dear readers, and I won’t have that on my conscience.

Therefore I shall only give my answers as one does with puzzles. Check them with the links at the bottom if you like. I could easily have overlooked something or made a typo.

Experimental series 300 of the presentation is the “small retroactive habituation experiment that used supraliminal rather than subliminal exposures” that is mentioned in the File-Drawer section of “Feeling the Future”.
Experiment 102 with 60 participants must have been excluded because it has 60 rather than 48 trials per session.
Experiments 103, 201, 202, 203 combined form experiment 6. They have the same number of participants (n=150). Moreover, the method matches precisely. 100 of these 150 were tested for “erotic reactivity“. This is true for experiment 6 as well as the combination.
Experiment 101 could be part of experiment 5 but there aren’t enough participants. Additional data must have been collected later.
Note that the footnote points to “subsequent results“.

Warning signs

Even without following up the footnotes and references there are some warnings signs in Feeling the Future that hint that something is amiss. For example.

The number of exposures varied, assuming the values of 4, 6, 8, or 10 across this experiment and its replication.

The only reason one would change something in an experiment is to determine if this one factor has any influence on the results. Here we learn that a factor was varied but there is neither reason nor justification given. Much less results.

These two items were administered to 100 of the 150 participants in this replication prior to the relaxation period and experimental trials.

The same thing applies here. A good experiment is completely preplanned and rigidly carried through. There’s no problem with doing less formal, exploratory work to find good candidate ideas that merit the effort necessary for a rigid test. But such exploratory experiments have almost no evidential weight.

Such warning signs are also present in the other experiments described in Feeling the Future. That could indicate that the same thing was done there as well. But don’t make the mistake of assuming that this issue is the only one that invalidates Bem’s conclusions. There’s also the issue of data dredging which is like deciding which card game to play depending on what hand you were dealt. Small wonder then, if you find your cards to be unusually good, according to the rules of the game you chose.

In terms of an experiment that means analyzing the results in various ways and then reporting those results that favor the desired conclusion. That Bem did this is also evident from a comparison of the 2003 and 2011 description of what is apparently and purportedly the same data.

Particularly worrying is that Bem has explicitly and repeatedly denied using such misleading methods. I shall restrain myself from speculating about what made him deny such an obvious, documented fact. It does not have to be dishonesty but none of the other possibilities is flattering, either.

There’s a common conceit among believers that skeptics don’t look at the data. Whenever someone claims this, ask them if there is anything wrong with Feeling the Future and you will know the truth of that.

1. #### des said,

June 5, 2011 at 9:45 am

In his seminar at Harvard, Dr. Bem aknowledged such optional stopping in methodology would be problematic and he stated firmly that it is not what he did. I think when we’re dealing with a scientist the caliber of Bem, maybe we should give him the benefit of the doubt and consider the implications that his data might be telling us something.

I think the critique about using classical stats is missing something. The odds internal to the experiment is not every thing that determines how reliable or not the data is. There is a probability associate with a professor at cornell deciding to pull off an 8 year experiment based on an extraordinary belief. We can not ignore the effect of this probability when interpreting what the data means. Taken together, the odds of the results with the odds that a study like this is carried out in the first place become astronomical. Because now, to factor the null hypothesis we don’t only have to believe that the results are not “significant enough”, but also that Bem’s beliefs are completely unfounded, which given his track in psychology has a certain element of low probability associated with it.

By ignoring this fact, we would be allowing statistics to take over science.

2. #### World of Parapsychology» WoP's Archives » Buzz of Bem’s paper: “Feeling the Future” said,

June 9, 2011 at 10:36 pm

3. #### Continued said,

March 22, 2012 at 6:23 pm

Also, experiment 6 can’t simply be 103 plus series 200 as the combined hit rate for erotic trails would be 48.0%, not 48.2% as reported for exp 6. In any case, 103 plus series 200 would contain no trials with 4 exposures and Bem states that exp 6 had trials with 4, 6, 8 and 10 exposures. Although, the numbers of male (63) and female (87) subjects in exp 6 would match 103 (15M, 35F) plus series 200 (48M, 52F) exactly.