The Entity case reviewed

‘The Entity’ is a 1982 movie (trailer), “based on a true story”. It’s about a woman, Carlotta “Carla” Moran, who is harassed and raped by a ghost and who seeks the help of parapsychologists.

Just a few days ago, someone posted a youtube link to a video in which parapsychologist Barry Taff, one of the real life investigators on the case, talks about it. I decided to investigate. One result is that I think the youtube clip is originally an extra on the DVD release of the movie (see Cinema of the Psychic Realm: A Critical Survey by Paul Meehan).
Between the case and the movie there is a novel by Frank DeFelitta, who is mostly a writer, director and producer.

A thorough investigation would require interviewing witnesses and, most importantly, reviewing what documentation is available. It is a fact that human memory is malleable, leading to things like False Memory Syndrome. This problem is especially acute in this case because the novel and the movie offer rich sources of entirely fictitious events and imagery. So called source amnesia is very common. That means that you can remember a fact but not where or how you learned it.
For example you may know that the the US constitution asserts the existence of “certain inalienable Rights, [and] that among these are Life, Liberty and the pursuit of Happiness.”
Where did you learn that? Did you read it? Were you told? Or did you hear on TV?

If you’re saying that you never learned this because it is not actually in the US constitution then grab yourself a cookie, otherwise you have some more food for thought. Either way, I’m sure the point is clear. Memory is not too reliable.

So, basically, a proper investigation must rely on recordings made at the time, photos, video and sound, as well as contemporary written accounts or notes. I will try to do that but I’m only a poor blogger and have only so much time and energy. This post is restricted to what one can find on the net. That said, let’s go.

Sources

The case was investigated by Kerry Gaynor and Barry Taff, of UCLA in 1974. Their findings were apparently published in IEEE publications for some reason. The IEEE is an electrical/electronics engineering association.
A deleted wikipedia entry gives these 2 references:

Taff, B.E. & Gaynor, K., “Another Wild Ghost Chase? No, One Hell of a Haunt”, in Wescon Special Session: “Psychotronics, 1975”, 1975 Wescon Professional Program, Western Electronic Show & Convention, San Francisco, September 16-19th, 1975 (Proceedings of the IEEE) [Institute of Electrical and Electronics Engineers, Inc.]

Taff, B.E. & Gaynor, K., “Another Wild Ghost Chase? No, One Hell of a Haunt”, in ELECTRO/76, Special Session: Psychotronics III”, Electro 76/Professional Program, Boston, May 11-14th, 1976, (Proceedings of the IEEE) [Institute of Electrical and Electronics Engineers, INC.]

Apparently these do not exist in electronic form at present so I am unable to confirm or deny the accuracy of these refs. The TV show Sightings seemed to, at least, confirm that there was something in ELECTRO/76.

Parapsychological articles generally cite a different source.

Taff, B.E & Gaynor, K., “A New Poltergeist Effect”, in Theta: A Journal For Research On the Question of Survival After Death, Journal of the Psychical Research Foundation, Durham, N.C., Vol. 4, No.2, Spring 1976, pp. 1-7

Apparently there are plans to make that archive electronically available but not yet.

What I did find was a sort of reprint in PSI Journal of Investigative Psychical Research, volume 4(1) from 2008. The introduction mentions changes made.

Desiring to set matters straight on what really occurred thirty-four years ago in a Los Angeles suburb, I am reprinting the original article with only minor upgrades and adjustments to compensate for three decades of time and acquired knowledge regarding this type of research.

I hope this only refers to background information. I am basing most of this post on that article.
I also listened to a few interviews with Barry Taff and watched a few segments from TV shows. Notably, a segment in the show Sightings (I think Season 1, Episode 3) featured Kerry Gaynor talking a lot and showing the photos. Unfortunately, that show was first aired in 1992, almost 20 years after the original occurrences and a decade after the movie. Gaynor seems to back up Taff’s article.
California’s Most Haunted is not very informative but has sound bites by Mort Zarkoff.

Problems with the Witnesses

I’ve already pointed out that long term memory is not too reliable. Now I must point out that even accounts written directly after an event are problematic. This realization dates back over 100 years. At the time parapsychology was very interested in séances. These took places in dark, even pitch-dark rooms and were lead by a medium.
Supposedly all sorts of supernatural events, objects flying, ghosts materializing and so on, took place. Skeptics explained this as the medium using magic tricks while believers pointed out that what witnesses recounted could not possibly be trickery.
In 1887, Richard Hodgsons, no less than the president of the American Society for Psychical Research, put the skill of witnesses to the test. With the help of magician John Davey, he staged a fake séance for, not exactly unsuspecting people ,but rather people who did not know that anything out of the ordinary would happen.
Their accounts were so incomplete and jumbled that it was impossible to reconstruct the tricks used. In fact, they could be ruled out, in defiance of the facts. (Read more)

Another consideration is that some of those involved have a financial motive to propagate the mystery of the case. They draw revenue from the movie and the novel. That is particularly true for DeFelitta, of course. The woman on whom all this was based also received money until her death or disappearance. I don’t know if she received any immediate, proper compensation for letting dozens of spectators into her home. It would have been fair.

According to my main source, DeFelitta was only present, with camera man Mort Zarkoff, on one evening where nothing much happened. That’s not the impression one would get from TV shows on the case, where he invariably appears. He happily relates seeing things that apparently happened when he was not there…
There were a few sound bites from Zarkoff on a California’s Most Haunted. They make it appear as if he is, like DeFelitta, talking about something he didn’t see, however that could be due to misleading editing.

Gaynor and Taff were technical advisors on the movie. I do not know how substantial the royalties are.

Barry Taff appears in a number of interviews, for example on Coast to Coast. He comes across as a very colorful character. He claims amazing psychic powers for himself, such as the ability to psychically diagnose medical conditions. He also relates how he once was beat up by a ghost, except that he also says that other witnesses saw him get beat up by a young man. As a witness he certainly has a credibility problem.

Kerry Gaynor now seems to be a hypnotist specializing in smoking cessation. At least I think that the guy in the photo of this article shows the same person. I feel he’s the most credible of the bunch.

Another person who appears in TV shows is Dick Thompson, professional photographer. The implication is that he was also a witness but he is not mentioned by name in my main source.

Allegedly, there were over 50 witnesses in total to one paranormal event or another. The problem here is that there are not that many witnesses testifying to the phenomena. We only have these five who testify to the phenomena and to the presence of other witnesses. What those other witnesses would say is simply unknown.

If anyone feels like digging up the original reports in a library, by all means check the following issues of the journals as well. Someone may have written a letter to the editor in which they confirm or deny the accuracy of the account.
However, even someone who saw absolutely nothing might still assume that all the good bits happened on another occasion or that the matter was simply not important enough to follow up on.

In any case, all those people that were present cannot give us much additional trust in the accuracy of the account. With so many people present, the likelihood is higher that one of them would be a “bad witness”. Maybe one of them is not quite sane or not quite honest and will see or say anything with the right prodding.
Also, so many people present does increase the possibility that one of them may have been motivated to play a little hoax.

At the end of the day, that so many people were present is a problem for the paranormal interpretation, rather than a plus.

Recent developments

As late as 2011, Taff said on Coast-to-Coast (23.01.2011) that he did not know what had happened to the female victim. He had not maintained contact with her but he knew from DeFelitta that she had stopped cashing the checks sent to her and that DeFelitta was unable to contact her.
More recently someone surfaced who claims to be the woman’s second child. I don’t know how or if the identity of that person was established. He claims that she died in 1996 from pulmonary disease but otherwise I see no new information. Read more about this here.

I am going to go through what allegedly happened according to Taff’s article in PSI. The article is contradicted in a few details by other sources.

What happened?

The meeting

In real life, the victim/experiencer was called Doris Bither at the time. She was married multiple times and apparently changed her last name accordingly. She is described as having been intoxicated during most of her dealings with the parapsychologists. She had four children, 3 boys aged 10, 13 and 16, which were interviewed, and one daughter, 6, that was never seen.
She lived in Culver city, California in a shabby house that was “twice condemned” by the city.

Her first meeting with the parapsychologists was a chance encounter. She overheard Kerry Gaynor talking with a friend about hauntings in a bookstore and approached them.

The first Visit

On August 22, 1974, Taff and Gaynor visited her small home in Culver city and interviewed the family.

Their accounts were fairly uniform in reference to a particular apparition whom they called “Mr. Whose-it.” The alleged apparition would appear in semi-solid form and was well over six feet in height, according to their testimony. Both Doris and her eldest son claimed to have seen two dark, solid figures with Asian features appear from out of nowhere within their mother’s bedroom, who at times appeared to be struggling with each other.
This particular event occurred several times, with one episode where Doris claimed to have physically bumped into the apparition in the hallway. Neither Doris nor her eldest son would accept the possibility that the apparitions might have been imagined or simply prowlers or intruders who forcibly entered the house.

Doris Bither also reported that she was sexually assaulted on several occasions, suffering large bruises. However, there were no medical records to substantiate this claim and the investigators could not see any bruising. The last attack had allegedly happened weeks prior so that any injuries had healed in the meantime.
The ‘spectral rape’ was pretty much what the movie centered around but there is no evidence, no matter how tenuous, that could corroborate Bither’s claims. By the by, California’s Most Haunted asserts that the investigators saw the bruising complete with a dramatic reenactment in which an actress pulls her top down.

Another incident featured in the movie is simply hearsay as well.

Even more dramatic was Doris’ claim that during one particular attack, her eldest son overheard the scuffle and entered the bedroom. According to Doris, he witnessed her being tossed around like a rag doll by the entities. She alleges that when her son came to her aid, an invisible force picked him up and threw him backwards into the wall. The son corroborated his mother’s story, speaking of the sheer terror he experienced during that struggle.

Apparently these stories are only the tip of the iceberg, for Taff goes on:

I will refrain from going into all the bizarre stories that were related to us for we cannot substantiate them.

The bleak picture that emerges is one of a dysfunctional family, with an alcoholic mother, living in a run down house. If you’re thinking based on this that these interviews promised an intriguing case then Taff and Gaynor would have, reportedly, disagreed with you.

Our initial impression was to totally discount Doris’ claims and simply refer her to one of the psychiatrists at the NPI.

The changed their opinion only because:

However, a few days hence, Doris called to inform us that five individuals outside her family had now seen the alleged apparitions.

There is no mention what exactly was seen, or if the claim was corroborated by asking the witnesses. Presumably it was not.

The second Visit

The parapsychologists return to the house with cameras. They noticed cold spots, a rotting smell and the feeling of pressure on the inner ear. Neither of these reports is particularly odd. Such sensations can be easily induced by suggestion.
There are no mentions of using measurement devices to confirm that any objective temperature or air pressure differences existed. However, I can’t help speculating what it might have been if it was real.
The stench is obviously consistent with the destitute circumstances of the family. On a more speculative note, the stench may also have been connected with the feeling of pressure in the ears. The feeling might have simply resulted from breathing differently. Or there might have been noxious chemicals or mildew spores in the air, coming with the stench from whatever was rotting there. Which might cause swelling of mucous membranes in the nose or ears and so lead to that feeling of pressure. But that’s really a lot of speculation without knowing if there was anything more than suggestion going on.
About the feeling of cold:

An intriguing factor, which in our opinion is highly significant, was that from the very first occasion we entered Doris’ bedroom, we both immediately noticed that the temperature was unusually low in comparison to the rest of the house, even though it was a hot August night and all the bedroom’s windows were closed.

A room can be cold because of closed windows, rather than despite. It will prevent warm air from going flowing in. If the room is also shadowed by the rest of the house, then it will be cooler than the outside or the rest of the house, because it neither receives warm air nor hot sun.

Moving on…

The first of many to come, seemingly inexplicable happenings, occurred while Gaynor was talking to the elder son in the kitchen.
Gaynor was standing approximately one foot away from the lower cabinets when suddenly the cabinet door swung open. A frying pan flew out of the cabinet, following a curved path to the floor over 2.5 feet [ca. 75 cm] away, hitting with quite a thud. Now, of course, the immediate thing to surmise is that the pan was leaning against the cabinet door and finally pushed it open as it fell out. But we cannot accept this explanation for the trajectory of the pan as it came out of the cabinet was elliptical. It was seemingly propelled out of the cabinet by a substantial force.

2.5 feet does not suggest much force to me. Moreover it suggests a very short flight. It seems doubtful that either Gaynor or the son were paying much attention to the cabinet. Which raises the question of how much of the flight path was really seen rather than just subconsciously surmised. This event is only as inexplicable as eyewitness testimony is reliable. That can be summed up in three words: Not at all.
By the by, over the years the pan has begun flying further. In California’s Most Haunted, Taff says that it flew across the kitchen, many feet. Retroactive PK at work, perhaps?
As are as I am concerned the falling hypothesis is entirely viable. Of course, it’s also possible that one of the kids played a prank. Come to think of it, the report doesn’t even say that Gaynor and the elder son were alone in the kitchen.
By the way, the term “elliptical trajectory” is nonsense in this context. That shows a lack of formal physical knowledge but whether that is relevant here is a different matter.

During this visit, an alleged psychic and friend of Bither called “Candy” was also present. At some point she called that she felt a presence in the bedroom. Taff rushed in from the kitchen and immediately took a Polaroid photo. At the time, there were no digital cameras. Normal cameras took a chemical film which had to be developed in a lab but Polaroid cameras produced pictures that developed on their own, on the spot.
So they could immediately examine the picture. It was completely “bleached” white. Taff uses the word “bleached”, to me these photos simply look overexposed. Psychic Candy felt a presence several more times and each time either Gaynor or Taff took another photo.
It’s not particularly remarkable that they would have produced a number of bad photos. Electronics in th 1970ies was not as advanced as now and so cameras required a lot of manual adjustments. The question is rather why the photos for which Candy indicated a “presence” were overexposed.
One answer may simply be that this is selective memory. Maybe they just didn’t bother mentioning pictures that came out normal when Candy shouted.
Another point may be that when Candy shouted they simply did not have the time to adjust the settings properly. They may also have held the camera differently at those time, covering the a built-in brightness sensor. It may also be possible that the sensor was not fast enough to adjust to different conditions when they rushed from one room to another.
Whatever the case may be, a few overexposed photos hardly deserve to be called evidence.

In one close-up of Candy, which was taken when she said the presence was right in front of her, her face is “bleached” but the surroundings not as much. That can happen when using a flash. The bright flash is reflected by the face but quickly dissipates over longer distances.

One more picture showed “a small ball of light” but no one of those present had seen it. White spots may result from damage to the photographic material. The specific camera used, the SX-70, was popular among artists, not just because it was the first instant camera but also because the pictures could be manipulated by applying the right kind of pressure afterwards.
I haven’t seen the specific photo, though.

Standing there in amazement for several minutes discussing this phenomenal picture, I happened to glance over toward the bedroom’s eastern window and suddenly observed several rapidly moving, electric-blue balls of light.

The most obvious explanation here is that Taff simply saw a car’s headlights, or whatever, reflected in the window glass. No one else saw anything.

What is interesting is that these are the first strange lights mentioned. There is no mention that the family had ever seen any strange lights.
Now something appears on a picture that they interpret as a ball of light and immediately Taff sees a strange light.

The last paranormal phenomenon of the day:

[…]the Polaroid suddenly and inexplicably took the last photograph by itself.

duN-dun-DUNNNN! Boo!
I have to ridicule this because I have absolutely no explanation for this. File under total proof.

They also had an infrared camera but due to mishandling the film was exposed. They do not consider the possibility that it was malicious manipulation by the entity.

The third visit

On this occasion they take a professional photographer with them. Unfortunately, just on that day they did not obtain any photographic evidence. What a strange coincidence. Or maybe the work of a malicious spirit?

Our third to Doris’ house was most notable in that it was the first occasion where we both collectively witnessed identical visual phenomena. On more than twenty separate occasions, all of us present in the bedroom, including Doris and the female photographer, simultaneously observed what appeared to be small, pulsing flashes of light. It was at this point that we decided to further darken the candle-lit room and hung several heavy quilts and bedspreads over the windows and curtains. Our attempt partially succeeded in that we significantly attenuated the outside light.
The change in light intensity within the bedroom did not affect our most unusual luminous “friend” that now appeared even more brilliant against the darkened surround. It should be noted that we both alternately watched the various window areas in the hope of determining if the source of light was originating from outside the house, perhaps from a passing vehicle or neighbor’s flashlight.
After several such attempts, we were satisfied that whatever these moving, pulsing lights were, they were not originating from outside the house as the thick quilts draped over the window curtains would have easily told us of such an photonic intrusion. The sudden and rapid appearance and disappearance of the lights on this night made it virtually impossible to obtain any photographs, regardless of the fact that it appeared over ten times on the front of the bar area alone.

I don’t know what I should say to that. I don’t understand how they became satisfied that the lights did not come from outside.
If I had to guess than my guess would be that it was light from outside after all.
Whatever, with a bit of creativity I can think of other causes for such elusive lights. One is reflected, flickering, candle light, or perhaps light from equipment. Also, it just might be that someone played a trick on them. The three teen-aged sons seem ideal suspects.
Another possibility is misperceptions in the low light environment, aided by suggestion.
The quirks of the human vision will also make it more difficult to track down a dim light source in a dark room. The human retina is not equally light sensitive everywhere. The edges of our vision are much more light sensitive than the center.
So we might see a light, that is really there, out of the corner of the eye and then lose it when looking straight at it. An occasional glance into a bright candle may leave afterimages.

We learned the next day that our attractive female photographer, after being dropped off by us at her apartment, became so overpoweringly ill from the effects of the bedroom’s malodorous environment that she regurgitated heavily before retiring.

They don’t say why they consider this particular detail noteworthy, nor why they feel it necessary to point out that the photographer was attractive. I suppose the implication is that she suffered because of the rapist ghost.
One might also conjecture that she, or maybe all of them, was (accidentally or not) drugged or poisoned which caused them to see lights and her to throw up. That is not be the most likely possibility, though, in my opinion.

The fourth visit

On their fourth visit they brought a number of other people with them and conducted a séance. The séance thing is usually glossed over in documentaries, for some reason. They only mentioned dozens, or whatever witnesses. Perhaps the producers fear that saying that they were there for a séance detracts from their credibility.

The séance circle consisted of someeight individuals, including more than ten spectators, many of who came with cameras loaded with high-speed infrared film with IR flashes, and high-speed black and white film with deep-red filtered strobes.

There is no mention of who these people were or how they were recruited. There is neither mention that any written testimonials were collected.

On this night we all observed what appeared to be extremely intense lights, which were not stable either in size or luminosity. The lights were at times three- dimensional in nature, reaching out between various individuals within the circle. Judging from the rapidly changing size, dimensional characteristics and intensity of the lights observed, it is our opinion that these manifestations were not fraudulently created, nor the result of collective hallucinations.

Apparently the lights were different from those on the previous night. That suggests a different cause.
Taff asserts that these lights could only have been “faked” using lasers and spends several paragraphs arguing that a concealed laser apparatus is an impossibility. It does not at all argue for the assertion that it would have required a laser.

In Sightings Gaynor mentions the possibility that it might have been a flashlight but dismisses the idea based on a photo. I wonder why that is not mentioned by Taff. The photo he bases his dismissal on was apparently taken during the next visit, according to my primary source so it is discussed below.

As far as photos on this day go:

With three 35 mm. cameras continuously firing at these oscillating greenish white, three-dimensional lights, only one photograph depicted anything significant. The camera loaded with Kodak Tri-X black and white film with a deep-red filtered strobe captured what appears to be a small ball of light flying across the corner of the room. The sixth obtained photograph displayed an object bearing strong resemblance to a comet with a tail behind it.

And later:

However, several other pictures showed what appeared to be faces or figures outlined in light against a sliding closet door. But, as these images are highly subjective in nature, much like a Rorschach, we did not subject them to further analysis. Another photograph depicted an intense light against the south-facing wall in several separate frames.
The professional photographer who took these pictures was convinced that this exposure could not be explained away as irregularities in paint or a “hot spot” of reflection. The photographer was similarly convinced that the flying ball of light, discussed earlier, which he also caught, was not an artifact of overdeveloping or scratch marks on the negative.
The criticisms raised against the facial and figure outlines on the walls were, in most respects valid, in that the lack of uniformity of paint on the bedroom walls in conjunction with the slight penetrating power of the pushed Kodak Tri-X film could have conceivably accounted for these unidentifiable figures, which unfortunately were not recognizable by everyone examining the photographs.

The phenomenon of seeing shapes in random patterns is known as pareidolia. There is no mention of who examined the photographs.
The assertions of the professional photographer can be believed or not. That he reportedly denies pareidolia harms his credibility but whether that matters is a different matter.

So what happened that night? No one knows, remember the limitations of eye witness reports. Nevertheless. I will indulge some speculations.

Green afterimages are seen after a bright red light. This could be a candle or the aforementioned red strobe. Particularly in a dark room such an afterimage can appear as a three dimensional glowing mass. It would require an iron will to believe to not see it for what it is though.
Whether this will is present in the witnesses from who we heard, you decide.

Lights from outside may also have played a role.

Then, there is still the possibility that someone, one of the family’s kids or a spectator, played a prank with a flashlight. It seems to me, though, that there should be more photos with a bright light then. Still, an iron will to disbelieve might lead one to dismiss such photos, especially if the light was perhaps usually turned less bright. So, who knows.

The fifth visit

Our fifth visit to Doris’ house resulted in a large-scale magnification of all phenomena. We began by duct taping large black poster boards up on the walls and ceiling of the bedroom, all of which were numbered and identified with a magnetic orientation. White duct tape was placed between the dark panels that formed a grid network, like graph paper, therein providing us with a reference for further attempts at photographing the lights. Black poster boards were also used to seal off all the light entrances into the bedroom that rendered the environment almost pitch black.
With over 30 individuals, some of whom were volunteers from our UCLA Parapsychology laboratory, the lights returned and were even more brilliant than before, as well as demonstrating a direct responsiveness to our verbal suggestions. The three-dimensional lights seemingly reacted and responded to our jokes and various provoking remarks, especially those of Doris.

They asked the lights to blink on a certain panel which it did.

[…]two blinks in panel three for “yes” and four flashes in panel six for “no.”

There is no full account of the questions asked, nor the answers given. In short:

The answers we received could not be confirmed, and never really made any sense.

These answers are interesting because if everyone saw the same answers then that would indicate that there really was a light, regardless of origin.
Unfortunately, I don’t think that can be taken as a given. They were sitting around in the dark waiting for something. Now if someone just shouts out that they see something then, quite possibly others will go along. That someone would buck the trend and say that he or she does not see something seems less likely. They might figure that their night sight is not good enough.
And then there’s the question to what a degree Taff and Gaynor would have been willing or able to credit skeptical remarks.

This night produced the best piece of evidence in the form of this photo.

On the photo you see two arcs of light. Which is not what was seen, according to the report.
The smaller one on the left looks to me like a “kink mark” (don’t google that at work). That’s what happens when a chemical film is roughly handled and bent which causes the light sensitive chemicals to be displaced.
Often, the smaller, left arc is cut off, leaving only the large arc above Doris Bither. Much is made of that.
The arcs are interpreted as traces of the balls of light, left because the shutter speed was not high enough. On its face that seems plausible but notice how the arcs fade out at the ends. That points rather to a kink mark in my opinion.
In either case, I have to agree with the often made argument that these arcs cannot be light projected on the background since otherwise there should be bends and discontinuities. This is particularly true for the left arc that covers someone’s head. Though for some reason Taff always points to the larger arc not being bent by the corner.

I don’t see any poster boards on the walls or windows. That could mean that Taff is mixing up dates and this photo was actually taken the previous day.
Or it may be that it simply was not taken in the bedroom. That would indicate that it was not taken during the séance when the lights were seen.
Either way, it is a worrying discrepancy.

When Adrian Vance, the West Coast Editor of Popular Photography examined the negatives of these photos, he was as perplexed as we were. According to Vance, the very nature of optical glass in a 35 mm. SLR camera prohibits such inverted arcs from occurring. Yet here they are. Vance could not conceive of any known artifact or anomaly to account for such images.

I find it hard to believe that the smaller arc is not a kink mark as it matches the examples I have seen. I don’t know about the bigger one.
There is reason to believe that Vance judges the likelihood of errors in pictures differently than many of his colleagues. That is that he is or was also an occasional UFO researcher who validated some UFO photos. Whether his judgment is more or less accurate than that of his colleagues, I cannot say.
The photo was published in the magazine. It would be very interesting to know what letters-to-the-editor that produced. (“UCLA Group Uses Camera to Hunt Ghosts”, Popular Photography, May 1976, pp. 102 & 115.)

They also had a Geiger counter which gave no reading at one point during the evening. This is interpreted as radiation being actually shielded rather than the device simply malfunctioning. No comment.

In the same night, the poster boards were pulled from the walls. It was suggested that maybe the heat could have weakened the adhesive tape but Taff asserts that some paint and plaster had been pulled from the wall as well and was still sticking to the tapes.
The possibility that one of the family was responsible was not discussed. In some interviews the idea that Doris Bither may have done it herself is repudiated by saying that she was petite. I think ladder technology had already been developed by 1974 but whatever. When I hear of petty vandalism, I’m thinking teen-age male.

The sixth visit

Our sixth session at the house took place five days later and in most respects was a repeat performance of our fifth visit with the exception that the lights repeatedly began to
take shape; forming the lime green, partially three-dimensional, apparitional image of a very large, muscular man whose shoulders, head and arms were readily discernible by the more than twenty individuals’ present. However, no salient facial characteristics of this apparition were discernible.

Nothing is made of it but this is the first time an apparition is seen by outsiders, similar to what the family had claimed.
Allegedly two persons, called Jeff and Craig, fainted when they saw the apparition. Perhaps fainting is not so unusual in the California heat in a crowded, ill-smelling room.

The display of lights this evening was so intense that they easily illuminated the numbered poster boards covering the walls of the bedroom’s corner. Even the clothes of the individuals observing the lights from outside the séance circle were brightly lit by the luminous activity. In fact, so piercing were the lights that they were seen to reflect off the camera’s aluminum frame and lenses, all of which were aimed directly at the corner where the optical display was concentrated.

Not surprising to us, considering the past two attempts at photographing the lights, all the negatives were perfectly clear, as if no light whatsoever was present to expose the film.

Undoubtedly, that was the most amazing night. If there was real light then someone must have very skillfully manipulated seven cameras unnoticed which seems just implausible.
The apparition is attested to by both Gaynor and Taff. Could they have convinced each other of having seen something that was not there at all? The answer is yes, of course. And while this  as an explanation is not implausible it is deeply unsatisfying to me. What really happened there?

The seventh visit

Again, the poster boards had been torn down. Also Doris Bither and two of her sons reported other psychokinetic phenomena. She showed a large bruise on her arm which supposedly resulted from having been hit by a candelabra thrown by an invisible force.

Accompanying us on this evening was Dr. Thelma Moss, head of our laboratory at UCLA’s Neuropsychiatric Institute, various assistants from the lab, several psychiatrists from the institute who professed an interest in such phenomena, and Frank De Felitta, a renowned writer,
producer, and director of The Stately Ghosts of England (NBC, 1965).

This appears to be the only occasion on which DeFelitta and Mort Zarkoff accompanied them, despite how it sounds in interviews.

There were some faint glimmerings of light, but they were in no way intense enough to
cause any real excitement.
[..]
Sadly, those attending only on this evening did not witness the “magnificent” display of swirling, three-dimensional lights or the apparition that had occurred within the house.

The obvious question is why no show? An obvious but speculative answer is the presence of some more skeptical authority figures in the form of Thelma Moss and several psychiatrists. If the lights were really just afterimages and hysteria then a few more sober persons, not going with the flow and not afraid to voice doubt, would have been quite a damper.
Of course, it’s also possible that a hoaxer simply lost interest, was not invited or any number of other things.
Taff remarks that Bither was much calmer during that visit and for the first time not intoxicated.

However, at one point during the séance, Gaynor suggested to the “presence” in the house, whatever it was, that it should demonstrate its strength by again tearing the poster boards off the walls, but this time, in our presence. As if in immediate reply, within five seconds following Gaynor’s request, several of the poster boards directly over Doris’ head were suddenly torn loose from their position and sharply struck her in the face.
Both Gaynor and I, as well as others in the room, could easily observe the bizarre sight of the duct tape being pulled, again as if by unseen hands, from the boards on the wall.

That sounds quite amazing and even more so when you hear Gaynor relating that in an interview.

Needless to say, the opinions of some of those attending only for the seventh visit to Doris’ house were anything but positive, as our claims were only marginally supported. As far as the activity surrounding the poster boards was concerned, many of those present felt that their sudden removal from the walls and ceiling was explainable under the heating and humidity hypothesis discussed earlier. Yeah, right. And pigs can fly too.

This is a remarkable passage. It reveals that there are two completely different views of the happenings. And it is the only passage that reveals that. I also wonder why it should be needless to say that.
Apparently what Gaynor and Taff witnessed was a solidly fixed board being torn from the wall and flying straight at Bither, right on cue. What the others witnessed was boards just falling down. I wonder if these less impressed witnesses would even agree that it happened on cue. Witness reports are often unreliable with regards to timing.

Needless to say, Frank and Mort were absolutely amazed by even this, less than expected, occurrence. Unfortunately, when they had their special films processed, it did not reveal anything significant as related to what we all had observed that night in Doris’s bedroom.

That nothing remarkable was caught on camera means either that it was not pointed at Bither or the right poster boards at the right time, or that the falling boards did not look impressive.

Note that Taff refers to events that all witnessed. That makes one wonder about the other occasions where supposedly many people saw something. Did really everyone see something amazing or just something?

The last visit

The 8th and last visit took place on Halloween 1974. It is described as even less remarkable than the previous one. No more details are given.

Advertisements

Radin for a Rerun

This is the 3rd and currently last part in my series on parapsychological double-slit experiments. It discusses the pilot study by Dean Radin and the six following experiments by him and others.

The Less Said…

In 2008, Dean Radin published an article in Explore, The Journal of Science & Healing, a journal dedicated to alternative medicine. That’s certainly a good way to hide it from anyone who knows or cares about physics. Or ethics.
Dean Radin is currently coeditor-in-chief.

Now hold tight because this paper is bad.

Obviously, the original design and justification for the experiment is taken from Jeffers, though Radin fails to give him the appropriate credit. However, the implementation is different.

Radin used a so-called Michelson interferometer for the experiment. This is physically equivalent to Jeffers’ set-up in all relevant aspects. However, due to being extremely sensitive to environmental factors, like temperature or vibrations, it seems a less than ideal choice.
Another thing that is different is the outcome measure. That’s where things really go south. He uses a CCD chip to capture the interference pattern. Unfortunately he then completely disregards it.
Instead he computes the total intensity of the light reaching the chip.
With that, the experiment becomes a test if the subjects can throw a shadow by concentrating on a spot.
Radin seems completely oblivious of that simple fact.

In all likelihood, Radin simply did not get a positive result and then, instead of accepting it, went fishing. How that works (except not really) is simply explained here.

The title of the paper, Testing Nonlocal Observation as a Source of Intuitive Knowledge, tells us that Radin sees this as relevant to intuition. Intuition, as he points out, is regarded as an important source of artistic inspiration or for scientific insights. It’s a bit hard to see what the connection between that and casting shadows should be. But remember that Radin doesn’t know what he is doing. He believes this is like Jeffers’ original design.
His first error is in thinking that the original design could that someone gains knowledge about the photons’ paths. Learn why not in this post. This is a simple misunderstanding and easy to follow.
His second error is more weird and more typically parapsychological. He thinks that, if people can gain knowledge about the paths of a few photons in some unknown way, then it’s reasonable to assume that they can gain any knowledge via the same unknown mechanism.
We detect photons all the time with our eyes, and much more effectively. If that doesn’t tell us anything about intuition, then why should this?

The bottom line is that evidently the experiment was botched. And even if it hadn’t been, the conclusions he tries to draw just wouldn’t follow. The connections he sees are just not there, at least as far as there is evidence.

Six More Experiments

Finally, we get to the current paper by Dean Radin, Leena Michel, Karla Galdamez, Paul Wendland, Robert Rickenbach, and Arnaud Delorme. I think Radin wrote most of the paper. It repeats so many errors of the previous one and the statistics are his style.

The paper is miles better. It follows Jeffers original set-up quite closely. For one, they use a standard double-slit set-up. They don’t measure the contrast in the way Jeffers did but something that should work just as well. At least, the justification seems solid to me though I don’t know enough to actually vouch for it.
What’s more, they say that the ups and downs of the measure were given as feed-back to the subjects. This again follows Jeffers’ lead and, importantly, gives me some confidence that it was not chosen after the fact. But again Jeffers is not properly credited with coming up with the original design of the experiment.

Unfortunately this paper again does not report the actual physical outcome. They don’t report how much the subjects were able to influence the pattern. What seems clear is that none of them was able to alter the pattern in a way that stood out above the noise.
It would have been interesting to know if the measurements were any more precise than those conducted by Jeffers and Ibison. If their apparatus is better, and the effect still couldn’t be clearly measured, then that would suggest confirmation that Ibison’s result was just chance.

They calculate that, on average, the pattern in periods where subjects paid attention was slightly different from the pattern when they did not pay attention. That is valid in principle because you can get a good measurement with a bad apparatus by simply repeating the process a lot of times.
However, any physicist or engineer would try their utmost to improve that apparatus rather than rely on repetitions.

On the whole, this is not handled like a physics experiment but more like one in social science.
Most importantly, the effect size reported is not about any physically measurable change to the interference pattern. It is simply a measure for how much the result deviated from “chance expectation”. That’s not exactly what should be of top interest.

The paper reports six experiments. To give credit where credit should not be due, some would have pooled the data and pretended to have run fewer but larger experiments to make the results seem more impressive. That’s not acceptable, of course, but still done (EG by Daryl Bem).

Results and Methods

The first four experiments presented all failed to reach a significant result, even by the loose standards common in social science. However, they all pointed somewhat in the right direction which might be considered encouraging enough to continue.

Among these experiments there were two ill-conceived attempts to identify other factors that might influence the result.
The first idea is that somewhat regular ups and downs in the outcome measure could have coincided with periods of attention and no attention. I can’t stress enough that this would have been better addressed by trying to get the apparatus to behave.
Instead, Radin performs a plainly bizarre statistical analysis. I’m sure this was thought up by him rather than a co-author because it is just his style.
Basically, he takes out all the big ups and downs. So far so fine. This should indeed remove any spurious ups and downs coming from within the apparatus. But wait, it should also remove any real effect!
Radin, however, is satisfied with still getting a positive result even when there is nothing left of that could cause a true positive result. The “positive” result Radin gets is obviously a meaningless coincidence that almost certainly would not repeat in any of the other experiments. And indeed, he reports the analysis only for this one experiment.
Make no mistake here, once a method for such an analysis has been implemented on the computer for one set of data, it takes only seconds to perform it on any other set of data.

The second attempt concerns the possibility that warmth might have affected the result. A good way to test this is probably to introduce heat sources into the room and see how that affects the apparatus.
What is done is quite different. Four thermometers are placed in the room while an experiment is conducted. The idea seems to have been that if the room gets warmer this indicates that warmth may have been responsible. Unfortunately, since you don’t know if you are measuring at the right places, you can’t conclude that warmth is not responsible if you don’t find any. Besides it might not be a steady increase you are looking for. In short, you don’t know if your four thermometers could pick up anything relevant or how to recognize it, if they did.
Conversely, even if the room got warmer with someone in it, this would not necessarily affect the measurement adversely.
In any case, temperature indeed seemed to increase slightly. Why the same temperature measurements were not conducted in the other experiments, or why the possible temperature influence was not investigated further, is unclear to me. They believe this should work, so why don’t they continue with it?

The last two experiments were somewhat more elaborate. They were larger, comprising about 50 subjects rather than about 30, and took an EEG of subjects. The fifth experiment is the one success in the lot insofar that it reports a significant result.

Conclusion

If you have read the first part of this series then you have encountered a mainstream physics articles that studied how the thermal emission of photons affects the interference pattern. What that paper shares with this one is that both are interested in how a certain process affects the interference pattern.

And yet the papers could hardly be more different. The mainstream paper contains extensive theoretical calculations that place the results in the context of known physics. The fringe paper has no such calculations and relies mainly on pop science accounts of quantum physics.

The mainstream paper presents a clear and unambiguous change in the interference pattern. Let’s look at it again.

The dots are the particles and the lines mark the theoretically expected interference patterns fitted to the actual results. As you can see the dots don’t exactly follow the lines. That’s just unavoidable random variation due to any number of reasons. And yet the change in the pattern can be clearly seen.

From what is reported in Radin’s paper we can deduce that the change associate with attention was not even remotely as clean. In fact, the patterns should be virtually identical the whole time.
That means, that if there is a real effect in Radin’s paper, it is tiny. So tiny that it can’t be properly seen with the equipment they used.

That is hardly a surprising result. If paying attention on something was able to change its quantum behavior in a noticeable way, then this should have been noticed long ago. Careful experiments should be plagued by inexplicable noise, depending on what the experimenters are thinking about.

The “positive” result that he reports suffer from the same problem as virtually all positive results in parapsychology and also many in certain recognized scientific disciplines. It may simply be due to kinks in the social science methodology employed.
Some of the weirdness in the paper, not all of which I mentioned, leaves me with no confidence that there is more than “flexible methods” going on here.

Poor Quantum Physics

Radin believes that a positive result supports “consciousness causes collapse”.  He bemoans a lack of experimental tests of that idea and attributes it, quite without justification, to a “taboo” against including consciousness in physics.
Thousands upon thousands of physicists and many times more students have out of some desire to conform simply refused to do some simple and obvious experiment. I think it says a lot about Radin and the company he keeps that he has no problem believing that.
I don’t know about you, my dear readers, but when I am in such a situation would have thought differently. Either all those people who should know more about the subject than me have their heads up their behinds. Or maybe it is just me. And I would have wondered if there was maybe something I am missing. And I would have found out what it was and avoided making an ass of myself. Then again, I would have (and have) also avoided book deals and the adoration of many fans and the like, all of which Radin secured for himself.
So who’s to say that reasonable thinking is actually the same as sensible thinking.

But back to the physics. As is obvious when one manages to find the relevant literature, conscious awareness of any information is not necessary to affect an interference pattern. Moreover, wave function collapse is not necessary to explain this. Both of this should be plain from the mainstream paper mentioned here.

Outlook

My advice to anyone who thinks that there’s something about this is to try to build a more sensitive apparatus and/or to calibrate it better. If the effect still doesn’t rise over the noise, it probably still wasn’t there in the first place. If it does, however, future research becomes much easier.
For example, if tiny magnetic fields influence this, as Radin suggests, that could be learned in a few days.

Unfortunately, it does not appear that this is the way Dean Radin and his colleagues are going about but I’ll refrain from comment until I have more solid information.
But at least, the are continuing the line of investigation. They deserve some praise for that. It is all too often the case that parapsychologists present a supposedly awesome, earth-shattering result and then move on to do something completely different.

 

Update

I omitted to comment on a lot of details in the second paper to keep things halfway brief. In doing so I overlooked one curiosity that really should be mentioned.

The fourth experiment is “retrocausal”. That means, in this case, that the double-slit part of the experiment was run and recorded three months before the humans viewed this record, and tried to influence it. The retrocausality in itself is not really such an issue. Time is a curious thing in modern physics and not at all like we intuit.

The curious thing is that it implies that the entire recording was in a state of quantum superposition for a whole three months. Getting macroscopic objects into and keeping them in such states is enormously difficult. It certainly does not just happen on its own. What they claim to have done there is simply impossible as far as mainstream quantum physics is concerned. Not just in theory, but it can’t be done in practice despite physicists trying really hard.

Attention! Double-slit!

Recently Dean Radin and others published an article that purports to study the effects of attention on a double slit experiment.

Originally I wanted to do just a rebuttal to that but then found it necessary to also review the entire background. The simple rebuttal spiraled out of control into a 3-part series. My old math teacher was right. Once you add the imaginary things get complex, for reals. And not only for them.

A Word of Caution

People often ask for evidence when they are faced with something they find unlikely. The more skeptical will also ask for evidence for something they consider credible, at least sometimes. For the academically educated evidence means articles published in peer-reviewed, reputable, scientific journals.
For example, all the articles I cite as evidence in the first part, where I look at mainstream quantum physics, are from such journals.
So here comes the warning. Not all journals that call themselves peer-reviewed are reputable. For example, there is a peer-reviewed journal dedicated to creationistic ideas. And I probably don’t need to tell you what scientists on the whole think of creationism.

The journals that published the articles discussed in this series are not reputable. Mainstream science does not take note of them. Physics Essays, where the most recent article appeared may very well be the closest to the mainstream and still it is mostly ignored.
It is largely an outlet for people who believe that Einstein was wrong. We’re not talking about scientists looking for the next big thing, we’re talking about people who are to Einstein’s theory what creationists are to evolution.
This is not meant as an argument against these ideas, I just don’t want to mislead anyone into believing that there is a legitimate scientific debate going on here.

That’s not to say that science ignores fringe ideas. For example, Stanley Jeffers who appears in the second part of this series is a mainstream physicist who decided to follow up on some of those.
He just didn’t find that there was anything there. It was a dead end.
James Alcock has a few words on that in his editorial Give the Null Hypothesis a Chance.

There are many cranks out there. These are people who hold onto some theory in the face of contrary evidence. They will not go away but they will, almost invariably, accuse the mainstream of science to be dogmatic. Eventually, there is nothing to be done but ignore them.

On to the Review

The first part gives a brief overview over the quantum physics background to the experiment. Dean Radin gets this completely wrong. And I fear the misunderstandings he propagates will pop up in many places.

Part 1: A Quantum Understanding

In the next part we will look at the experiment in question. Let’s call it the parapsychological double-slit experiment. We will learn who came up with the idea and what he found out and also what a positive result should look like and what it might mean.

Part 2: A Physicist Investigates

The 3rd and last part, for now, looks at the two articles authored by Dean Radin, presenting seven replications of the original design.

Part 3: Radin for a Rerun

Further studies are being conducted so more parts are likely to follow at some point.

Getting Wagenmakers wrong

EJ Wagenmakers et al published the first reply to the horribly flawed Feeling the Future paper by Daryl Bem. I’ve blogged about it more times than I care to count right now.

Their most important point was regarding the abuse of statistics. Or, as they put it, that Bem’s study was exploratory rather than confirmatory.
They also suggested a different statistical method as a remedy. I’ve expressed doubts about that because I don’t think that there is a non-abusable method.

Unfortunately ,what they proposed has been completely and thoroughly misunderstood. The latest misrepresentation appeared in an article by 3 skeptics in The Psychologist. I blogged.

How to get Wagenmakers right

The traditional method of evaluating a scientific claim or idea is Null-Hypothesis Significance Testing (NHST). This involves coming up with a mathematical prediction of what happens if the new idea is wrong. It’s not enough to say that people can’t see into the future, you must say what the results should look like if they can’t.
After the experiment is done, this prediction is used to work out how likely it was to get results such as those that one actually got. If it is unlikely then one concludes that the prediction was wrong. The null hypothesis is refuted. Something happens and this then is taken as evidence for the original idea.
There’s a number of things that can go wrong with this method. One is choosing that null prediction after the fact, based on whatever results you got.

The method that Wagenmakers argued for is different. It involves not only making a prediction about what  happens when the original idea is wrong. It also requires making a prediction about what happens if it is right.
Then, with the results of the experiment, one works out how likely the result was under either prediction. Finally, calculate how much more likely the result is under one hypothesis rather than the other. This last number is called the Bayes Factor.

For an example, imagine an ordinary 6-sided die but instead of being ordinarily labeled it has only the letters “A” and “B”. The die comes in 2 variations, one has 5 “A”s and 1 “B”, the other 1 “A” and 5 “B”s.
You roll a die once and get an “A”. This result is 5 times likely in the first variant.
You could use this result to speculate about what kind of die you rolled. But what if there is a third variant of die? One that has, say, 3 “A”s and 3 “B”s. Then your Bayes Factor would be different.

The Bayes Factor depends crucially on the two hypothesis being compared. Depending on which 2 hypothesis are being compared one can seem more likely or the other.

In the case of Feeling the Future, the question is basically what we should assume about what happens when something happens. How much feel for the future should we assume?
Wagenmakers et al said if one cannot assume anything for lack of information then one should use this default assumption as suggested by several statisticians. This assumption implied that people might be a little good at feeling the future or maybe very good.
Bem, along with two statisticians, countered that we already know that people are not good at feeling the future. Parapsychological abilities are always weak and therefor one should use a different assumption under which the strength of the evidence was very much confirmed.

Let’s make this intuitively clear. Think again of of die with 5 As or 5 Bs. You are told that one die was rolled 100 times and showed 30 As and 70 Bs. Clearly that is more likely to be a 5 B die than a 5 A die. But wait, what if, instead of comparing those 2 with each other we compare either of these with a die with 2 As and 4 Bs. That die would win.

I have simplified a lot here. If something doesn’t seem to make sense it’s probably because of that and not because of a problem in the original literature.

Bem’s argument makes a lot of sense but overlooks that belief in strong precognition is wide-spread, even among parapsychologists. Tiny effects are what they get but not what they hope for or believe in. Both parties have valid arguments for their assumptions but neither makes a compelling case. On the whole, however, it does show a problem with the default Bayesian t-test.

Let me emphasize again that Wagenmakers made two points. The first that Bem made mistakes in applying the statistics. And secondly that it would be better to use the default Bayesian t-test rather than the traditional NHST. These are separate issues.
In my opinion, the abuse of statistical methods is the crucial issue that cannot be solved by using a different method.

How to get Wagenmakers wrong

Bayesian statistics is often thought of as involving a prior probability. In fact, the defining characteristic of Bayesian statistics is that it includes prior knowledge.

Again let’s go with the example. You’re only concerned with the 2 die variants with the 5 “A”s and the 5 “B”s. Someone is throwing always the same die and telling you the result. You can’t see the die, of course, but are supposed to guess which die was thrown solely based on the result.
Intuitively, you’ll probably be tending more toward the first kind with every “A” and more toward the second type with every “B”.
But what if I told you that I randomly picked the die out of a box with 100 die of the 5 “A” variant and only 1 of the 5 “B” variant. You’ll start out assuming it should be the 5 “A” variant and will require a lot of “B”s before switching.
Formally, we’d compute the Bayes Factor from the data and then use that factor to update the prior probability to get the posterior probability. The clearer the data is, and the more data one has, the greater the shift in what we should hold to be the case.

In reality one will hardly ever know which of several competing hypotheses is more likely to be true. Different people will make their own guess. Some, maybe most, people will regard precognition as a virtual impossibility, a few as a virtual certainty.
Wagenmakers et al showed that even if one assumes a very low prior probability to the idea that people can feel the future (or rather the mathematical prediction based on that idea), 2,000 test subjects would yield enough data to shift opinion firmly towards precognition being true.

Unfortunately, some people completely misunderstood that. They thought that Wagenmakers et al were saying that we should not regard Bem’s data as convincing because they assigned a low prior probability. In truth the only assumption that went into the Bayes factor calculation was regarding the effect size. That point was strongly emphasized but still people miss it.

[sound of sighing]

The May issue of The Psychologist carries an article by Stuart Ritchie, Richard Wiseman and Chris French titled Replication, Replication, Replication plus some reactions to it. The Psychologist is the official monthly publication of The British Psychological Society. And the article is, of course, about the problems the 3 skeptics had in getting their failed replications published.

Yes, replication is important

That the importance of replications receives attention is good, of course. Depositories for failed experiments are important and have the potential to aid the scientific enterprise.

What is sad, however, is that the importance of proper methodology is largely overlooked. Even the 3 skeptics who should know all about the dangers of data-dredging cavalierly dismiss the issue with these words:

While many of these methodological problems are worrying, we don’t think any of them completely undermine what appears to be an impressive dataset.

But replication is still not the answer

I have written about how replication cannot be the whole answer before. In a nutshell, by cunning abuse of statistical methods it is possible to give any mundane and boring result the impression of showing some amazing, unheard of effect. That takes hardly any extra work but experimentally debunking the supposed effect is a huge effort. It takes more searching to be sure that something is not there than to simply find it. For statistical reasons, an experiment needs more subjects to “prove” the absence of an effect with the same confidence as finding it.
But there’s also that there might be some difference between the original experiment and the replication that explain the lack of effect. In this case it was claimed that maybe the 3 failed because they did not believe in the effect. It takes just seconds to make such a claim. Disproving it requires finding a “believer” who will again run an experiment with more subjects that the original.

Quoth the 3 skeptics:

Most obviously, we have only attempted to replicate one of Bem’s nine experiments; much work is yet to be done.

It should be blindingly obvious that science just can’t work like that.

There are a few voices that take a more sensible approach. Daniel Bor writes a little of how neuroimaging which has, or had, extreme problems with useless statistics might improve by foster greater expertise among the practitioners. Neuroimaging seems to have made methodological improvements. What social psychology needs is a drink of the same cup.

The difficulty of publishing and the crying of rivers

On the whole, I find the article by the 3 skeptics to be little more than a whine about how difficult it is to get published, hardly an unusual experience. The first journal refused because they don’t publish replications.
Top journals are supposed to make sure that the results they publish are worthwhile. Showing that people can see into the future is amazing, not being able to show that is not. Back in the day it was simply so that there was only a limited number of pages that could be stuffed into an issue, these days, with online publishing, there’s still the limited attention of readers.
The second journal refused to publish because one of the peer-reviewers, who happened to be Daryl Bem, requested further experiments to be done. That’s a perfectly normal thing and it’s also normal that researchers should be annoyed by what they see as a frivolous request.
In this case, one more experiment should have made sure that the failure to replicate wasn’t due to the beliefs of the experimenters. The original results published by Bem were almost certainly not due to chance. Looking for a reason for the different results is good science.

I’ve given a simple explanation for the obvious reason here. If the 3 skeptics are unwilling or unable to actually give such an explanation they are hardly in a position to complain.

Beware the literature

As a general rule, failed experiments have a harder time to get published than successful ones. That’s something of a problem because it means that information about what doesn’t work is lost to the larger community. When there is an interesting result in the older literature that seems not to have been followed up on then it probably is the case that it didn’t work after all. The original report was a fluke and the “debunking” was never much published. Of course, one can’t be sure if it was not maybe overlooked, which is a problem.
One must be aware that the scientific literature is not a complete record of all available scientific information. Failures will mostly live on in the memory of professors and will still be available to their ‘apprentices’ but it would be much more desirable if the information could be made available to all. With the internet, this possibility now exists and that discussion about such means is probably the most valuable result of the Bem affair so far.

Is Replication the Answer?

One question that is forced on us by the publication of papers like Daryl Bem’s Feeling the Future is what went wrong and how it can be fixed.

One demand that often arises is for replication. It is one of the standard demands made by interested skeptics in forums and such places. I can understand why calling for replication is seductive.
It is shrewd and skeptical. It says: Not so fast, let’s be sure first while at the same time offering a highly technical criticism. Replication is technical jargon, don’t you know?. On the other hand it’s also nice and open-minded. It says: This is totally serious science and some people who aren’t me should spend a lot of time on it.
And perhaps most important of all, it requires not a moments thought.

Cynicism aside, replication really is important. As long as a result is not replicated it is very likely wrong. If you don’t replicate you’re not really generating knowledge. Not only can you not rely on the results, you also lose the ability to determine if you are using good methods or are applying them correctly. Which I’d speculate will decrease reliability still further over time.

Replication is essential but is replication really all that is needed?

Put yourself in the shoes of a scientist. You have just run an experiment and found absolutely no evidence that people can see the future.  That’s going to be tough to publish.
Journals are sometimes criticized for being biased against negative results but the simple fact is that they are biased against uninteresting results. Attention is a limited quantity; there’s only so much time in a day that can be spent reading. Most ideas don’t work out and so it is hardly news when an idea fails in experiment. Think for an example of all the chemicals that are not drugs of any kind.

Before computers and the information age it probably wouldn’t even have been possible to handle all the information about failed ideas. Things have changed now but the scientific community is still struggling to incorporate these new possibilities. However, one still can’t expect real life humans to pay attention to evidence of the completely expected.

Now you could try a new idea and hope that you have more luck with that.
Or you could do what Bem did and work some statistical magic on the data. And by magic I mean sleight of hand. The additional work required is much less and it is almost certain to work.
The question is simply if you want to advance science and humanity or your career and yourself.

If you go the 2nd route, the Bem route, your result will almost certainly fail to replicate.

So you might say that replication, if it is attempted solves the problem. Until then you have a confused public by premature press reports, perhaps bad policy decisions, and certainly a lot of time wasted trying to replicate the effect. Establishing that an effect is not there always takes more effort than just demonstrating it.

To this one might say that the nature of science is just so, tentative and self-correcting. Meanwhile the original data magician, our Bem-alike, has produced a publication in a respectable journal, which indicates quality work, and received numerous citations (in the form of failed replications), which indicates that the paper was fruitful and stimulated further research. These factors, number of publications, reputation of journal and number of citations are usually used to judge the quality of work by a scientist in some objective way.

Eventually, if replication is all the answer needed, one should expect science to devolve into producing seemingly, amazing results that are then slowly disproven by subsequent failed replications. Any of that progress we have come to expect would be merely an accidental byproduct.

The problem might be said to lie rather in judging scientists in such a way. Maybe we should include the replicability of results in such judgments. But now we’re no longer talking about replication as the sole answer. We’re now talking about penalizing bad research.

And that’s the point. Science only works if people play by the rules. Those who won’t or can’t must be dealt with somehow. In the extreme case that means labeling them crackpots and ostracizing them.
But there’s less extreme examples.

The case of the faster than light neutrinos

You probably have heard that some scientists recently announced that they had measured neutrinos to go faster than light. This turned out to be due to a faulty cable.

This story is currently a favorite of skeptics who pointed out that few physicists took the result seriously, despite the fact that it was originally claimed that all technical issues had been ruled.. It makes a good cautionary tale about how implausible results should be handled and why. Human error is just always possible and plausible.

There’s another chapter to this story, one that I fear will not get much attention.

The leaders of the experiment were forced to resign as a consequence of the affair.

There were very many scientists involved in the experiment due to the sheer size of the experimental apparatus. Among them, there was much discontentment about how the results were handled. Some said that they should have run more tests, including the test that found the fault, before publishing. Which means, of course, that they shouldn’t have published at all.

It is easy to see how a publish-or-perish environment that puts a premium on exciting results encourages not looking too closely for faults. But what’s the alternative? No incentive to publish equals no incentive to work. No incentive for exciting results just cements the status quo and hinders progress.

A Pigasus for Daryl Bem

Every year on April Fools day, James Randi hands out the Pigasus Award. Here is the announcement for the 2011 awards, delivered on April 1 2012.

One award went to Daryl Bem for “his shoddy research that has been discredited on many accounts by prominent critics, such as Drs. Richard Weisman, Steven Novella, and Chris French.”

I’ve called this well deserved but there’s certainly much that can be quibbled about. For example, these critics are hardly those who delivered the hardest hitting critiques. Far more deserving of honorable mention are Wagenmakers, Francis and Simmons (and their respective co-authors) for their contribution of peer reviewed papers that tackle the problem.

A point actually concerning the award is whether it is fair to single out Bem for a type of misconduct that may be very wide-spread in psychological research. Let’s be clear on this, his methods are not just “strange” or “shoddy” as Randi kindly puts it, they border on the fraudulent. Someone else, in a different field, might have found themselves in serious trouble with a paper like this. Though I think it very hard to get such a paper past peer review in a more math savvy discipline.
But even if you think it is just a highly visible example of normal bad practice, surely it is appropriate to use the high visibility to bring attention to it. Numerous people have done exactly that. Either using it to argue for different statistical techniques or to draw attention for the lack of necessary replication in psychology.

I doubt that Randi calling this out will do much good since I doubt that many psychologists will even notice. And even if, I doubt that it will cause them to rethink their current (mal)practice. There’s a good chance that Bem will be awarded an IgNobel prize later this year. That probably gets more attention but even so…

 

The reactions from believers have been completely predictable. They have so far ignored the criticisms of the methods and so they ignore that Randi explicitly justifies the award with the “strange methods”. They simply pretend that any doubt or criticism is the result of utter dogmatism.

Sadly, some skeptical individuals have also voiced disappointment, for example Stuart Ritchie on his Twitter feed. Should I ever come across a justification for such reactions I will report and comment.

Why doesn’t experiment 9 replicate?

I have written about Daryl Bem’s paper “Feeling the Future” before and laid out a few of the serious issues that invalidate it.

Recently it’s been in the news again because one of the nine experiments presented in it, experiment 9, was repeated and failed to yield a positive result. Of course, no one was particularly surprised by this, except perhaps the usual die hard believers. Still, some may wonder where the positive result came from in the first place. Just chance or something more?

Before we can look at the actual research we need to look at the dangers of pattern seeking…

Patterns are for kilts

Image of group of 9 people

Let’s do a little game. We pick a few people in this image and then we try to find some way to split those nine people into two groups in such a way that most of our picks end up in one group.
For example let’s take the 1st from the left in the first row and the 2nd in the bottom row.
Answer: Males vs. Females.

Again: We take the 2nd in the top row and the 2nd and 4th in the bottom row.
Possible answer: People with and without sunglasses.
It doesn’t work perfectly but mostly.

If you’re creative you can find a more or less good solution for any possible combination of picks. That’s the first take-away point.

Now let’s add a bit of back story and extend our game. The group went to a casino and some of them won big and those are the people we point out.
The goal of the game is now not only to find a good grouping but also to make up some story for why the one group had most of the winners.

For example: The sunglasses are a lucky charm and that’s why the group with glasses did better.
That’s alright, but lucky charm is kind of lame.
How about: Hiding the eyes helps bluffing in poker. Much better…
But wait, correlation does not equal causation as statisticians never tire of telling us. Pro-players like to wear sunglasses, as everyone knows, and that’s why that group did better.

So if you’re creative you can even find some semi-plausible explanation for why a group did better than another.
And when the explanation need not even be semi-plausible then you can always find one without any creativity. Lucky charm, magic or divine favor fits any case. That’s the second take home point.

You can always find some sort of pattern in any set of random data. For example, shapes in clouds. Random means that you rarely find the same pattern again.

For one final encore, let’s make up for each person in that picture how much money they won or lost in the casino. Say top left: Lost $145; top 2nd from left: Won $78; And so on…
An answer might be skin bared in square cm, or height in inches and so on.

Experiment 9

Experiment 9 is derived from a simple psychological experiment that could run something like this:
Step 1
Ask a subject to remember a list of words. The words are flashed one at a time on a computer screen for 3 seconds each.
Step 2
Then randomly select some words for the subject to practice. The selected words appear on the screen again and the subject types them. Of course, the subject can’t make notes.
Step 3
The subject is asked to recall the words.

The result is, unsurprisingly, that more of the practiced words are recalled.
Bem switched steps 2 and 3. That is the words are practiced after they were recalled. You wouldn’t expect that what one does after the fact makes a difference but Bem claimed that the experiment was a success.

If you are new to parapsychology you would probably assume that this means that more practice words were recalled. In fact, Bem does not tell us that. We don’t know if that was the case but the omission is telling.
Bem constructs what he calls a “differential recall index” for each subject. You compute this by first subtracting the number of control words (words that did get practiced) recalled from the number of practice words. Then you multiply this by the number of words recalled in total. This is then turned into a percentage but I’ll omit that in the examples.

So if subject 1 recalls 39 words in total and 20 of these are practiced later then the index is 1*39.
And if subject 2 recalls only 18 words and 8 are practiced then the index is (-2)*18= -36.

You can already guess where this is going. The justification that Bem gives for this manipulation is:
Unlike in a traditional experiment in which all participants contribute the same fixed
number of trials, in the recall test each word the participant recalls constitutes a trial and is
scored as either a practice word or a control word.

This is just massive nonsense. As we have seen, not every recalled word is equal. Those words that come from participants who recalled many count heavier. The function of the index runs counter to the stated purpose.
Let’s combine the examples above. Subject 1 recalled one more practice word but subject 2 recalled two fewer. This indicates that practicing after the fact does not work, although in an actual experiment 2 subjects would be too little to state anything with confidence.
But now look at the combined index: 39 – 36 = +3. This indicates success. Obviously the index misleads here.

That the reviewers let this through is certainly a screw-up on their part. There’s no sugar-coating it.

Charitably one might assume that Bem also made a mistake and just through luck got a significant result. However, that is unlikely.
The evidence, namely the advice he gives on writing articles as well as his handling of the other experiments, indicates that the index was created to force a positive result.
Still, that not necessarily implies ill intent. He may have played around with a statistics program until he got results he liked without ever realizing that this is scientifically worthless. In fact, objectively this is scientific misconduct.
Unfortunately, Bem displays an awareness of the inappropriateness of such methods.

The fact that the actual result of the experiment is not reported by Bem but only the flawed and potentially misleading Differential Recall index makes me conclude that the experiment was probably a failure. There was simply a random association between high recall and favorable outcome on which the DR index capitalizes.
By random chance such a pattern may arise again but only rarely, hence the failure to replicate.

Conceptual vs. Close Replication

Believers often insist that Bem has only replicated previous work. The implication being that these experiments are replicable. But when they say replication they mean a so-called “conceptual replication”. By that they mean experiments in general that purport to show retroactive effects, that is the present affecting the past. Of course, when one makes up a whole new experiment one can simply use the now familiar tricks to force a positive result.
A close replication actually repeats the experiment and is therefore bound to the same method of analysis. Only a close replication is a real replication.

Back from hiatus

As you can see I took a time out from this blog for half a year now and never delivered the promised Ganzfeld series. It’s tedious and unrewarding work and I simply had better things to do. Hopefully I’ll bring things home in the next couple of months. Even though I have no idea who really cares I feel a sense of duty to finish what I started.

Randi’s Prize
What I won’t finish is the chapter by chapter review of Randi’s Prize by Robert McLuhan. It simply doesn’t work. He cites a lot of research and it is really this research that should be addressed rather than McLuhan’s take on it. The basic errors he himself makes are already pointed out in the reviews of the first few chapters.

Next up will be my take on the current hoopla about the failed replication of one of Bem’s experiments. Stay tuned.

Randi’s Prize: Answering Chapter 4

Chapter Four:  Uncertain Science

This chapter deals with parapsychological experiments in general, rather than mediumship specifically, as the previous chapter. We are run quickly past a number of claims and rebuttals without dealing with any in detail.

Repeatability

Of the problems the most hard-hitting is probably the fact that experiments are not repeatable but unfortunately this huge problem does not get discussed but rather ignored. Perhaps McLuhan simply chose to believe assertions to the contrary?

He mentions the card-guessing experiments of JB Rhine, conducted in the 1930s, and tells us of Hubert Pearce, a theology student, who could consistently and repeatably demonstrate his ESP by scoring, on average, 33% hits where 20% was expected.

He also tells us of the Ganzfeld experiments, conducted in the 1980s onward, where, on average, people score 33% instead of 25%. The Ganzfeld is a method of creating a state of mild sensory deprivation which is supposed to enhance someone’s ability to receive extra-sensory information and thus to enable better scoring.

Curious, isn’t it? Decades pass during which parapsychologists develop a method to increase scoring but… The increase is worse than what was achieved at the time by using a “star subject”.

McLuhan tells us that card-guessing was abandoned because it was to boring, just sitting there calling out one guess after the other. In the Ganzfeld experiments, someone has to endure 20 minutes of sensory deprivation for a single guess. I am not sure how that relieves the problem.

I wonder if it may be one thing that distinguishes skeptics and believers, that skeptics have a higher need for internal consistency?

Bad statistics are a serious problem in parapsychology as they can create the impression of an effect where there is none. Naturally, not all criticisms are correct. McLuhan incorrectly generalizes rebuttals of some criticisms to mean that such criticisms as a whole are unwarranted. Looking at recent works like Bem’s Feeling the Future, it is obvious how misleading that is.

One thing that stood out to me is how McLuhan speaks with 2 voices. He generally makes an effort (or a show?) of considering both sides. Sometimes he even intimates that these argument affected him. Yet every so often a different attitude breaks through. Then he tells us why these arguments are made. Not because they are true or reasonable but only to create doubt.

In-Depth Controversy

The first controversy that is addressed in-depth is the “sense of being stared at”. Unfortunately this is not one I have studied and so I will not comment on it. I intend to do so at some time but not in the next few weeks.

The next controversy concerns Sheldrake’s psychic dogs. This has already been examined on this blog.
Someone who goes to the original articles and actually evaluates the data for himself should be able to see past Sheldrake’s wall of make-belief but McLuhan completely falls for his spin and retells it as such.

He is so faithful to that version that he even follow Sheldrake in making nasty attacks on a skeptic who had the bad judgement of taking the claims seriously enough to conduct his own investigation.

After this low of investigative effort comes a more extensive exploration of the Ganzfeld experiments. These are in many ways amongst the best parapsychology has to offer. Many other results shrivel to nothingness under scrutiny or are simply unrepeatable which means we have to take them on faith.

By comparison, this series of experiments is a shining example of methodological rigor and solidity. Some time I will make a post on why I don’t believe that there is no real effect there. I expect they will eventually end up like Rhine’s experiments in the 1930s. Never fully explained but simply abandoned.

McLuhan quotes the same skeptic praising these experiments who he had just a few pages earlier accused of trying to sabotage Sheldrake’s research. He seems completely oblivious of the inherent contradiction.

Finally, there comes the remote viewing experiments called the Stargate project, performed by the US government. Here things get more mixed. Eventually this was a debate between Ray Hyman(skeptic) and Jessica Utts(believer). McLuhan, of course, finds the believer convincing, never realizing the gaping holes in her arguments.

« Older entries