(International Library of Psychology.) Watt, Caroline - Wiseman, Richard - Parapsychology-Taylor and Francis (2017)
(International Library of Psychology.) Watt, Caroline - Wiseman, Richard - Parapsychology-Taylor and Francis (2017)
(International Library of Psychology.) Watt, Caroline - Wiseman, Richard - Parapsychology-Taylor and Francis (2017)
Edited by
11 Ray Hyman (1981), ‘The Psychic Reading’, Annals of the New York Academy of
Sciences, 364, pp. 169-81. 185
12 Sybo A. Schouten (1994), ‘An Overview of Quantitatively Evaluated Studies
with Mediums and Psychics’, Journal of the American Society for Psychical
Research, 88, pp. 221-54. 199
Prometheus Books for the essay: John Beloff (1986), ‘What is your Counter-Explanation? A
Plea to Skeptics to Think Again’, in P. Kurtz (ed.), A Skeptic s Handbook of Parapsychology,
Buffalo, NY: Prometheus, pp. 359-77.
Rhine Research Center for the essays: Joseph Banks Rhine (1938), ‘Experiments Bearing on
the Precognition Hypothesis: I. Pre-Shuffling Card Calling’, Journal of Parapsychology, 2,
pp. 38-54; Julie Milton (1999), ‘Should Ganzfeld Research Continue to be Crucial in the Search
for a Replicable Psi Effect? Part I. Discussion Paper and Introduction to an Electronic-Mail
Discussion’, Journal of Parapsychology, 63, pp. 309-33; Helmut Schmidt (1970), ‘APK Test
with Electronic Equipment’, Journal of Parapsychology, 34, pp. 175-81; William Braud, Donna
Shafer and Sperry Andrews (1993), ‘Further Studies of Autonomic Detection of Remote Staring:
Replication, New Control Procedures, and Personality Correlates’, Journal of Parapsychology,
57, pp. 391—409; Richard Wiseman and Marilyn Schlitz (1997), ‘Experimenter Effects and the
Remote Detection of Staring’, Journal of Parapsychology, 61, pp. 197-207; Marilyn Schlitz
(2001), ‘Boundless Mind: Coming of Age in Parapsychology’, Journal of Parapsychology, 65,
pp. 335-50.
Skeptical Inquirer for the essay: Susan Blackmore (1987), ‘The Elusive Open Mind: Ten Years
of Negative Research in Parapsychology’, Skeptical Inquirer, 11, pp. 244-55. Copyright © 1987
Skeptical Inquirer.
Springer for the essay: Dean I. Radin and Roger D. Nelson (1989), ‘Evidence for Consciousness-
Related Anomalies in Random Physical Systems’, Foundations of Physics, 19, 1499-514.
Copyright © 1989 Plenum Publishing Corporation.
Society for Psychical Research for the essay: Michael Daniels (2002), ‘The “Brother Doli”
Case: Investigation of Apparent Poltergeist-Type Manifestations in North Wales’, Journal of
the Society for Psychical Research, 66, pp. 193-221.
Every effort has been made to trace all the copyright holders, but if any have been inadvertently
overlooked the publishers will be pleased to make the necessary arrangement at the first
opportunity.
Series Preface
Psychology now touches every corner of our lives. No serious consideration of any newsworthy
topic, from eating disorders to crime, from terrorism to new age beliefs, from trauma to happiness,
is complete without some examination of what systematic, scientific psychology has to say on
these matters. This means that psychology now runs the gamut from neuroscience to sociology,
by way of medicine and anthropology, geography and molecular biology, connecting to virtually
every area of scientific and professional life. This diversity produces a vibrant and rich discipline
in which every area of activity finds outlets across a broad spectrum of publications.
Those who wish to gain an understanding of any area of psychology therefore either
have to rely on secondary sources or, if they want to connect with the original contributions
that define any domain of the discipline, must hunt through many areas of the library, often
under diverse headings.
The volumes in this series obviate those difficulties by bringing together under one set
of covers, carefully selected existing publications that are the definitive papers that characterize
a specific topic in psychology.
The editors for each volume have been chosen because they are internationally
recognized authorities. Therefore the selection of each editor, and the way in which it is
organized into discrete sections, is an important statement about the field.
Each volume of the International Library of Psychology thus collects in one place the
seminal and definitive journal articles that are creating current understanding of a specific
aspect of present-day psychology. As a resource for study and research the volumes ensure
that scholars and other professionals can gain ready access to original source material. As a
statement of the essence of the topic covered they provide a benchmark for understanding
and evaluating that aspect of psychology.
As this International Library emerges over the coming years it will help to specify what
the nature of 21st Century psychology is and what its contribution is to the future of humanity.
DAVID CANTER
Series Editor
Professor of Psychology
University of Liverpool, UK
Introduction
History, Background and Terminology
Many people have experienced seemingly psychic phenomena, such as having a dream that
predicts the future, seeing a ghost, or thinking of a long-lost friend and then receiving a telephone
call from that person moments later. In addition, some individuals appear to possess psychic
abilities, including mediums who claim to communicate with the dead, healers who seem to
help cure illness, and psychics who can apparently bend keys and cutlery using just the power
of their minds.
Although such allegedly psychic experiences and abilities have been reported throughout
history, it is only in the last hundred years or so that researchers have carried out systematic and
scientific work into these topics (for historical reviews of the area see Beloff, 1977; Hyman,
1985a). Much of the early work in this area was conducted under the auspices of one of the first
organizations dedicated to the scientific study of alleged paranormal phenomena, the Society
for Psychical Research (SPR). Founded in 1882 by a group of prominent academics, the majority
of the SPR’s initial research focused on testing individuals claiming to have strong psychic
abilities, including several well-known mediums of the day. Around the turn of the last century,
almost all research into alleged paranormal phenomena was conducted by individuals either
working alone or on behalf of learned societies like the SPR. However, during the 1930s,
Professor Joseph Banks Rhine established a parapsychology laboratory at Duke University
(North Carolina, USA), and initiated the first systematic programme of university-based research
into alleged psychic abilities. Rhine also pioneered a somewhat different approach to the topic,
choosing to work with people who did not claim strong psychic abilities and having participants
take part in easily controlled experiments, such as attempting to guess the order of a shuffled
deck of cards. Since the 1930s a small number of academics have continued to conduct
parapsychological research within universities throughout the world.
Most present-day researchers draw a distinction between two types of ostensible psychic
ability. In extrasensory perception (ESP), a person appears to receive some information via a
channel of communication not presently understood. Researchers frequently draw a distinction
between three types of possible ESP phenomena: clairvoyance, in which the information
received was not known to anyone else; telepathy, in which the information was known to
another person; and precognition, in which the information relates to a future event. In the
second type of ostensible psychic ability, psychokinesis (PK), a person appears to influence an
object or their surroundings using unknown means. Researchers tend to refer to two types of
alleged PK: Macro-PK, in which the apparent phenomenon is large and directly observable
(for example, the levitation of an object) and micro-PK, in which small effects are produced
that can only be detected via statistical analyses (for example, causing dice to roll sixes at
above chance levels).
Generally speaking, researchers investigate the possible existence of ESP and PK using one
of three approaches. The first is the study of various types of anomalous experience reported by
xiv Parapsychology
the public. These studies have involved a diverse range of methods, including, for example,
attempting to identify the types of people that have such experiences and assessing the reliability
of their reports. A second approach has focused on individuals who claim to be psychically
gifted. These studies typically employ just one subject (the alleged psychic) and, when successful,
appear to produce large and impressive effects. The third and final approach assumes that
everybody possesses psychic abilities to a small degree, and usually involves carrying out
laboratory-based experiments involving large numbers of individuals, none of whom claim to
be especially psychic. The effects obtained in these studies are often relatively small and can
only be detected by statistical analysis. All three approaches have yielded interesting and useful
data, and the five Parts of this volume reflect the diversity of work undertaken in these three
main areas.
In addition to employing different approaches to studying alleged psychic experiences and
abilities, researchers also hold a diverse range of theoretical perspectives about such phenomena.
At one end of the spectrum, some proponents argue that certain experiential and/or experimental
data strongly support the existence of psychic abilities, and may believe that they understand
how such abilities are best explained (for example, that they are analogous to normal sensory
systems or indicative of spiritual advancement). Other researchers are less convinced by the
evidence and, even if they do believe that the data suggest some form of unexplained anomaly,
are uncertain about how this anomaly should best be viewed. Finally, towards the other end of
the spectrum, sceptics reject the notion that there exists convincing evidence for alleged psychic
abilities, and instead argue that such evidence is the result of various types of self-deception,
fraud or methodological artefacts. Given these diverse viewpoints, it is perhaps unsurprising
that this field has attracted a considerable amount of controversy. The essays in this volume
have been chosen to provide readers with a general sense of the methods used in this research,
the various viewpoints that have been advanced to account for the findings that have been
obtained and the controversies generated by this work.
This Introduction is designed to help set each of the selected essays in context, and also to
provide additional references for those wishing to delve deeper into the issues surrounding
each of the areas covered.
Additional work explores why people experience such unusual and seemingly paranormal
phenomena. Researchers have approached this issue from a range of quite different perspectives
(see, for example, Cardena, Lynn and Krippner, 2000; Irwin, 1993; Roberts and Groome, 2001).
Some researchers have argued that some anomalous experiences do not reflect the existence of
genuine psychic abilities, but may instead be due to people incorrectly assigning paranormal
causation to normal events (see, for example, Marks and Kammann, 1980; Shermer, 1997;
Zusne and Jones, 1982). Caroline Watt’s essay on coincidences (Chapter 2) represents an example
of this line of research, exploring how various psychological biases may mislead people into
believing that they have experienced a rare and meaningful coincidence. Similarly, Chris French’s
essay (Chapter 3) reviews a large body of work on the psychology of false memory, arguing
that this research could help provide a normal explanation for anomalous experiences involving
altered states of consciousness, such as alleged alien abductions, past-life regression and near
death experiences.
In contrast, other investigators have argued that certain anomalous experiences may provide
evidence to support the existence of genuine psychic phenomena. The next two essays illustrate
this approach. In the first of these, Ian Stevenson (Chapter 4) describes a series of unusual case
studies that appear to support claims of reincarnation. Stevenson has gained a considerable
reputation for carefully documenting cases (mainly from India and Sri Lanka) in which people
allegedly remember details of past lives (see, for example, Stevenson, 1974, 1997; for a critical
review of this work see Edwards, 1996) and, in this essay, he argues that certain birthmarks
may be indicative of illnesses and accidents suffered by individuals in a previous life. In
Chapter 5, Pirn van Lommel and his colleagues present the details of a recent study into near
death experiences (NDEs). People reporting NDEs describe a remarkably similar set of
phenomena, including moving through a tunnel of light, blissful feelings, life review and so on
(Moody, 1975; Ring, 1980). Some researchers have suggested that these experiences could be
the result of various types of hallucination (Blackmore, 1993), whereas others have argued that
they may reflect some form of genuine separation between mind and body (for example, Pamia
et al., 2001). Van Lommel compares data from those who reported NDEs and those who did
not, and argues that existing medical, pharmaceutical and psychological explanations cannot
account for these experiences.
The final two essays in Part I reflect two quite different approaches to investigating very
different types of alleged paranormal experiences, namely hauntings and poltergeist activity.
Parapsychologists have conducted a considerable amount of work at allegedly haunted locations,
examining both the reliability of eyewitness reports and whether any environmental factors
(for example, air temperature, magnetic field strength and so on) are associated with such
reports (see, for example, Houran and Lange, 1998; Maher and Schmeidler, 1975). The essay
by Richard Wiseman and his colleagues (Chapter 6) illustrates how these types of method were
used to empirically examine two well-known, and allegedly haunted, locations. Research into
alleged poltergeist activity has generated a considerable amount of controversy, with some
researchers arguing that the phenomena represent genuine paranormal activity (Fontana, 1991)
and others that they are the result of self-deception and fraud (Randi, 1985). Mike Daniels’
essay (Chapter 7) describes a recent investigation into alleged poltergeist activity in Wales, and
illustrates not only the methods used in such investigations, but also the difficulties encountered
when trying to reach any firm conclusion during this kind of work.
XVI Parapsychology
Extrasensory Perception
As noted above, the first systematic programme of research into the existence of ESP was
initiated by J.B. Rhine in the 1930s. Much of Rhine’s work involved participants attempting to
guess the order of shuffled decks of ‘ESP cards’ (that is, cards printed with one of five simple
symbols - a circle, cross, square, star or wavy lines - on their faces). The first essay in Part III,
written by Rhine in 1938, describes an initial set of precognition experiments and illustrates the
type of methods involved in these early and ground-breaking studies.
Rhine’s research generated a significant amount of controversy, with proponents arguing
that the results supported the existence of ESP and critics claiming that the studies possessed
various methodological and statistical problems (see Palmer, 1986, for a review). This, combined
with the rather tedious procedures involved in the studies, and with a tendency for initially
significant results to decline over time (see the review by Palmer, 1978) eventually resulted in
researchers exploring other ways of running laboratory-based ESP experiments. In Chapter 14
Charles Akers presents a comprehensive and critical review of the key ESP studies that were
conducted between the end of the Rhine era and the early 1980s. This review describes the
many artefacts and biases that can hinder this research and then evaluates the degree to which
these problems were present in a series of studies that both obtained positive results and were
seen as making a significant contribution to the field.
In Chapter 15 Irvin Child reviews a series of well-known studies, conducted in the late
1960s and early 1970s at the Maimonides Medical Centre, exploring the possible existence of
ESP in dreams. These studies obtained highly significant results, suggesting that the content of
participants’ dreams reflected randomly selected target material (for example, pictures) that
were shown to them the following morning. Child also notes how much of the critical
commentary attacking this work misrepresented both the methods and results of the studies.
Partly as a result of the success of the dream ESP work, many researchers focused their
attention on running studies in which participants are placed into an altered state of consciousness.
Much of this work involves participants undergoing the ‘ganzfeld’ procedure (a mild sensory
deprivation procedure originally developed by perceptual psychologists to help people generate
imagery) and then attempting to identify target material, such as a picture or film, being looked
at by another person in a separate room.
Parapsychological research using the ganzfeld procedure has generated a considerable amount
of debate, with some researchers arguing that the work represents some of the best evidence in
favour of ESP (see Honorton, 1985; Honorton et ai, 1990; Utts, 1991), and others questioning
the validity and quality of the studies (see Blackmore, 1987; Hyman, 1985b; Scott, 1986).
Daryl Bern’s and Charles Honorton’s essay, published in 1994 and now reprinted as Chapter
16, first presents a review of the early ganzfeld studies and then describes a meta-analysis of a
series of well-controlled and highly statistically significant ganzfeld studies conducted at the
Psychophysical Research Laboratories. Bern’s and Honorton’s essay provoked a considerable
amount of debate and additional meta-analyses (see Bern, 1994; Hyman, 1994; Milton and
Wiseman, 1999; Storm and Ertel, 2001). Towards the end of their essay, the authors note the
importance of a broad range of investigators attempting to replicate the ganzfeld ESP effect. In
the following essay (Chapter 17) Julie Milton begins by picking up on this issue, arguing that
the effect has declined in recent ganzfeld studies. She then discusses various reasons for this
decline and outlines possible strategies for future research in this area. Milton’s essay acted as
xviii Parapsychology
the basis for a large-scale electronic discussion about these issues, and the debate about the
replicability of ganzfeld-ESP findings continues (Bern, Palmer and Broughton, 2001).
results. Braud et ai review this research and then describe the methods and results of then-
own study.
The next essay (Chapter 22), by Richard Wiseman and Marilyn Schlitz, also describes a
study examining the remote detection of staring, but in addition examines the possible role
that the experimenter may play in determining study outcome. As noted in both this section and
the previous one, much of the debate concerning the existence of psychic ability revolves
around the degree to which the experimental evidence for such abilities can be replicated across
several laboratories. This issue is especially problematic within parapsychology as some
experimenters have a reputation for consistently achieving positive results whilst others obtain
chance findings. Attempting to understand such ‘experimenter effects’ is therefore clearly vital
to the future of the field, and the Wiseman-Schlitz study explores this issue by examining
whether two studies, using the same design but carried out by different experimenters, obtain
significantly different results.
A review of the small number of studies to empirically examine the possibility of ‘distant
healing’ (including, for example, prayer and therapeutic touch) constitutes the next essay. This
review, by John Astin and his colleagues, examined 23 experiments and concluded that, although
the results of these studies revealed an overall effect, the poor methodological quality of the
work made any clear-cut interpretation of this work problematic. Several other studies into
distant healing have been conducted since this review, with mixed results (for example, Leibovici,
2001; Roberts, Ahmed and Hall, 2004).
To close the two Parts on laboratory research into ESP and PK, we have an essay by noted
parapsychology critic, James Alcock (see Alcock, 1981, 1985, 1987, 1990). Alcock’s essay,
which forms his editorial introduction for a special issue of the Journal of Consciousness Studies,
argues that parapsychologists are inclined to search for paranormal interpretations of their data
and tend to neglect the possibility that there simply is no psi in their experiments. He gives a
list of 12 ‘reasons to remain doubtful about the existence of psi’.
well, and has the potential to inform several important academic and social issues in the near
future.
References
Alcock, J.E. (1981), Parapsychology: Science or Magic?, London: Pergamon.
Alcock, J.E. (1985), ‘Parapsychology as a “Spiritual Science’”, in P. Kurtz (ed.), A Skeptic's Handbook of
Parapsychology, Buffalo, NY: Prometheus.
Alcock, J.E. (1987), ‘Parapsychology: Science of the Anomalous or Search for the Soul?’, Behavioral
and Brain Sciences, 10, pp. 553-65.
Alcock, J.E. (1990), Science and Supernature: A Critical Appraisal of Parapsychology, Buffalo, NY:
Prometheus.
Barrington, M.R. (1992), ‘Palladino and the Invisible Man Who Never Was’, Journal of the Society for
Psychical Research, 58, pp. 324-40.
Barrington, M.R. (1993), ‘Palladino, Wiseman and Barrington: Ten Brief Replies\ Journal of the Society
for Psychical Research, 59, pp. 196-98.
Beloff, J. (1977), ‘Historical Overview’, in B.B. Wolman (ed.), Handbook of Parapsychology, New York:
McFarland.
Bern, D.J. (1994), ‘Response to Hyman’, Psychological Bulletin, 115, pp. 25-27.
Bern, D.J., Palmer, J. and Broughton, R.S. (2001), ‘Updating the Ganzfeld Database: A Victim of its own
Success?’, Journal of Parapsychology, 65, pp. 207-18.
Beyerstein, D. (1996), ‘Sai Baba’, in G. Stein (ed.), The Encyclopedia of the Paranormal, New York:
Prometheus Books, pp. 653-57.
Blackmore, S.J. (1987), ‘A Report of a Visit to Carl Sargent’s Laboratory’, Journal of the Society for
Psychical Research, 54, pp. 186-98.
Blackmore, S.J. (1988), ‘Do We Need a New Psychical Research?’, Journal of the Society for Psychical
Research, 55, pp. 49-59.
Parapsychology xxi
Blackmore, S.J. (1993), Dying to Live: Science and the Near-death Experience, London: Grafton.
Blackmore, S.J. (1998), ‘Abduction by Aliens or Sleep Paralysis?’, Skeptical Inquirer, 22, pp. 23-28.
Cardena, E., Lynn, S.J. and Krippner, S. (eds) (2000), Varieties of Anomalous Experience: Examining the
Scientific Evidence, Washington, DC: American Psychological Association.
Edwards, P. (1996), Reincarnation: A Critical Examination, Amherst, NY: Prometheus Books.
Fontana, D. (1991), ‘A Responsive Poltergeist: A Case from South Wales’, Journal of the Society for
Psychical Research, 57, pp. 385^102.
Fontana, D. (1992), ‘The Feilding Report and the Determined Critic’, Journal of the Society for P sychical
Research, 58, pp. 341-50.
Fontana, D. (1993), ‘Palladino (?) and Fontana: The Errors are Wiseman’s Own’, Journal of the Society
for Psychical Research, 59, pp. 198-203.
Girden, E. (1962), ‘A Review of Psychokinesis (PK)’, Psychological Bulletin, 59, pp. 353-88.
Girden, E., Murphy, G., Beloff, J., Eisenbud, J., Flew, A., Rush, J.H., Schmeidler, G. and Thouless, R.H.
(1964), ‘A Discussion of Psychokinesis’, International Journal of Parapsychology, 6, pp. 25-137.
Hansel, C.E.M. (1981), ‘A Critical Analysis of H. Schmidt’s Psychokinesis Experiments’, Skeptical
Inquirer, 5(3), pp. 26-33.
Hansen, G.P. (1990), ‘Deception by Subjects in Psi Research’, Journal of the American Societyfor Psychical
Research, 84, pp. 25-80.
Haraldsson, E. (1985), ‘Representative National Surveys of Psychic Phenomena: Iceland, Great Britain,
Sweden, USA and Gallup’s Multinational Survey’, Journal of the Society for Psychical Research, 53,
pp. 145-58.
Haraldsson, E. and Wiseman, R. (1995), ‘Reactions to and an Assessment of a Videotape on Sathya Sai
Baba’, Journal of the Society for Psychical Research, 60, pp. 203-13.
Honorton, C. (1985), ‘Meta-analysis of Psi Ganzfeld Research: A Response to Hyman’, Journal of
Parapsychology, 49, pp. 51-91.
Honorton, C., Berger, R.E., Varvoglis, M.P., Quant, M., Derr, P, Schechter, E. and Ferrari, D.C. (1990),
‘Psi Communication in the Ganzfeld: Experiments with an Automated Testing System and a Comparison
with a Meta-analysis of Earlier Studies’, Journal of Parapsychology, 54, pp. 99-139.
Houran, J. and Lange, R. (1998), ‘Rationale and Application of a Multi-Energy Sensory Array in the
Investigation of Haunting and Poltergeist Cases’, Journal of the Society for Psychical Research, 62,
pp. 324-36.
Hyman, R. (1981), ‘Further Comments on Schmidt’s PK Experiments’, Skeptical Inquirer, 5, pp. 34-
40.
Hyman, R. (1985a), ‘A Critical Historical Overview of Parapsychology’, in P. Kurtz (ed.), A Skeptic s
Handbook of Parapsychology, Buffalo, NY: Prometheus.
Hyman, R. (1985b), ‘The Ganzfeld Psi Experiment: A Critical Appraisal’, Journal of Parapsychology,
49, pp. 3—49.
Hyman, R. (1994), ‘Anomaly or Artifact? Comments on Bern and Honorton’, Psychological Bulletin,
115, pp. 19-24.
Hyman, R. (1995), ‘Evaluation of the Program on Anomalous Mental Phenomena’, Journal of
Parapsychology, 59, pp. 321-51.
Irwin, H.J. (1993), ‘Belief in the Paranormal: A Review of the Empirical Literature’, Journal of the
American Society for Psychical Research, 87, pp. 1-39.
Jahn, R., Dunne, B., Bradish, G., Dobyns, U., Lettieri, A., Nelson, R., Mischo, J., Boiler, E., Bosch, H.,
Vaitl, D., Houtkooper, J. and Walter, B. (2000), ‘Mind/Machine Interaction Consortium: PortREG
Replication Experiments’, Journal of Scientific Exploration, 14, pp. 499-555.
Liebovici, L. (2001), ‘Effects of Remote, Retroactive Intercessory Prayer on Outcomes in Patients with
Bloodstream Infection: Randomised Controlled Trial’, British Medical Journal, 323, pp. 1450-51.
Maher, M.C. and Schmeidler, G.R. (1975), ‘Quantitative Investigation of a Recurrent Apparition’, Journal
of the American Society for Psychical Research, 69, pp. 341-51.
Marks, D.F. and Kammann, R. (1980), The Psychology of the Psychic, Buffalo, NY: Prometheus Books.
Milton, J. and Wiseman, R. (1999), ‘Does Psi Exist? Lack of Replication of an Anomalous Process of
Information Transfer’, Psychological Bulletin, 125, pp. 387-91.
Moody, R.A. (1975), Life after Life, Covington, GA: Mockingbird Books.
X X II Parapsychology
I n t r o d u c t io n
There have been a number of surveys of spontaneous psychic
experiences reported in the parapsychological literature. However,
most of these surveys involved preselected samples that might be
atypical of a broadly representative population. L. E. Rhine (1961),
for example, based her findings on the reports of persons who
mailed descriptions of their experiences to her on their own initia
tive or in response to public appeals. Other studies involved asking
questions about specific experiences to more or less intact groups
1 This survey was conducted while I was Research Associate at the Division of
Parapsychology, School of Medicine, University of Virginia, I wish to thank Dr. Ian
Stevenson, Director of the Division, for providing financial and administrative
support, and the Parapsychology Foundation for additional financial support.
4 Parapsychology
222 Journal o f the American Society fo r Psychical Research
such as college students or persons from a particular social class
(Green, 1967; Prasad and Stevenson, 1968; Sidgwick and Commit
tee, 1894; W est, 1948). Perhaps the most representative study with
an American sample is a national interview survey of mystical
experiences which unfortunately dealt only superficially with psy
chic experiences (McCready and Greeley, 1976). A highly repre
sentative national survey of psychic experiences in Iceland, using a
questionnaire similar to my own, has recently been published
(Haraldsson, Gudm undsdottir, Ragnarsson, Loftsson, and Jonsson,
1977).
In 1974, I decided to undertake a survey of psychic experiences
in the U.S. using random sampling techniques. My colleague in this
endeavor was Mr. Michael Dennis.2 A national survey was beyond
our resources, so we decided on a community mail survey of
Charlottesville, Virginia, and surrounding suburbs. Charlottesville
is a community of about 35,000 people with a diversity of social
and economic groups, although many of its resources are tied to
the University of Virginia, which is located there. We nevertheless
felt that Charlottesville is a reasonably representative American
community, and a businessman of my acquaintance informed me
that it is often considered such for purposes of marketing research.
Our objectives in carrying out the survey were to estimate the
proportion of Americans who claim to have had various kinds of
psychic experiences, and to explore correlations between these
experiences and other variables, including related experiences and
activities, attitudes, and demographic factors.
1 want to stress at the outset that the survey dealt with experi
ences that our respondents claimed to have been psychic. I am
not prepared to state what percentage of these cases actually re
quire paranorm al explanations, and to my knowledge no attem pts
have been made to verify any of them. I nevertheless think that the
information obtained in the survey is of value to parapsychology
both as a source of sociological information and of hypotheses
about the nature of the experiences considered.
M ethod
Questionnaire
The questionnaire consisted of 46 items,3 many of which con
tained several parts. Respondents answered by circling a number
2 A preliminary report of this survey, co-authored by Mr. Dennis, was presented
at the Seventeenth Annual Convention of the Parapsychological Association,
Jamaica, N.Y. 1974. An abstract was published in Research in Parapsychology
1974. Metuchen, N.J.: Scarecrow Press, 1975. Pp. 130-133.
3 For reasons of space, not all of these items will be discussed in this report.
Parapsychology 5
A Community Mail Survey o f Psychic Experiences 223
next to their choices. They were encouraged to elaborate upon
their answers or describe particularly meaningful experiences on
the back of the questionnaire or on separate pages. The items can
be classified in six main categories, as follows:
IA: Experiences that, if valid, by definition involve psi, i.e.,
either ESP or PK. These include waking ESP experiences, ESP
dream s, being an “ agent” for someone else’s ESP experience, and
poltergeist activity (RSPK).
IB: Experiences that are not psychic as such, but are of interest
to parapsychologists because they might provide a context for
either ESP or PK effects. These include out-of-body experiences
(OBEs), apparitions, communication with the dead, hauntings,
“ m em ories” of a previous lifetime, deja vu experiences, and aura
vision. Supplementary questions (see below) explored possible
psychic elements in some of these experiences.
II: Altered states of consciousness that are not of direct interest
to parapsychologists, but are often considered relevant to psi.
These include dream s (addressed in terms of frequency of recall
and vividness), lucid dreams, and mystical experiences. The dis
tinction between this category and category IB above is admittedly
not a sharp one. For example, some might want to include mystical
experiences in IB, while others might want to include deja vu and
aura vision in II. Our decisions on this m atter were based on our
assessm ent of which topics parapsychologists have historically
considered to fall within their purview. They are to some extent
arbitrary.
Ill: Activities related to psi. These include m editation, use of
hallucinogenic drugs, analysis of one’s dreams, and seeking the
services of a psychic.
IV: Attitudes related to psi. Included in this category are atti
tudes toward astrology, survival of death, reincarnation, and the
value of parapsychological research. We did not include a direct
question about belief in psi (i.e., a “ sheep-goat” question) because
we feared it might bias respondents’ answers to some of the other
questions. We hoped that the question about the value of psi
research would get at this issue indirectly, so we interpreted it as a
surrogate sheep-goat question.
V: Demographic questions. These include sex, race, age, birth
order, marital status, political ideology, religious denom ination,
religiosity, level of education, occupation, and family income.
VI: The effect of psychic experiences on the respondents’ lives,
including their attitudes, their life decisions, and whether such
experiences had ever saved them or someone else from a crisis or
tragedy.
For purposes of analysis, items in category I (A and B) were
6 Parapsychology
224 Journal o f the American Society fo r Psychical Research
treated as dependent or criterion variables, while items in
categories II through V were treated as independent or predictor
variables. Items in category VI have not been analyzed in relation
to other items in the survey.
Items in categories I through III generally had multiple parts.
First, respondents answered “ yes” or “ no” to whether they ever
had the experience or engaged in the activity in question. If they
answered “ yes,” they then answered a set of from one to eight
supplementary questions asking for more specific information
about these experiences. The first of these questions usually re
ferred to how many times respondents had the experience, and
they responded by circling a number from “ 1” to “ 9-or-m ore.”
For the other questions, they circled the number of these experi
ences that had a given characteristic. We recognized that in cases
where persons had multiple experiences, they may well not be able
to remember the exact number they had, or exactly how many had
a given attribute. This format nonetheless allowed us to make
rough estimates of these numbers.
The primary questions were phrased as precise descriptions of
the experience or activity of interest, using the simplest words
possible. In most cases, we avoided the use of labels such as
“ telepathy,” “ apparition,” and “ out-of-body experience” that
might have different connotations for different respondents. We
also generally avoided giving examples, lest respondents feel they
should answer “ no” unless their experience was the same as the
example. These principles should become clear as I quote the
questions in the presentation of results to follow.
Selection o f Sample
Our purpose was to obtain as representative and random a sam
ple as possible of the population of Charlottesville. Toward this
end, we used two sources. The first was the City Directory of
Charlottesville, which lists all persons over 18 living in numbered
street addresses in Charlottesville and surrounding suburbs. The
second was the University of Virginia (UVa) Student Directory,
which gives a complete listing of students registered at the U niver
sity at the beginning of the school year.
We decided upon an initial sample of 1,000 persons. Based upon
census figures of the proportion of Charlottesville residents who
were UVa students, we selected 700 names from the City Directory
and 300 names from the Student Directory. These became our
“ tow n” (T) sample and our “ student” (S) sample, respectively.
The names to be selected for the samples were determined by
referring to a com puter-generated table of pseudo-random num-
Parapsychology 7
A Community M ail Survey o f Psychic Experiences 225
bers. These numbers defined the page, the column, and the row of
the Directory which was to be sampled. This procedure became
quite complicated with the City Directory, because of the nonsys-
tematic way in which names were arranged on the pages. Also,
names had to be excluded from each Directory for various reasons
(e.g., UVa students had to be excluded if their names were “ cho
sen’' from the City Directory).4
Procedure
On March 1, 1974, one copy of the survey questionnaire, along
with a postage-paid “ business reply” return envelope, was sent to
each of the 700 persons sampled from the City Directory. The first
mailing to the 300 students was on M arch 11, because the week of
March 1 coincided with the U niversity’s spring vacation. It also
corresponded to a postal rate increase, which is why we didn’t wait
until March 11 to mail all the surveys.
There were two additional follow-up mailings to persons who had
not yet returned their questionnaires. Each of these mailings oc
curred three weeks after the preceding one. It consisted of a new
copy of the questionnaire, a new return envelope, and a supple
m entary letter exhorting the person to return his or her completed
questionnaire.
Each questionnaire had a three-digit code number stamped on
the lower right-hand corner of the back page. This number keyed
the person’s name on our mailing list. When a questionnaire was
returned, this num ber was circled on the mailing list and the date
we received it was recorded.
If a questionnaire was returned to us by the post office as
undeliverable, or if someone else returned it indicating the person
was deceased, a new name was sampled and the questionnaire was
sent to the new individual. If a person returned an uncompleted
questionnaire or indicated a refusal to cooperate, we simply treated
that person as a “ no-return” and did not resample.
Data Scoring and Analysis
The respondents’ answers were transferred directly from the
questionnaires to IBM cards by professional keypunchers at the
UVa Computer Center. The data were subsequently stored on
magnetic tape.
The data were analyzed using the SPSS statistical package (Nie,
4 A manuscript describing the selection procedure in more detail is available from
the author upon request.
8 Parapsychology
226 Journal o f the American Society fo r Psychical Research
Hull, Jenkins, Steinbrenner, and Bent, 1975). Frequency distribu
tions were first printed out for all items. Most of the items were
then cross-tabulated with each other, resulting in the printout of a
large number of contingency tables and corrected chi-square
values. In some cases it was necessary to combine some of the
response categories for meaningful cross-tabulations.
Results
Return Rates
We obtained usable questionnaires from 354 townspeople and
268 students, corresponding to 51% and 89% of the initial samples.
N ot all respondents answered every item, so that the “ N s” for
individual items discussed below are often slightly less than the
above figures.
We were very gratified by the response of the students, and I
consider our final sample to be highly representative of this aspect
of our population. The response of the townspeople, while less
gratifying than the response of the students, was by no means a bad
showing for this type of survey. Nevertheless, the representative
ness of this sample is questionable.5 Although we have not under
taken a formal comparison with census figures, it is obvious that
there is an under-representation of the lower socio-economic
classes in our final T sample. This is understandable, because such
persons may have had difficulty in understanding the questions or
been adverse to “ paperw ork” generally. The seriousness of this
bias is mitigated somewhat by the fact (to be discussed later) that
socio-economic variables were not strongly correlated with the
frequency of reported psychic experiences.
A second way in which we attempted to assess the bias produced
by the relatively low response rate of the townspeople was to
evaluate the responses on questionnaire items as a function of
when the respondents returned their questionnaires. Respondents
were divided into three groups according to whether they returned
their questionnaires after the first, second, or third mailings. For
the T sample, these three groups contained 183, 112, and 59 re
spondents, respectively.
The original rationale for this procedure was based upon the
assum ption that persons who did not return their questionnaires at
all were more like the persons who returned them after the third
mailing than those who returned them after the first mailing. Thus
if there were, say, a significant decline in the proportion of respon-
5 I understand that sociologists consider 60% to be the minimal rate of return
which justifies a claim of representativeness.
Parapsychology 9
A Community Mail Survey o f Psychic Experiences 221
dents who reported having had a psychic dream across the three
groups, we might suspect that the non-respondents had relatively
few psychic dream s and that our sample percentage was an overes
timate of the population percentage.
W hat we found, in fact, was that for none of the questionnaire
items was there a significant difference in responses as a function
of date of return. M oreover, the general trend was for a slightly
higher proportion of people in the first and third groups to report
psychic and psi-related experiences than in the second group. Al
though I cannot provide any evidence justifying the rationale out
lined in the preceding paragraph, these results do make me more
confident that the results from our T sample are not grossly off
base.
Psychic Experiences
In this section I will present descriptive data regarding the expe
riences listed in Category I as defined above. These data are listed
in Table 1. The figures in parentheses refer to the estimated propor
tion of experiences that have the characteristic in question.
A few comments about these latter estimates are in order. They
were computed by dividing the total number of experiences re
ported as having the characteristic by the total number of experi
ences reported by respondents in the sample. Thus the experience
rather than the respondent is the unit of analysis, and some re
spondents contributed to this figure more than others. M oreover, it
was necessary to exclude from these computations the data from
respondents who claimed to have nine or more of the experiences,
since the exact number of experiences they had could not be
determined. The adjacent figures not in parentheses refer to the
proportion of those persons claiming to have had the experience at
least once who also claimed to have had at least one such experi
ence with the characteristic in question. The advantages and disad
vantages of these figures are roughly complementary to those of
the figures in parentheses.
For reasons of space, I will not quote all the figures in the text.
Therefore, readers may wish to keep Table 1 at hand as they read
the following paragraphs.
Waking ESP Experiences. This question was designed to assess
how many respondents ever had what they considered to be a valid
ESP experience while in the waking state. The question was
worded as follows: “ Have you ever had, while aw ake, a strong
feeling, impression, or 'vision’ that a previously unexpected event
had happened, was happening, or was going to happen, and
[learned] later that you were right?”
10 Parapsychology
228 Journal o f the American Society fo r Psychical Research
Table 1
Percentage of Respondents Claiming Psi or Psi-Related Experiences
Item T (N=354)a S (N = 268)£
Waking ESP 38 39
More than one 86b 79b
Vision (hallucinations) 45 (30)c 24 (13)c
Tragic event 42 (21) 35 (17)
Family member 78 (51) 59 (33)
Within 24 hours 83 (56) 86 (67)
Told someone 55 (30) 46 (23)
ESP Dreams 36 38
More than one 85 89
Especially vivid 81 (66) 80 (68)
Tragic event 43 (19) 31 (17)
Family member 65 (41) 58 (34)
Within 24 hours 63 (35) 58 (31)
Told someone 46 (22) 40 (16)
ESP Agency 18 20
More than one 72 65
Emotion 62 (49) 58 (45)
Thinking of percipient 62 (49) 51 (43)
Family member 61 (47) 42 (32)
RSPK (Poltergeist) 8 6
More than one 86 63
Other person present 46 27
Out-of-Body Experiences 14 25
More than one 87 82
Saw physical body 56 (43) 62 (45)
Traveled 29 (21) 27 (14)
ESP (information acquired) 15 ( 7) 12 ( 3)
Seen as apparition 10 ( 9) 9 ( 2)
Produce at will 16 (12) 22 (16)
Apparitions 17 17
More than one 74 79
Seen 46 (34) 41 (33)
Heard 70 (58) 65 (45)
Touched 61 (53) 56 (43)
Family member 49 (42) 29 (17)
Deceased 59 (39) 30 (19)
ESP (information acquired) 24 (18) 11(2)
Collective 22 (12) 29 (13)
Communication with the Dead 8 5
Seen or heard 72 31
Automatic writing 12 0
Direct voice 21 23
Xenoglossy 8 31
Lived in Haunted House 7 8
Past-Life Memories 8 9
More than one experience 69 87
Dream 65 (42) 68 (38)
More than one lifetime 36 32
Famous person 37 43
Parapsychology 11
were dreaming and felt that you possessed all your waking facul
ties,” were claimed by 56% of the T sample and 71% of the S
sample. However, only 14% of the T sample and 29% of the S
sample said they had such dreams more often than “ rarely.”
These items were generally good predictors of psi-related experi
ences,8 especially ESP experiences. Frequency of dream recall was
significantly related to waking ESP experiences and ESP dreams in
both samples. In the T sample, it was also related significantly to
8 From here on, the term “ psi-related experiences” will also include experiences
distinguished as “ psychic” ; i.e., those in category IA as well as IB.
Parapsychology 19
A Community Mail Survey o f Psychic Experiences 237
ESP agency, communication with the dead, past-life memories, and
deja vu. On the other hand, it related significantly only to RSPK
and apparitions in the S sample.
Vividness of dreams was not quite as reliable a predictor as
frequency of dream recall. The only variable it related to signifi
cantly in both samples was ESP dream s, a finding that should be
interpreted in the context of the fact reported above that respon
dents tended to rate their ESP dreams as more vivid than their
ordinary dream s. Vividness also was significantly related to OBEs,
past-life memories, deja vu, and aura vision in the T sample, and to
apparitions and hauntings in the S sample.
W hether or not a respondent had ever had a lucid dream was a
strong predictor of psi-related experiences. This was especially true
in the T sample, where it was significantly related to every psi-
related experience except RSPK. In the S sample, it was signifi
cantly related only to waking ESP, ESP dreams, OBEs, and appari
tions.
M ystical Experiences. The survey revealed that 28% of the T
sample and 35% of the S sample claimed to have had “ a profound
and deeply moving ‘spiritual,’ ‘m ystical,’ or transcendental experi
ence.” Of those claiming a mystical experience, about three-
quarters said they had had more than one.
Mystical experiences were significatly related to waking ESP in
both samples, but to ESP dreams in neither. This item also signifi
cantly predicted apparitions and communication with the dead in
both samples. In addition, it significantly predicted ESP agency,
RSPK, deja vu, and aura vision in the T sample, and OBEs,
hauntings, and past-life memories in the S sample.
Activities Related to Psi
Data for item s in this section are also in Table 2 and
Figure 1.
Dream Analysis. We asked our respondents the following: Have
you ‘‘ever tried to remember or analyze your dreams for the
guidance or insight they might give you?” This question was an
swered affirmatively by 36% of the T sample and 53% of the S
sample. Fifty-nine percent of these respondents in the T sample
found such analysis to be at least somewhat helpful, as compared
to 49% of the S sample.
I was interested in this as a predictor of psi experiences, espe
cially because it conceivably might separate out persons who had
been in psychotherapy, and whose experiences might have some
pathologic origin. We did not wish to approach this question di
rectly for obvious reasons. Of course, an affirmative answer to this
20 Parapsychology
238 Journal o f the American Society fo r Psychical Research
question by no means necessarily implies psychotherapeutic expe
rience.
This item did prove to be a very strong predictor of psi experi
ences, but only for the T sample. There was a significant relation
ship between reported self-analysis of dreams and every one of the
11 psi-related experiences for the T sample. For the S sample, this
relationship was only significant for apparitions, although the other
relationships tended to be in the positive direction.
Visits to Psychics. We asked our respondents whether they had
“ ever seriously sought information, help, or guidance” from a
“ medium, clairvoyant, or psychic,” “ palm reader,” “ astrologer,”
or “ faith (or psychic) healer.” The question was answered affirma
tively by 10% of the T sample and 3% of the S sample. Interest
ingly, while 89% of those in the S sample who sought such help
found it at least somewhat helpful, only 45% of the corresponding
respondents in the T sample did.
Again, this item proved to be a better predictor of psi-related
experiences in the T sample than in the S sample. It was signifi
cantly related in this sample to every psi-related experience except
OBEs, past-life memories, and deja vu. In the S sample, it was
significantly related only to ESP agency, past-life memories, and
aura vision. The extreme marginal totals in these latter analyses,
however, suggest some caution in their interpretation.
M editation. We asked our respondents whether they had ever
practiced meditation, in the sense of a ‘form al technique of stilling
the m ind.” The question was answered affirmatively by only 6% of
the T sample and 9% of the S sample. Meditation was described as
at least somewhat helpful by 94% of those in the T sample who
practiced it, and 82% of those in the S sample.
M editation was not a strong predictor of psi-related experiences,
although the direction of the relationship was almost always posi
tive. It was significantly related to OBEs and apparitions in the T
sample and to aura vision in the S sample.
Drugs. We asked our respondents whether they had “ ever used
‘mind expanding’ drugs or medicines,” and, if so, whether they
had had any psi-related experiences while under their influence.
We used this rather non-specific wording to avoid putting some of
our respondents in the position of admitting the use of illegal
substances. Although our survey was essentially anonymous, we
did keep records of the names associated with the code numbers on
the questionnaires until the solicitation phase of the project was
com pleted.
The drug question was answered affirmatively by 7% of the T
sample and 32% of the S sample. We suspect that the figure for the
S sample is a gross underestimate of actual drug use. W hether this
Parapsychology 21
$11,000-315,000 27 —
$16,000-320,000 17 —
$21,000-$30,000 17 —
Over $30,000 11 —
Note: Figures in some percentage columns do not add up to 100 due to rounding
error.
a Sample size varies slightly from question to question due to non-responders.
h Figures for students are not included because some respondents interpreted the
item as requesting parental income, while others did not.
The L ittle Oxford D ictionary (1986) common trends or patterns may emerge to
defines a coincidence as a 'remarkable con suggest possible process-related hypothe
currence of events without apparent causal ses, some of which may be quite normal
connection’. This definition begs 2 ques and others paranormal.
tions: what makes some concurrences of In their article 'Methods for Studying
events remarkable and not others, and how Coincidences', mathematicians Persi
does one establish an apparent lack of Diaconis and Frederick Mosteller (1989)
causal connection? By their nature, remark identify 4 factors that, they feel, can
able coincidences are one-off, unique account for the vast majority of coinci
events that cannot realistically be manufac dences. These are: hidden cause, multiple
tured and controlled in a laboratory setting. endpoints, the law of truly large numbers,
Parapsychologists therefore encounter and human psychology. Returning to our
coincidences after they have occurred, and dictionary definition, the first factor, obvi
must use techniques of interview and ously, suggests causal connections behind
meticulous description to try to reconstruct coincidences; the others are related to how
a picture of events involved in the coinci we find some concurrences of events more
dence, much as a detective has to piece remarkable than others. The term 'human
together evidence suggesting the events in psychology' is extremely broad, however,
a crime. Because one can never be 100% and overlaps somewhat with the first 3
certain that all possible causal links factors; after all, humans experience coin
between concurrent events have been fully cidences, so by definition human psychol
investigated and eliminated, the paranor- ogy is likely to play a part in all
mality of individual coincidences will coincidences. One might refine the broad
always be a matter of degree of confidence. 'psychology' topic into 2 categories which
It is only when many coincidences are col are by no means mutually exclusive in the
lected together and analysed that some real world but which represent different
An earlier version of this paper was presented at schools of psychological research. The first
an SPR Weekend Course on Psi and Synchronic- refers to characteristics of our intuitive
ity, November 1990; some of the other speakers judgements about probability or likelihood;
focused on paranormal aspects of coincidences. I the second refers to ways in which our per
would like to thank Charles Honorton, Robert ceptions, judgements and recollections are
Morris and my referees for their helpful sugges modified so as to confirm our beliefs and
tions for improvements. expectations.
36 Parapsychology
WATT
Unlike Diaconis and Mosteller, I am not hear the newsflash announcing the coup,
confident that all coincidences may be this information may have been subcon
explained away by these factors. Perhaps, sciously registered, triggering the night
though, an understanding of them may mare. Thus, further investigation of this
help parapsychologists to separate the coincidence between the contents of a
coincidental wheat from the chaff. In this dream and a recent news item revealed a
article I will briefly reiterate Diaconis and possible hidden cause that made the
Mosteller's arguments on hidden causes, coincidence less surprising. It's quite likely
multiple endpoints, and truly large num that a proportion of meaningful coinci
bers, introducing other related research as dences can be explained by a hidden cause.
we go along. I will then expand consider Describing the range of such causes is
ably upon their brief comments on psycho beyond the scope of the present article, but
logical factors, dealing first with studies of see Marks and Kammann (1980) and
the 'intuitive statistician' and then with Morris (1986, 1989) for more comprehen
ways in which our beliefs affect our sive treatments of this topic.
perception, judgement and memory. While
much of this material may already be 2. Multiple Endpoints
familiar to parapsychologists, I hope to
provide some service by drawing together A coincidence can be very impressive if
many disparate strands of research on it is very specific. Often, however, a 'close'
human judgement under uncertainty, as coincidence is also regarded as impressive,
well as introducing some of the most recent although the chances of a 'close' coinci
criticisms of the 'heuristics and biases' dence happening are far greater than the
literature.1 chances of an exact or specific coincidence.
For example, someone may get the
1. Hidden Cause hunch that the phone is about to ring, and
it will be Auntie Maude, who hasn't been in
Marks and Kammann (1980) described touch for years, making the call! As pre
'unseen cause' as 'the second root of coinci dicted, the phone does ring, only it’s Auntie
dence' (their first root is simple probabil Maude's neighbour. Well, that's still quite
ity). A coincidence is not surprising if we an impressive coincidence, but you might
discover a simple reason for it. But other also be impressed if it had been Maude's
surprising coincidences can have perfectly husband Bert on the line, or another auntie,
straightforward hidden causes, which we or Maude’s daughter...and so on.
have just not yet discovered. For instance, The prediction was quite specific, but if
imagine a case where a woman wakes up the experient allows for 'close' coincidences
from a nightmare in which President to count, then the prediction has multiple
Gorbachev is attacked in a coup. She thinks endpoints. That is, there could be many
nothing more of it, until she sees from the 'close' coincidences that could also be seen
headlines in the following morning's as impressive, although the chances of a
newspaper that this actually happened. On 'close' coincidence are so much higher than
first inspection this could be a meaningful the chances of Auntie Maude alone being
coincidence, suggesting that in her dream the caller.
she gained information through precogni What is it that makes a coincidence
tion or clairvoyance. However, when 'close'? Specific events are members of
various members of the family are inter larger categories (for example, relatives
viewed, it emerges that she went off early who might telephone); elements in the
to bed the night before. The rest of the same category or readily associated with
family watched the 10 o'clock news in an each other (for example, a next-door neigh
adjoining room, and although the woman bour of Maude) are seen in degrees of
was asleep, the news could be heard in her closeness in accordance with the size of the
room. Even though she did not consciously category that is shared (for example, next
67
Parapsychology 37
PSYCHOLOGY AND COINCIDENCES
68
38 Parapsychology
WATT
increased surprise for self-coincidences, she The Intuitive Statistician
was often interrupted with 'but you should
hear what happened to me....' (Falk, 1989, One popular illustration of how we
p.488). Thus, personal involvement is one underestimate the likelihood of a concur
important consideration in explaining why rence of events is the birthday problem:
some concurrences of events are seen as how many people would you need to
remarkable while others are not. gather together before there was a 95%
A similar egocentric bias may explain chance that 2 of them would share the same
why, although they may be perfectly aware day and month of birthday? The answer is
of the statistics for risk of death in car surprisingly (if you are not familiar with
accidents or for risk of smoking-related this problem) few people; only 48 in fact.
disease, individuals consistently underes For only a 50% chance of 2 individuals'
timate the likelihood that they personally birthdays coinciding, only 23 people need
will become victims (Slovic, Fischoff, & be gathered together. That so few people
Lichtenstein, 1982). Experience perpetuates are needed is usually quite surprising
this myth; the newspapers only report because we typically underestimate the
accidents that happen to other people. It is number of different combinations of pairs
only when someone close to us is involved of birthdays that can occur with a small
in an accident or falls ill that we are sud number of people. We expect that with 365
denly reminded that we are not immune to possible birthdays you'd need a fairly large
disaster and we are not immortal! number of people before there was a coin
cidence of birthdays.
4. Psychology Diaconis and Mosteller (1989) have
developed a simple formula that enables
There are several aspects of human psy the calculation of the number of people
chology that affect how we judge the likeli needed to get a coincidence of birthdays or
hood and frequency of coincidences, and of any other categories: how many people
that affect our perception and recall of (N) do you need for there to be a 50%/95%
coincidences. Occasionally these psycho likelihood that at least 2 of them will fall in
logical factors may contribute to us mistak the same category from among a number
enly judging a coincidence to be significant of categories (c) such as 365 possible birth-
or meaningful. dates?
First of all we will consider people as
intuitive statisticians. I will describe the Approximately,
findings of research into how we make N = 1.2>/c for 50%chance
judgements under uncertainty, including N = 2.5>/c for 95%chance
estimations of likelihood or probability,
and frequency or base rate information. Using this formula, Table 1 shows how
Secondly I'll describe psychological many people are needed for coincidences
research into how our perception, judge between different numbers of categories.
ment and recall canbe biased by our beliefs
and expectations. Not all of this research Table 1
has been conducted with coincidences Guide to solving the birthday problem, and
explicitly in mind, but because the experi other coincidences of categories(c)
ence of coincidences is one form of judge
ment under uncertainty, readers may see c = 100 200 300 365 400 500 600 700
how general psychological research may be
relevant to this question.
N(50%) = 12 17 21 23 24 27 29 31
N(95%) = 25 35 43 48 50 56 61 66
69
Parapsychology 39
PSYCHOLOGY AND COINCIDENCES
It is interesting to note how slowly N (e.g., Kahneman, Slovic, & Tversky, 1982;
rises as c increases, so that having several Nisbett &Ross, 1980).
hundred more partygoers does not As I said in the introduction, there has
dramatically increase the chances of a recently been a backlash against the heuris
coincidence of birthdays. tics and biases movement. Before I describe
Diaconis and Mosteller extend this cal the reasons for this in more detail, how
culation to apply to other, more complex ever, I will briefly introduce 2 major heuris
situations, for instance, where there is more tics (judgement by representativeness and
than one type of category that could coin judgement by availability) whose use may
cide (such as birthdays and year of birth), introduce some bias into people's base rate
and where 'close coincidences’ are accepted and probability estimates.
(the multiple endpoints situation described
earlier). The formula to estimate the num Judgement by Representativeness has been
ber of people needed for a coincidence proposed to explain an apparent lack of
within k days in the latter, 'almost birth understanding of the 'law of large numbers'
days' situation (with a 50-50 chance of a (the larger the random sample, the greater
coincidence) is: its accuracy in estimating the characteristics
of the parent population from which it is
N = 1.2V drawn). It is argued (e.g., Tversky &
J— ^—
(2k +1) Kahneman, 1974) that people judge the
likelihood of an event according to the
With c(categories) = 365 and k = 1 day, sample's similarity to, or representativeness
only around 13 people are needed for a of, the parent population on certain essen
match. tial features such as means and propor
These formulae may be helpful in esti tions. Sample size, which should give some
mating the likelihood of coincidences indication of the degree to which one could
where the number of possible categories is confidently predict characteristics of the
known or can be discovered after some parent population, was frequently
research. There remains, however, a large neglected by subjects in early studies by
number of events whose frequency is diffi Kahneman and Tversky.
cult to measure objectively or even to For instance, subjects were posed this
estimate, and which therefore cannot be question:
examined using such formulae. For these, 'A certain town is served by two hospi
as well as for coincidences that are quanti tals. In the larger hospital about 45
fiable, people may fall back on rough 'rules babies are bom each day, and in the
of thumb’; the so-called cognitive heuris smaller hospital about 15 babies are bom
tics. each day. As you know, about 50% of all
Over the last 20 years cognitive babies are boys. The exact percentage of
psychologists, led by Amos Tversky and baby boys, however, varies from day to
Daniel Kahneman, have developed the idea day. Sometimes it may be higher than
that people use a number of rules of thumb 50%, sometimes lower. For a period of 1
or cognitive shortcuts in their everyday year, each hospital recorded the days on
processing of information. Usually these which more than 60% of the babies bom
strategies, called cognitive heuristics, are were boys. Which hospital do you think
perfectly adequate to get us through daily recorded more such days?' (Kahneman
life efficiently. When it comes to assessing & Tversky, 1972, p.443).
the statistical likelihood of events such as Subjects' opinions were equally divided
coincidences, however, it has been argued between the two hospitals, despite the fact
that the use of these heuristics can intro that by the law of large numbers the
duce a source of bias into our estimations. smellier hospital would be expected to
Hence, this area of research has come to be show more deviations from the average
known as the 'heuristics and biases' school
70
40 Parapsychology
WATT
50%figure. Later, however, it was demon Judgement by A vailability is the second
strated that subjects could take account of cognitive heuristic that may influence our
sample size if the wording of questions was judgements about coincidences. When we
simplified (e.g., Bar-Hillel, 1979); indeed, if use availability we estimate frequency in
sample size was the only information pro terms of how easy it is to think of examples
vided, then correct responding could of something (Tversky &Kahneman, 1974).
approach 100%(Evans, 1989). Like representativeness, judgement by
Nevertheless, in the real world, people availability is usually a good rule of thumb,
are faced with lots of possibly irrelevant but it can lead to biased decisions because
information, which may distract attention availability is influenced not only by objec
from features such as sample size that tive frequency but also by recency,
should be taken into consideration when familiarity and vividness. For example,
making judgements under uncertainty. So when we estimate how often earthquakes
in the case of coincidences, if people tend occur in a 10 year period we are too heavily
not to take sample size sufficiently into influenced by whether an earthquake has
account when judging likelihood, they may occurred recently.
not appreciate that an extreme outcome is The apparent neglect of base rate or fre
more likely to occur in a small sample, and quency information in making probability
may therefore mistakenly attribute signifi judgements (the 'base rate fallacy') has been
cant rarity to a coincidence occurring under widely attributed to the operation of the
these conditions. availability heuristic (e.g., Borgida &
The representativeness heuristic has Brekke, 1981). Here, it is argued, base rate
also been proposed to explain the so-called information is often less vivid, more
'conjunction fallacy' (Tversky &Kahneman, abstract, less noticeable than other kinds of
1983). Here, subjects judge the conjunction information and so it tends to get over
of 2events as more probable than one of its looked. In the earthquake example, the
components because, it is argued, they base rate or frequency information refers to
judge according to the similarity between data about how many earthquakes have
the paired events and an original descrip occurred in the last 10 years. Typically, this
tive statement; this is despite the basic tenet statistical information is overlooked in
of probability theory that a conjunction favour of the vivid memory of a recent
cannot be more probable than one of its earthquake, leading to an exaggerated
consituents. For example, subjects were estimation of the frequency of earthquakes.
given the following description (Tversky & Studies that have increased the availability
Kahneman, 1983, p.297): of base rate information (for instance by
conveying it graphically rather than in
'Linda is 31 years old, single, outspoken tabular form) have shown that it can be
and very bright. She majored in philoso taken into account by subjects.
phy. As a student, she was deeply con Another consequence of the availability
cerned with issues of discrimination and heuristic is that we pay less attention than
social justice, and also participated in we should to negative information - to non
anti-nuclear demonstrations.' occurrences or non-coincidences - because
Subjects were asked to indicate which they are less noticeable. Logically, the fail
of 2 alternatives was more probable: 'Linda ure of something to happen can be just as
is a bank teller'; or, 'Linda is a bank teller informative for our decision-making as a
and is active in the feminist movement’. positive occurrence. Yet, because non-
85% of the respondents indicated that the events are less salient or less memorable,
latter statement was more probably correct, their usefulness for judging the frequency
a finding which Kahneman and Tversky of, say, coincidences, is neglected. Take, for
interpret as a blatant violation of the con example, a person who believes that she
junction rule. can make people telephone her simply by
wishing for it to happen. When she
71
Parapsychology 41
PSYCHOLOGY AND COINCIDENCES
72
42 Parapsychology
WATT
evaluate performance. Yet it is the What is called in the heuristics and
'inadequate intuitive statistician' message biases literature the 'normative theory of
that caught the imagination and tinged the probability' or the like is in fact a very
research approaches of subsequent investi narrowkind of neo-Bayesianview that is
gators. shared by some theoretical economists
Lopes argues persuasively that evalu and cognitive psychologists, and to a
ative language does not belong in scientific lesser degree by practitioners in
articles; these should be concerned with business, law, and artificial intelligence.
description and interpretation rather than It is not shared by proponents of the fre-
value judgements. The 'rhetoric of irration quentist view of probability that domi
ality' may serve to titillate authors and nates today's statistics departments, nor
readers, who can feel themselves superior by proponents of many other views; it is
because (with hindsight) they can solve the not even shared by all Bayesians....By
this narrow standard of 'correct' proba
probability problems; the strong language bilistic reasoning, the most distinguished
also gives the impression (misleading, as probabilists and statisticians of our cen-
we shall see) that there is an obvious tury....would be guilty of biases' in
correct answer to such problems. probabilistic reasoning, (pp.86-87)
2. Statistical M odels. Often the authors of Gigerenzer proceeds to demonstrate
papers on heuristics and biases use phrases how ’overconfidence bias' (where subjects
such as 'subjects' inability to appreciate the answering a series of questions show a
discrepancy between their perceived suc
laws of probability' or their 'lack of intui cess
tive understanding of the normative theory and their actual performance of a task;
of prediction'. Whereas anyone reading a overview by Lichtenstein, Fischhoff, &
standard textbook on statistics could be Phillips, 1982), the 'conjunction fallacy' and
forgiven for concluding that there is some the 'base rate fallacy’ can be made to
sort of 'normative probability theory' that ’disappear' if questions are re-phrased to
provides correct answers to problems models take account of alternative statistical
posed in some heuristics and biases and meanings of probability.
experiments, those in the know - that is, tellerLetand us return to the 'Linda is a bank
is active in the feminist move
statisticians - have pointed out that there is ment' example
no normative probability theory; and, junction fallacy.used to illustrate the con
Gigerenzer points out that
worse still, that the statistical assumptions to choose this description
behind the probability problems come from likely is a violation of ofsome Linda as more
subjective
a school of reasoning that is held by only a
minority of statisticians. theories of probability, including Bayesian
The most authoritative critic of the theory, but it is not contrary to the domi
model of probability used in most heuris nant frequentist school of probability,
tics and biases literature is Gerd Gigerenzer because in this latter model, single specific
(e.g., 1991a, 1991b; see Gigerenzer et al., events cannot be considered in terms of
1989, for a description of the historical probability; probability theory is about fre
development of die different statistical quencies, not single events. If the Linda
schools of thought; and see Gigerenzer & problem is rephrased in frequentist terms
Murray, 1987, for a detailed consideration tion 'There are 100 persons who fit the descrip
of these as they have been applied to the themabove (i.e., Linda's). How many of
are: (a) bank tellers (b) bank tellers
study of judgement under uncertainty). In
a paper entitled ’How to make cognitive and active in the feminist movement' then
the 'conjunction fallacy’ largely disappears,
illusions disappear: Beyond heuristics and with
biases', Gigerenzer (1991a) makes a strong (b) asonly 22%of subjects choosing option
most likely (Fiedler, 1988).
critique of the heuristics and biases school:
73
Parapsychology 43
PSYCHOLOGY AND COINCIDENCES
Some of the 'errors' identified by and biases experiments, and the everyday
Kahneman and Tversky and their followers situations where judgements about prob
may therefore be due to the researchers' ability are made (e.g., when placing a bet;
adoption of an inappropriate statistical when judging what caused a picture to fall
model rather than to weaknesses in their off a wall; when reading about or experi
subjects' reasoning abilities. Further, away encing coincidences). When such artificial
from the relatively controlled and clean situations are used in conjunction with pos
world of the laboratory, the confusions and sibly inappropriate models of probability,
complexities of the real world may make any conclusions that may be drawn about
the application of any statistical models the use of cognitive heuristics in more
controversial and rather difficult. complex situations become severely lim
ited. There is a need for the heuristics and
3. Experimental methodology. Earlier, biases researchers to adopt more realistic
when discussing the 'law of large numbers', methodologies; for instance, role-playing,
I cited a study that demonstrated that peo simulations of complex situations, and
ple are more able to take account of this observational studies of individuals’ statis
law if the question is phrased more simply, tical judgements in their natural environ
and if other distracting information is ment. As we shall see in the next section,
removed. In a similar vein, many of studies of the biasing effects of beliefs and
Kahneman and Tversky's original positions expectations on perception, judgement and
have been refined, following demonstra memory have successfully used more real
tions that variations in experimental meth istic settings, and have produced findings
odology cause variations in the appparent that have practical applications.
influence of cognitive heuristics upon prob
lem solving and judgement under uncer 4. Theoretical usefulness of heuristics.
tainty. We have already seen how the Sherman and Corty (1984) also note that
'conjunction fallacy' can be made to disap Kahneman and Tversky’s heuristics are
pear by rephrasing the question. rather vague and are often identified post
Steven Sherman and Eric Corty (1984), hoc. They are insufficiently precisely
for instance, review a number of studies defined to enable prediction of which par
that suggest that the extent to which heuris ticular heuristic will be applied in which
tics are used to solve a problem may specific situation. Gigerenzer (1991a)
depend on the way in which the problem is echoes these criticisms thus: 'All three heu-
presented or structured. If there is plenty of ristics...are largely undefined concepts and
time, if the task is not too complex and is can post hoc be used to explain almost
clearly presented, if base rate information is everything. After all, what is similar to
made concrete, salient and specific to an what (representativeness), what comes into
individual case, then individuals may reach your mind (availability), and what comes
the normatively correct solution (where first (anchoring) have long been known to
there is one). For example, typical biases in be important principles of the mind'
judging random sequences can be elimi (p.102). Heuristics, he argues, are hardly
nated simply by instructing subjects that more than re-descriptions of the phenom
random events may be present or by pro ena seen in judgement under uncertainty.
viding them with a comparison level of
nonrandomness (Peterson, 1977).
Related to the question of experimental Conclusions
methodology is another telling criticism of Do these criticisms of the heuristics and
the heuristics and biases paradigm: its lack biases literature negate its applicability to
of ecological validity. There is a consider the question of what makes coincidences
able gulf between the sorts of paper and seem remarkable? Certainly, they seriously
pencil probability problems posed to weaken those aspects of the literature that
unsuspecting subjects in typical heuristics deal with probability judgements and pre-
74
44 Parapsychology
WATT
diction where some sort of normative judgements have to be made under much
theory of probability has been greater uncertainty, with a profusion of
(questionably) assumed. Further, it is diffi distracting information and incomplete
cult to generalise from the typically artifi data. I believe that it is in these conditions
cial methods used, to more complex that we are most likely to •simplify by
settings. But although 'overconfidence bias' resorting to rules of thumb. If relevant
as typified in the heuristics and biases information, such as base rates, is readily
literature may 'disappear' if an alternative available and noticeable, then we have seen
statistical model is adopted, in more realis that it can be applied quite appropriately
tic situations such as in studies of eyewit by individuals. On occasions when all rele
ness testimony, overconfidence neverthe vant information is not at hand, heuristics
less remains a problem. Wells and Murray may be used. Evans (1989) makes the use
(1984), for instance, reviewed studies of ful distinction between competence and
eyewitnesses' confidence in their memory performance in statistical reasoning. People
reports and concluded that 'the eyewitness can be seen in some circumstances compe
accuracy-confidence relationship is weak tently to apply statistical principles in
under good laboratory conditions and judgement under uncertainty. What we
functionally useless in forensically repre need to understand is why this competence
sentative settings' (p.165). is not applied under a different set of
Gigerenzer's criticisms have, however, circumstances.
been constructive: he suggests that the The final criticism of the cognitive heu
study of judgement under uncertainty may ristics, that they are vague and post hoc, is,
explicitly utilise various statistical models to me, the most telling. At the moment
to get a clearer idea of which model most cognitive heuristics are largely descriptive
closely approximates subjects' intuitive (or heuristic!) devices to help psychologists
reasoning (one might also have to consider organise their thoughts about other peo
the possibility of individual differences in ple's thought processes. Description is a
model selection). Also, many statistical necessary stage in the development of theo
principles, such as the law of large retical ideas, but the heuristics literature
numbers, are uncontroversial, and in this has yet to progress beyond this descriptive
section I have tried to focus on aspects of phase. We need a theory or theories of
judgement under uncertainty that are not judgements under uncertainty to be devel
so vulnerable to criticism of underlying oped to a stage where they offer 3 things:
statistical assumptions. The research on the falsifiable predictions; an explanation of
effects of salience or availability on focus of why humans judge the way they do; and
attention and causal attributions, for exam predictions of the circumstances under
ple, reinforces the apparent importance of which the various judgemental biases
availability for judgements under uncer might be expected to operate. Describing
tainty (e.g., Taylor & Fiske, 1978, Dow theories of human reasoning as
(Watt), 1988). Lopes' comments on evalu 'fragmented', Evans (1991) states, 'while
ative language are well-taken, and are a theorists interested in bias emphasize...the
useful reminder to all concerned with heu role of non-reasoning processes, those
ristics and biases that they should look out interested in competence emphasize
for 'creeping value judgements' in their ...reasoning processes" (p.97). There is a
writings. lack of integration between the various
We have seen that the degree to which approaches to the study of human reason
heuristics are used depends greatly on the ing, and Evans makes some constructive
presentation of problems in the experimen recommendations for overcoming this
ted situation, and that careful simplification problem. I would agree with Sherman and
and manipulation of information can Corty (1984), however, that cognitive heu
modify or overcome heuristic use. There is ristics can potentially identify the processes
no doubt, however, that in the real world, underlying decision-making, and can
75
Parapsychology 45
PSYCHOLOGY AND COINCIDENCES
potentially suggest how to solve decision gists found that paranoid or suspicious
making problems and improve judgement. patients exaggerated the eyes in their draw
For these reasons, they may be useful in ings, whereas dependent patients, who like
evaluating coincidences. to be fed and cared for, exaggerated the
mouth. The Chapmans asked patients in a
4.2 The Influence of Beliefs and State hospital to take the DAP test. These
Expectations on Perception, Judgement drawings were then paired completely at
and Recall random with 6 symptoms, such as suspi
ciousness and dependence. The Chapmans
Apart from characteristics of our statis asked untrained college students to exam
tical intuitions that may cause some coinci ine the drawings and the symptoms with
dences to seem remarkable, the sense of which they had randomly been paired.
meaningfulness of coincidences may be Later, the students were asked which fea
enhanced by other aspects of our informa tures of the drawings had most often been
tion processing. In short, how we perceive, paired with each symptom. The students
interpret and remember events is, to a large reported the same kinds of association
extent, determined by our a priori beliefs, between symptoms and drawings that the
expectations and theories (or schemata) clinicians had, even though it had been
about how the world works. Information arranged that there was no systematic rela
that is consistent with our expectations is tionship for the students (incidentally, these
readily assimilated to strengthen our experiments do not suggest that the DAP
beliefs; on the other hand, information that test is of no clinical use; it may be helpful to
does not fit with our expectations may be clinicians when taken in the context of a
distorted to make it fit, selectively ignored, wider clinical investigation).
or forgotten, so that our prior expectations Sometimes it is valid and efficient for
or interpretations of an event or a coinci our expectations to influence our interpre
dence are not challenged. tation of information; for instance, our
knowledge of language may enable us to
understand what is being said over a noisy
How Beliefs Can Influence Perception and telephone line. At other times, our precon
Judgement ceptions can be misleading; for example,
Not only do people tend to overlook where wishful dunking or preoccupation
non-occurrences or their failures to get the with a particular idea may lead to a misin
coincidences they predicted; they also tend terpretation of the caller's words. With
to see relationships where there are none. regard to the study of coincidences, the
This is called the "illusory correlation” challenge is to identify when information
effect, and usually refers to cases where may have been distorted or misinterpreted.
people associate 2 factors, though statisti Though there is no easy answer to this
cally no relationship exists. Our theories problem, some pointers are given by psy
and stereotypes often lead to our perceiv chological research.
ing illusory correlations. Nisbett and Ross (1980) identified some
The classic studies showing illusory factors that increase the likelihood of erro
correlation (Chapman & Chapman, 1967, neous bias based on a priori beliefs or
1969) were concerned with the question of theories:
why clinical psychologists persisted in 1. Confidence in the theory. If this confi
reporting correlations between patients’ dence is based on emotional commitment
responses on a projective psychological to the theory rather than on a solid factual
test, and aspects of the patients’ motiva foundation then it is more likely that we
tions and emotions. Detailed studies of this will selectively process information so as to
Draw-A-Person (DAP) test suggested that strengthen ourbeliefs.
responses on the test were totally unrelated
to clinical symptoms. Yet clinical psycholo
76
46 Parapsychology
WATT
2. A vailability of the theory. The likeli information on new beliefs; and thirdly, the
hood that a theory will influence how we effect of false information on beliefs.
interpret information depends on its avail
ability; its likelihood of being triggered by 1. N ew Information and Established
the data at hand. If you have recently Beliefs. Lord, Ross, and Lepper (1979) took
attended a course in Freudian psycho 2 groups of university students: one group
analysis, this theory might be very avail strongly believed that capital punishment
able for you and be readily used to was a deterrent to potential murderers; the
interpret the actions and dreams of people other strongly believed it was worthless as
around you. A common example of the a deterrent. Each subject read about the
possible operation of availability in coinci results of 2 supposedly authentic studies on
dences is where you learn a new word, the deterrent effects of capital punishment.
then suddenly notice it repeatedly cropping One of the studies concluded that capital
up. It is unlikely that you have never before punishment was an effective deterrent. The
encountered the word; rather, your atten other concluded the opposite. Subjects
tion has been drawn to it, and it has were asked a number of questions after
become salient or available for you to they had read both studies.
notice when it occurs again. There were 3 main findings from this
experiment: 1. Whichever study supported
3. A m biguity of the information. Evi a subject’s own initial position was found
dently, if information is clear and unambi to be significantly 'more convincing’ and
guous then it may be more difficult 'better conducted' than the study opposing
(though not impossible) for us to put our their position; 2. When subjects were asked
own interpretation on that information about their beliefs after reading about only
based on our preconceptions. If, on the one study, which could be in agreement
other hand, the information is experienced with or in contradiction to their own views,
in an ambiguous way - say, in poor light, in belief in the subject's original position was
confusing circumstances - then it is much strengthened if they had just read a suppor
easier for us to interpret it so as to fit our tive study, but belief in the original position
expectations. Fading of memory and the was hardly affected at all by reading an
operation of our cognitive heuristics can opposing study; and, 3. After reading
render initially clear information ambigu about both studies, the subjects were more
ous. This is why it is so important to take convinced about the correctness of their
note of, for instance, each prediction that initial position than they were before read
we make, plus whether or not it is fulfilled; ing about any evidence.
and to write down details of a coincidence In summary, different standards are
as soon as possible. The note-taking makes used for criticizing opposing evidence to
the information less ambiguous than our those used for criticizing supportive evi
unassisted memory would. dence. Mixed evidence, giving equal sup
port to 2 opposing views, does not reduce
confidence for holders of either view but
How Information Often Doesn't Influence
instead reinforces confidence for holders of
Our Beliefs
each view.
Psychological research suggests that Perhaps these results were obtained
once we have made up our minds about because the subjects were impressionable
something we are very resistant to revising young students. But even in the supposedly
our theories. Here, I'll give examples of 3 rigorous and objective world of reviewing
areas of research into the effects of informa articles for scientific journals, prior beliefs
tion on beliefs: firstly, what happens when have a strong influence on evaluations. In a
established beliefs are faced with new controversial 'real-life' experiment, Douglas
information; secondly, the effect of new Peters and Stephen Ceci (1982) re
submitted 12 already-published research
77
Parapsychology 47
PSYCHOLOGY AND COINCIDENCES
78
48 Parapsychology
WATT
tistical summaries of a wider survey. they had seen the new combination
Anderson concluded that this effect was sentences before. Bransford and Franks
not due to memory but to the spontaneous proposed that individuals integrate infor
generation of causal explanations that mation from individual sentences so as to
seemed to be facilitated by the case histo construct larger ideas; they think they have
ries. In the case of coincidences, of course, already seen these complex sentences
the data are also usually concrete; in the because they have been combined in
form of personal experiences or anecdotes memory and, once combined, they cannot
that are told by others. break them down into their original
To sum up this section: we have seen components.
the interplay of human psychology, beliefs, This constructive model of memory is
and data. We tend to cling unduly to our not necessarily limited to recall for sets of
own beliefs or theories, even in the face of sentences. People instinctively try to make
contradictory evidence, and we apply a sense out of any situation - sets of noises,
double standard to evidence relevant to our events happening around them, snippets of
beliefs. We have probably all seen this conversation - and their memories of these
happening in our everyday life; but we events may contain not only just the origi
may neglect to consider these facts when nal events but also the interpretation put on
we ourselves are involved. We can easily themby the individual.
see the weak points in other people's One example of the study of recollec
beliefs, while being absolutely certain of the tion change in more realistic situations,
truth of our own. This may be one reason which are perhaps more relevant to the
why we are less impressed when coinci evaluation of coincidences, is work in the
dences happen to other people than when area of eyewitness testimony (e.g., Wells &
we are closely involved in them ourselves Loftus, 1984). In a typical experiment,
(Falk, 1989). Loftus and Loftus (1975) showed subjects a
film of a traffic accident. Soon after that,
How Recall Can Change Due to Beliefs subjects were asked questions about their
memory of the accident. One of these ques
and Expectations
tions, about the speed of the cars, was
Memory is a construction, based partly asked in 2 different ways. Subjects were
on our perceptions and partly on our inter either asked, 'How fast were the cars going
pretations, and memories tend to fade and when they smashed into each other?' or
alter over time. It appears that when we they were asked 'How fast were the cars
recollect something we actively reconstruct going when they hit each other?' Appar
our memories so as to fit with our theories ently, subjects used the different inferences
and expectations. When we recall coinci suggested by the words 'smashed' or 'hit' to
dences that we have heard of or have been alter their memory of the accident.
involved in in the past, our memory may 'Smashed' implies a more destructive colli
blur some details and strengthen others so sion than 'hit'. A week later, subjects were
as to make the coincidence seem more given a memory test, where they were
impressive than it was to begin with, a asked 'Did you see any broken glass?'
process which may be quite unconscious. Although there was no broken glass in the
In 1971, Bransford and Franks devel original film, those subjects who had been
oped their Constructive Model of Memory. asked the 'smashed' question were more
Subjects were presented with sets of simple likely to say mistakenly that they remem
sentences, some of which they had seen a bered seeing broken glass.
few minutes before and others which were The sentence-recall experiment showed
new sentences, including combinations of how information could be misremembered
the earlier sentences. When they were only a short time after its presentation.
asked to identify those sentences they had Generally, the more time that passes after
seen before, many subjects were convinced the original incident, the more chance there
79
Parapsychology 49
PSYCHOLOGY AND COINCIDENCES
is that recollections will change. You can information. The experiment on eyewitness
imagine how recall might change over testimony described above showed how
months or years after an original event. careless questioning can bias recollection.
This suggests that sometimes a coincidence Hall, McFeaters, and Loftus identified 4
that was only moderately impressive to major factors (time delay, warnings, ques
begin with can, over time, be recalled tion phrasing, and attitude) which affected
differently, as really very striking. the change in recollections for unusual or
These experiments into sentence recall unexpected events. These 4 factors have
and eyewitness testimony demonstrated been fairly well demonstrated in experi
razsremembering. Other studies have dem ments.
onstrated selective remembering. Hintzman, The first is the tim e delay between an
Asher, and Stem (1978) explored then- event, a subsequent misleading message,
hypothesis that coincidences seem to occur and a final test of recollection. It seems that
more often than chance because of selective changes in recollection are greatest if there
remembering of meaningfully related is a relatively long time delay before the
events, by asking subjects to rate a series of misinformation is given; presumably so
concrete nouns and, at another time, a that the original memory can fade. Then,
series of pictures of objects, in a task osten the change in recollection is greatest if peo
sibly unrelated to memory. Some of the ple are tested about their recall of the origi
norms and pictures were related to each nal information while the post-event misin
other, but the rest were unrelated (the formation is still relatively recent.
authors do not say by what criteria the Secondly, it has been shown that if
judgements of relatedness were made). people are warned just before they are to be
Later, participants were unexpectedly exposed to misinformation that the
asked to recall as many words from the list message may contain misleading informa
of nouns as possible. This was therefore an tion, then they are less likely to be influ
incidental learning task, and the authors enced to change their original recollections.
regarded the related nouns and pictures as This effect is quite specific, though. If the
coincidences. They found that significantly warning is not given immediately before
more 'related' words were recalled than the post-event misinformation, then it's not
'unrelated' words, suggesting that there usually effective.
was selective remembering of the meaning Thirdly, it seems that the way in which
fully-related words. An experiment of simi a misleading question is phrased affects the
lar design but using events rather than likelihood of recollection change. After a
nouns (the former being components of surprise intruder interrupted their lecture,
coincidences in the real world) replicated subjects who were asked, 'Was the mous
this selective memory retrieval effect tache worn by the tall intruder light or dark
(Kallai, 1985, cited in Falk, 1989). brown?' were less likely to (mistakenly)
In a review of the literature into recall that the intruder had a moustache
'Alterations in recollections of unusual and than those who were asked' "Did the
unexpected events', Hall, McFeaters, and intruder who was tall and had a moustache
Loftus (1987) described how new informa say anything to the professor?' (Loftus,
tion could be absorbed and interpreted as 1981). The latter question included the
an original memory. A coincidence, of misinformation in an auxiliary clause, sug
course, is an unusual and unexpected gesting that memory is more easily altered
event. New information might be embed if misinformation is casually or uninten
ded in a misleading message, or in a bias tionally absorbed, rather than being given
ing question, or in a sketch or photograph. direct and critical attention. Also, misin
Private remembering of the event, discus formation that is slowly scrutinized may be
sion with friends or family, or even ques rejected, whereas if you give brief and
tioning by a careless investigator can be a minimal attention to the misinformation, it
source of misleading opinions and
80
50 Parapsychology
WATT
may be added easily to the original progress beyond the stage of cataloguing
recollections. heuristics and biases.
I described earlier how attitude can A consideration of techniques for over
affect how we perceive or remember coming the many biases in our judgements
information. This has also been demon under uncertainty would also have been
strated in the experiments into eyewitness helpful, but would have made the paper
testimony. Information that is consistent unacceptably long. The interested reader is
with attitudes is strengthened in the proc referred to Kahneman and Tversky (1982),
ess of recollection, whereas information Fischhoff (1982), Nisbett et al. (1982), Evans
that doesn't fit fades, or is replaced. In a (1989), and Lopes (1987) for further infor
classic experiment, subjects were shown a mation on debiasing. Research has also
picture of 2 men in an underground train. been conducted into ways of improving
One of the men was white, the other black. recollection of real-world events; for exam
The white man held an open cut-throat ple, police have an obvious interest in eye
razor in his hand. Subjects were asked to witness recall, and Roy (1991) describes
describe the picture to others, who in turn how the 'cognitive interview' has been
described it to others, and so on. It was shown to improve eyewitness recall. Four
found that, over time, the razor moved questioning strategies are used, which aim
from the white man's hand to the black to enhance memory retrieval: the witness is
man's hand (Allport &Postman, 1947). encouraged to reinstate mentally the exter
nal scene and the internal thoughts that
Summary and Future Directions existed at the time of the crime; he or she is
asked to report everything, even incom
Some of the research described in this plete or apparently trivial information;
paper may not be new to parapsycholo events are recounted in a variety of orders;
gists, but by drawing together a variety of and the witness is encouraged to report
psychological studies relevant to the evalu events from a variety of different perspec
ation and experience of coincidences, I tives. The cognitive interview has been
hope some readers may be stimulated shown to facilitate retrieval of more correct
further to consider the implications of this information than either the standard police
psychological research for the study of technique or hypnotic techniques.
coincidences. I am only too aware of the In the meantime, this paper can only
limitations of this paper, which can be provide a few guidelines for coincidence
subjected to the same sorts of criticisms as research: where possible, try to get an esti
have been levelled against the heuristics mation of the likelihood of a coincidence
and biases approach: I have merely cobbled (the formulae given when discussing the
together a number of descriptions of rele birthday problem may be helpful here);
vant research findings without providing search for hidden causes; guard against
any useful explanatory framework. The predictions with multiple endpoints, by
various psychological factors I have documenting predictions when they are
described may be applied post hoc to made and noting failures to confirm
account for many coincidences. What predictions; ask whether the interpretation
would be even more useful would be some of a coincidence might have been influ
theory or theories enabling the prediction enced by the use of representativeness and
of the circumstances under which these fac availability heuristics, especially where
tors would be expected to operate. This will judgements of likelihood and causality are
probably have to await further develop concerned; have several people (ideally
ments in mainstream psychology, though with differing prior beliefs about coinci
Hogarth (1981), Gigerenzer (1991a) and dences) document thoroughly coinciden
Evans (1991) make some constructive sug ces, to try to some extent to circumvent
gestions for how researchers could belief-confirming distortions in perception,
judgement and memory; beware of mis-
81
Parapsychology 51
PSYCHOLOGY AND COINCIDENCES
82
52 Parapsychology
WATT
Evans, J.St.B.T. (1991) Theories of human Jones, E.E., Rock, L., Shaver, K.G., Goethals,
reasoning: The fragmented state of the art. G.R., &Ward, L.M. (1968) Pattern of per
Theory & Psychology, 1,83-105. formance and ability attribution: An
Falk, R. (1981-82) On coincidences. The Skepti unexpected primacy effect. Journal of Per
cal Inquirer, 6 , 18-31. sonality and Social Psychology, 10,317-340.
Falk, R. (1989) Judgment of coincidences: Mine Kahneman, D., Slovic, P., &Tversky, A. (1982)
versus yours. American Journal of Psychol (Eds.) Judgement Under Uncertainty: Heu
ogy, 102,477-493. ristics and Biases. Cambridge: Cambridge
Fiedler, K. (1988) The dependence of the con UniversityPress.
junction fallacyonsubtle linguistic factors. Kahneman, D., &Tversky, A- (1972) Subjective
Psychological Research, 50, 123-129; cited in probability: A judgement of representa
Gigerenzer (1991a). tiveness. Cognitive Psychology, 3,430-454.
Fischhoff, B. (1975) Hindsight * foresight The Kahneman, D., &Tversky, A. (1982) Intuitive
effect of outcome knowledge on judge prediction: Biases and corrective proce
ment under uncertainty. Journal of Experi dures. In D. Kahneman, P. Slovic, &A.
mental Psychology: Human Perception and Tversky (Eds.) Judgment Under Uncer
Performance, 1,288-299. tainty: Heuristics and Biases. Cambridge:
Fischhoff, B. (1982) Debiasing. In D. Cambridge University Press.
Kahneman, P. Slovic, &A. Tveisky (Eds.) Kallai, E. (1985) Psychological Factors that Influ
Judgment Under Uncertainty: Heuristics and ence the Belief in Horoscopes. Unpublished
Biases. Cambridge: Cambridge University Master's thesis, The Hebrew University,
Press. Jerusalem. Cited inFalk, R. (1989).
Gigerenzer, G. (1991a) How to make cognitive Lichtenstein, S., Fischhoff, B., &Phillips, L.D.
illusions disappear. Beyond 'heuristics and (1982) Calibration of probabilities: The
biases'. In W. Stroebe & M. Hewstone state of the artto 1980. InD. Kahneman, P.
(Eds.) European Review of Social Psychology, Slovic, & A. Tversky (Eds.) Judgment
2,83-115. Under Uncertainty: Heuristics and Biases.
Gigerenzer, G. (1991b) From tools to theories: Cambridge: CambridgeUniversity Press.
A heuristic of discovery in cognitive psy Loftus, G.R. (1981) Mentalmorphosis: Altera
chology. Psychological Review, 98,254-267. tions in memory produced by bonding of
Gigerenzer, G., &Murray, D.J. (1987) Cognition new information to old. In J.B. Long &
as Intuitive Statistics. Hillsdale, N.J.: A.D. Baddeley (Eds.) Attention and
LawrenceErlbaumAssociates. Performance IX.
Loftus, G.R., &Loftus, E.F. (1975) Human Mem
Gigerenzer, G., Swijtink, Z., Porter, T., Daston, ory: The Processing of Information. New
L., Beatty, J., & Kruger, L. (1989) The York: HalstedPress.
Empire of Chance: How Probability Changed
Science and Everyday Life. Cambridge: Lopes, L. (1987) Procedural debiasing. Acta
CambridgeUniversity Press. Psychologica, 64,167-185.
Hall, D.F., McFeateis, S.J., &Loftus, E.F. (1987) Lopes, L. (1991) The rhetoric of irrationality.
Alterations in recollection of unusual and Theory & Psychology, 1 , 65-82.
unexpected events. Journal of Scientific Lord, C., Ross, L., &Lepper, M.R. (1979) Biased
Exploration, 1,3-10. assimilation and attitude polarization: The
Hintzman, D.L., Asher, S.J., & Stem, L.D. effects of prior theories on subsequently
(1978) Incidental retrieval and memory for considered evidence. Journal of Personality
coincidences. In M.M. Gruneberg, P.E. and Social Psychology, 37,2098-2109.
Morris & R.N. Sykes (Eds.) Practical Marks, Dv &Kammann, R. (1980) The Psychol
Aspects of Memory. London: Academic ogy of the Psychic. Buffalo, NY:
Press. Prometheus.
Hogarth, R.M. (1981) Beyond discrete biases: Morris, R. (1986) What psi is not: The necessity
Functional and dysfunctional aspects of for experiments. In H. Edge, R. Morris, J.
judgmental heuristics. Psychological Bulle Palmer, &J. Rush Foundations of Parapsy
tin, 90,197-217. chology. London: Routledge &KeganPaid.
83
Parapsychology 53
PSYCHOLOGY AND COINCIDENCES
Psychologie et coincidences
Resume: Cet article presente une revue selective de la recherche sugg^rant des causes nor-
males possibles a certaines coincidences. Apres une breve discussion des causes cachees,
des predictions a issues multiples, et de simple probability, l'ensemble de l’article se centre
sur la recherche psychologique sur le jugement et la prise de decision en situation
d'incertitude. On examine les raccourcis utilises dans le traitement d'information juges
responsables des faiblesses apparentes de nos intuitions statistiques quotidiennes, ainsi que
les critiques de ce paradigme d’heuristiques et biais. On donne des exemples d'etudes mon-
trant comment la perception, le jugement et le rappel peuvent etre biais^s afin de confirmer
nos prejuges. Certaines implications de cette recherche pour l'etude des coincidences sont
soulignyes, ainsi que la recherche suggyrant des mesures afin d'ameliorer le jugement de
faqon prometteuse.
84
[3]
Fantastic Memories
The Relevance of Research into Eyewitness Testimony and
False Memories for Reports of Anomalous Experiences
Christopher C. French
Ian Stevenson
Department of Psychiatric Medicine, University of Virginia, School of Medicine,
Charlottesville, Virginia 22908
Introduction
Although counts of moles (hyperpigmented nevi) have shown that the average
adult has between 15 and 18 of them (Pack and Davis, 1956), little is known
about their cause — except for those associated with the genetic disease neu
rofibromatosis — and even less is known about why birthmarks occur in one
location of the body instead of in another. In a few instances a genetic factor
has been plausibly suggested for the location of nevi (Cockayne, 1933;
Denaro, 1944; Maruri, 1961); but the cause of the location of most birthmarks
remains unknown. The causes of many, perhaps most, birth defects remain
similarly unknown. In large series of birth defects in which investigators have
searched for the known causes, such as chemical teratogens (like thalido
mide), viral infections, and genetic factors, between 43% (Nelson and
Presented at the Eleventh Annual Meeting of the Society for Scientific Exploration held at Princeton
University, June 11-13, 1992.
78 Parapsychology
404 I. Stevenson
Holmes, 1989) and 65 — 70% (Wilson, 1973) of cases have finally been as
signed to the category of “unknown causes.”
Among 895 cases of children who claimed to remember a previous life (or
were thought by adults to have had a previous life), birthmarks and/or birth de
fects attributed to the previous life were reported in 309 (35%) of the subjects.
The birthmark or birth defect of the child was said to correspond to a wound
(usually fatal) or other mark on the deceased person whose life the child said it
remembered. This paper reports an inquiry into the validity of such claims.
With my associates I have now carried the investigation of 210 such cases to a
stage where I can report their details in a forthcoming book (Stevenson, forth
coming). This article summarizes our findings.
Children who claim to remember previous lives have been found in every
part of the world where they have been looked for (Stevenson, 1983; 1987),
but they are found most easily in the countries of South Asia. Typically, such a
child begins to speak about a previous life almost as soon as it can speak, usu
ally between the ages of two and three; and typically it stops doing so between
the ages of five and seven (Cook, Pasricha, Samararatne, Win Maung, and
Stevenson, 1983). Although some of the children make only vague statements,
others give details of names and events that permit identifying a person whose
life and death corresponds to the child’s statements. In some instances the per
son identified is already known to the child’s family, but in many cases this is
not so. In addition to making verifiable statements about a deceased person,
many of the children show behavior (such as a phobia) that is unusual in their
family but found to correspond to behavior shown by the deceased person con
cerned or conjecturable for him (Stevenson, 1987; 1990).
Although some of the birthmarks occurring on these children are “ordinary”
hyperpigmented nevi (moles) of which every adult has some (Pack and Davis,
1956), most are not. Instead, they are more likely to be puckered and scarlike,
sometimes depressed a little below the surrounding skin, areas of hairlessness,
areas of markedly diminished pigmentation (hypopigmented macules), or
port-wine stains (neviflammei). When a relevant birthmark is a hyperpigment
ed nevus, it is nearly always larger in area than the “ordinary” hyperpigmented
nevus. Similarly, the birth defects in these cases are of unusual types and rarely
correspond to any of the “recognizable patterns of human malformation”
(Smith, 1982).
Methods
My investigations of these cases included interviews, often repeated, with
the subject and with several or many other informants for both families. With
rare exceptions, only firsthand informants were interviewed. All pertinent
written records that existed, particularly death certificates and postmortem re
ports, were sought and examined. In the cases in which the informants said that
the two families had no previous acquaintance, I made every effort to exclude
all possibility that some information might nevertheless have passed normally
to the child, perhaps through a half-forgotten mutual acquaintance of the two
Parapsychology 79
Birthmarks and Birth Defects 405
Fig. 1. Hypopigmented macule on chest of an Indian youth who, as a child, said he remembered
the life of a man, Maha Ram, who was killed with a shotgun fired at close range.
Fig. 2. The circles show the principal shotgun wounds on Maha Ram, for comparison with Fig
ure 1.
explanations seem to be required to account for the discrepant cases, and 1 dis
cuss these elsewhere (Stevenson, forthcoming). Figure 1 shows a birthmark (an
area of hypopigmentation) on an Indian child who said he remembered the life
of a man who had been killed with a shotgun fired at close range. Figure 2
shows the location of the wounds recorded by the pathologist. (The circles were
drawn by an Indian physician who studied the postmortem report with me.)
The high proportion (88%) of concordance between wounds and birthmarks
in the cases for which we obtained postmortem reports (or other confirming
documents) increases confidence in the accuracy of informants’ memories
concerning the wounds on the deceased person in those more numerous cases
for which we could obtain no medical document. Not all errors of informants’
memories would have resulted in attributing a correspondence between birth
marks and wounds that did not exist; in four cases (possibly five) reliance on
Parapsychology 81
Birthmarks and Birth Defects 407
Fig. 3. Large verrucous epidermal nevus on head of a Thai man who as a child said he remem
bered the life of his paternal uncle, who was killed with a blow on the head from a heavy
knife.
Fig. 4. Congenital malformation of nail on right great toe of the Thai subject shown in Figure 3.
This malformation corresponded to a chronic ulcer of the right great toe from which the
subject’s uncle had suffered.
nail of the right great toe (Figure 4). This corresponded to a chronic infection
of the same toe from which the subject’s uncle had suffered for some years be
fore he died.
The series includes 18 cases in which two birthmarks on a subject corre
sponded to gunshot wounds of entry and exit. In 14 of these one birthmark was
larger than the other, and in 9 of these 14 the evidence clearly showed that the
smaller birthmark (usually round) corresponded to the wound of entry and the
larger one (usually irregular in shape) corresponded to the wound of exit.
These observations accord with the fact that bullet wounds of exit are nearly
always larger than wounds of entry (Fatteh, 1976; Gordon and Shapiro, 1982).
Figure 5 shows a small round birthmark on the back of the head of a Thai boy,
and Figure 6 shows a larger, irregularly shaped birthmark at the front of his
head. The boy said that he remembered the life of a man who was shot in the
head from behind. (The mode of death was verified, but no medical document
was obtainable.) In addition to the 9 cases I have investigated myself, Mills re-
Parapsychology 83
Birthmarks and Birth Defects 409
Fig. 5. Small, round puckered birthmark on a Thai boy that corresponded to the bullet wound of
entry in a man whose life he said he remembered and who had been shot with a rifle from
behind.
Fig. 6. Larger, irregularly shaped birthmark on the frontal area of the head of the Thai boy shown
in Figure 5. This birthmark corresponded to the bullet wound of exit on the Thai man
whose life the boy said he remembered.
84 Parapsychology
410 I. Stevenson
Fig. 7. Two round, puckered, scarlike birthmarks of different sizes on the left breast of a Burmese
woman who as a child said she remembered the life of a woman who was fatally wounded
by a shotgun that used a cartridge containing shot of different sizes.
ported another case having the feature of a small round birthmark (correspond
ing to the wound of entry) and a larger birthmark corresponding to the wound
of exit (both verified by a postmortem report) (Mills, 1989).
I have calculated the odds against chance of two birthmarks correctly corre
sponding to two wounds. The surface area of the skin of the average adult male
is 1.6 meters (Spalteholz, 1943). If we were to imagine this area square and
spread on a flat surface, its dimensions would be approximately 127 centime
ters by 127 centimeters. Into this area would fit approximately 160 squares of
the size 10 centimeters square that I mentioned above. The probability that a
single birthmark on a person would correspond in location to a wound within
the area of any of the 160 smaller squares is only 1/160. However, the proba
bility of correspondences between two birthmarks and two wounds would be
(l/160)2i.e. 1 in 25,600. (This calculation assumes that birthmarks are uni
formly distributed over all regions of the skin. This is incorrect [Pack, Lenson,
and Gerber, 1952], but I believe the variation can be ignored for the present
purpose.)
Examples of Other Correspondences of Detail between Wounds and Birthmarks
A Thai woman had three separate linear hypopigmented scarlike birthmarks
near the midline of her back; as a child she had remembered the life of a
woman who was killed when struck three times in the back with an ax. (Infor
mants verified this mode of death, but no medical record was obtainable.) A
woman of Burma was bom with two perfectly round birthmarks in her left
Parapsychology 85
Birthmarks and Birth Defects 411
chest (Figure 7); they slightly overlapped, and one was about half the size of
the other. As a child she said that she remembered the life of a woman who was
accidentally shot and killed with a shotgun. A responsible informant said the
shotgun cartridge had contained shot of two different sizes. (No medical
record was obtainable in this case.) Another Burmese child said that she re
membered the life of her deceased aunt, who had died during surgery for con
genital heart disease. This child had a long, vertical linear hypopigmented
birthmark close to the midline of her lower chest and upper abdomen; this
birthmark corresponded to the surgical incision for the repair of the aunt’s
heart. (I obtained a medical record in this case.) In contrast, a child of Turkey
had a horizontal linear birthmark across the right upper quadrant of his ab
domen. It resembled the scar of a surgeon’s transverse abdominal incision. The
child said that he remembered the life of his paternal grandfather, who had be
come jaundiced and was operated on before he died. He may have had a cancer
of the head of the pancreas, but I could not learn a precise medical diagnosis.
Two Burmese subjects remembered as children the lives of persons who had
died after being bitten by venomous snakes, and the birthmarks of each corre
sponded to therapeutic incisions made at the sites of the snakebites on the per
sons whose lives they remembered. Another Burmese subject also said as a
child that she remembered the life of a child who had been bitten on the foot by
a snake and died. In this case, however, the child’s uncle had applied a burning
cheroot to the site of the bite — a folk remedy for snakebite in parts of Burma;
and the subject’s birthmark was round and located at the site on the foot where
the bitten child’s uncle had applied the cheroot.
Three Examples of Birth Defects
Figure 8 shows the right side of the head of a Turkish boy with a diminished
and malformed ear (unilateral microtia). He also had underdevelopment of the
right side of his face (hemifacial microsomia). He said that he remembered the
life of a man who had been shot (with a shotgun) at point-blank range. The
wounded man was taken to a hospital where he died 6 days later — of injuries
to the brain caused by shot that had penetrated the right side of the skull. (I ob
tained a copy of the hospital record.)
Figure 9 shows fingers almost absent congenitally on one hand (unilateral
brachydactyly) in a child of India who said he remembered the life of another
child who had put his right hand into the blades of a fodder-chopping machine
and lost his fingers. Most cases of brachydactyly involve only a shortening of
the middle phalanges. In the present case there were no phalangeal bones, and
the fingers were represented by mere stubs. Unilateral brachydactyly is ex
ceedingly rare, and I have not found a published report of a case, although a
colleague (plastic surgeon) has shown me a photograph of one case that came
under his care.
Figure 10 shows congenital absence of the lower right leg (unilateral
hemimelia) in a Burmese girl. She said that she remembered the life of a girl
86 Parapsychology
412 I. Stevenson
Fig. 8. Severely malformed ear (microtia) in a Turkish boy who said that he remembered the life
of a man who was fatally wounded on the right side of the head by a shotgun discharged at
close range.
who was run over by a train. Eyewitnesses said that the train severed the girl’s
right leg first, before running over the trunk. Lower hemimelia is an extremely
rare condition, and Frantz and O’Rahilly (1961) found it in only 12 (4.0%) of
300 cases of all congenital skeletal deficiencies that they examined.
Discussion
Because most (but not all) of these cases develop among persons who be
lieve in reincarnation, we should expect that the informants for the cases
would interpret them as examples according with their belief; and they usually
do. It is necessary, however, for scientists to think of alternative explanations.
The most obvious explanation of these cases attributes the birthmark or
birth defect on the child to chance, and the reports of the child’s statements
Parapsychology 87
Birthmarks and Birth Defects 413
Fig. 9. Almost absent fingers (brachydactyly) of one hand in a boy of India who said he remem
bered the life of a boy of another village who had put his hand into the blades of a fodder
chopping machine and had its fingers amputated.
and unusual behavior then become a parental fiction intended to account for
the birthmark (or birth defect) in terms of the culturally accepted belief in rein
carnation. There are, however, important objections to this explanation. First,
the parents (and other adults concerned in a case) have no need to invent and
narrate details of a previous life in order to explain their child’s lesion. Believ
ing in reincarnation, as most of them do, they are nearly always content to at
tribute the lesion to some event of a previous life without searching for a par
ticular life with matching details. Second, the lives of the deceased persons
figuring in the cases were of uneven quality both as to social status and com
mendable conduct. A few of them provided models of heroism or some other
enviable quality; but many of them lived in poverty or were otherwise unex-
emplary. Few parents would impose an identification with such persons on
their children. Third, although in most cases the two families concerned were
acquainted (or even related), I am confident that in at least 13 cases (among
210 carefully examined with regard to this matter) the two families concerned
had never even heard about each other before the case developed. The sub
ject’s family in these cases can have had no information with which to build up
an imaginary previous life which, it later turned out, closely matched a real
one. In another 12 cases the child’s parents had heard about the death of the
person concerned, but had no knowledge of the wounds on that person. Limi
tations of space for this article oblige me to ask readers to accept my appraisal
of these 25 cases for this matter; but in my forthcoming work I give a list of the
88 Parapsychology
414 I. Stevenson
Fig. 10. Congenital absence of lower leg (unilateral hemimelia) in a girl of Burma who said she
remembered the life of a young woman who was accidentally run over by a train, with
her right leg being severed first.
cases from which readers can find the detailed reports of the cases and from
reading them judge this important question for themselves. Fourth, I think I
have shown that chance is an improbable interpretation for the correspon
dences in location between two or more birthmarks on the subject of a case
and wounds on a deceased person.
Persons who reject the explanation of chance combined with a secondarily
confected history may consider other interpretations that include paranormal
processes, but fall short of proposing a life after death. One of these supposes
that the birthmark or birth defect occurs by chance and the subject then by
telepathy learns about a deceased person who had a similar lesion and devel-
Parapsychology 89
Birthmarks and Birth Defects 415
ops an identification with that person. The children subjects of these cases,
however, never show paranormal powers of the magnitude required to explain
the apparent memories in contexts outside of their seeming memories.
Another explanation, which would leave less to chance in the production of
the child’s lesion, attributes it to a maternal impression on the part of the
child’s mother. According to this idea, a pregnant woman, having a knowledge
of the deceased person’s wounds, might influence a gestating embryo and fetus
so that its form corresponded to the wounds on the deceased person. The idea
of maternal impressions, popular in preceding centuries and up to the first
decades of this one, has fallen into disrepute. Until my own recent article
(Stevenson, 1992) there had been no review of series of cases since 1890
(Dabney, 1890); and cases are rarely published now (Williams and Pembroke,
1988). Nevertheless, some of the published cases — old and new — show a re
markable correspondence between an unusual stimulus in the mind of a preg
nant woman and an unusual birthmark or birth defect in her later-born child.
Also, in an analysis of 113 published cases I found that the stimulus occurred
to the mother in the first trimester in 80 cases (Stevenson, 1992). The first
trimester is well known to be the one of greatest sensitivity of the embryo/fetus
to recognized teratogens, such as thalidomide (Nowack, 1965) and rubella
(Hill, Doll, Galloway, and Hughes, 1958). Applied to the present cases, how
ever, the theory of maternal impression has obstacles as great as the normal ex
planation appears to have. First, in the 25 cases mentioned above, the subject’s
mother, although she may have heard of the death of the concerned deceased
person, had no knowledge of that person’s wounds. Second, this interpretation
supposes that the mother not only modified the body of her unborn child with
her thoughts, but after the child’s birth influenced it to make statements and
show behavior that it otherwise would not have done. No motive for such con
duct can be discerned in most of the mothers (or fathers) of these subjects.
It is not my purpose to impose any interpretation of these cases on the read
ers of this article. Nor would I expect any reader to reach even a preliminary
conclusion from the short summaries of cases that the brevity of this report en
tails. Instead, I hope that I have stimulated readers to examine the detailed re
ports of many cases that I am now in the process of publishing (Stevenson,
forthcoming). “Originality and truth are found only in the details” (Stendhal,
1926).
Acknowledgements
I am grateful to Drs. Antonia Mills and Emily W. Cook for critical com
ments on drafts of this paper. Thanks are also due to the Bernstein Brothers
Parapsychology and Health Foundation for the support of my research.
Correspondence and requests for reprints should be addressed to: Ian
Stevenson, M.D., Division of Personality Studies, Box 152, Health Sciences
Center, University of Virginia, Charlottesville, VA 22908
90 Parapsychology
416 I. Stevenson
References
Cockayne, E. A. (1933). Inherited abnormalities of the skin. London: Oxford University Press.
Cook, E. W., Pasricha, S, Samararatne, G, Win Maung, & Stevenson, I. (1983). Review and analy
sis of “unsolved” cases of the reincarnation type: II. Comparison of features of solved and un
solved cases. Journal of the American Society for Psychical Research, 77, 115-135.
Dabney, W. C. (1890). Maternal impressions. In J. M. Keating (Ed.), Cyclopaedia of the diseases
of children, Vol. 1, (pp. 191-216). Philadelphia: J. B. Lippincott.
Denaro, S. J. (1944). The inheritance of nevi. Journal of Heredity, 35, 215-18.
Fatteh, A. (1976). Medicolegal investigation of gunshot wounds. Philadelphia: J. B. Lippincott.
Frantz, C. H., & O’Rahilly, R.(1961). Congenital skeletal limb deficiencies. Journal of Bone and
Joint Surgery, 43-A, 1202-24.
Gordon, I., & Shapiro, H. A. (1982). Forensic medicine: A guide to principles. (2nd ed.) London:
Churchill Livingstone.
Hill, A. B., Doll, R., Galloway, T. M., & Hughes, J.P.W. (1958). Virus diseases in pregnancy and
congenital defects. British Journal of Preventive and Social Medicine, 12, 1-7.
Maruri, C. A. (1961). La herencia en dermatologia. (2nd ed.) Santander: Aldus, S.A. Artes Grafi-
cas.
Mills, A. (1989). A replication study: Three cases of children in northern India who are said to re
member a previous life. Journal of Scientific Exploration, 3, 133-184.
Nelson, K., & Holmes, L. B. (1989). Malformations due to presumed spontaneous mutations in
newborn infants. New England Journal of Medicine, 320, 19-23.
Nowack, E. (1965). Die sensible Phase bei der Thalidomid- Embryopathie. Humangenetik, 1,
516-36.
Pack, G. T., & Davis, J. (1956). Moles. New York State Journal of Medicine, 56, 3498-3506.
Pack, G. T., Lenson, N. & Gerber, D. M. (1952). Regional distribution of moles and melanomas.
AMA Archives of Surgery. 65, 862-70.
Smith, D. W. (1982). Recognizable patterns of human malformation. (3rd ed.) Philadelphia: W. B.
Saunders.
Spalteholz, W. (1943). Hand atlas of human anatomy. Translated by L. F. Barker. 7th English ed.
Philadelphia: J.B. Lippincott.
Stendhal (1926). Lucien Leuwen. Paris: Librairie Ancienne Honore Champion, 4, 169.
Stevenson, I. (1975). Cases of the reincarnation type. I. Ten cases in India. Charlottesville: Uni
versity Press of Virginia.
Stevenson, I. (1983). American children who claim to remember previous lives. Journal of Ner
vous and Mental Disease, 171, 742-748.
Stevenson, I. (1987). Children who remember previous lives. Charlottesville: University Press of
Virginia.
Stevenson, 1.(1990). Phobias in children who claim to remember previous lives. Journal of Scien
tific Exploration, 4, 243-254.
Stevenson, I. (1992). A new look at maternal impressions: An analysis of 50 published cases and
reports of two recent examples. Journal of Scientific Exploration, 6, 353-373.
Stevenson, I. (Forthcoming). Birthmarks and birth defects: A contribution to their etiology.
Williams, H. C., & Pembroke, A. C. (1988). Naevus of Jamaica. Lancet, ii, 915.
Wilson, J. G. (1973). Environment and birth defects. New York: Academic Press.
[5]
Near-death experience in survivors of cardiac arrest: a
prospective study in the Netherlands
Pirn van Lom m el, Ruud van Wees, Vincent M eyers, Ing rid E lffe ric h
Summary Introduction
Some people who have survived a life-threatening crisis
Background Some people report a near-death experience report an extraordinary experience. N ear-death
(NDE) after a life-threatening crisis. We aimed to establish experience (NDE) occurs with increasing frequency
the cause of this experience and asse ss factors that because of improved survival rates resulting from
affected its frequency, depth, and content. m odern techniques of resuscitation. T he content of
N DE and the effects on patients seem similar
Methods In a prospective study, we included 344 worldwide, across all cultures and times. T he subjective
consecutive cardiac patients who were successfully nature and absence of a frame of reference for this
resuscitated after cardiac arrest in ten Dutch hospitals. We experience lead to individual, cultural, and religious
compared demographic, medical, pharmacological, and factors determining the vocabulary used to describe and
psychological data between patients who reported NDE and interpret the experience.1
patients who did not (controls) after resuscitation. In a N DE are reported in many circumstances: cardiac
longitudinal study of life changes after NDE, we compared arrest in myocardial infarction (clinical death), shock in
the groups 2 and 8 years later. postpartum loss of blood or in perioperative
complications, septic or anaphylactic shock,
Findings 62 patients (18%) reported NDE, of whom 41 electrocution, coma resulting from traum atic brain
(12%) described a core experience. Occurrence of the damage, intracerebral haemorrhage or cerebral
experience was not associated with duration of cardiac infarction, attem pted suicide, near-drowning or
arrest or unconsciousness, medication, or fear of death asphyxia, and apnoea. Such experiences are also
before cardiac arrest. Frequency of NDE was affected by reported by patients with serious but not immediately
how we defined NDE, the prospective nature of the life-threatening diseases, in those with serious
research in older cardiac patients, age, surviving cardiac depression, or without clear cause in fully conscious
arrest in .firs t myocardial infarction, more than one people. Similar experiences to near-death ones can
cardiopulmonary resuscitation (CPR) during stay in occur during the terminal phase of illness, and are called
hospital, previous NDE, and memory problems after deathbed visions. Identical experiences to N D E, so-
prolonged CPR. Depth of the experience was affected by called fear-death experiences, are mainly reported after
sex, surviving CPR outside hospital, and fear before cardiac situations in which death seemed unavoidable: serious
arrest. Significantly more patients who had an NDE, traffic accidents, m ountaineering accidents, or isolation
especially a deep experience, died within 30 days of CPR such as with shipwreck.
(p<0-0001). The process of transformation after NDE took Several theories on the origin of N D E have been
several years, and differed from those of patients who proposed. Some think the experience is caused by
survived cardiac arrest without NDE. physiological changes in the brain, such as brain cells
dying as a result of cerebral anoxia.2'4 O ther theories
Interpretation We do not know why so few cardiac patients encompass a psychological reaction to approaching
report NDE after CPR, although age plays a part. With a death,5 or a com bination of such reaction and anoxia.6
purely physiological explanation such as cerebral anoxia for Such experiences could also be linked to a changing
the experience, most patients who have been clinically state of consciousness (transcendence), in which
dead should report one. perception, cognitive functioning, emotion, and sense of
identity function independently from normal body-
Lancet 2001; 358: 2 0 3 9 -4 5 linked waking consciousness.7 People who have had an
See Commentary page 2010 N DE are psychologically healthy, although some show
non-pathological signs of dissociation.7 Such people do
not differ from controls with respect to age, sex, ethnic
origin, religion, or degree of religious belief.1
Studies on N D E 1,3,8,9 have been retrospective and very
selective with respect to patients. In retrospective
studies, 5-10 years can elapse between occurrence of the
experience and its investigation, which often prevents
accurate assessment of physiological and
pharmacological factors. In retrospective studies,
between 43% 8 and 48% ‘ of adults and up to 85% of
Division of Cardiology, Hospital Rijnstate, Arnhem, Netherlands children10 who had a life-threatening illness were
(P v a n L o m m e l m d ); Tilburg, Netherlands (R v a n W e e s PhD); estimated to have had an N DE. A random investigation
Nijmegen, Netherlands (V M e y e rs PhD); and Capelle a /d Ijssel, of more than 2000 Germans showed 4-3% to have had
Netherlands (I E lffe ric h PhD) an N D E at a mean age of 22 years." Differences in
Correspondence to: D r Ptm v a n L o m m e l, D iv is io n o f C a rd io lo g y , estimates of frequency and uncertainty as to causes of
H o sp ita l R ijn s ta t e , P O B o x 9 5 5 5 , 6 8 0 0 T A A rn h e m , N e th e rla n d s this experience result from varying definitions of the
(e -m ail: p im v a n lo m m e l@ w a n a d o o .n l) phenomenon, and from inadequate m ethods of
92 Parapsychology
research.12 Patients’ transformational processes after an We did standardised and taped interviews with
N DE are very similar1,3-12"16 and encompass life-changing participants a mean of 2 years after CPR. Patients also
insight, heightened intuition, and disappearance of fear of completed a life-change inventory.16 The questionnaire
death. Assimilation and acceptance of these changes is addressed self-image, concern with others, materialism
thought to take at least several years.15 and social issues, religious beliefs and spirituality, and
We did a prospective study to calculate the frequency attitude towards death. Participants answered 34
of N D E in patients after cardiac arrest (an objective questions with a five-point scale indicating w hether and
critical medical situation), and establish factors that to what degree they had changed. After 8 years,
affected the frequency, content, and depth of the surviving patients and their partners were interviewed
experience. We also did a longitudinal study to assess the again with the life-change inventory, and also com pleted
effect of time, memory, and suppression mechanisms on a medical and psychological questionnaire for cardiac
the process of transformation after N D E, and to reaffirm patients (from the D utch H eart Foundation), the
the content and allow further study of the experience. We U trecht coping list, the sense of coherence inquiry, and
also proposed to reassess theories on the cause and a scale for depression. These extra questionnaires were
content of N DE. deemed necessary for qualitative analysis because of the
reduced num ber of respondents who survived to 8 years
Methods follow-up. Our control group consisted of resuscitated
Patients patients who had not reported an N D E. We m atched
We included consecutive patients who were successfully controls with patients who had had an N D E by age, sex,
resuscitated in coronary care units in ten D utch hospitals and time interval between CPR and the second and
during a research period varying between hospitals from third interviews.
4 m onths to nearly 4 years (1988-92). The research
period varied because of the requirem ent that all S ta tis tic a l analysis
consecutive patients who had undergone successful We assessed causal factors for N D E with the Pearson \ 2
cardiopulmonary resuscitation (CPR) were included. If test for categorical and t test for ratio-scaled factors.
this standard was not met we ended research in that Factors affecting depth of N D E were analysed with the
hospital. All patients had been clinically dead, which we M ann-W hitney test for categorical factors, and with
established mainly by electrocardiogram records. All Spearm an’s coefficient of rank correlation for ratio-
patients gave written informed consent. We obtained scaled factors. Links between N D E and altered scores
ethics committee approval. for questions from the life-change inventory were
assessed with the M ann-W hitney test. The sums of the
Procedures individual scores were used to compare the responses to
We defined N D E as the reported memory of all the life-change inventory in the second and third
impressions during a special state of consciousness, interview. Because few causes or relations exist for
including specific elements such as out-of-body N D E , the null hypotheses are the absence of factors.
experience, pleasant feelings, and seeing a tunnel, a light, Hence, all tests were two-tailed with significance shown
deceased relatives, or a life review. We defined clinical by p values less than 0-05.
death as a period of unconsciousness caused by
insufficient blood supply to the brain because of Results
inadequate blood circulation, breathing, or both. If, in P a tients
this situation, CPR is not started within 5-10 min, We included 344 patients who had undergone 509
irreparable damage is done to the brain and the patient successful resuscitations. M ean age at resuscitation was
will die. 62-2 years (SD 12-2), and ranged from 26 to 92 years.
We did a short standardised interview with sufficiently 251 patients were men (73%) and 93 were women
well patients within a few days of resuscitation. We (27%). W omen were significantly older than m en (66 vs
asked whether patients recollected the period of un 61 years, p=0-005).The ratio of men to women was
consciousness, and what they recalled. Three researchers 57/43 for those older than 70 years, whereas at younger
coded the experiences according to the weighted core ages it was 80/20. 14 (4%) patients had had a previous
experience index.1 In this scoring system, depth of N D E N D E. We interviewed 248 (74%) patients within 5 days
is measured with weighted scores assigned to elements of after CPR. Some dem ographic questions from the first
the content of the experience. Scores between 1 and 5 interview had too many values missing for reliable
denote superficial N DE, but we included these events statistical analysis, so data from the second interview
because all patients underwent transformational changes were used. O f the 74 patients whom we interviewed at
as well. Scores of 6 or more denote core experiences, and 2-year follow-up, 42 (57%) had previously heard of
scores of 10 or greater are deep experiences. We also N D E, 53 (72%) were religious, 25 (34%) had left
recorded date of cardiac arrest, date of interview, sex, education aged 12 years, and 49 (66%) had been
age, religion, standard of education reached, whether the educated until aged at least 16 years.
patient had previously experienced N D E, previously 296 (86%) of all 344 patients had had a first
heard of N D E, whether CPR took place inside or outside myocardial infarction and 48 (14%) had undergone
hospital, previous myocardial infarction, and how many more than one infarction. Nearly all patients with acute
times the patient had been resuscitated during their stay myocardial infarction were treated with fentanyl, a
in hospital. We estimated duration of circulatory arrest synthetic opiod antagonist; thalamonal, a combined
and unconsciousness, and noted whether artificial preparation of fentanyl with dehydrobenzperidol that
respiration by intubation took place. We also recorded has an antipsychotic and sedative effect; or both. 45
type and dose of drugs before, during, and after the crisis, (13%) patients also received sedative drugs such as
and assessed possible memory problems at interview after diazepam or oxazepam, and 38 (11%) were given strong
lengthy or difficult resuscitation. We classed patients sedatives such as midazolam (for intubation), or
resuscitated during electrophysiological stimulation haloperidol for cerebral unrest during or after long-
separately. lasting unconsciousness.
2040
Parapsychology 93
WCEI score* n artificial respiration without intubation, while heart
massage and defibrillation are also applied. W hen we
A No m em ory
B So m e re collection
0
1 -5
282
21
(8 2 % )
(6% ) want to intubate the patient, he turns out to have
C M oderately d ee p N DE 6 -9 18 (5% ) dentures in his m outh. I remove these upper dentures
D D eep N D E 1 0 -1 4 17 (5% ) and put them onto the ‘crash car’. M eanwhile, we
E Very d ee p N DE 1 5 -1 9 6 (2% ) continue extensive CPR. After about an hour and a half
WCEI=weighted core experience index. NDE=near-death experience. *A=no the patient has sufficient heart rhythm and blood
NDE. B=superficial NDE, C /D /E = co re NDE pressure, but he is still ventilated and intubated, and he
Table 1: Distribution of the 344 patients in five WCEI classes* is still comatose. He is transferred to the intensive care
unit to continue the necessary artificial respiration. Only
234 (68%) patients were successfully resuscitated after more than a week do I meet again with the patient,
within hospital. 190 (81%) of these patients were who is by now back on the cardiac ward. I distribute his
resuscitated within 2 min of circulatory arrest, and m edication. T he m om ent he sees me he says: ‘Oh, that
unconsciousness lasted less than 5 min in 187 (80%). 30 nurse knows where my dentures are’. I am very
patients were resuscitated during electrophysiological surprised. T hen he elucidates: ‘Yes, you were there
stimulation; these patients all underw ent less than 1 min when I was brought into hospital and you took my
of circulatory arrest and less than 2 min of un dentures out of my m outh and put them onto that car, it
consciousness. This group were only given 5 mg of had all these bottles on it and there was this sliding
diazepam about 1 h before electrophysiological stim drawer underneath and there you put my teeth.’ I was
ulation. especially amazed because I rem em bered this happening
101 (29%) patients survived CPR outside hospital, while the man was in deep coma and in the process of
and nine (3%) were resuscitated both within and outside CPR. When I asked further, it appeared the m an had
hospital. O f these 110 patients, 88 (80%) had more than seen himself lying in bed, that he had perceived from
2 min of circulatory arrest, and 62 (56%) were above how nurses and doctors had been busy with CPR.
unconscious for more than 10 min. All people with brief He was also able to describe correctly and in detail the
cardiac arrest and who were resuscitated outside small room in which he had been resuscitated as well as
hospital were resuscitated in an ambulance. Only 12 the appearance of those present like myself. At the time
(9%) patients survived a circulatory arrest that lasted that he observed the situation he had been very much
longer than 10 min. 36% (123) of all patients were afraid that we would stop CPR and that he would die.
unconsciousness for longer than 60 min, 37 of these And it is true that we had been very negative about the
patients needed artificial respiration through intubation. patient’s prognosis due to his very poor medical
Intubated patients received high doses of strong condition when adm itted. The patient tells me that he
sedatives and were interviewed later than other patients; desperately and unsuccessfully tried to make it clear to
most were still in a weakened physical condition at the us that he was still alive and that we should continue
time of first interview and 24 showed memory defects. CPR. He is deeply impressed by his experience and says
Significantly more younger than older patients survived he is no longer afraid of death. 4 weeks later he left
long-lasting unconsciousness following difficult CPR hospital as a healthy m an.”
(p=0-005). Table 3 shows relations between dem ographic,
medical, pharmacological, and psychological factors and
P rospective fin d in g s the frequency and depth of N D E. N o medical,
62 (18%) patients reported some recollection of the pharmacological, or psychological factor affected the
time of clinical death (table 1). O f these patients, 21 frequency of the experience. People younger than
(6% of total) had a superficial N D E and 41 (12%) had a 60 years had N D E more often than older people
core experience. 23 of the core group (7% of total) (p=0-012), and women, who were significantly older
reported a deep or very deep N D E. Therefore, of 509 than men, had more frequent deep experiences than
resuscitations, 12% resulted in N D E and 8% in core men (p=0-011) (table 3). Increased frequency of
experiences. Table 2 shows the frequencies of ten experiences in patients who survived cardiac arrest in
elements of N D E .' No patients reported distressing or first myocardial infarction, and deeper experiences in
frightening N DE. patients who survived CPR outside hospital could have
During the pilot phase in one of the hospitals, a resulted from differences in age. Both these groups of
coronary-care-unit nurse reported a veridical out-of- patients were younger than other patients, though the
body experience of a resuscitated patient: age differences were not significant (p=0-05 and 0-07,
“D uring a night shift an ambulance brings in a 44- respectively).
year-old cyanotic, comatose m an into the coronary care Lengthy CPR can sometimes induce loss of memory
unit. He had been found about an hour before in a and patients thus affected reported significantly fewer
meadow by passers-by. After admission, he receives N D Es than others (table 3). No relation was found
between frequency of N D E and the time between CPR
Elements of NDE1 Frequency (n=62) and the first interview (range 1-70 days). M ortality
1 A w a re n e ss o f b ein g d ead 31 (5 0 % ) during or shortly after stay in hospital in patients who
2 P o sitive e m otio ns 35 (5 6 % ) had an N D E was significantly higher than in patients
3 O ut of body e xp erie nce 15 (2 4 % ) who did not report an N D E (13/62 patients [21 %] vs
4 M oving through a tunnel 19 (3 1 % )
24/282 [9%], p=0*008), and this difference was even
5 C o m m u n icatio n with light 14 (2 3 % )
more marked in patients who reported a deep
experience (10/23 [43%] vs 24/282 [9%], p<0-0001).
6 O b se rvatio n o f co lo u rs 14 (2 3 % )
7 O b se rvatio n o f a c e le stia l la n d sca p e 18 (2 9 % )
8 M e eting with d e c e a se d p e rso n s 20 (3 2 % )
9 Life review 8 (1 3 % ) L o n g itu d in a l fin d in g s
1 0 P re se n ce of border 5 (8% )
At 2-year follow-up, 19 of the 62 patients with N D E had
NDE=near-death experience died and six refused to be interviewed. Thus, we were
Table 2: Frequency of ten elements of NDE able to interview 37 patients for the second time. All
2041
94 Parapsychology
Frequency of NDE Depth Life-change inventory questionnaire P
of NDE
NDE No N D E Social attitude
P (n=62)
(n = 6 2 ) (n = 2 8 2 ) Sh o w in g own fe e lin g s 0 034
A cce p ta n ce of o th e rs* 0 012
Categorical factors
M ore loving, e m pathic* 0 002
D em o graph ic
U n d e rsta n d in g o th ers* 0 003
W om en 1 3 (2 1 % ) 80 (2 8 % ) NS 0 -0 1 1
Involvem ent in fam ily* 0 -0 0 8
A ge* < 6 0 y e a rs 3 2 (5 2 % ) 96 (3 4 % ) 0 -0 1 2 NS
R e lig io n f (y e s) 2 6 (7 0 % ) 27 (7 3 % ) NS NS Religious attitude
E d u c a t io n ft Ele m e n tary 1 0 (2 7 % ) 15 (4 3 % ) NS NS U n de rstan d p urpo se of life* 0 -0 2 0
M edical S e n s e inner m e an in g of life* 0 -0 2 8
Intubation 6 (1 0 % ) 3 1 (1 1 % ) NS NS In te re st in sp irituality* 0 035
E le ctro p h ysio lo gica l 8 (1 3 % ) 2 2 (8% ) NS NS
Attitude to death
stim u latio n
Fear of death* 0 -0 0 9
First m yocardial 6 0 (9 7 % ) 2 3 6 (8 4 % ) 0 -0 1 3 N S
B e lie f in life after d eath* 0 -0 0 7
infarction
C P R o u tsid e h o sp ita l§ 1 3 (2 1 % ) 8 8 (3 2 % ) NS 0 027 Others
M em ory d efe ct after 1 (2% ) 4 0 (1 4 % ) 0 -0 1 1 N S In te re st in m e a n in g of life 0 020
lengthy C P R U n d e rstan d in g o n e se lf 0 019
D eath within 3 0 d ay s 1 3 (2 1 % ) 2 4 (9%) 0 -0 0 8 0 -0 1 7 App reciatio n of ordinary th in g s 0 -0 0 0 1
P ha rm a co lo g ica l
*2
NDE=near-death experience 35 patients had NDE, 3 9 had not had NDE
Extra m edication 1 7 (2 7 % ) 7 0 (2 5 % ) NS NS
1 value m issin g for patients wih NDE in all categories, values m issin g for
P sy ch o lo g ical patients with NDE (le, n=33)
Fe ar before C P R f § 4 (1 3 % ) 2 (6% ) NS 0 -0 4 5
P revio u s N DE 6 (1 0 % ) 8 (3% ) 0 -0 3 5 N S Table 4: Significant differences in life-change inventory-scoreslb
Fo rekno w led ge of N D E f 2 2 (6 0 % ) 2 0 (5 4 % ) NS NS of patients with and without NDE at 2-year follow-up
Ratio-scaled factors
D em o graph ic N D E was linked to high scores in spiritual items such as
Age (m ean [S D ], y e a rs)* 5 8 - 8 (1 3 4) 6 3 -5 (1 1 -8 ) 0 -0 0 6 N S interest in the meaning of one’s own life, and social
M edical items such as showing love and accepting others. The 13
D uration of ca rd iac 4 0 (5 -2 ) 3 -7 (3 -9 ) NS NS patients who had superficial N D E underwent the same
a rre st (m ean [S D ], mm)
specific transform ational changes as those who had a
Duration of 6 6 1 (2 6 9 -5 ) 1 1 8 3 (3 5 5-5 ) NS NS
core experience.
8-year follow-up included 23 patients with an NDE
u nco n scio u sn e ss
(m ean [S D ], m in)
N um ber of C P R s (S D ) 2 -1 (2 5) 1 4 (1 -2 ) 0 -0 2 9 N S that had been affirmed at 2-year follow-up. 11 patients
Data are number (%) u nle ss otherwise indicated CPR=cardiopulm onary had died and one could not be interviewed. Patients
could still recall their N D E almost exactly. O f the
%2
resuscitation. NS=not significant (p>0 0 5) *3 m issin g values tn = 7 4 (data
from 2nd interview, 3 5 NDE, 3 9 no NDE) m issing values § 1 0 m issing
patients without an N D E at 2-year follow-up, 20 had
values
died and four patients could not be interviewed (for
Table 3: Factors affecting frequency and depth of near-death reasons such as dem entia and long stay in hospital),
experience (NDE)
which left 15 patients without an N D E to take part in
the third interview.
patients were able to retell their experience almost All patients, including those who did not have NDE,
exactly. O f the 17 patients who had low scores in the had gone through a positive change and were more self-
first interview (superficial N D E), seven had unchanged assured, socially aware, and religious than before. Also,
low scores, and four probably had, in retrospect, an
N D E that consisted only of positive emotions (score 1). Life-change inventory 2-year follow-up 8-year follow-up
Six patients had not in fact had an N D E after all, which questionnaire
was probably because of our wide definition of N D E at
NDE no N DE NDE no N DE
(n = 2 3 ) (n = 1 5 ) (n = 2 3 ) (n = 1 5 )
the first interview.
We selected a control group, m atched for age, sex, Social attitude
16 78 58
and time since cardiac arrest, from the 282 patients who
Sh o w in g own fe e lin g s 42
A cce p ta n ce of o th e rs 42 16 78 41
had not had N DE. We contacted 75 of these patients to More loving, e m pathic 52 25 68 50
obtain 37 survivors who agreed to be interviewed. Two U n d e rstan d in g o th e rs 36 8 73 75
controls reported an N D E consisting only of positive Involvem ent in fam ily 47 33 78 58
Only six of the 74 patients that we interviewed at B e lie f in life after death 36 16 42 16
2 years said they were afraid before CPR (table 3). Four Others
and without an N D E are shown in table 4. For instance, no change (0), somewhat decreased (-1 ), and strongly decreased (-2 ) Only in
the reported 13 {of 34) item s in this table were significant differences found in
people who had N D E had a significant increase in belief life-change sco res in the interview after 2 years (table 4)
in an afterlife and decrease in fear of death compared Table 5: Total sum of individual life-change inventory scores16
with people who had not had this experience. D epth of of patients at 2-year and 8-year follow-up
2042
Parapsychology 95
people who did not have N D E had become more In a study of m ortality in patients after resuscitation
emotionally affected, and in some, fear of death had outside hospital,18 chances of survival increased in
decreased more than at 2-year follow-up. Their interest people younger than 60 years and in those undergoing
in spirituality had strongly decreased. M ost patients who first myocardial infarction, which corresponds with our
did not have N D E did not believe in a life after death at findings. Older people have a smaller chance of cerebral
2-year or 8-year follow-up (table 5). People with N D E recovery after difficult and complicated resuscitation
had a m uch more complex coping process: they had after cardiac arrest. Younger patients have a better
become more emotionally vulnerable and empathic, and chance of surviving a cardiac arrest, and thus, to
often there was evidence of increased intuitive feelings. describe their experience. In a study of 11 patients after
Most of this group did not show any fear of death and CPR, the person that had an N D E was significantly
strongly believed in an afterlife. Positive changes were younger than other patients who did not have such an
more apparent at 8 years than at 2 years of follow-up. experience.19 Greyson7 also noted a higher frequency of
N D E and significantly deeper experiences at younger
Discussion ages, as did Ring.1
Our results show that medical factors cannot account Good short-term memory seems to be essential for
for occurrence of N D E; although all patients had been remembering N D E. Patients with memory defects after
clinically dead, most did not have N D E. Furtherm ore, prolonged resuscitation reported fewer experiences than
seriousness of the crisis was not related to occurrence or other patients in our study. Forgetting or repressing
depth of the experience. If purely physiological factors such experiences in the first days after C PR was unlikely
resulting from cerebral anoxia caused N DE, most of our to have occurred in the remaining patients, because no
patients should have had this experience. Patients’ relation was found between frequency of N D E and date
medication was also unrelated to frequency of N DE. of first interview. However, at 2-year follow-up, two
Psychological factors are unlikely to be im portant as fear patients rem embered a core N D E and two an N D E that
was not associated with N DE. consisted of only positive emotions that they had not
T he 18% frequency of N D E that we noted is lower reported shortly after CPR, presumably because of
than reported in retrospective studies,1,8 which could be memory defects at that time. It is remarkable that people
because our prospective study design prevented self could recall their N D E almost exactly after 2 and
selection of patients. O ur frequency of N D E is low 8 years.
despite our wide definition of the experience. Only 12% Unlike our results, an inverse correlation between
of patients had a core N D E , and this figure m ight be an foreknowledge and frequency of N D E has been
overestimate. W hen we analysed our results, we noted show n.18 Our finding that women have deeper
that one hospital that participated in the study for nearly experiences than men has been confirmed in two other
4 years, and from which 137 patients were included, studies,1,7 although in one,7 only in those cases in which
reported a significantly (p=0-01) lower percentage of women had an N D E resulting from disease.
N D E (8%), and significantly (p=0-05) fewer deep The elements of N D E that we noted (table 2)
experiences. Therefore, possibly some selection of correspond with those in other studies based on Ring’s1
patients occurred in the other hospitals, which classification. Greyson20 constructed the N D E scale
sometimes only took part for a few months. In a differently to Ring,1 but both scoring systems are
prospective study17 with the same design as ours, 6% of strongly correlated (r=0-90). Yet, reliable comparisons
63 survivors of cardiac arrest reported a core are nearly impossible between retrospective studies that
experience, and another 5% had memories with features included selection of patients, unreliable medical
of an N D E (low score in our study); thus, with our wide records, and used different criteria for N D E ,12 and our
definition of the experience, 11 % of these patients prospective study.
reported an N D E. Therefore, true frequency of the Our longitudinal follow-up research into trans
experience is likely to be about 10%, or 5% if based on formational processes after N D E confirms the
num ber of resuscitations rather than num ber of transform ation described by many others.1'3,8,10,1316,21
resuscitated patients. Patients who survive several CPRs Several of these investigations included a control group
in hospital have a significantly higher chance of N D E to enable study of differences in transform ation,14 but in
(table 3). our research, patients were interviewed three times
We noted that the frequency of N D E was higher in during 8 years, with a m atched control group. Our
people younger than 60 years than in older people. In findings show that this process of change after N D E
other studies, mean age at N D E is lower than our tends to take several years to consolidate. Presumably,
estimate (62-2 years) and the frequency of the besides possible internal psychological processes, one
experience is higher. M orse10 saw 85% N D E in children, reason for this has to do with society’s negative response
Ring1 noted 48% N D E in people with a mean age of to N D E, which leads individuals to deny or suppress
37 years, and Sabom 8 saw 43% N D E in people with a their experience for fear of rejection or ridicule. T hus,
mean age of 49 years; thus, age and the frequency of the social conditioning causes N D E to be traum atic,
experience seem to be associated. Other retrospective although in itself it is not a psychotraum atic experience.
studies have noted a younger m ean age for NDE: As a result, the effects of the experience can be delayed
32 years,9 29 years,6 and 22 years.11 Cardiac arrest was for years, and only gradually and with difficulty is an
the cause of the experience in m ost patients in Sabom ’s8 N D E accepted and integrated. Furtherm ore, the
study, whereas this was the case in only a low percentage longlasting transform ational effects of an experience that
of patients in other work. We saw that people surviving lasts for only a few minutes of cardiac arrest is a
CPR outside hospital (who underw ent deeper N D E surprising and unexpected finding.
than other patients) tended to be younger, as were those One limitation of our study is that our study group
who survived cardiac arrest in a first myocardial were all D utch cardiac patients, who were generally
infarction (more frequent N D E ), which indicates that older than groups in other studies. Therefore, our
age was probably decisive in the significant relation frequency of N D E might not be representative of all
noted with those factors. cases—eg, a higher frequency could be expected with
2043
96 Parapsychology
younger samples, or rates might vary in other elements of N D E, such as out-of-body experiences
populations. Also, the rates for N D E could differ in and other verifiable aspects. Finally, the theory
people who survive near-death episodes that come about and background of transcendence should be included as
by different causes, such as near drowning, near fatal car a part of an explanatory framework for these
crashes with cerebral traum a, and electrocution. experiences.
However, rigorous prospective studies would be almost
impossible in many such cases.
Several theories have been proposed to explain N DE. Contributors
Pim van Lommel coordinated the first interviews and was responsible
We did not show that psychological, neurophysiological, for collecting all demographic, medical, and pharmacological data.
or physiological factors caused these experiences after Pim van Lommel, Ruud van Wees, and Vincent Meyers rated the
cardiac arrest. Sabom 22 m entions a young American first interview. Ruud van Wees and Vincent Meyers coordinated the
woman who had complications during brain surgery for second interviews. Ruud van Wees did statistical analysis of the first
a cerebral aneurysm. The EEG of her cortex and and second interviews. Ingrid Elffench did the third interviews and
analysed these results.
brainstem had become totally flat. After the operation,
which was eventually successful, this patient proved to Acknowledgments
have had a very deep N D E , including an out-of-body We thank nursing and medical staff of the hospitals involved in the
experience, with subsequently verified observations research; volunteers of the International Association of Near Death
during the period of the flat EEG. Studies; IANDS-Netherlands; Merkawah Foundation for arranging
And yet, neurophysiological processes m ust play some interviews, and typing the second and third interviews; Martin Meyers
for help with translation; and Kenneth Ring and Bruce Greyson for
part in N D E. Similar experiences can be induced review of the article.
through electrical stim ulation of the temporal lobe (and
hence of the hippocam pus) during neurosurgery for
epilepsy,23 with high carbon dioxide levels References
(hypercarbia),24 and in decreased cerebral perfusion 1 Ring K Life at death. A scientific investigation of the near
resulting in local cerebral hypoxia as in rapid death experience. New York: Coward McCann and Geoghenan,
acceleration during training of fighter pilots,25 or as in 1980.
2 Blackmore S. Dying to live: science and the near-death experience.
hyperventilation followed by valsalva m anoeuvre.*123456 London: Grafton—an imprint of Harper Collins Publishers,
K etam ine-induced experiences resulting from blockage 1993.
of the N M D A receptor,26 and the role of endorphin, 3 Morse M. Transformed by the light. New York: Villard Books,
serotonin, and enkephalin have also been m entioned,27 1990.
as have near-death-like experiences after the use of 4 Lempert T, Bauer M, Schmidt D. Syncope and near-death
experience. Lancet 1994; 344: 829-30.
L SD ,28 psilocarpine, and mescaline.21 These induced 5 Appelby L. Near-death experience: analogous to other stress
experiences can consist of unconsciousness, out-of-body induced physiological phenomena. BMJ 1989; 298: 976-77.
experiences, and perception of light or flashes of 6 Owens JE, Cook EW, Stevenson I. Features of “near-death
recollection from the past. These recollections, however, experience” in relation to whether or not patients were near death.
Lancet 1990; 336: 1175-77.
consist of fragmented and random memories unlike the 7 Greyson B. Dissociation in people who have near-death experiences,
panoram ic life-review that can occur in N D E. Further, out of their bodies or out of their minds? Lancet 2000; 355:
transform ational processes with changing life-insight 460-63.
and disappearance of fear of death are rarely reported 8 Sabom MB. Recollections of death: a medical investigation. New
after induced experiences. York: Harper and Row, 1982.
T hus, induced experiences are not identical to N DE, 9 Greyson B. Varieties of near-death experience. Psychiatry 1993;
and so, besides age, an unknown m echanism causes 56: 390-99.
10 Morse M Parting visions: a new scientific paradigm. In: Bailey LW,
N D E by stim ulation of neurophysiological and Yates J, eds. The near-death experience: a reader. New York and
neurohum oral processes at a subcellular level in the London: Routledge, 1996: 299-318.
brain in only a few cases during a critical situation such 11 Schmied I, Knoblaub H, Schnettler B. Todesnaheerfahrungen m
as clinical death. These processes m ight also determine Ost- und Westdeutschland—eine empirische Untersuchung. In:
w hether the experience reaches consciousness and can Knoblaub H, Soeffner HG, eds. Todesnahe: mterdisziplinare
Zugange zu einem aufiergewohnlichen Phanomen. Konstanz:
be recollected. Universitatsverlag Konstanz, 1999: 217-50.
With lack of evidence for any other theories for N DE, 12 Greyson B. The incidence of near-death experiences. Med Psychiatry
the thus far assum ed, but never proven, concept that 1998; 1: 92-99.
consciousness and memories are localised in the brain 13 Roberts G, Owen J. The near-death experience. BrJ Psychiatry
should be discussed. How could a clear consciousness 1988; 153: 607-17.
14 Groth-Marnat G, Summers R. Altered beliefs, attitudes and
outside one’s body be experienced at the m om ent that behaviors following near-death experiences. J Hum Psychol 1998;
the brain no longer functions during a period of clinical 38: 110-25.
death with flat EEG?22 Also, in cardiac arrest the EEG 15 Atwater PMH. Coming back to life: the after-effects of the
usually becomes flat in most cases within about 10 s near-death experience. New York: Dodd, Mead and Company,
from onset of syncope.29,30 Furtherm ore, blind people 1988.
16 Ring K. Heading towards omega: m search of the meaning of
have described veridical perception during out-of-body the near-death experience. New York: Quill William Morrow,
experiences at the time of this experience.31 N D E pushes 1984.
at the limits of m edical ideas about the range of hum an 17 Parma S, Waller DG, Yeates R, Fenwick P. A qualitative and
consciousness and the m ind-brain relation. quantitative study of the incidence, features and aetiology of near
Another theory holds that N D E might be a changing death experiences in cardiac arrest survivors. Resuscitation 2001;
48: 149-56.
state of consciousness (transcendence), in which 18 Dickey W, Adgey AAJ. Mortality within hospital after resuscitation
identity, cognition, and emotion function independently from ventricular fibrillation outside hospital. Br Heart J 1992; 67:
from the unconscious body, but retain the possibility of 334-38.
non-sensory perception.78910234567-8,22,2831 19 Schoenbeck SB, Hocutt GD. Near-death experiences in patients
Research should be concentrated on the effort to undergoing cardio-pulmonary resuscitation J Near-Death Studies
1991; 9: 211-18.
explain scientifically the occurrence and content of 20 Greyson B. The near-death experience scale: construction, reliability
N D E. Research should be focused on certain specific and validity. J Nervous Mental Dis 1982; 171: 369-75.
2044
Parapsychology 97
21 Schroter-Kunhardt M. Nah—Todeserfahrungen aus psychiatrisch- the role of glutamate and the NMDA-receptor In: Bailey LW,
neurologischer Sicht. In: Knoblaub H, Soeffner HG, eds Yates J, eds. The near-death experience: a reader. New York and
Todesnahe: interdisziplinare Zugange zu einem aufiergewohnlichen London: Routledge, 1996: 265-82.
Phanomcn. Konstanz. Universitatsverlag Konstanz, 1999: 65-99 27 Greyson B. Biological aspects of near-death experiences. Perspect
22 Sabom MB. Light and death: one doctors fascinating account of Biol Med 1998; 42: 14-32.
near-death experiences. Michigan: Zondervan Publishing House, 28 Grof S, Halifax J. The human encounter with death. New York:
1998:37-52. Dutton, 1977.
23 Penfield W. The excitable cortex in conscious man. Liverpool: 29 Clute HL, Levy WJ. Electroencephalographic changes during brief
Liverpool University Press, 1958. cardiac arrest in humans. Anesthesiology 1990; 73: 821-25.
24 Meduna LT. Carbon dioxide therapy: a neuropsychological 30 Aminoff MJ, Schemman MM, Griffing JC, Herre JM. Electrocerebral
treatment of nervous disorders. Springfield. Charles C Thomas, accompaniments of syncope associated with malignant ventricular
1950. arrhythmias. Ann Intern Med 1988, 108: 791-96.
25 Whinnery JE, Whinnery AM Acceleration-induced loss of 31 Ring K, Cooper S Mindsight. near-death and out-of-body
consciousness. Arch Neurol 1990; 47: 764-76. experiences in the blind. Palo Alto: William James Center for
26 Jansen K. Neuroscience, ketamine and the near-death experience: Consciousness Studies, 1999.
2045
[6]
An investigation into alleged ‘hauntings’
Richard W isem an1*, Caroline W a tt2, Paul Stevens2,
Emma Greening 1 and Ciaran O ’Keeffe 1
'University of Hertfordshire, UK
2University of Edinburgh, UK
Recent polls reveal that approximately 38% of Americans believe that ghosts exist
(Gallup, 2001), and 13% report having experienced one (MORI, 1998). Such experi
ences involve a diverse range of phenomena, including apparitions, unusual odours,
sudden changes in temperature and a strong sense of presence (Lange, Houran, Harte, &
Havens, 1996). In a relatively small number of cases, witnesses consistently report these
experiences in certain locations, often giving rise to the belief that these places are
‘haunted’. The best of these cases appear evidentially impressive, sometimes lasting
several years and involving a large number of seemingly trustworthy witnesses reporting
unusual phenomena in the same ‘haunted’ areas (for further information see Gauld &
Cornell, 1979; Houran & Lange, 2001; Irwin, 1999; McCue, 2002). Many of these alleged
*Requests for reprints should be addressed to D r R ichard W isem an, University o f Hertfordshire, H atfield Cam pus,
College Lane, Hatfield, H e rts A L IO 9AB, U K (e-mail: r.w isem an@ herts.ac.uk).
100 Parapsychology
196 Richard W iseman et al.
hauntings have been described in several best-selling books on the paranormal, and
reported on both television and radio (see e.g. Auerbach, 1986).
These high-profile claims have been the subject of very little well-controlled,
systematic, research. This is unfortunate, in part, because media reportage of many of
these cases exerts a major influence over the public’s belief in the paranormal (National
Science Board, 2000). In addition, such work clearly has the potential to contribute to
our theoretical understanding of how certain psychological and psychophysiological
phenomena (including e.g. hallucination, suggestion and response to subtle environ
mental stimuli) operate in unusual, but naturalistic, settings (see e.g. Houran & Lange,
1996: Houran & Williams, 1998; Lange & Houran, 1997). The work also could contribute
to applied research into several important, and often controversial, areas, including e.g.
contagious psychogenic illness, sick building syndrome and other forms of alleged
‘environmental illness’ (Lundberg, 1998). The present article addresses these issues by
outlining the first investigations into two internationally known cases of alleged
hauntings.
Experiment 1 took place at Hampton Court Palace. This royal palace was home to
many British monarchs for over 500 years, and it is now a popular historical
attraction. The palace is also frequently referred to as ‘one of the most haunted
places in England’ (see e.g. Guiley, 1994; Law, 1918: Underwood, 1971), and allegedly
contains the ghost of Catherine Howard, the fifth wife of Henry VIII. Fifteen months
after her marriage to the King in 1540, Catherine Howard was found guilty of
adultery and sentenced to death (Thurley, 1996). Legend suggests that upon hearing
the news, Catherine Howard ran to the King to plead for her life, but was dragged
back along a section of the Palace now known as ‘the Haunted Gallery’ (Guiley, 1994;
Underwood, 1971). By the turn of the century, the Gallery had become associated
with various unusual experiences, including sightings of a ‘woman in white’ and
reports of inexplicable screams (Law, 1918). Since then, visitors to the Gallery have
reported other ‘ghostly’ phenomena, including a strong sense of presence, a feeling
of dizziness and sudden changes in temperature (Franklin, 1998). The Haunted
Gallery is not the only part of Hampton Court Palace associated with such
phenomena, with visitors and staff reporting similar experiences in other areas of
the building, including an area known as the Georgian Rooms (Franklin, 1998).
Information about the reputation of the Haunted Gallery is widely available to the
public, but specific information about the location of experiences in areas such as the
Georgian Rooms is not widely available.
Experiment 2 was carried out in part of the South Bridge Vaults in Edinburgh,
Scotland. Edinburgh’s South Bridge was constructed in the late eighteenth century to
ease transportation problems in the city. The Bridge consisted of 19 huge stone arches
supporting a wide road lined with several three storey buildings. A series of ‘vaults’
(i.e. small chambers, rooms and corridors) were built into the Bridge’s arches to house
workshops, storage areas and accommodation for the poor (Henderson, 1999). How
ever, ineffective waterproofing and overcrowding meant that by the mid-nineteenth
century the vaults had degenerated into a disease-ridden slum. The area was abandoned
during the late nineteenth century, but rediscovered and opened for public tours in
1996. During some of these tours, both members of the public and guides have
experienced many unusual phenomena, including, for example, a strong sense of
presence, several apparitions and ‘ghostly’ footsteps (Wilson, Brogan, & Hollinrake,
1999). As a result, the vaults have acquired an international reputation for being one of
the most haunted parts of Scotland’s capital city. The public has relatively easy access to
Parapsychology 101
EXPERIMENT I
The first part of Expt 1 examined whether participants would report a dispro
portionately large number of unusual experiences in apparently haunted’ areas of
the Haunted Gallery and the Georgian Rooms at Hampton Court Palace. Prior to
the study, Ian Franklin (IF), a warder at the palace, catalogued many of the reports
of unusual phenomena associated with the building. IF reviewed this material and
identified areas where people had consistently reported unusual phenomena in
both the Haunted Gallery and the Georgian Rooms. The areas identified were
classified as ‘haunted’ whilst the remaining areas were classified as ‘controls’. The
investigators were blind to these classifications until all data collection had been
completed.
Groups of participants walked around either the Haunted Gallery or the Georgian
Rooms, and reported if they experienced any unusual phenomena. Participants report
ing such phenomena marked the locations of their experiences on a floorplan. It was
predicted that the percentage of experiences reported in the ‘haunted’ areas in both
locations would be significantly above chance.
Some researchers have argued that the witnesses involved in alleged hauntings may
have had prior knowledge about which parts of a building were ‘haunted’, and that this
may be responsible for them reporting a disproportionately large number of unusual
experiences in these areas. There are several ways in which this may happen. For
example, witnesses’ prior knowledge about a ‘haunted’ area may cause them to assign
special significance to any unusual phenomenon experienced in that area, therefore
increasing the likelihood of them telling others about their experience. Alternatively,
such information may have increased witnesses’ anxiety levels when entering these
areas, and this, in turn, may have resulted in the witnesses experiencing mild
psychosomatic and hallucinatory phenomena. A second part of Expt 1 evaluated
whether any disproportionate reporting of unusual experiences in ‘haunted’ areas
would be due to participants’ prior knowledge about previous reports of ‘ghostly’
activity. Prior to visiting either the Haunted Gallery or the Georgian Rooms, participants
rated the degree to which they knew where in these locations people had experienced
ghostly’ phenomena in the past. The ‘prior knowledge’ hypothesis predicted that
participants indicating a high level of prior knowledge would report a greater percen
tage of experiences in the haunted’ areas than those indicating a low level of prior
knowledge. Of course, participants can only report on their prior c o n s c io u s knowledge.
It is theoretically possible that participants may be influenced by their unconscious
knowledge of haunted locations (e.g. knowledge acquired earlier but now forgotten).
However, due to the difficulty of assessing unconscious knowledge, and for ease of
expression, we will use the phrase ‘prior knowledge’ throughout this article to refer to
prior conscious knowledge.
Others have challenged the prior knowledge’ hypothesis, noting that witnesses
often claim to have been unaware of the reputation of a ‘haunted’ building prior to their
experiences (see e.g. MacKenzie, 1982). This position has recently received empirical
support from several studies conducted by Maher and her colleagues (for a review of
these experiments, see Maher, 1999), using a quantitative technique pioneered by
102 Parapsychology
198 Richard Wiseman et ai.
Method
Classifying ‘haunted* and ‘control* areas
IF had catalogued a large number of reports of unusual phenomena experienced by staff
and visitors at Hampton Court Palace (Franklin, 1998). These reports dated from the end
of the last century to the present day, and consisted of material from newspapers,
magazines, books and IF’s interviews with witnesses. Prior to the experiment, RW asked
IF vo identify where in the Haunted Gallery and the Georgian Rooms people had
consistently reported unusual experiences. The palace supplied floorplans of both the
Haunted Gallery and the Georgian Rooms. RW divided each of these floorplans into 24
equally sized areas and asked IF to mark the areas in which people had consistently
reported unusual experiences. Areas marked by IF were classified as ‘haunted’ whilst
unmarked areas were classified as ‘controls’. IF marked seven areas in the Haunted
Gallery and six areas in the Georgian Rooms. These floorplans were not seen by the
investigators until all data collection had been completed. To avoid bias, RW, the
assistant experimenters, who guided participants to the locations, and PS, who mapped
the magnetic fields, were blind to the identity of these areas.
Questionnaires
In Questionnaire 1 participants rated the degree to which they knew where, in the
Haunted Gallery or Georgian Rooms, people had experienced unusual phenomena in
the past (definitely yes, probably yes, uncertain, probably no, definitely no).1
Questionnaire 2 asked participants to quietly walk around the Haunted Gallery or
the Georgian Rooms, write a brief description of any unusual phenomena they
experienced, indicate whether they believed that their experience(s) were due to a
ghost (definitely yes, probably yes, uncertain, probably no, definitely no) and mark
where they were standing when they had their experience(s) on a floorplan. The
floorplan included in this questionnaire had not been divided into areas.
Procedure
Participants were self-selecting members of the public visiting Hampton Court Palace in
late May/early June 2000. They had seen leaflets inviting their participation; thus they
knew they were taking part in a scientific investigation. Participants took part in one of
three daily sessions held over the course of 6 days. Each session involved a maximum of
40 people. Participants were first randomly split into two groups, according to where
they had chosen to sit, with one half of the room forming Group 1 and the other half
forming Group 2, in a counterbalanced order. All participants then completed Ques
tionnaire 1, with Group 1 being asked about their prior knowledge concerning the
Haunted Gallery, whilst Group 2 were asked about the Georgian Rooms. RW then gave a
short talk about scientific research into ghosts. The talk was presented in an atmo
spheric setting, with lowered lighting. RW briefly described the historical tale of
Catherine Howard, as outlined in the introduction, but without mentioning the location
in which related haunting-type experiences had reportedly occurred. The talk also
1 Questionnaire I also contained other item s including w hether participants believed that ghosts exist, the frequency with
’
which they had experienced ‘ghostly ph enom ena in the past, etc. The results o f these item s and related analyses are reported
in W isem an, Watt, Greening, Stevens, an d O 'Keeffe (2 0 0 1 ). In sum m ary, participants reporting prior b e lie f in ghosts reported
significantly m ore unusual phenom ena during the experim ent than disbelievers, and were significantly m ore likely to attribute
the phenom ena to ghosts.
104 Parapsychology
200 Richard Wiseman et al.
Illustrated some of the apparatus that could be used in haunting investigations, such as
heat-sensitive cameras and instruments sensitive to magnetic activity. Finally, RW
outlined the purpose and methodology of the experiment.2 Participants were then
escorted by an assistant experimenter to either the Haunted Gallery (Group 1) or the
Georgian Rooms (Group 2). Once at the location, the participants were free to walk
around the location according to their individual preferences, and completed
Questionnaire 2. Although participants were able to drop out of the experiment at
any time without penalty, none did. Assistant experimenters were always on hand if
needed by participants when they were in the test locations. Participants were also
given RW’s contact details in case they required further advice or information following
the conclusion of the studies.
The first six sessions were pilot sessions, whose purpose was to check the
practicability of the planned protocol, and to help identify areas for placement of
measurement equipment. Data from the pilot sessions are not included in the analyses
reported below.
2As an additional investigation o f the effects o f suggestion on reported ghostly experiences, during his introductory talk to
‘
participants R W made suggestions that one o f the two locations was active’ while the other was 'inactive’ (in term s o f recent
frequency o f reported ghostly experiences, but giving no specific suggestions as to where in each location experiences had been
reported). To avoid systematic bias, these suggestions were made in a counterbalanced fashion. For sake o f brevity, and
because suggestion appeared to have little effect on reported experiences, this manipulation will receive no further attention in
et al.
this article. M ore detail o f the m ethod an d results o f this manipulation can be found in the article by W isem an (2 0 0 2 ).
3 O n ce IF ’s classification o f 'haunted’ and ‘control’ areas was revealed at the end o f the study, it transpired that the 12
areas chosen by R W consisted o f six haunted and six control areas as identified by IF. The analyses for m agnetic fields
therefore refer to this 'sub-group’ o f six haunted and six control areas and not to the entire group o f 13 haunted and 3 5
control areas.
Parapsychology 105
An investigation into alleged ‘h auntings’ 201
number of unusual experiences reported in each of the areas whilst setting-up and
operating the magnetic field sensors. Magnetic data was recorded for thirty minutes in
each area. Recording took place while tourists were visiting the area, but not during any
experimental sessions. Hence the magnetic measurement procedure would not bias
participants’ reports.
Participants
There were 678 participants who each attended 1 of the 18 sessions. Some of the
participants (131) were excluded as they did not complete all of the items on
Questionnaire 1 and a further 83 were excluded for not completing all of the items
on Questionnaire 2. The number of participants remaining was 462 (163 males, 299
females; mean age: 35.0, age range. 7 to 82, S D = 16.3). As the 18 groups of participants
were assigned to one of the two locations, there was a total of 36 groups of participants.
Results
Participants reported a total of 431 unusual experiences: 189 (43 8%) of these
experiences were reported in the Haunted Gallery and 242 (56.2%) in the Georgian
Rooms; 215 (46.5%) participants reported at least one experience, and the mean
number of experiences for participants reporting one or more experiences was 2.0
( S D = 1.45). Approximately two thirds of these experiences involved an unusual change
in temperature. The remaining one third involved a mixture of phenomena including,
for example, a feeling of dizziness, headaches, sickness, shortness of breath, some form
of ‘force’, a foul odour, a sense of presence and intense emotional feelings. When asked
whether their experiences were due to a ghost, 8 (3.72%) participants indicated
‘Definitely yes’, 22 (10.23%) ‘Probably yes’, 80 (37.21%) ‘Uncertain’, 87 (40.46%)
Probably no’ and 18 (8.37%) Definitely no’. It is difficult to assess the extent to
which these experiences may have been elicited or dampened by the context of the pre
experiment talk. However, it is worth noting that both locations were well lit and
relatively noisy and busy with tourists and were therefore less atmospheric than the
context in which the talk was given. Given these circumstances, it was perhaps
surprising that so many participants reported having experiences.
Participant grouping
Each of the 36 groups completed Questionnaire 2 whilst walking around either the
Haunted Gallery or the Georgian Rooms. Individual responses to the questionnaire
cannot therefore be considered statistically independent as they may have influenced,
and been influenced by, other members of the group. For example, friends and family
members were likely to have sat beside one another and therefore to have been assigned
to the same group, so they may have interacted more with one another than strangers
might. As a result, participants’ responses to the questionnaire were combined within
each of the 36 groups so the group is the unit of analysis (see Rosenthal & Rosnow,
1991).
Percentage of experiences reported in ‘haunted* areas
The floorplans that had been divided into 24 areas were photocopied onto acetate and
used to classify the location of each of the experiences reported by participants. This
106 Parapsychology
202 Richard Wiseman et a i
classification was carried out by EG and CO, whilst blind to both the location of the
‘haunted’ and ‘control’ areas and the results of the magnetic field measurements. Given
that there were seven haunted’ areas in the Haunted Gallery and six in the Georgian
Rooms, single mean £ tests were used to compare the actual percentage of experiences
reported in these areas with the chance baselines of 29.16% and 25% respectively. Both
analyses found the percentage of experiences to be significantly greater than chance
(see Table 1).
Table I . The df, population means, t values (single group) and p values (two-tailed) comparing the
percentage of experiences reported in the ‘haunted’ areas of the Haunted Gallery and the Georgian
Rooms against chance
Prior knowledge
Each group’s ‘prior knowledge score’ consisted of the mean of participants’ responses
to the question concerning the extent to which they knew where other people had
reported unusual experiences in either the Haunted Gallery or the Georgian Rooms
(coded on a 5-point scale from 1 (definitely yes) to 5 (definitely no)). Each group was
then classified as having either ‘High’ or ‘Low’ levels of prior knowledge on the basis of a
median split. This resulted in 18 groups being classified as ‘High’ (mean score = 3.89,
S D = .33) and 18 groups as Low’ (mean score = 4.51, S B = .1 8 ). There was a
nonsignificant difference between the percentage of experiences reported in the
haunted’ areas by the ‘High’ and ‘Low’ levels of prior knowledge groups in either the
Haunted Gallery (£(15) =1.66, unpaired, p = .12, two-tailed) or the Georgian Rooms
(£ (16) = —.14, unpairedp = .89, two-tailed).
Magnetic fields
There was a nonsignificant difference in the mean magnetic field strength between the
‘haunted’ and ‘control’ areas (unpaired £(10) = 1.55, p — .15, two-tailed). However,
there was a significant difference in the variance of the field between the two types of
areas (unpaired £(10) = 2.34, p = .04, two-tailed), with the ‘haunted’ areas (M = 12.71,
S D = 12.10) displaying a higher variance than ‘control’ areas (M = 2.16, S D = 1.03).
Spearman rank correlation coefficients were calculated between the number of
experiences reported by each group within each of the 12 areas for which magnetic data
was obtained, and mean strength and variance of the magnetic field in those areas.4 One
sample £ tests were then used to examine whether the sample mean of these
correlations differed significantly from zero. These analyses revealed a nonsignificant
4 There were three groups for which no experiences were reported in the 12 areas. As it was not possible to calculate a
correlation in these cases, these three groups were not included in the analyses.
Parapsychology 107
An investigation into alleged ‘hauntings’ 203
relationship between the number of experiences reported and the mean field strength
(1 sample t ( 32) = . 8 2 ,p = .42, two-tailed). A significant relationship was found between
the variance of the field and number of unusual experiences reported (1 sample
r(32) = 2.15,p = .04, two-tailed).
Discussion
The experiment first examined whether participants would report a disproportionately
large number of unusual experiences in the ‘haunted’ areas. These ‘haunted’ areas had
been classified on the basis of prior reports. By chance, it was expected that
approximately 29% of participants’ unusual experiences would be reported in the
haunted’ areas of the Haunted Gallery, and 25% in the Georgian Rooms. However,
groups of participants visiting both rooms reported significantly more unusual experi
ences in the ‘haunted’ areas within both locations. These findings strongly support the
notion that people’s unusual experiences are not evenly distributed across the locations,
but instead concentrate in ‘haunted’ areas. In addition, the findings suggest that
the areas in which people report their experiences are consistent across time. In
short, these empirical findings validate several characteristics of spontaneous haunt
experiences suggested by anecdotal reports.
Prior to entering either the Haunted Gallery or the Georgian Rooms, participants
were asked to rate the degree to which they knew where people had reported unusual
experiences in these locations in the past. The results showed that participants’ level of
prior knowledge was not significantly related to the percentage of experiences reported
in the ‘haunted’ areas. These findings do not support the notion that the disproportio
nately large number of unusual experiences reported in ‘haunted’ areas is due to
participants’ prior conscious knowledge about the location.
Thirdly, the experiment examined the possibility that there were significant differ
ences between the strength and variance of the magnetic fields between the ‘haunted’,
and ‘control’, areas. Results suggested no significant differences in the mean strength of
the magnetic field between the two types of areas. However, the variance of the local
magnetic field was significantly greater in ‘haunted’ than ‘control’ areas, and there was a
significant relationship between the magnetic variance and the mean number of unusual
experiences reported by groups of participants. These results seem consistent with
previous research suggesting a relationship between local magnetic field activity and
haunt reports.
Experiment 2 (see below) built upon both the methodology and results of Expt 1.
First, in Expt 1, areas within the Haunted Gallery and the Georgian Rooms were
classified as either ‘haunted’ or ‘control’. Experiment 2 provided a more fine-grained
classification of areas by using a venue in which it was possible to rank order each of the
areas from ‘most’ to ‘least’ ‘haunted’. Secondly, in Expt 1, the nature of the venue
resulted in participants having to walk around each of the locations in groups, and thus
their data had to be analysed and interpreted at a group level. Unfortunately, this
resulted in the study having low statistical power, and it is possible that the locations,
having tourists as well as up to 20 participants walking around, were relatively noisy and
therefore not conducive to haunt experiences. These issues were overcome in Experi
ment 2 by using a venue in which participants could visit areas on their own, and thus
produce data that could be analysed and interpreted independently. Finally, Expt 2
measured a far greater number of environmental variables.
108 Parapsychology
204 Richard Wiseman et al.
EXPER IM EN T 2
The experiment took place in 10 of the South Bridge Vaults in Edinburgh. For the
past few years, the company conducting guided tours through the underground
vaults has maintained a collection of any unusual experiences reported by both
guides and visitors. Prior to the experiment, RW asked Fran Hollinrake (FH), a senior
tour guide, to review this database and rank order the vaults between 1 (‘least
haunted’, i.e. smallest number of unusual experiences) and 10 (‘most haunted’, i.e.
largest number of unusual experiences). This was referred to as the ‘Haunted Order’
of the vaults.
During the experiment, participants were asked to spend approximately 10 min in
one of the vaults on their own, write down any unusual phenomena they experi
enced and rate the degree to which they believed that these experiences were due to
a ghost. On the basis of the results obtained in Expt 1, it was predicted that there
would be a significant correlation between the ‘Haunted Order’ and mean number of
experiences reported in each vault. That is, it was predicted that the location of past
haunt reports would be predictive of the location of haunt reports in the current
study.
The experiment also investigated the potential relationship between participants’
prior knowledge about the vaults and their reports of unusual phenomena. Prior to
visiting the vaults, participants noted whether they knew where people had reported
unusual experiences in the vaults in the past. Based on the results of Expt 1, it was
predicted that the correlations between the Haunted Order’, and the mean number of
experiences reported in each vault, would be significant among participants who
indicated no prior knowledge of the vaults.
The experiment also examined a wider range of environmental variables than Expt
1, including, the mean strength and variance of the local magnetic field, air tempera
ture, air movement, the vaults’ interior lighting levels, the lighting level directly outside
the entrances to the vaults, the floorspaces of the vaults and their height. It was
predicted that there would be significant correlations between these variables and
both the Haunted Order’, and the mean number of reported experiences in each
vault.
Method
Questionnaires
Questionnaire 1 asked participants whether they had heard (e.g. from friends, the
media, publications about the vaults) where in the vaults people have reported
experiencing unusual phenomena (possible responses: yes, uncertain, no).5
Questionnaire 2 instructed participants to spend a few minutes in a vault and then
report any phenomena that they experienced. They were asked to report all of their
unusual experiences, no matter how faint, and to include all types of experiences
(including e.g. unusual changes in temperature, smells, tastes, a sense of presence, etc.).
The questionnaire contained four boxes, and participants were asked to briefly describe
each of their experiences in one of the boxes. They were also asked to rate whether they
thought that each of their experiences was due to a ghost (definitely yes, probably yes,
5 O ther item s on the questionnaire asked participants w hether they believed in the existence o f ghosts, w hether they believed
that they had previously experienced a ghost, etc. The findings will be reported in a separate article.
Parapsychology 109
An investigation into alleged ‘hauntings’ 205
uncertain, probably no, definitely no). If participants did not experience anything
unusual then they were instructed to simply return the blank questionnaire.
Procedure
The experiment was carried out in April 2001. Participants were self-selecting
members of the public who had seen the experiment listed in the programme of
the Edinburgh International Science Festival. Participants took part in one of six
daily sessions held over the course of 4 days. Each session involved a maximum of
10 people. The first part took place in a private function room close to the vaults.
At the start of the experiment, RW handed out numbered clipboards randomly,
which assigned a participant number to each person. RW briefly outlined the
purpose and procedure of the study, and demonstrated the kinds of apparatus that
could be used in scientific research into ghosts. RW then asked participants to
complete Questionnaire 1. Participants were then taken as a group down to the
vaults by FH, and then taken individually to a vault according to their randomly
assigned participant number (i.e. participant number 1 went to vault 1). Note that
RW was blind to the haunted order so he could not introduce bias by, say, assigning
apparently suggestible participants to particular vaults. FH was not blind to the
haunted order, but due to uneven flooring and low ceilings in parts of the vaults
her presence was needed for safety insurance reasons, and she had very limited
interactions with participants. Participants spent approximately 10 min in the vault
and completed Questionnaire 2. During this time FH retired to a separate area of
the vaults so she did not inadvertently influence participants’ reports. Two assistant
experimenters, who were blind to the haunted order, monitored participants while
they completed their questionnaire and were available in case anyone had a query
or a problem. Participants then returned their questionnaires to the assistant
experimenters. Participants were able to drop out of the experiment at any time
without penalty. Two did so. Participants were also given RW’s contact details in
case they required further advice or information following the conclusion of the
studies.
Apparatus
M agnetic fields, air tem perature and air m ovem ent
Local magnetic fields were measured using the same equipment as employed in Expt 1,
but with an increased sampling rate of 4 Hz. Air temperature and air movement were
measured with a Testo 445 multi-purpose datalogger connected to a Testo Hot Bulb
probe (temperature range: —20 to +70C, movement: 0 to 10 m/s: accuracy, sampling at
a rate of 0.5 Hz). Both the magnetic sensors and air temperature/movement probe were
placed into one vault prior to each group’s arrival. The participant in the vault was asked
to remain a few feet from the equipment to prevent potential artifacts. The equipment
logged data for 10 min. All measurements were made by PS, who was blind to the
number of unusual experiences reported in each of the areas whilst setting up and
operating the equipment. The magnetic sensor was sited at head height on a level part of
the floor, at least 1 m away from the participant and on the opposite side of the room
to any lighting circuits. When the participant arrived, PS started the recording and
then left the vault.
no Parapsychology
206 Richard Wiseman et al.
Light readings and physical dim ensions
The light levels within, and directly outside, each vault were measured using a Vital
Technologies Corporation Tricorder. At the end of the experiment, RW recorded the
light levels and physical dimensions of each vault. Light levels were recorded from the
centre of each vault, and involved pointing the light meter towards each of the walls of
the vault and taking an average of the readings obtained. The light level directly outside
the vault was obtained by placing the light meter in the centre of the vault and pointing
it towards the doorway of the vault.
Participants
The participants ( N = 218) each attended one of the 24 sessions in groups of up to 10
(91 males, 127 females); mean age: 35.3 ( S D = 13.20, age range: 11 to 77).
Results
Participants reported a total of 172 unusual experiences: 95 (43.58%) participants
reported at least one experience, and the mean number of experiences for
participants reporting one or more experiences was 1.81 ( S D = . 94). Again, the
majority of these experiences involved an unusual change in temperature, but also
included descriptions of apparitions, a strong sense of being watched, burning
sensations, strange sounds, odd odours, etc. When asked to rate whether experi
ences were due to a ghost, 1 (.67%) experience was rated ‘Definitely yes’, 4 (2.67%)
‘Probably yes’, 58 (38.67%) ‘Uncertain’, 65 (43.33%) ‘Probably no’ and 22 (14.67%)
‘Definitely no’.
Hypotheses
The correlation between the ‘Haunted Order’ and the mean number of unusual
experiences reported in each vault, was significant (/V = 10, rho=.76, p — .02,
two-tailed).
Prior knowledge
Participants indicating ‘yes’ or ‘uncertain’ to the question regarding prior knowledge
about where in the vaults people had experienced unusual phenomena in the past were
then excluded from the data ( N = 31). The correlation between the Haunted Order’
and the mean number of unusual experiences reported by the remaining participants
was highly significant ( N = 10, rho = .87, p = .009, two-tailed).
Environmental variables
Table 2 contains the correlations between each of the environmental variables, and
both the Haunted Order’ and the mean number of experiences reported by
participants. Overall, the magnetic field readings varied from 47,018-51,588nT, S D
from 4-32 nT. All of these measurements are within the natural fluctuation ranges
and are not inherently anomalous. This is to be expected given that the vaults had no
mains wiring other than a single, minimal lighting circuit.
Parapsychology 111
Table 2. Spearman rank correlation coefficients (corrected for ties), and two-tailed p values (in
parentheses), between each of the environmental variables, and both the ‘Haunted Order’ and mean
number of unusual experiences reported by participants with no prior knowledge of the vaults.
Statistically significant values are highlighted in bold
Acknowledgements
The authors would like to thank Ian Baker, Robert Chalmers, Dr Iliya Eigenbrot, Ian Franklin,
Christopher Gidlow, Ricky Glover, Fran Hollinrake, Dr James Houran, Dennis McGuinnes,
Professor Robert Morris, Elizabeth Whiddett, Rachel Whitburn, Jeffrey Wiseman and our referees
for their invaluable advice and assistance with these studies. We would also like to thank
Bartington Instruments, Hampton Court Palace, the Edinburgh International Science Festival,
114 Parapsychology
210 Richard Wiseman et al.
Mercat Tours, Land Infrared, L’Oriel Technology, Testo, the Perrott Warrick Fund, COPUS, and
Philip Harris Education for supporting this work.
References
Auerbach, L. (1986). ESP, h a u n tin g s a n d p o lte rg e ists. New York: Warner Books.
Beloff, J. (2001). Foreword. In J. Houran & R. Lange (Eds), H a u n tin g s a n d p o lte rg e ists:
M u ltid isc ip lin a ry p e rsp e c tiv e s. Jefferson, NC: McFarland.
Franklin, I. (1998). H a m p to n C o u rt gh osts. Unpublished manuscript.
Gallup (2001). A m e r ic a n s ’ b e lie f in p sy c h ic a n d p a r a n o r m a l p h e n o m e n a is u p o v e r la st d e c a d e
(http://www. gallup.com/poll/releases/prO10608.asp).
Gauld, A., &Cornell, A. D. (1979). P oltergeists. Boston, MA: Routledge &Kegan Paul.
Gearhart, L., & Persinger, M. A. (1986). Geophysical variables and behavior: XXXIII. Onset of
historical poltergeist episodes with sudden increases in geomagnetic activity. P erce p tu a l a n d
M o to r Skills, 62 , 463-466.
Guiley, R. E. (1994). The G u in n e ss e n c yc lo p ed ia o f g h o sts a n d spirits. London: Guinness
Publishing.
Halgreen, E., Walter, R. D., Cherlow, D. D., &Cranall, P. H. (1978). Mental phenomena evoked by
electrical stimulation of the human hippocampal formation and amygdala. B ra in , 1 0 1 ,
83-117.
Henderson, J. (1999). The to w n b e lo w th e g ro u n d . London: Mainstream.
Houran, J. (1997). Ambiguous origins and indications of ‘poltergeists.’ P erce p tu a l a n d M o to r
S kills, 8 4 , 339-344.
Houran, J., & Lange, R. (1996). Hauntings and poltergeist-like episodes as a confluence of
conventional phenomena: Ageneral hypothesis. P erce p tu a l a n d M o to r Skills, 8 3 , 1307-1316.
Houran, J., &Lange, R. (1998). Rationale and application of a multi-energy sensor array in the
investigation of haunting and poltergeist cases.J o u r n a l o f th e S o ciety f o r P sy ch ic a l R esearch ,
62 , 324-336.
Houran, ]., &Lange, R. (Eds.) (2001). H a u n tin g s a n d p o lte rg e ists: M u ltid isc ip lin a ry p e rsp e c tiv e s.
Jefferson, NC: McFarland.
Houran, J., & Williams, C. (1998). Relation of tolerance of ambiguity to global and specific
paranormal experience. P sy ch o lo g ic a l R ep o rts, 8 3 , 807-818.
Irwin, H. J. (1999). A n in tro d u c tio n to p a ra p s y c h o lo g y (3rd ed.). Jefferson, NC: McFarland.
Konig, H., Fraser, J. T., & Powell, R. (1981). B io lo g ic a l effects o f e n v ir o n m e n ta l
e le ctro m a g n etism . Berlin: Springer-Verlag.
Korinevskaya, I. V., Kholodov, Y. A., & Korinevskii, A. V (1993). Effect of bilateral peripheral
application of an alternating magnetic field on the EEG in humans. H u m a n P h ysio lo g y, 19,
213-217.
Lange, R., &Houran, J. (1997). Context-induced paranormal experiences: Support for Houran and
Lange’s model of haunting phenomena. P e rce p tu a l a n d M o to r Skills, 8 4 , 1435-1458.
Lange, R., &Houran, J. (1998). Delusions of the paranormal: A haunting question of perception.
J o u r n a l o f N e rv o u s a n d M e n ta l D isease, 1 8 6 , 6 3 7 -6 4 5 .
Lange, R., &Houran, J. (1999). The role of fear in delusions of the paranormal.J o u r n a l o f N e rv o u s
a n d M e n ta l D isea se, 1 8 7 , 159-166.
Lange, R., Houran, J., Harte, T. M., &Havens, R. A. (1996). Contextual mediation of perceptions in
hauntings and poltergeist-like experiences. P erce p tu a l a n d M o to r Skills, 8 2 , 7 5 5 -7 6 2 .
Law, E. (1918). The H a u n te d G a llery o f H a m p to n C ourt. London: Rees.
Lundberg, A. (1998). The e n v ir o n m e n t a n d m e n ta l h ealth : A g u id e f o r clin icia n s. Mahwah, NJ:
Erlbaum.
MacKenzie, A. (1982). H a u n tin g s a n d a p p a ritio n s. London: Paladin Grafton.
Maher, M. C. (1999). Riding the waves in search of the particles: A modern study of ghosts and
apparitions. J o u r n a l o f P a ra p sych o lo g y, 63, 47-80.
Parapsychology 115
An investigation into alleged ‘hauntings’ 2 1I
Roll, 1976, 1977; Teguis & Flynn, 1983). These include the presence of a child
or adolescent (John-Paul), and preceding family disruptions (e.g. moving house
and the moving away from the family home of four out of five children). A
strict religious background in the family is also quite typical of poltergeist
cases (Teguis & Flynn, 1983). Although this may be observed in the childhood
of both David and Rose-Mary, it does not seem to apply within their own
family. The fact that Rose-Mary’s family apparently experienced a poltergeist
when she was a child may be relevant, however.
Other aspects of the case are more unusual. Stains and especially carvings
of shapes and words are rare in haunting and poltergeist cases, although there
are some famous precedents. Perhaps the clearest parallel is with the contro
versial ‘Marianne’ writings on walls and paper in the Borley Rectory case
(e.g. Banks, 1996; O’Neil, 2002). Similarities may also be suggested with the
Spanish ‘Faces of Belmez’ (e.g. Tort & Rufz-Noguez, 1993) although the images
in that case were much more painterly and naturalistic. Like the stains in the
present case, however, the Belmez faces were said not to respond to attempts
at removal using detergents or by scrubbing. Also like the present stains, the
faces were reported to appear, change and eventually disappear without any
obvious explanation.
Another distinctive feature of the present case is the sheer volume, frequency,
duration and variety of the phenomena involved. Although haunting cases may
continue for decades or centuries, poltergeist manifestations typically last for
only a few months, although cases spanning several years have been reported
(Roll, 1976, 1977). The first clear poltergeist-type activity that may be identi
fied in this case was the appearance of the stained cross on the fireplace in
October or November 1998. At the time of writing, more than three years later,
similar phenomena continue to be reported.
The present case is also notable for the absence of any bombardment by
projectiles or observed movements of objects, which are two of the most
common features of poltergeist activity (Roll, 1976, 1977). Smells, temperature
changes, and visual apparitions, on the other hand, are generally more typical
of hauntings rather than poltergeist manifestations (Owen, 1964; Roll, 1976,
1977). Perhaps the most unusual feature of this case, however, is the apparently
benign quality of the phenomena. In contrast, poltergeist manifestations are
usually annoying, unpleasant, very disruptive or traumatic, often expressing
indirectly underlying emotional tensions within the family (e.g. Roll, 1976,
1977; Teguis & Flynn, 1983). Although the ‘presence’ of Brother Doli undeniably
affected the family dynamics in various ways, his influence generally seemed
to provide the family with a sense of common interest and focus. The benign
nature of the manifestations is most obviously expressed in the clearly and
consistently religious language and imagery of the stains and carvings,
although the apparitions were also reported to impart a sense of peace and
serenity. It is, of course, arguable whether the vision of the Virgin Mary and
subsequent healing allegedly experienced by the Dooleys should be treated as
connected with the other phenomena in this case. My own view, however, is
that there are sufficient similarities to presume a connection.
Although some of the phenomena may be explainable in terms of natural
artefacts, suggestibility of witnesses, errors of perception, or lapses in memory,
217
142 Parapsychology
J o u r n a l o f th e S o c iety fo r P sy c h ic a l R esea rch [Vol. 66.4, No. 869
it is obvious from the evidence of the stains, carvings and photographs that
something very unusual has been going on at the Gowers’ home. The question,
then, is whether these unusual occurrences indicate a genuine case of para
normal activity or whether, on the contrary, they are a hoax. We should also
remain open to the possibility that the case may represent some complex
mixture of the genuine and fraudulent (cf. Roll, 1976,1977).
If we consider the evidence of the physical phenomena (e.g. stains, carvings,
photographs, object disappearances, e-mails) then it is clear that all of them
could be hoaxed. Disappearing objects and e-mails would be relatively easy
to hoax. Although we do not yet know the chemistry involved, the stains on
plaster and stone may not be difficult to produce, although their disappearance
would perhaps require a more subtle knowledge of chemistry. David, of course,
has degrees in Chemistry and he readily concedes that this makes him a likely
suspect in any hoax.
I remain somewhat perplexed by the prominent stain of the ‘monk’ shape
that appeared in a photograph I had taken (Figure 13), but that did not seem
to have been there at the time (nor was observed subsequently). It is, of course,
possible that my memory is mistaken, or that there may have been a faint stain
on the wall that was ignored during the survey and for some reason has been
particularly highlighted in the photograph. On the other hand, Rose-Mary has
reported similar photographic additions on several occasions. It is perhaps
significant that I can provide possible independent confirmation of this.
Most of the carvings in plaster, stone and wood could easily be executed with
a little care and time. The very finely executed carving of words on the ‘Monk
Stone’ would seem, however, to be the work of a skilled craftsperson. The
overpainting of the plaster carvings would also be difficult to do convincingly
and undetectably, as would be the apparent disappearance and filling in of
some of these carvings. On the other hand, some of the plaster carvings have
appeared to show very crude attempts at filling and overpainting.
There is some evidence that strongly points to certain phenomena being a
hoax. Most obvious, perhaps, are the apparent spelling mistakes that seem
to be consistent with errors made by a non-Welsh-speaking person, possibly
someone with poor eyesight, who has access to a Welsh dictionary. Then there
is the evidence of the ‘Monk in the Mirror’ photograph, which seems to indicate
quite clearly that the image has been daubed on the mirror.
The sceptic will also point out a number of other weaknesses or suspicious
features in this case that could suggest hoaxing. These include
• An apparently Welsh monk, who can write Welsh words, but cannot
construct a sentence in the language.
• The apparent attempt to respond to hints dropped by the researcher (e.g. for
inaccessible stains, disappearing and raised carvings).
• The ‘convenient’ power cut during video surveillance.
• The announcement of Brother Doli’s farewell that occurred within six weeks
of sharing my concerns about the ‘Monk in the Mirror’ photograph.
• The failure, following this sharing of concerns, to locate negatives of other
key photographic anomalies.
• The failure to confirm the existence of the Dooleys.
218
Parapsychology 143
October 2002] T he 'B ro th e r D o l i ’ C a se
If some or all of the phenomena are a hoax, it seems clear that either or
both David and Rose-Mary must be the perpetrators (possibly conspiring with
others). Because of the amount of time she spends at home, Rose-Mary would
appear to have the clearest opportunity. Also, many of the phenomena involve
her directly (e.g. the apparitions and photographic anomalies). It is inconceiv
able that John-Paul has the ability to perpetrate such a sophisticated hoax
unaided and the other children would seem to have little opportunity to play
anything more than a minor role in any conspiracy. On the other hand, on
several occasions, new stains and carvings have appeared following family
trips away from home. It is conceivable, therefore, that some other person may
have access to the house during these times.
Given the range of phenomena reported, a hoaxer would need to have at
least a general knowledge of typical poltergeist manifestations, although this
is not difficult to acquire. Rose-Mary had her own experience of poltergeist
activity during her childhood. As a result of this, both Rose-Mary and David
have a long-standing interest in poltergeist and related phenomena. They are
also past associate members of the SPR.
The question of the possible motivation for a hoax is a relevant consideration.
The family does not seem to be making substantial money from the case at this
time, although it is possible that some future financial exploitation might be
anticipated (e.g. a book or film). It is true that Rose-Mary’s art business may
benefit from the connection she has established with the Brother Doli pheno
mena. However, Rose-Mary’s artistic work did not commence until January
2000, nearly three years after the first phenomena (the Marian visions) were
reported. Another obvious possibility is that the family enjoys the interest
generated in the case, especially among the media and on the Internet. Rose-
Mary in particular may be said to have ‘promoted’ the case in various ways,
through radio and TV appearances, in newspaper interviews and on several
websites (e.g. Gower, 2000, and others cited in the references). I do not know
the individuals or the family dynamics well enough to speculate whether there
might be other, more complex, psychological motives for a hoax. Perhaps
the most straightforward explanation might be that the phenomena simply
represent the hoaxer’s hobby, or are an expression of creative fun, with an
additional reward being the delight of fooling others.
Against the argument that this is a hoax perpetrated by a single person,
there is the testimony of the other members of the family confirming that
strange noises are frequently heard in the house and that the stains and
carvings often appear under seemingly impossible conditions. Also there are
the apparition-type experiences reported by the daughters (John-Paul’s
experiences are, of course, very difficult to assess). This confirmation could
indicate either a family conspiracy, or else that at least some of the phenomena
may be genuine (i.e. not fabricated, albeit not necessarily paranormal).
Alternatively, it may indicate that other members of the family have simply
responded to the suggestions provided by the framing of the phenomena as the
work of ‘Brother Doli’ and have therefore been set to experience and interpret
events at the house accordingly.
If the case involves genuine poltergeist-type activity, the question arises as
to who may be the focal person or catalyst (Bender, 1982; Owen, 1964; Roll,
219
144 Parapsychology
J o u r n a l o f th e S o c iety fo r P sy c h ic a l R e sea rch [Vol. 66.4, No. 869
1976, 1977). John-Paul might appear to be the obvious candidate, especially
considering his age together with his activities and reported statements in the
final days before Brother Doli left. However, as mentioned, he seems to lack
understanding of, and shows no real interest in, the phenomena and there is
little evidence generally for his direct involvement. In contrast, Rose-Mary
does appear to be the central figure in many of the reported events. It is
therefore possible that she may be the focus and that she possesses some
latent psychokinetic potential. Rose-Mary’s own reported childhood poltergeist
experiences may be indicative in this context.
The third main possibility is that this is a mixed case in which there is a
core of genuine but relatively low-level paranormal phenomena (e.g. noises,
apparitions, and possibly some staining) that has been deceptively imitated or
elaborated upon. Such elaboration or ‘imitative fraud’ (Cox, 1961), perhaps
occurring in a dissociated state of mind, has been suggested as a feature to be
considered in poltergeist investigations (e.g. Bender, 1982; Roll, 1976, 1977).
On this assumption, investigation becomes particularly difficult because
the suspicion or discovery of some hoaxed elements does not immediately
invalidate all other aspects, although it inevitably tarnishes the overall
reputation of the case. The investigator’s problematic task then becomes one of
carefully examining each phenomenon separately in the attempt to establish
or eliminate the various possibilities for fraud.
In conclusion, the phenomena that have occurred, and are continuing to
occur, at the Gowers’ home are extraordinary. They also remain highly
ambiguous. The reader may wish, of course, to draw his or her own conclusions
from the evidence I have outlined in this paper. In my opinion it is not possible,
at this stage, to be certain about the status of this interesting case.
Acknowledgements
I would like to express my particular thanks to David and Rose-Mary Gower
for their invitation to conduct this investigation, for their continuing openness
and co-operation, and for their most generous hospitality on my visits. To their
family, thanks for their willingness to discuss the case. Keith Nicholson, Mike
Kavanagh, and Steve Lawler of Liverpool John Moores University contributed
invaluable technical advice and assistance. Dr Branwen Jarvis of University
of Wales Bangor provided most helpful consultation on the Welsh language.
Three anonymous reviewers and Dr Zofia Weaver gave helpful and detailed
suggestions for improvements in content and style.
School of Psychology
Liverpool John Moores University
Henry Cotton Campus
15-21 Webster Street
Liverpool L3 2ET m.i.daniels@livjm.ac.uk
REFERENCES
Banks, I. (1996) The E n igm a o f B orley R ecto ry . London: Foulsham.
Bender, H. (1982) Poltergeists. In Grattan-Guinness, I. P sych ica l R esearch: A G u ide
to Its H isto ry, P rin ciples & P ractices, 123— 133. Wellingborough, Northamptonshire:
Aquarian Press.
220
Parapsychology 145
October 2002] T he ‘B ro th e r D o lV C a se
221
Part II
Testing Psychic Claimants
[8]
A BRIEF OVERVIEW OF M AGIC FOR PARAPSYCHOLOGISTS
By George P. Hansen
It has long been recognized that a trickery can give a greater appreciation that magicians are the enemy of psy
knowledge of magic (e.g., legerde of required controls. This is especially chical researchers, yet Truzzi (1983)
main) is quite useful in investigating important when developing new meth noted two polls of magicians in which
certain types of psychic phenomena. ods of testing gifted subjects. Even if over 70 percent believed in some form
In the early years of psychical re a researcher sticks to well developed of psi. Many magicians seem to have
search, there was considerable contact methods and tests only unselected sub misperceptions about other magicians’
between researchers and magicians (for jects, he may be called upon to referee actual beliefs. For instance, I know
a superb overview see Truzzi, 1983). papers which do require a knowledge two professional conjurors who claim
For instance, Howard Thurston, the of conjuring. to have been friends of David Hoy;
greatest magician of his time, witnessed I think that it is also worth noting one said that Hoy did not really believe
table levitation by Eusapia Palladino that the major critics of parapsychology himself to have psychic abilities, the
and seemed quite convinced of its be in this country (Diaconis, Randi, Hy other emphatically stated that Hoy
ing genuine (Carrington, 1954). How man, Truzzi, Christopher and G ard certainly believed himself psychically
ever, in more recent years, with the ner )have all performed magic pro gifted. Today there are a number of
increased emphasis on laboratory work, fessionally or have published in magic practitioners who believe that part of
there has been much less interaction. periodicals. In addition, Diaconis, R an w hat they do is “real” (Ruthchild,
Nevertheless, in a 1981 poll of para di, Hyman and Truzzi have been con 1983).
psychologists, 92 percent agreed with sulted regarding federal funding of psi A full discussion of the value and
the statement “Magicians can play a research (M cRae, 1984). Gardner and limitations of magic and magicians in
positive role in parapsychology” and Christopher are both very highly re parapsychology is far beyond the scope
70 percent agreed that “Magicians garded authorities on magic. The field of this paper. A number of papers
should be consulted by parapsycholo of parapsychology has no one compar could (and should) be written ad
gists in setting up tests for alleged psy able. My own experience indicates that dressing this topic. Certainly having a
chics” (Truzzi, 1983). Indeed a num very few parapsychologists are fami magician present during an experiment
ber of parapsychologists have consulted liar with (let alone read) even the is no guarantee against fraud. Indeed,
magicians in the course of their work most basic works on the topic. The there are instances in which magicians
(e.g., Beloff, 1984; Bender, Vandrey parapsychology students at John F. are of no use and could even be detri
and W endlandt, 1976; Bersani and Kennedy University are a notable ex mental. Collins (1983) very effectively
Martelli, 1983; Grussard (Randall, ception (a class, Creation of Illusions, delineates some of the problems in
1982); Eisenbud, 1967; Haraldsson is required of them ). volved in working with conjurors.
and Osis, 1977; Hasted, 1981; Roll A knowledge of legerdemain is also Nevertheless, magicians do have a
and Pratt, 1971; Shafer and Phillips, helpful when consulting magicians. place in psi research. The first step
1982; T arg and Puthoff (Marks and Conjurors’ norms and values are con in attaining the benefits of magicians’
Kam mann, 1980) ). In fact, parapsy siderably different from those of scien knowledge is for parapsychologists to
chologists A rthur Hastings, Loyd Auer tists. For instance, publicly revealing become a bit more educated on the
bach and W. E. Cox are themselves the modus operandi of a trick is con topic. In this paper I want to discuss
magicians. Also, the Parapsychological sidered by most magicians to be a seri some factors in learning about magic
Association at its 1983 meeting pre ous breach of ethics. Randi has been and give a few tips on educating one
pared a formal statement indicating bitterly attacked by magicians for his self. Here I can provide only the brief
that more cooperation with magicians exposes of Uri Geller. Thus, in some est of introductions. To become really
should be encouraged (PA Statement quarters, exposing such fraud is con proficient at magic takes years, but a
on Magicians, 1984). sidered unethical! On the other hand, greater appreciation can be had with
There are a number of reasons para other magicians have strongly criticized less effort. Hopefully, when more para
psychologists should learn about magic. mentalists for posing as true psychics psychologists know more about this
A knowledge of legerdemain can give (e.g., David Hoy has been posthumous area, a more fruitful discussion with
the field investigator a better chance ly attacked for this (Prince, 1983) ). magicians may be possible. Thus, my
of making a preliminary evaluation of For in-depth discussions of some of goal here is to give the reader a bit of
uncontrolled, ostensible paranormal these issues and the sociology of magic familiarity with magic so that he can
phenomena. In some cases it is pos see Collins (1983), Gloye (1964) and start educating himself. I do want to
sible to know whether trickery was used Stebbins (1982). strongly emphasize that a little learn-
just given a brief description of the A confusing, but fascinating aspect George Hansen is Research Fellow at the
effect (e.g., driving a car blindfolded). of this area is the actual belief in psi Institute for Parapsychology, Durham,
In laboratory work, a knowledge of by conjurors. G ardner (1983) claims North Carolina.
150 Parapsychology
ing can be a dangerous thing. It is very Bizarre magic involves such things the procedure. Unfortunately it is not
easy to fool oneself into thinking that as decapitation, cremation and demon always easy to say w hat constitutes
one can detect trickery after reading a stration of occult powers. It is usually “control.” Randi (1981) has clearly
book or two. This is definitely not the performed for a small intimate group demonstrated that, in some cases, an
case. rather than on stage. The following experimenter thought he had been in
description of one effect will give the control, but in reality it was the sub
Types of M agic flavor of this rather obscure area: “Af ject who was calling the shots.
ter a talk about Charon, the legendary
There are many different specialties boatman on the river Styx who escort Observing Magic
within magic; only some of these are ed the souls of the damned across to
of importance to parapsychologists. Hell, you begin. You introduce a pack Probably the best way to learn about
The various types of magic are perti et of five yellowed T arot cards, and being deceived is to actually experience
nent to different areas within psychical a parchment covered with mysterious it. Watching a skilled close-up artist
research and the level of useful knowl symbols. One of the cards is chosen can be mind-blowing. I would urge all
edge of magic will depend upon the and signed. It is placed with its mates researchers to spend some time doing
area being investigated. For instance, into the vessel of earth, each card be this. There needs to be some warning,
a person involved in studying reincar ing covered by a small mound of dirt. however; the vast majority of magici
nation or near-death experiences will Suddenly the lights dim and unearthly ans present very bad magic. Most per
have little or no need of magic or sounds echo through the room. One form the same old tricks in a very bor
magicians. O n the other hand, a per of the mounds of dirt catches fire—a ing manner. Most lack showmanship
son investigating a physical medium gleaming white spark. There’s an in skills and have not developed approp
under field conditions will have a great credible flash of red light and one of riate timing and misdirection. Never
deal of need for knowledge of conjur the spectators shrieks and jumps to his theless, there are a num ber of accom
ing. Similarly, most magicians special feet clutching his arm in pain. The plished ones worth watching.
ize in only a few areas and the vast mage ends the ceremony quickly, turn One of the most direct ways to learn
majority would not be worth consult- ing up the lights. T he spectator is ex magic is to become involved in the
ing by parapsychologists. amined and is seen to be branded with local magic scene. Most moderate size
Some of the types of conjuring in the mark of Charon in the form of a cities have groups of magicians affili
clude escapology, illusion, kid magic, blood red welt. Examining the vessel, ated with one of the two m ajor na
bizarre magic, spirit effects, rope tricks, the smouldering mound of earth is tional associations (International Bro
close-up, card magic, stage magic and found to contain the chosen Tarot therhood of Magicians (IBM ) and
mentalism. This list is by no means in card. Neat, huh?” (Epoptica, 1983, the Society of American Magicians
clusive and in practice the various 206-207). Needless to say, this is not (SAM) ). These groups usually meet
types blend and overlap. Of these, typical parlor magic. Parapsychologists monthly and can be found by contact
mentalism and close-up have the great or anthropologists studying psi in other ing the local magic shop. They are
est implications for the psychical re cultures should at least be aware of this largely composed of amateurs and, as
searcher generally, though spirit tricks branch. Such information is available such, most magic performed at the
and bizarre magic are relevant for cer not only in the developed countries; meetings is low quality. However, these
tain types of investigation. in fact, African witch doctors have groups often will have a few highly
M entalism is the branch of conjur been known to buy materials from a skilled performers who are well worth
ing specializing in simulation of psy British magical supply house (Booth, watching. In addition, they sometimes
chic events. Examples of these include 1982). bring in professionals for lectures.
headline prediction, cold reading, mind There is another small specialty area When a person starts to learn about
reading, billet reading, blindfold sight that might be termed “opportunistic magic, he will find that there are two
and Hellstromism or so-called muscle psychic magic.” This does not involve parts to every trick. T he first and
reading. Close-up magic, as its name performing a routine act, but rather most important is the “effect” (w hat
implies, allows spectators to observe at taking advantage of opportunities as the spectator sees). T he other is the
very short range. In fact, many of they arise. This brand of trickery can method. The student who learns only
these tricks can be performed within be especially powerful and has definite the method without first seeing the ef
one foot of the spectators. These feats implications for parapsychology. Two fect will almost certainly be disappoint
typically involve the appearance and works which describe this in detail ed, because most tricks are accomplish
vanishing of small objects such as coins are ostensibly by U riah Fuller (1975, ed by surprisingly simple means. In
or cards, increasing the size of coins 1980). This can be especially insidious actuality, very, very few tricks require
and other effects virtually identical to when the observer or experimenter sophisticated technology or exotic
those reported around Sai Baba by thinks that he is in charge of things. chemicals. T he real skill comes in the
Haraldsson and Osis (1977). Some of Parapsychologists (e.g., Cox, 1974) presentation—subtle misdirection, con
these require skillful sleight of hand, have claimed that one can easily say trol of the audience and the like. Full
but many take only a few minutes to that a phenomenon is genuine if the appreciation of this cannot be achieved
learn. observer-experimenter has control of by mere book learning.
6 Parapsychology Raviaw
Parapsychology 151
Conjuring can have an especially seven times. There are other catalogs learn a bit about conjuring. Most para
strong impact when the tricks are pre which specialize in electronic devices psychologists cannot become experts,
sented as though they are or may be for fake ESP demonstrations. for that takes years. But with greater
real. Reynolds (Benassi, Singer and The number of books is also quite knowledge, investigators would be in a
Reynolds, 1980) posed as a psychic astonishing. The beginning student better position to consult magicians. I
metal bender for several university psy should be aware that there are basi hope this paper goes at least a little
chology classes. His presentation was cally two different categories of books, way toward that end.
so effective that a number of the stu those for the beginners or lay public
dents thought that he was in league and those for the serious conjuror. Al A BRIEF ANNOTATED LIST
with the devil! Nor are such results though there are thousands of different OF
SOURCES OF INFORMATION
limited to naive college students. M ar tricks, there is usually considerable
tin Johnson (1975-1977) had a m a overlap in w hat is presented in most CATALOOS
gician perform during the 1976 PA books (this is especially true for be Tannen’s (Louis Tannen, Inc., 6 W. 32nd
convention and a number of parapsy ginners’ books). The elementary works St., 4th Floor, New York, NY 10001).
chologists suggested that he was actual An 800 page catalog of apparatus and
can be purchased in most well-stocked books.
ly using psychic means to perform his bookstores; however, these are unlike Abbott’s (Abbott’s Magic Mfg. Co., Colon,
tricks! ly to describe how tricks of top per MI 49040). A 450 page catalog of books
formers of today are actually done. and apparatus.
Literature
The two most important books on Mickey Hades (Mickey Hades Internation
T he published material on conjur al, 110 Union St., No. 500, Box 2242,
mentalism are Practical M ental Magic Seattle, WA 98111). Hades has several
ing is quite immense. In fact, there is by Theodore Annemann and Thirteen catalogs of books and supplies. An espe
a periodical, Epoptica, which is de Steps to Mentalism by Gorinda (the cially large selection on mentalism.
voted to reviews of current literature most im portant). Also good is The Magic Inc. (5082 N. Lincoln Ave., Chi
and apparatus; recently it has averaged cago, IL 60625). Several catalogs avail
Handbook of M ental Magic by M ar able. Good selection on mentalism.
over 50 pages. In spite of this, the vin Kaye. These volumes do provide a MAGAZINES
m aterial is hidden to the casual ob good overview, but one should realize Magick (1065 La Mirada St., Laguna
server. Even a university library with that even $1000 would put only a mod Beach, CA 92651). Fortnightly, usually 4
several million volumes will usually to 6 pages per issue. Devoted exclusively
erate dent in the mentalist material to mentalism.
have only 10 to 20 books on magic. available. Attached is a very brief list The New Invocation (Illusions, Ltd., P.O.
Magic shops have a generally restrict of sources of information. Several oth Box 2530, Chicago, IL 60690). Six times
ed selection. Nearly all the best, impor ers have also recently prepared sug a year, usually 12 pages per issue. De
tant books are available only through gested reading lists (e.g., Auerbach, voted to weird and bizarre magic.
mail order* outlets or from dealers at Genii (P.O. Box 36068, Los Angeles, CA
1983; Hoener, 1983; Webb, 1983). 90036). Monthly, approx. 60 pages per
conventions. Rarely are they listed in In this “hidden” literature there is a issue. This slick magazine covers all areas
Books in Print. In addition to most considerable amount on the psychology of magic, contains a number of columns,
magic publications being quite obscure, announcements, etc.
they are usually expensive. A small of deception (especially notable are The Linking Ring. This is the monthly
stapled booklet of thirty pages can eas Bruno, 1978; Fitzkee, 1981; Maskelyne magazine of the IBM and is available to
and Devant, 1946 and Randall, 1982). members only. It has approx. 150 pages
ily cost $10. The prices are intention These works are impressive from a per issue.
ally inflated to keep the knowledge to M-U-M. This is the monthly publication
a restricted few. In fact, books and theoretical standpoint and can give an of the SAM and is available to members
tricks are frequently criticized in re understanding of the topic. However, a only. It has approx. 50 pages per issue.
views for having too low a price! theoretical knowledge is of little use in
recognizing trickery unless one has con BIBLIOORAPHY
I would suggest that all parapsychol siderable experience in direct observa Annemann, T. Practical Mental Magic.
ogists obtain two or three of the large tion and performance. New York: Dover, 1983. (Originally
catalogs of books and apparatus. I While on the topic of books, there is
published 1944).
have shown several of these to a num Auerbach, L. M. “Mentalism for parapsy
ber of people who have been quite another source outside that of the chologists.” ASPR Newsletter, 1983, 9
magic fraternity, that is Dover Publi 1, 4.
amazed at the extent of the items Beloff, J. “Research strategies for dealing
available. These catalogs can serve as cations. Much to the dismay of many with unstable phenomena.” Parapsychol
a healthy antidote to the notion that magicians, Dover has reprinted a num ogy Review, Jan.-Feb., 1984, 1-7.
tricks are for kids. Abbott’s and Tan- ber of classic books on conjuring. Benassi, V. A., Singer, B. and Reynolds,
nen’s catalogs are only $5 and each These are best buys and easily ob C. B. “Occult belief: Seeing is believing.”
tained. Journal for the Scientific Study of Re
contains over 400 pages. Micky Hades ligion, 1980, 19, 337-349.
publishes a number of excellent works Bender, H., Vandrey, R. and Wendlandt,
on mentalism. During a quick glance Conclusion S. “The ‘Geller effect’ in Western Ger
through their catalog on mentalism, I This has been an all too brief out many and Switzerland: A preliminary
report on a social and experimental
found 20 tricks with ESP cards and line of a vast area. I do hope that study.” In J. D. Morris, W. G. Roll and
J. B. Rhine’s name was mentioned more researchers will take the time to R. L. Morris (Eds.) Research in Para
MarcK-April, 1985 7
152 Parapsychology
psychology 1975, Metuchen, N .J.: Scare Gardner, M. “Lessons of a landmark PK PA Statement on Magicians. Parapsychol
crow, 1976. hoax.” Skeptical Inquirer, Summer, ogy Review, March-April, 1984, 16.
Bersani, F. and Martelli, A. “Observations 1983, 16-19. Price, D. “Has magiedom reached matur
on selected Italian mini-Gellers.” Psy Gloye, E. E. Institutionalized Secrecy and ity?” Genii, May, 1983, 337-338.
choenergetics: The Journal of Psycho Mass Communications; An Analysis of Randal, J. The Psychology of Deception
physical Systems. 1983, 5, 99-128. Popular Literature on Conjuring. Un (W hy Magic Works). Venice, California:
Booth, J. “Memoirs of a magician’s ghost.” published manuscript, Whittier College, Top Secret Publications, 1982.
Linking Ring, September, 1982, 54-57, 1964. Randall, J. L. Psychokinesis: A Study of
108. Haraldsson, E. and Osis, K. “The appear Paranormal Forces Through the Ages.
Bruno, J. Anatomy of Misdirection. Balti ance and disappearance of objects in the London: Souvenir Press, 1982.
more: Stoney Brook Press, 1978. presence of Sri Sathya Sai Baba.” Jour Randi, J. “More card tricks from Susie
Carrington, H. The American Seances with nal of the American Society for Psychical Cottrell.” Skeptical Inquirer, Spring,
Eusapia Palladino. New York: Garrett Research, 1977, 71, 33-43. 1981, 70-71.
Publications, 1954. Hasted, J. The Metal Benders. London: Roll, W. G. and Pratt, J. G. “The Miami
Collins, H. “Magicians in the laboratory: A Routledge and Kegan Paul, 1981. disturbances.” Journal of the American
new role to play.” New Scientist, June Hoener, G. J. “Recommended reading for Society for Psychical Research, 1971, 65,
30, 1983, 929-931. the psychic entertainer.” Magick, March 409-454.
Corinda, T. Thirteen Steps to Mentalism. 11, 1983, 1585-1586. Ruthchild, M. “Cold Reading: The reality
New York: Tannen Magic, 1968. Johnson, M. “Some reflections after the and the illusion.” The New Invocation,
Cox, W. E. “Parapsychology and magici P. A. convention.” European Journal of October, 1983, 195.
ans.” Parapsychology Review, May-June, Parapsychology, 1975-1977, 1, Part 3, Shafer, M. and Phillips, P. R. “Some in
1974, 12-14. 1-5. vestigations of claims of PK effects on
Eisenbud, J. The World of Ted Serios. New Kaye, M. The Handbook of Mental Magic. metal and film by Masuaki Kiyota; II.
York: William Morrow, 1967. New York: Stein and Day, 1975. The St. Louis experiments.” Journal of
Epoptica. Review of The Book of Shadows Marks, D. and Kammann, R. The Psychol the American Society for Psychical Re
by Randy L. Clower. 1983, No. 4, 206- ogy of the Psychic. Buffalo, New York: search. 1982, 76, 233-236.
207. Prometheus, 1980. Stebbins, R. A. “Making magic: Production
Fitzkee, D. Magic by Misdirection. Oak Maskelyne, N. and Devant, D. Our Magic. of a variety act.” Journal of Popular Cul
land, California: Magic Limited/Lloyd Berkeley Heights, New Jersey: Fleming ture, 1982, 16, 116-126.
E. Jones, 1981. Book Co., 1946. Truzzi, M. Reflections on Conjuring and
Fuller, U. Confessions of a Psychic. Tea- Psychical Research. Unpublished manu
McRae, R. M ind Wars. New York: St. script, 1983.
neck, NJ: Karl Fulves, 1975. Martin’s, 1984. Webb, W. “Starting in mentalism— an un
Fuller, U. Further Confessions of a Psychic. Nelms, H. Magic and Showmanship. New orthodox view.” Linking Ring, October,
Teaneck, NJ: Karl Fulves, 1980. York: Dover, 1969. 1983, 65-66.
8 Parapsychology Review
[9]
W hat Is Your Counter-Explanation?
A Plea to Skeptics to Think Again
JOHN BELOFF
Isay it is a scandal that the dispute as to the reality o f these phenomena should
still be going on—that so many competent witnesses should have declared their
belief in them, that so many others should be profoundly interested in having
the question determined, and yet the educated world as a body should still be
simply in the attitude o f incredulity.
Henry Sidgwick (Presidential Address to the Society
for Psychical Research, June 17, 1882)
Belief is not a matter of choice. In the end, either one is convinced by the evi
dence of the arguments or one is not. Long before that stage is reached, however,
there is ample room for dialogue. Perhaps the evidence on which we based our
belief was inadequate. Perhaps the arguments on which we relied were unsound.
At all events, I think it is important that the dialogue between parapsychologists
and their critics should continue. There will always be those who will say that
controversy is a waste of time, that the two sides will never see eye to eye, that
time spent in this way would be better spent on research. But, while I have some
sympathy with this view, I regard it as short-sighted. So long as incredulity
remains the typical response of the scientific community to parapsychological
claims, parapsychology will be accorded a low level of priority in the competition
for funding and resources, and this, in turn, will retard its progress, thereby
reinforcing the initial incredulity. Is there any way out of this impasse?
There are, I suggest, two assumptions skeptics habitually make that should
not go unchallenged. The first is that only the strict experimental evidence needs
to be taken seriously, that any other kind of evidence can be dismissed as
anecdotal and unscientific. The second assumption is that, given the antecedent
improbability of the phenomena, nothing short of their becoming common
place—which means, in effect, finding a way of producing them on demand,
could ever justify our accepting them at face value. The two assumptions go
hand in hand. Thus, if there were, at the present time, some unequivocally
This chapter is dedicated to the memory of Piet Hein Hoebens (1951-1984).
154 Parapsychology
360 What is Your Counter-Explanation?
repeatable psi effect, there would be every reason to emphasize the experimental
evidence because it alone would allow us to satisfy ourselves as to its validity
without having to take anyone else’s word for it. But, of course, if such were the
case, parapsychology would no longer be the controversial science that it now
is. The present controversy takes as its point of departure the fact that the
disputed phenomena are too unstable and elusive to permit such an outcome.
We cannot even say at the present time whether such an outcome is even
theoretically possible. In these circumstances the experimental evidence takes on
a very different complexion. For an unrepeatable experiment is just another
unique historical event; it no longer represents a recipe for obtaining similar
results as experiments ordinarily purport to do.
Hence, to rely exclusively on the experimental evidence to settle the question
of the basic existence of psi is to betray a profound misunderstanding of the role
of experimentation in science. Scientists do not carry out experiments with the
aim, primarily, of making converts, though every successful experiment strength
ens the credibility of the phenomenon under investigation. They carry out experi
ments in order to test hypotheses and thereby advance our understanding of the
phenomena. And, if a truly repeatable experiment should prove possible, it is
precisely by increasing our knowledge of the phenomena that we shall arrive at
it. That is why most parapsychological researchers at the present time are
avowedly “process-oriented” rather than “proof-oriented” in their work, even
when their original belief in the reality of psi may have sprung from some
personal experience and not from their work in the laboratory. But, however
important such process-oriented research may be in the long run, I do not think
we have yet reached the stage where it can be made to bear the weight of the
controversy directed at the proof issue.
Meanwhile, there is a danger that exclusive preoccupation with the experi
mental evidence may lead us to overlook the fact that what we may be getting in
the laboratory is no more than a weak, fitful, or degenerate manifestation of psi.
After all, no one would ever have had recourse to the laboratory in the first
instance had there not been a strong presumption that psi occurs in the outside
world and has done so in every age and in every society of which we have
knowledge. Obviously the laboratory affords a degree of security, rigor, and
control in a way that is hardly possible in the field, but such advantages must be
set against the disadvantages of dealing with psi effects whose very presence can
only be detected by means of statistical analysis. Should the skeptic’s attention,
then, be redirected to the spontaneous evidence?
The snag is that it is then that the second assumption comes into its own.
The argument is essentially that which David Hume first propounded in his
famous essay on miracles.1 A miracle, he pointed out, is, by definition, a
singular exception to some law of nature. But, since the laws of nature are daily
Parapsychology 155
JohnBeloff 361
confirmed in our experience, no mere human testimony, however imposing,
could ever suffice to outweigh the reason we have for doubting it. The fatal
weakness of this argument, as I see it, is that it is bound to fail when put to the
test. Thus we would have no difficulty envisaging a hypothetical situation where
we would be left in no doubt whatsoever that a paranormal event had occurred.
To take a concrete, if jocular, example, we could imagine a press conference in
the White House at which the President of the United States, in full view of
audience, security officers, and television cameras, were suddenly to vanish and,
as suddenly, reappear elsewhere. Whether any actual event similarly rules out
the element of doubt is, of course, quite another matter, although Hume, to his
credit, does try to make his argument proof against such a contingency. Thus he
goes out of his way to remind the reader that, only shortly before the time he
was writing, a whole series of miraculous cures had been reported from the
cemetery of St. Medard in Paris. (For an account of these events, which puzzled
not only Hume but even Voltaire, see Dingwall 1947, Chap. 4.) He then con
cedes that, other things being equal, the evidence for their authenticity must be
acknowledged as overwhelming. But, precisely because they are miraculous, a
rational person has no option but to reject them as spurious But, in saying this,
Hume is surely mistaken. A point may be reached, as our hypothetical example
showed, where no one, rational or otherwise, had any option but to believe.
What, in fact, Hume has done, all unwittingly, is to furnish a reductio ad
absurdum of his own argument. If the evidence he cites for the events in question
were indeed so overwhelming, we would have to accept it, miracles or no
miracles. Actually, few skeptics are content these days to base their case on the a
priori impossibility of psi phenomena. We have witnessed so many upheavals in
science that we are much less prone to suppose than were Hume’s contem
poraries that either science or common sense can tell us in advance what can or
cannot be the case. Perhaps we are more ready these days to agree with St.
Augustine when he pointed out that a miracle was not so much contrary to
nature as contrary to what we know of nature.2
Let us assume, anyhow, that we are not all Humean skeptics and that we
are willing to consider the possible existence of phenomena that do not meet the
strongest criterion of validation, production on demand. How then are we to set
about deciding rationally between belief and doubt? The main aim of this paper
is to suggest an appropriate strategy. It is the following. Whenever one is
confronted with a claim, from whatever source, that has certain paranormal
implications, one should ask oneself what normal explanation there could be
that would obviate the necessity of invoking anything of a paranormal nature.
The question, then, is whether this alternative explanation is more or less
plausible than the original paranormal claim. This strategy will not solve the
controversy, if only because what to one person may seem a perfectly reasonable
156 Parapsychology
362 What is Your Counter-Explanation?
scenario may strike another as wildly implausible and far-fetched. Nevertheless,
such a strategy can, I maintain, serve to sharpen a controversy that is always
liable to get out of focus.
Conclusions
Are we then justified in dismissing the career of “this extraordinary woman,” as
Feildihg called her, with the one simple word “trickery”? Trickery is, of course,
another of those convenient open-ended and slippery concepts that, no less than
the concept of the paranormal itself, can be invoked to explain anything
whatsoever. All the same, it is not unreasonable to ask what conjurer would
agree to perform under the conditions to which Eusapia regularly submitted?
These included (1) performing in a private room that was first searched and then
locked and sealed against intruders who might act as accomplices; (2) under
going a thorough body search before the sitting began;16 (3) allowing one’s arms
and legs to be held by sitters whose one duty was never to let go; (4) producing
sometimes phenomena in light sufficient, it was said, to read the small print in
one’s Baedeker. At all events, one leading American magician who observed her
during her American tour was so impressed by her performances that he offered
to donate $1,000 to charity if any of his fellow magicians could do as much
under comparable conditions. There were no takers. (See Rogo 1975,27.)
Apart from sleight-of-hand trickery, the only other counter-explanation that
was occasionally invoked in connection with Palladino was hallucination,
especially with reference to her materializations and pseudopods.16 It is a some
what desperate explanation, especially as there was always more than one sitter;
but, in the nature of the case, it is very difficult to deny. However, skeptics can
always rely on one supreme ally: the poverty of the human imagination. Some
things go so far beyond our familiar experience and are so inherently hard to
credit that even to contemplate them imposes a severe strain on our intellectual
equanimity. It is not, therefore, surprising that people are content to clutch at
any straw as an excuse for not having to take these things seriously and are
pathetically grateful to critics like Hansel whom they can cite in self-justification.
To such people Eusapia’s mismanaged American tour came as a godsend.
But, for those who are willing to make that leap of the imagination that is
required if we are to put ourselves in the shoes of those who came face to face
with her phenomena, who, so to speak, had their noses rubbed in them, there
are a number of useful lessons we can learn from her career. In the first place, it
reminds us that fraud can go hand in hand with genuine psychic ability, so that
Parapsychology 167
John Beloff 373
it is always risky to generalize from the discovery that cheating has occurred.
There may be all kinds of psychological reasons why certain persons in certain
situations indulge in trickery. We can also learn from her case that the more
fantastic phenomena are not necessarily any less real than those of lesser
magnitude. In particular, we can no longer justify dismissing materialization as
too preposterous to warrant serious consideration.
This last point, in turn, reopens a host of other controversial historical cases
that now demand to be looked at with a fresh eye and the received opinion if
necessary challenged. What about the young Florence Cook, for example? Must
we ignore her because in middle age she took to cheating? Must we continue to
defame the memory of one of Britain’s most illustrious (not to say bravest)
scientists by insisting that he was either the dupe of this 16-year-old girl or else
her lover, who for the sake of sexual favors betrayed his fellow scientists and the
cause of truth?17 And what of those cases that flourished in the decade after
Palladino? What about poor “Eva C,” with her ectoplasmic faces that looked to
all the world so suspiciously two-dimensional? The SPR investigators could
prove nothing against her, though they were reluctant to authenticate her phe
nomena; but Gustave Geley, of the Institut M6tapsychique in Paris, insists that
he watched the faces taking shape and was able to produce a certain amount of
photographic evidence to support his contention.18 And must we still ignore the
mountain of evidence that exists on the Margery mediumship because of one
compromising incident late in her career? After all, did not the great Houdini
stake his reputation on exposing her and fail ignominiously?19 To take one more
example, the most fantastic of them all, what are we to make of the Brazilian
medium Carlos Mirabelli? If the so-called Santos Report were to be credited, he
would rank as by far the most powerful medium on record, surpassing D. D.
Home.20 Deceased individuals known to the sitters are said to have materialized
in broad daylight while the medium was seated in full view strapped to a chair.
These same entities conversed with the sitters, submitted to a medical examina
tion on the basis of which they were pronounced anatomically perfect, allowed
themselves to be photographed and later dissolved into nothingness while the sit
ters looked on incredulously. The witnesses, moreover, were tnainly persons of
professional standing, many of them physicians. The only objection one can raise
in Mirabelli’s case is that he was never tested outside Brazil. Plans to bring him
to Europe fell through. It is, however, embarrassing to have to base one’s
counter-explanation on the assumption that what happens in Brazil need not be
taken seriously and that Brazilians, even medically qualified ones, cannot be
trusted to tell the truth. The point is, however, that, even in such an extreme case
as this, where we have every reason to query the evidence and to express sus
picion, we cannot escape the responsibility of putting forward some counter
explanation.
168 Parapsychology
374 What is Your Counter-Explanation?
When cornered, there is always one last trump-card that the skeptic can
brandish in the face of the believer: Where are all these marvels now? The harsh
fact of the matter is that there are no more Palladini. For all 1 know to the con
trary, the phenomenon of materialization may be extinct and may never recur.
Except in connection with poltergeist cases, we cannot even be sure that there
are any more strong phenomena. Hence, since historical cases can never compete
in credibility with cases that are still open to further investigation, the skeptic
cannot be faulted if, like Podmore, he prefers to suspend judgment pending the
advent of more compelling evidence, always provided he does not invoke spuri
ous reasons for rejecting what evidence there is. But, equally, if the negative
option is still valid, so is the positive option. It is no less rational, and it is
certainly more adventurous, to adopt an attitude of basic belief as one’s working
hypothesis. If we remain entrenched in a rigidly conservative stance, we are apt
to neglect phenomena that may be both real and important. Moreover, the basic
believer is spared those intellectual contortions to which a skeptic is now driven
when confronted with evidence for which there is no plausible counter-explana
tion. Lastly, if in the future new cases of a spectacular nature should arise, the
basic believer will be in a better position and better prepared to deal with them.
Acknowledgment
1 am specially grateful to the late Piet Hein Hoebens for criticizing an earlier draft of this
paper, as a result of which the paper has been completely rewritten.
Notes
1. David Hume, An Enquiry Concerning Human Understanding, 1748, Sect, 10,
Of Miracles.
2. St. Augustine, The City o f God, 21:8
3. Prompted by Scott, Piet Hein Hoebens of Amsterdam was able to obtain from
Groningen an earlier version of the report, where, sure enough, this ambiguity was more
apparent.
4. The most comprehensive account known to me is Carrington (1909), but see also
entry under “Paladino” in Fodor (1966 [1934]) and Dingwall (1950, Chap. 5). Dingwall’s
essay is still the best general introduction that I know and has an invaluable bibliography.
5. Most recently by Ruth Brandon, who in her caustic book (Brandon 1983, 135)
speaks of “his will to believe and his disinclination to accept any unpalatable contrary
indications.”
6. See Richet (1922, 35). His long treatise is dedicated to William Crookes and
Frederic Myers. An account of Richet’s involvement in psychical research is given in
Fodor (1966 [1934]) under “Richet ”
Parapsychology 169
John Beloff 375
7. Much of this is taken from Carrington (1909, Part 3).
8. Lodge (1894) writes: “Any person without invincible prejudice who had the
same experience would have come to the same conclusion, viz: that things hitherto held
impossible do occur.” Lodge never saw any reason to retract this conclusion and, in his
autobiography (Lodge 1931), he amplifies his account of his experiences on the lie
Roubaud.
9. In a footnote to her review of Morselli’s Psicologia e Spiritismo in the Proceed
ings o f the SPR , 21 (1909): 522. She was even then not yet satisfied as to the authenticity
of the phenomena.
10. This is certainly the view taken by Cassirer (1983a).
11. It is curious how often this point has been misunderstood. Thus Ruth Brandon
(1983) writes: “In all other scientific fields to be caught out just once in fraud is to be
instantly discredited.” But this confuses the experimenter and the subject. An experi
menter who cheats is instantly discredited in parapsychology as in all other sciences and
all his results discounted as suspect. But, if a subject cheats, this shows only that the
experimenter has been careless and should try harder.
12. See Flournoy (1911, Chap. 7). He mentions there that “Myers was this time—as
were all the others—absolutely convinced of the reality of the phenomena” (p. 246).
13. A lengthy review of this report by Count Perovsky-Petrovo-Solovovo appeared
as a supplement to the issue of SPR Proceedings that contains the Feilding Report
(Feilding et al. 1909). Flournoy is rather more brusque with the Courtier Report, which
he accused of prevarication. Since they appeared incapable of saying either oui or non
when it came to the authenticity of the phenomena, Flournoy (1911, 272) suggested that
they ought to reply in chorus: Nouin!
14. This view was first, I believe, put about by the celebrated illusionist, J. N.
Maskelyne. He and his son had been invited to participate in the Cambridge sittings of
1895 and he soon formed a very poor opinion of Eusapia and perhaps an even poorer
one of her investigators. “No class of men can be so readily deceived by trickery as
scientists” he asserted. “Try as they may they cannot bring their minds down to the level
of the subject and are as much at fault as if it were immeasurably above them” (cited in
Brandon 1983, 138). Feilding, on the other hand insisted that scientists, as a class, are far
more reluctant than conjurers to acknowledge any sort of supernormal force (see Feild
ing et al. 1909, Final Note). In fact, Maskelyne, himself, once admitted to a journalist
who interviewed him that, as a result of personal experiences with friends, he had to
admit that there was something we could not explain about “table-turning” phenomena,
though he felt sure it was not the action of spirits (Brandon 1983, 166).
15. Hansel (1980, 61) suggests that Eusapia might have taken advantage of her
female prerogative to refuse such a search. However, the task was usually given to
female sitters, such as the two American ladies who were invited to attend the eighth
stance of the Naples series, but Carrington mentions specifically that at the conclusion
of the very successful sixth stance, Eusapia made no objections to letting the three men
search her (Feilding 1950).
16. Alice Johnson, at one time honorary secretary of the SPR, would occasionally
advance this hypothesis when all else failed; but for a recent discussion of the “pseudo
pods” phenomenon, see Cassirer (1983b).
170 Parapsychology
376 What is Your Counter-Explanation?
17. This was the thesis of Trevor Hall’s (1962) book and is the one favored by Ruth
Brandon (1983, Chap. 4). Those who are not persuaded that Sir William Crookes, O.M.,
was either an imbecile or an unprincipled blackguard may wish to consult the analysis
of the “Katie King” sfcances provided by Dr. G. Zorab (1980), a Dutch scholar, although
his book has so far appeared only in Italian.
18. A re-evaluation of the case of Eva C. (Marthe Bfcraud) is given by Brian Inglis
(1984, Chap. 4).
19. Margery (Mrs. Mina Crandon) was one of the most controversial mediums of
the twentieth century. Inglis (1984, 167) reproduces a photograph of her encased in
Houdini’s fraud-proof box (“like an old-fashioned steam-bath with a hole for her neck
and two for her arms”). Houdini is shown holding one of her hands. The stance had no
sooner started when the lid burst open and the phenomena continued! Mrs. Marian
Nester, of the American SPR, a daughter of Dr. Mark Richardson, Margery’s chief
investigator, is preparing a new book about this medium that should provide new
grounds for reconsidering her case.
20. See Dingwall (1930). I am indebted to Brian Inglis for drawing attention to this
document in his discussion of Mirabelli’s mediumship. See Inglis (1984, 221-227).
Further information on Mirabelli is provided in Playfair (1975, Chap. 3).
References
Baggally, W. W. 1910. Discussion of the Naples Report on Eusapia Palladino. Journal
o f the SPR, 14:213-228.
Beloff, John. 1980. Seven evidential experiments. Zetetic Scholar, 6:91-94, 116-120.
Brandon, Ruth. 1983. The Spiritualists: The Passion for the Occult in the 19th and 20th
Centuries. Buffalo, N.Y.: Prometheus Books.
Carrington, Hereward. 1909. Eusapia Palladino and her Phenomena. London: Werner
Laurie.
--------. 1913. Personal Experiences in Spiritualism. London: Werner Laurie.
--------. 1954. The American Stances with Eusapia Palladino. New York: Garrett.
Cassirer, Manfred. 1983a. Palladino at Cambridge. Journal o f the SPR, 52:52-58.
--------. 1983b. The fluid hands of Eusapia Palladino. Journal o f the SPR, 52:105-112.
Dingwall, E. J. 1930. An amazing case: The mediumship of Carlos Mirabelli. Journal o f
the American SPR, 24:296-306.
------- . 1947. Some Human Oddities. London: Home and Van Thai.
--------. 1950. Very Peculiar People. London: Rider.
Feilding, E., W. W. Baggally, and H. Carrington. 1909. Report on a series of sittings
with' Eusapia Palladino, Proceedings o f the SPR 23:306-569 (reprinted in E. Feild
ing, Sittings with Eusapia Palladino and Other Studies [with an introduction by
E. J. Dingwall], University Books, New Hyde Park, N.Y., 1963.)
Flournoy, Theodore. 1911. Mtlanges de Mitapsychique et de psychologie. Geneva and
Paris.
Fodor, Nandor. 1966 [1934]. An Encyclopeadia o f Psychic Science. New Hyde Park,
N.Y.: University Books.
Parapsychology 171
John Beloff 377
Hall, Trevor H. 1963. The Spiritualists: The Story o f Florence Cook and William
Crookes. New York: Garrett 1963 (reprinted as The Medium and the Scientist: The
Story of Florence Cook and William Crookes, Prometheus, Buffalo, N.Y., 1985).
Hansel, C. E. M. 1980. ESP and Parapsychology. Buffalo, N.Y.: Prometheus Books.
Inglis, Brian. 1984. Science and Parascience: A History of the Paranormal, 1914-1939.
London: Hodderand Stoughton.
Lodge, Oliver. 1894. Experience of unusual physical phenomena occurring in the pres
ence of an entranced person (Eusapia Palladino). Journal o f the SPR, 6:306-360.
------- . 1931. Past Years. London: Hodderand Stoughton.
Playfair, Guy. 1975. The Flying Cow. London: Souvenir Press.
Podmore, Frank 1909. The report on Eusapia Palladino. Journal o f the SPR, 14:172-
176.
------- . 1910. The Newer Spiritualism. London: Fisher and Unwin.
Richet, Charles. 1922. Traiti de Metapsychique. Paris: Alcan (English translation:
Thirty Years o f Psychical Research, London, 1923).
Rogo, D. Scott. 1975. Eusapia Palladino and the structure of scientific controversy.
Parapsychology Review, 62:23-27.
Scott, Christopher. 1980. Comment on Beloffs “Seven Evidential Experiments.” Zetetic
Scholar, 6:110-112.
Zorab, George. 1980. Katie King Donna o Fantasma? Milano: Armenia Editore.
[ 10]
The Appearance and Disappearance of Objects in
the Presence of Sri Sathya Sai Baba
E r l e n d u r H a r a l d sso n a n d K a r l is O s is 1,2
ABSTRACT: During three field trips to India to study claims suggestive of psi
phenomena the investigators were able to observe at close range some unexplained
occurrences which took place in the presence of Sri Sathya Sai Baba. Although no
conclusions can be reached on the phenomena observed and described in this account
because they occurred under informal conditions, it seemed worth while to report the
events because of the challenge they offer to carry out further studies of this
well-known Indian religious leader under well-controlled experimental conditions.
I n t r o d u c t io n
Ostensibly paranormal appearances and disappearances of objects
have been reported in various cultures. The phenomenon consists of
an object appearing or disappearing in circumstances where no
physical cause of the event can be detected. In cases where para
normal creation of the object is assumed, the process is usually
referred to as “ materialization.” W hen an already existing object is
“ brought” by paranormal means from one place to another without
visible means of travel, the phenomenon is called “ teleportation”
and the object is referred to as an “ apport.” Teleportation is said to
occur in poltergeist cases (Bender, 1969; Owen, 1964; Roll, 1974).
Materializations of human forms have been reported in the presence
of mediums (Carrington, 1954; Hannesson, 1924; Richet, 1923;
Schrenck-Notzing, 1920). Indian popular literature describes appear
ances of inanimate objects, usually amulets made of precious
materials and said to have magical properties such as providing
protective contact with a guru (Yogananda, 1969).
The appearance and disappearance of objects is of course one of
the favorite illusions created by stage magicians. With the help of 1
1 We wish to express our gratitude to Sri Sathya Sai Baba for his kind cooperation in
this investigation.
2 This research was financed through the A.S.P.R.’s James Kidd inheritance fund
and by an anonymous donor in Iceland to whom we are grateful.
174 Parapsychology
34 Journal o f the A m erican Society fo r Psychical Research
astonishing dexterity, diversion of attention, and some gadgetry,
objects have “ appeared” and “ disappeared” on the magic show
stage without any detection of the tricks of the trade by the audience.
Enterprising showmen throughout recorded history have produced
“ spirits” and “ dem ons” in religious settings in their claim to
demonstrate “ supernatural” phenomena.
On close scrutiny the bulk of the claims for materialization and
teleportation have been explained in quite natural (and som etim es
entertaining) ways (Carrington, 1920). Nevertheless, there are a few
reports which keep the question open, e.g ., Crookes’ (1874) report on
Florence Cook and D. D. Home, and the reports on Rudi Schneider by
Lord Hope (1933) and Schrenck-Notzing (1920). More recently,
Eisenbud’s (1967) observations of Ted Serios suggest some kind of
materialization interfering with light in photographic and television
processes.
In spite of considerable research done in this area by psychical
researchers early in this century, claims of materialization and allied
phenomena have generally been frowned upon and rejected by nearly
all present-day parapsychologists (Eisenbud, 1975). W e too shared
this point of view and did not give serious consideration to such
phenomena until our encounters with Sri Sathya Sai Baba.
Sathya Sai Baba, age 51, is a religious leader who has a large
following and lives in the State of Andhra Pradesh in southern India.
He is not only credited by his followers and many others with a
variety of psychokinetic powers (such as materializations, tele
portations, and healing) but also with various forms of extrasensory
perception and out-of-body projections collectively perceived. Several
popular books have already been published about him, all of which
deal in part with his paranormal phenomena (Kasturi, 1973-4;
Murphet, 1971; Sandweiss, 1975; Schulman, 1971). W e first
encountered reports of these phenomena when researching deathbed
visions in India in 1972-73.
During two subsequent visits to India in late 1973 and early 1975
we met with Sai Baba several tim es and also had lengthy interviews
with a number of persons, including Indian research scientists, who
had observed or experienced psychic phenomena of various kinds
which they attributed to him. E.H. made another visit to India in
January, 1976, for further observations and interviews with Sai
Baba. In a series of interviews, we had the opportunity to discuss
these matters with Sai Baba him self and to observe him in action. He
speaks English but often prefers to use an interpreter. He tends to
belittle the significance of his own psychic phenomena, calling them
“ small item s,” and he repeatedly stresses the importance of spiritual
and ethical issues. Our bid for formal experiments was rejected with
the comment that he would only use his paranormal powers for
Parapsychology 175
A ppearance and D isappearance o f Objects 35
religious purposes such as helping his devotees when they are in dire
need or for invoking faith in hitherto agnostic persons, but never for
purely demonstrative purposes. During the 11 interviews we had with
Sai Baba, however, he did spontaneously display a number of the
same phenomena for which he has become famous in India.
Interviews with Sai Baba are generally not prearranged. People
who want to meet him— usually several hundred— gather outside his
residence in the ashram. Twice a day he makes his rounds for a short
while and chooses those he wants to see. Many wait for weeks in vain
and are never granted an interview either in a group or in private. Sai
Baba’s interview room, where most of the phenomena we are
reporting occurred, is bare, with concrete walls and floor and without
carpets or any decorations. The only furniture in the room was one
armchair. During our interviews we all sat crosslegged on the floor.
The number of persons present with Sai Baba varied from only E.H.
and K.O. to about nine persons. W e observed some 21 appearances
and disappearances of objects at close range, but none under
controlled conditions. We shall describe four instances of these
phenomena and then attempt a very tentative evaluation of their
genuineness.
T h e I n c id e n t s
1. A ppearance o f a “R udraksha"
The first of these phenomena concerns the possibly paranormal
appearance of a “ rudraksha,” which is similar to an acorn, about an
inch in diameter, and with a fine texture like an apricot stone. First
Sai Baba presented us both with some “ vibuti” (holy ash, which is
probably comparable symbolically to bread and wine in Christianity).
He gave us the vibuti after a typical wave of his right hand, palm
down, in small circular movements that lasted two or three seconds.
After a short discussion he presented one of us (K.O.) with a large
gold rin g,3 again after having waved his hand in a typical manner.
W hile we were arguing with Sai Baba about the value of science
and controlled experimentation, he turned the discussion to his
favorite topic, the spiritual life, which in his view should be as
“ grown together” with ordinary daily life as a “ double rudraksha.”
W e did not understand this term nor could the interpreter translate it.
Sai Baba seem ed to make several efforts to make its m eaning clear to
us until he gave up and with som e signs of impatience closed his fist
and waved his hand. He then opened his palm and showed us a
3 A goldsmith later examined this ring and found that it is made of gold. It was
appraised at $100.
176 Parapsychology
36 Journal o f the A m erican Society fo r Psychical Research
double rudraksha, which we are told by Indian botanists is a rare
specimen in nature like a twin orange or twin apple.
W e observed Sai Baba closely all the time we sat on the floor. After
we had admired the rudraksha, Sai Baba took it back in his hand and,
turning to E.H ., said he wanted to give him a present. He enclosed
the rudraksha between both his hands, blew on it, and opened his
hands toward E.H. In his palm we again saw a double rudraksha, but
it now had a golden ornamental shield on each side of it. These
shields were about an inch in diameter and held together by golden
chains on both sides. On top of the shield was a golden cross with a
small ruby affixed to it. Behind the cross was an opening so that this
ornam ent4 could be hung on a chain and worn around the neck.
Many, but not all, of the ornaments which Sai Baba presents to
people are said to be made of precious metals and stones.
Sai Baba wears a one-piece robe with sleeves that reach his wrists.
We watched his hands very closely and could not see him take
anything from his sleeves or reach toward his bushy hair, clothing, or
any other hiding place.
It was not possible for us to examine Sai Baba’s clothing. On one
occasion, however, we had an opportunity to examine two robes he
had worn. He reportedly always wears robes of the same sort and
when they start to wear out he gives them away. The two we
examined contained no pockets of any kind or any signs of magician’s
paraphernalia having been attached.
W e became acquainted with a former professor of chemistry in
Bangalor, Dr. D. K. Banerji. One day Sai Baba visited him and his
wife unexpectedly and “ produced” some objects for them, as he does
almost everywhere he goes. As he retired for the night in their house,
he asked Mrs. Banerji to wash his robe, which she did. On this
occasion she, Dr. Banerji, and a colleague of his, Dr. P. K.
Bhattacharya (doctorate in chemistry from Illinois), carefully
examined Sai Baba’s robe and found that it had no pockets. Dr.
Banerji was formerly the director of the Department of Organic
Chemistry at the All India Institute of Science, which is a leading
research institute in India. These three persons reported this incident
to us during two independent interviews.
In an interview during his third visit to Sai Baba, E.H. repeatedly
saw the sun shine through Sai Baba’s thin, silken sleeves as he was
sitting on a chair approximately five to six feet away from him. The
4 A goldsmith later examined this ornament and found that it contains 22-carat gold.
Its value was appraised at $80. The small ruby was examined by the Gem Testing
Laboratory of the London Chamber of Commerce and Industry. Because of the closed
setting behind the stone it was not possible to determine whether it is a natural or a
synthetic ruby. A botanist’s microscopic examination of the rudraksha showed it to be a
genuine example of its species.
Parapsychology 177
A ppearance and Disappearance o f Objects 37
late afternoon sun was shining through the window of the interview
room where a few people were sitting on the floor around him. Most
of the interview was spent on a discussion, but Sai Baba also
produced a few objects which he gave to those present. As Sai Baba
was sitting on a chair his arms were approximately at the head-level
of those sitting on the floor close to him. The sun shining through Sai
Baba’s sleeves did not reveal any shadows that might indicate the
presence of hidden objects. Sitting that close to Sai Baba, E.H. could
several tim es see up his sleeve, which appeared to be empty.
2. D isappearance o f a Picture fro m K.O . s Ring
This episode concerns the gold ring that Sai Baba had presented to
K.O. during our first visit. This ring had a large enam eled picture in
color of Sai Baba encased in it. The picture was of oval shape, about 2
cm long and IV2 cm wide, and was framed by the ring. The edges of
the ring above and below the enam eled picture, together with four
little notches that protruded over it from the circular golden frame,
kept it fixed in the ring. Thus the picture was set as firmly in the ring
as if it and the ring were one solid article.
In an interview during our second visit when we tried to persuade
Sai Baba to participate in som e controlled experiments, he seem ed to
becom e impatient and said to K.O., “ Look at your ring.” The picture
had disappeared from it. W e looked for it on the floor, but no trace of
it could be found. The frame and the notches that should have held
the picture were undamaged; we examined them afterwards with a
magnifying glass. For the picture to have fallen out of the frame, it
would have been necessary to bend at least one of the notches and
probably also to bend the frame at some point, but neither had been
done. Another alternative would have been to break the picture in the
ring so that it would fall out in pieces.
W hen Sai Baba made us aware of the picture’s absence we were
sitting on the floor about five or six feet away from him. W e had not
shaken hands when we entered the room and he did not reach out to
us or touch us. As we sat cross-legged on the floor, K.O. had his
hands on his thighs and E.H. had noticed the picture in the ring
during the interview and before this incident occurred. E .H .’s first
reaction was that the picture had suddenly become transparent. Two
persons, Dr. D. Sabnani from Hong Kong and Mrs. L. Hirdaramani
from Ceylon, whom we had met for the first time during the inter
view, certified that they had observed the large golden ring with Sai
Baba’s picture on K .O .’s left hand before the picture disappeared.
When the picture could not be found, Sai Baba somewhat teasingly
remarked, “ This was my experim ent.”
During our next interview, which took place two days later, Sai
Baba asked K.O. if he wanted the picture back, to which K.O. replied
178 Parapsychology
38 Journal o f the A m erican Society fo r Psychical Research
that he did. On Sai Baba’s demand, K.O. gave him the ring which he
took in his hand and asked, “ Do you want the same picture or a
different one?” “ The sam e,” K.O. replied. Sai Baba then closed his
fingers around the ring in his palm, brought it to about six inches
from his mouth, blew at it lightly, and then stretching his hand
toward us, opened it. In it was a ring. The enameled picture was like
the one that had been framed in the first ring; the ring itself,
however, was different. The first incident, the disappearance of the
picture, was obviously more evidential than was its reappearance,
about which there is not much we can say.
3. R ing and Necklace fo r M r. and M rs. K rystal
During the aforementioned interview we observed an interesting
phenomenon. A lawyer from Los Angeles and his wife, Mr. and Mrs.
Krystal, were present with us. Their 33rd wedding anniversary was
around that day and Sai Baba seem ed to be happy about the occasion.
He waved his hand and as he opened his fist we saw a golden ring. He
handed it to Mrs. Krystal, telling her to put it on one of her husband’s
fingers, as is customary for the bride to do at a traditional Indian
wedding. Sai Baba’s open hand was still stretched out in the air
without having touched his clothing or any object. W e watched
closely. Immediately thereafter Sai Baba waved his hand again for
two or three seconds, turned palm down, and quickly closed it. His
arm was approximately horizontal to the ground, which was not a
position favorable for slipping something out of his sleeve by means
of gravity. We observed at close range as Sai Baba loosened the grip
of his fist so that he could hold a large, bulky necklace in his hand. Its
double length was about 20 to 29 inches and it contained a variety of
different kinds of stones interspaced by small golden pieces.
Attached to it was a picture of Sai Baba surrounded by a golden
rosette frame about two inches in diameter. This necklace was
presented to Mrs. Krystal.
4. A ppearance o f Vibuti {Holy A sh)
The fourth incident of possible materialization that we observed
and will report upon here occurred in the open. We sat cross-legged
on the ground in a long line of people as Sai Baba walked by. He
stopped in front of Professor Hasra, a friend of Dr. Banerji, whom we
mentioned above. Professor Hasra was sitting second to the left of
K.O. and third from E.H. Sai Baba waved his right hand. As we were
sitting on the ground and he was standing, his hand was slightly
above the level of our eyes.
His palm was open and turned downwards, and his fingers were
stretched out as he waved his hand in a few quick, small circles. As he
Parapsychology 179
A ppearance and D isappearance o f Objects 39
did this, we observed a gray substance appearing close to his palm.
This substance appeared just below and at his palm, and Sai Baba
seem ed to grasp it into his fist with a quick downward movement of
his hand as if to prevent it from falling to the ground. K.O., who sat
slightly closer to Sai Baba than did E.H ., observed that this material
first appeared entirely in the form of granules, like very rough
grained sand. Sai Baba then poured the granules into the palms of
Drs. Hasra and Banetji and most of them disintegrated into
amorphous ash which they smeared on their foreheads. The point is
that the granules were very fragile and would have lost their structure
if produced by the magician’s art of quick movements (“ the hand is
faster than the eye” ) which were invisible to us. W hen K.O. first saw
the vibuti (holy ash), the granules were intact. This ostensible
materialization of vibuti is a frequent occurrence and Sai Baba
produces it several times as he walks among the crowd. W e observed
many such incidents, but only this one at so short a distance.
D isc u ssio n
The alleged paranormal appearance and disappearance of objects
has been a tough problem for psychical research in the sense that
observations are rarely permitted under conditions which would
exclude all possible normal causes. We were prepared to make
instrumented observations with movie cameras and small sealed or
locked enclosures wherein we hoped the objects would appear.
Unfortunately, we were told not to use these in Sai Baba’s interview
room. W e filmed him outdoors, waving his hand and producing holy
ash, but not at close enough range for decisive analysis. All we have
are observations made under semi-spontaneous conditions. There
fore, all our conclusions have to be extremely tentative.
Let us spell out some hypothetical normal explanations for the
incidents we observed:
1. W e might have been in altered states of consciousness,
like mass hypnosis, and have responded to skillful suggestion
techniques by “ seeing” what was not there and overlooking actual,
observable events. For example, the late Carl Vett (personal com
munication) explained his observations of the Indian rope trick in this
way. W e are both psychologists and can state with confidence that we
did not undergo any altered states during our interviews with Sai
Baba. We were very much on our guard at all tim es. Moreover, the
objects produced (the double rudraksha and the gold ring with the
enamel picture) are still in our possession.
2. The objects might have been provided by an accomplice in the
interview room. This is not possible because objects also appeared
when we were alone in the room with Sai Baba. Moreover, the seating
180 Parapsychology
40 Journal o f the A m erican Society fo r Psychical Research
positions often excluded such a possibility, e.g ., when he was seated
at some distance from the other persons. Those present were visitors
who varied from interview to interview. Only at our first visit in 1973
was an interpreter used who was also an “ officebearer” of Sai Baba’s
organization.
3. The interview room might have contained concealed devices
which somehow ejected the objects we observed. The room was
barren of anything which could be so used. Sai Baba usually sat
cross-legged on the concrete floor out of reach of any possible
containers, such as a shopping bag on a windowsill, in which
packages of vibuti or other small objects might be concealed. The
place where he sat varied from interview to interview, and he was not
positioned at one particular spot when the incidents occurred. He also
produced objects outdoors and in a private room.
4. Sai Baba might have concealed the objects on his person and
produced them by sleight-of-hand. We heard rumors about this
possibility which suggested hiding places such as the sleeves of his
robe, hidden pockets, and even his hair. However, we found no one
who could offer firsthand observations or who could name someone
who had made firsthand observations supporting this hypothesis.
W e consider hypotheses 1-3 to be unreasonable and not worth
further discussion. However, the sleight-of-hand hypothesis needs
careful consideration because magicians do make objects seem to
appear and disappear by this method.
Now back to our experiences with Sai Baba. W e made some 20
observations of ostensibly paranormal appearances of objects in his
hand. None of these occurred under controlled conditions and we
were not able to examine him physically or to take other necessary
precautions. Therefore, at this stage we obviously do not have
sufficient grounds for accepting the claims made about the genuine
ness of the reported phenomena. It must also be stated that under the
given conditions we were not able to detect any evidence of fraud.
Consideration of the following points leads us to regard Sai Baba’s
phenomena as possibly paranormal:
1. Lengthy history without clear detection of fraud. According to
those who have had a long association with Sai Baba, the seem ingly
paranormal flow of objects has lasted for some 40 years, or since his
childhood. Most of the persons we met who had had even just one
m eeting with him reported having observed some ostensible
materialization phenomena. W e did not meet anyone who claimed
personal observations indicative of Sai Baba having produced the
objects by normal means.
2. Reports of the occurrence of other psi phenomena, such as ESP
over distance, giving m essages in dreams, healing, out-of-body
projections collectively perceived, and PK of heavy objects.
Parapsychology 181
A. Unwitnessed:
Subject Working B. Witnessed by Experimenter C. Witnessed and Shuffled by
Alone Shuffled by Subject Experimenter. No Contact
ethod
u b je c t
H P ... PDT. 223 469 7.1 15.4 JBR,SOZ,JGP 2 1 2 287 6.3 9.7 JBR,SOZ,JGP 482 85 5.2 1.9
G Z ... PDT. 124 21 5.2 0.9 SOZ,MHP.... 31 33 6 .1 2.9 SOZ............... 654 211 5.3 4 .0
S O ... PDT. 260 549 7 .1 16.7 GZ,MHP....... 142 93 5.7 3.8 MHP,SOZ__ 59 18 5.3 1.1
MHP. PDT. 642 229 5.4 4.4 SOZ............... 66 — 6 4.9 0.4
TCC.. PDT. 113 117 6.0 5.4 JB R ............... 40 47 6 .1 3.7 SOZ,GZJB R . 228 —12 4.9 0 .4
LHG . P M .. 90 110 6.2 5.7 EPG.............. 114 43 5.4 2.0
Total......... 1452 1495 6.0 19.1 425 460 6 .1 11.0 1603 339 5.2 4.2
•This table is complete for those subjects who performed under at least two of the three general conditions
of witnessing. For the complete block of results of the third (right) section, see Table III.
66— 6
(Post ” period . 93 56 19.7 5.6 2.8
9___ TCC...........
8 ___ 1935 MHP.............. SOZ...................... PDT..................... 16.6 4 .9 0.4
1935 SOZJBR.GZ....... PDT..................... 228 — 12 30.8 4 .9 0.4
1 0 ___ 1935 E JG ................ JB R ..................... PDT..................... 948 — 59 62.9 4.9 0 .9
11.... 1935 MB................. G EBJLM ........... PDT..................... 35 30 12.1 5.9 2.5
5.0
1 2 ___ 1935-1936 LH G+EPG.. EPG..................... POM.................... 154 49 25.3 5.3 1.9
1 3 ___ 1936 FM ............... . JG P ..................... PDT,POM,PSTM 1146 41 69.0 0 .6
Total 1934-1937 4 9 subjects 11 experimenters PDT,POM,PSTM 4523 614 137.2 5.14 4.5
*The initials of the experimenters represent names as follows: SOZ, Sara Ownbey Zirkle; JGP, J. G. Pratt;
MHP, Margaret H. Pegram; GZ, George Zirkle; JBR, J. B. Rhine; GEB, G. E. Buck; JLM, J. L. Michaelson;
EPG, E. P. Gibson; JDMcF, J. D. McFarland; LMcC, L. McCartney; ES, Earl Stephenson.
GZ’s work for this phase falls into two periods separated by a gap of
about a month. The first block of 97 runs average 6.4, the highest he
had done for PD T and the second block of 557 runs averaged only 5.1.
This later period was a poor one for all test procedures tried.
Likewise this later condition (i.e. experimenter shuffling) coincided
with general decline in scoring in the case of subject H P also.
However, the fact remains that the other four subjects declined
when the experimenter took over the shuffling. The first thought
occurring to the cautious investigator is that this may be due to the
additional safeguarding and that conversely the work under the other
two sections A and B were experimentally defective. But while this
makes little difference since the work of Table II alone is offered for
consideration in connection with the precognition hypothesis, it should
be said in fairness that there are other possible explanations for this
decline. Some subjects such as H P believed they were helped by some
contact with the cards and liked to be allowed to shuffle the pack in the
usual test procedures. (Although H P ’s best E SP scoring was done
without this—with distance and with no sensory card-contact at all.)
The most marked exception to this was GZ who also did his best (6.4)
Parapsychology 247
50 The J o u r n a l of P a r a psych o lo g y
call the cards for him. But whereas I had obtained positive scoring as
experimenter with subjects H P and AJL, the averages were approxi
mately chance for subjects TCC and EJG. Likewise Mrs. Zirkle, as
experimenter, obtained marked positive deviations from subjects GZ
and H P but not from M H P and TCC. (However, Mrs. Zirkle ob
tained high averages from these subjects on other tests as did I also in
the instances given above.) Other experimenters had similar differ
ences. It is, of course, remotely possible that a difference could be
attributed to particular subject-experimenter combinations even under
the “E SP shuffle” hypothesis, and the above argument is far from
conclusive.
A second and more urgent consideration lies in the fact that in
most instances where comparative data are available, the scores in
precognition tests paralleled the course of scoring of the subject on
other test procedures. For example, in the instance of subject FM,
when her first 20 runs in each of four types of precognition tests7 were
grouped, the average for the 80 runs was 5.71 hits per 25. And when
the first 20 runs made by FM on the comparable methods of DT and
DTSTM (actually 19 and 24 runs because of irregular natural group
ing) were pooled, the total 43 runs average 5.75 hits. The compari
son continues through the decline which follows in all six methods: The
remaining precognition tests given this subject averaged only 5.0; and
for the plain D T and D TSTM , 4.7. This strongly suggests that the
subject is following a trend peculiar to herself, and that the shuffling
by the experimenter, Dr. Pratt, in the precognition tests was not the
determiner of the results.
Again, subjects GZ and H P also underwent a great decline in
scoring ability during the period of their precognition tests, although
at a much slower rate of decline than FM. This decline appears in
both the PD T and in the usual E SP tests and suggests strongly that
their scoring ability in PD T was contributed by the subject himself
rather than the shuffler of the cards. For example, GZ declined in the
BT tests during the period from a score average of 8 per 25 for 46
runs to an average of 6 per 25 for 101 runs. The PD T tests began
with 97 runs at 6.4 hits per 25 and after a break fell in the 557 later
runs to 5.1 per 25. The case of H P, though it would require more
detail to describe, is no less marked in its parallel decline in DT and
PDT, all of these cases leave the impression that the three different
experimenters involved were not causing the drop in deviation, but
that the subject himself actually produced the change.
7 PDT, POM, PSTM and a special form of POM 4* STM. See above for
account of these procedures.
250 Parapsychology
E x p e r im e n t s B e a r in g o n t h e 53 P r e c o g n it io n H y p o t h e s is
Pro Pro
cedure Runs Av. cedure Runs Av.
HP...................... DT 360 5.3 PDT 482 5.2 Total work of both conditions.
SO....................... DT 30 5.8 PDT 142 5.7 Not same period. No DT taken in same
period as PDT.
6.0
H J....................... DT 39 5.3 PDT 98 5.2 Approximately same period.
A JL ..................... DT 40 PDT 45 5.5 Approximately same period.
MHP**............... DT 5931 5.3 PDT 642 5.4 Approximately same period.
DT 1468 4.3 PDT 61 4.3 Approximately same period.
(Low-aim series, both DT and PDT).
6.6
EJG.................... DT 152 5.3 PDT 948 4.9 Two methods used—a year apart.
M B..................... DT 24 PDT 35 5.9
LHG................... /OM 96 7.0 POM 114 5.4 Different periods Precognitive match
\DT 46 5.6 ing used.
FM ..................... DT 43 5.7 PDT 80 5.7 Approximately 1st 20 runs of each of 2 DT
methods and of 4 precognition test
methods used with this subject.
200
fDT PDT
\DTSTM 4.7 POM 1066 5.0 Remainder of tests, both conditions.
School children. . DT 218 5.3 PDT 335 5.5
♦Subject* of Table II omitted from this table are those who did not perform on the ESP test most com
parable; hence there is no basis for comparison. The school children in No. 10 series were not the same since
both sets of tests, DT and PDT, were not given either group. But both groups are from the same school, are
approximately of the same age and status, and were tested by the same experimenters.
♦♦Unwitnessed work, but this subject was a psychology assistant.
8In the case -of subject LHG the methods were OM and POM and this also
shows the greatest difference of any series in the table. This subject’s DT average
is entered for comparison.
Parapsychology 251
54 The Journal of P arapsychology
A general inspection of this table will undoubtedly leave the im
pression upon those familiar with the score ranges obtained in other
E SP tests (varying from below 3 hits per 25 to above 18) that there is
some unusual similarity in these D T and PD T scores—that the sub
ject’s ability in PD T is essentially that of DT scoring, and that the
temporal factor is not inhibiting. But again, it could not be regarded as
more than strongly suggestive of the occurrence of precognition.
The research may be credited, however, with having produced some
considerable presumptive probability of the occurrence of precognitive
E SP and then to have raised a new alternative hypothesis which points
to a new line of experimentation. Whichever hypothesis proves to be
correct, precognition or the “ESP shuffle,” a new advance may properly
be claimed for the research.
S ummary
There has been reported, above, the work of eleven investigators
in testing 49 subjects for ability to call cards in advance of the sup
posedly random rearrangement of simple shuffling by the experimenters.
In the 4,523 runs a total score of 614 above mean chance expectation
was obtained, which is 4.5 times the standard deviation. The odds are
over 400,000 to one that so great a deviation would not be expected to
occur by chance alone.
With the chance hypothesis ruled out, with sensory cues impossible
from an arrangement of the cards not yet in existence, and the shuffling
safeguarded and handled entirely by the experimenter who also did
all the checking, there is left no serious alternative to the precognitive
hypothesis except that of conceivable unreliability of research personnel
or the possibility (not an established one) of the experimenter unwit
tingly shuffling the pack, guided by ESP, so as to match to a significant
degree the predictive calls made by the subject. The first alternative
will have to be weighed non-statistically by the individual readers, whose
criteria for tentative acceptance will vary greatly. The second hypothesis,
the “ESP shuffle,” is the subject of another research and a succeeding
report.
[ 14]
M ethodological Criticisms of
Parapsychology
Charles Akers
1. Introduction
In his presidential address to the Parapsychological Association, Rao (1978)
claimed that the evidence for psi was “inescapable” and that criticisms of the
field were “unfair” and “false” (p. 278). Yet the field of parapsychology has
come under increasing criticism in recent years. Some of the sharpest criticism
has come from Fellows or Consultants of the Committee for the Scientific
Investigation of Claims of the Paranormal. Hansel, one of the Fellows, has writ
ten a revision (1980) of his 1966 work, which has been so widely cited. Other
volumes of criticism have been contributed by Alcock (1981), Gardner (1981),
Marks and Kammann (1980), and Randi (1980). Frazier (1981) has edited a
collection of articles from the Skeptical Inquirer, many of which deal with para
psychology. Psychologist Hyman (1976,1977a,b, 1981a,b, 1983) has written a
number of serious reviews of parapsychological literature.
The recent spate of criticism, greater than at any time since the 1930s, has
not been confined to the Committee. Girden (1978) has updated his 1962
review of psychokinesis to cover parapsychology as a whole. Diaconis (1978),
Moss and Butler (1978), and Zusne and Jones (1982) have published other major
critiques. There have been many important articles and volumes of criticism,
too numerous to cite here.
The response by parapsychologists to this barrage of criticism cannot be
readily summarized. There are serious disagreements within the field over the
strength of the evidence for psi (McConnell & Clark, 1980; Truzzi, 1980). How
ever, there does seem to be agreement that the critics in general, and especially
Hansel, have wrongly focused on the results from “critical experiments” or from
studies with famous “psychics.” There has, for example, been an enormous
amount written on psychic showman Uri Geller, whom Morris (1980b) refers to
as “a tired old favorite of the critics” (p.435). Honorton (1981), Morris
(1980b), Palmer (1978), Rao (1980) and other parapsychologists have argued
that the case for psi depends not on isolated experiments with selected psychics,
but on repeatable experiments with more-or-less unselected subjects. Claims of
repeatable findings are discussed in the Wolman (1911) Handbook o f Parapsy
chology (especially in chapters by Honorton and Palmer) and in volumes of the
present series.
254 Parapsychology
Methodological Criticisms of Parapsychology 113
3.1. Introduction
A weakness in a number of “classic” ESP experiments was the employment
of target sequences that were severely nonrandom. One such case was the Gron
ingen telepathy series, where the targets were letter/number combinations.
Schouten and Kelly (1978) examined the sequences, and found that the target
letters were not equally frequent. Letters (A through H) were distributed as fol
lows: 103, 102, 74,43,73,71,98,23, with x2(7) = 78.6, p < .001. There were
similar inequalities in the distribution of target numbers, and in the letter/num
ber combinations. Schouten and Kelly claimed that the nonrandom features
could not account for the ESP scoring. Whether or not a nonrandom feature
can “account for” an ESP effect, it is a disquieting situation when the departures
from randomness are severe, and of unknown origin. If a critical procedure such
as randomization broke down, this raises the possibility that there were other
breakdowns, or failures to carry out the experimental plan. Each such failure
lessens one’s confidence in the competence of the investigation as a whole. In
the usual ESP investigation, only one peculiarity of the results should emerge: a
deviation from the number of “hits” predicted by probability theory.
Nonrandomness was also found in the Reiss series, regarded by Pratt, Rhine,
Smith, Stuart, & Greenwood (1940/1966) as one of six conclusive ESP experi
ments. The target sequence from the series was deficient in target doubles
(Nicol, 1959). This deficiency could not account for the scores achieved, but
the origin of the nonrandom feature was again unknown. Also discussed by
Nicol (1959) was a highly significant deficiency of certain triplet combinations
in target sequences from the classic Soal-Goldney experiments. The targets had
supposedly been derived from tables of logarithms, but the source could not be
verified. Evidence eventually emerged that Soal, the chief experimenter, may
have altered some targets to create false extra hits (Hansel, 1980; Markwick,
1978; Scott & Haskell, 1974). The evidence is strong, though circumstantial.
In this case, the nonrandom sequences were a clue to experimenter error. This
possibility must always be considered in cases where randomization breaks
down.
stated” (p. 393). Davis and Akers (1974) came to a similar conclusion but were
more open towards the use of informal methods in preliminary research. Yet,
shuffling and other informal techniques of randomization are often used in re
search beyond the preliminary phase. Stokes (1977) criticized Targ and Put-
hoffs (1974) experiments on this basis. On some trials the investigators had
selected targets for the subject, Uri Geller, by “randomly” opening a dictionary
and interpreting the first drawable word in whatever manner they saw fit. The
use of such a procedure raises questions about the extent to which the Geller
experiments were preplanned. If the experiments were not preplanned, then
there may indeed be grounds for suspecting reporting errors, as Marks and
Kammann (1980) allege.
The Targ and Puthoff research is not an isolated example. Hyman (1983) sur
veyed 42 ganzfeld ESP experiments (most of which had not been fully pub
lished), and found numerous failures to describe randomization procedures.
Where the procedures were described, they were often informal (e.g., shuffling).
Often, researchers mentioned the use of “random numbers,” but with no ac
count of where the numbers came from, or how they were employed (personal
communication, Hyman, 1983).
from sessions where no feedback was given. This leaves 14 experiments where
feedback effects were a hypothetical possibility. Of these experiments, eight
were “blind matching” card tests, in which subjects place cards from a “target
deck” beneath whichever of several “key cards” they think will yield a match.
Presumably, neither target cards nor key cards are known to the subjects. To
succeed at the task through inferential strategies, subjects would need to remem
ber the orders of both target cards and key cards from the previous run, and use
this information to predict their future joint distribution. This is a rather in
volved counterhypothesis, which I cannot consider except as a remote possibility
(with these unselected subjects).
If the blind matching studies are not considered as seriously flawed, this still
leaves six experiments where inferential strategies would seem to be a real possi
bility (Bevan, 1947; Casler, 1962; Casper, 1952; Grela, 1945; Johnson & Kantha-
mani, 1967, Exp. 1; Shields, 1962, Exp. 1). With the exception of Casler (1962)
these studies appear to be flawed (pending study of the actual sequences of calls
and targets). Casler (1962) used four different decks, and feedback was given
only after four runs, so I chose not to assign a flaw (that is, to consider the
experiment flawed in this respect). In the remaining cases there was feedback
after each run (Bevan, 1947; Casper, 1952; Grela, 1945; Shields, 1962, Exp. 1)
or after each trial (Johnson & Kanthamani, 1967, Exp. 1). In none of these
cases was it reported that the target deck was changed between runs. If a single
deck was used repeatedly, with run-by-run feedback, then subjects may well
have discovered a guessing strategy. This is even more likely, as Diaconis (1978)
has argued, where subjects obtain trial-by-trial feedback. However, that situa
tion existed only in the study by Johnson and Kanthamani.
It could be argued that informal randomization invalidates any ESP experi
ment. Even if subjects are kept from seeing past target orders, their calling pref
erences might match patterns in the target sequence in such a way that the
theoretical variance of ESP scores increases (Feller, 1940). If this happened,
the usual statistical procedures would be invalidated. I would agree that this is a
theoretical possibility, but I know of not a single ESP study (excluding the
multiple-calling situation) where such an increase in variance has actually been
observed. If unbalanced decks are used, as in the examples provided by Zusne
and Jones (1982), then psi artifacts can easily emerge. However, all the card
experiments cited above involved balanced decks.
There were 27 cases in the sample where random number tables had been em
ployed, and one case where an ESP test machine was used (Haraldsson, 1978).
Since there was no description of the test machine, nor any report of random
ness tests, the experiment by Haraldsson must be considered flawed, until the
situation can be clarified. In the cases where random numbers were used, it was
not always clear exactly how a target sequence was finally generated. However,
the application of random number tables, to obtain an open-deck target se
quence, is a procedure that is familiar to most parapsychologists. In one study
by Braud and Braud (1973, Exp. 2) the tables were applied by untrained agents,
and a misapplication is possible; the study appears to be flawed in this respect.
Parapsychology 259
118 Charles Akers
There were other studies (e.g., Krippner, 1968) where the target selection proce
dures were assigned to experimental assistants, whose degree of training and/or
supervision is uncertain. However, I did not assign a flaw in those cases.
In summary, there were 20 cases in the sample where informal randomization
was used, and six cases where randomization was poorly described. Five of the
former cases were considered flawed, as were all of the latter. Flaws were also
assigned to studies by Braud and Braud (1973, Exp. 2) and Haraldsson (1978),
bringing the total number of such experiments to 13 out of the 54. This may
well understate the problem, the extent of which can only be known by check
ing the randomness of the target sequences (Davis & Akers, 1974). This was
done in only one instance (Schmeidler & McConnell, Group Exps.). When
Schmeidler (1959) checked her sequences, she found that an assistant had inad
vertently used many closed-deck target orders, rather than the open-deck orders
that she had requested. Fortunately, this was an error that required only minor
revisions in her data analyses. One wonders whether similar errors may have
occurred in other experiments from the sample, with more serious consequences.
randomness were found (Gatlin, 1979; Goldman, Stein, & Weiner, 1977; Ken
nedy, 1980b). Goldman, Stein, and Weiner (1977) concluded that “until the
experiment is done again, we are in the position of a chemist who at the end of
an experiment discovers that his test tube was dirty” (p. 37).
In reply to such criticism Tart (1977a, 1978, 1979a, 1980) acknowledged the
deviations from randomness, but saw these as small relative to the presumed ESP
effects. He was able to explain the initial finding by Goldman, Stein, and Weiner
(of a deficiency in target repetitions) in terms of an innocent error by his under
graduate experimenters (Tart, 1980, p. 217). Tart (1980) speculated that other
findings of nonrandomness, for which he did not have a ready explanation,
could have arisen from a PK influence on the test machine.
Kennedy (1980a,b) agreed that PK was a possible explanation for the nonran
dom features. However, he saw experimenter error as an alternative explana
tion. Kennedy pointed out a fundamental weakness in the procedure: The Ten-
Choice Trainer was not completely automated, since the experimenter had to
manually enter targets by setting a switch. If the experimenter entered a target
other than that chosen by the REG, then nonrandom artifacts could result.
Kennedy stressed the fact that only one experimenter out of six had achieved
significant results with the Ten-Choice Trainer, and the nonrandom effects were
strongest in his data. Hence, there was no need to assume ESP. The scoring
might have arisen from errors by the experimenters (e.g., entering a target which
the subject was about to choose). If there were no such errors, it was still the
case that the targets were nonrandom. Hence the subjects, who were highly
selected, might have used guessing strategies, rather than ESP.
For purposes of the present discussion, I will focus on the possibilities for
guessing strategies (Tart’s subjects did have trial-by-trial feedback). To answer
this criticism Tart and Dronek (1980) developed a “Probabilistic Predictor Pro
gram.” This was a computer program which mimicked ESP by utilizing all infor
mation from previous targets to predict the next target in the sequence. Despite
its advantage of perfect memory, the computer program was unable to match
the success of Tart’s ESP subjects. On this basis Tart concluded that only a
small portion of the ESP scoring could be explained by inferential strategies.
At first sight, Tart’s approach seems reasonable. As Wilson (1966) has argued,
all ESP test machines will exhibit small biases. Hence, it is always a matter of
taking bias into account when evaluating an ESP effect, and this is what Tart has
tried to do. However, Wilson was referring to cases where (a) the bias was a
more-or-less unvarying characteristic of the machine, and (b) the source of the
bias was well understood. Neither of these criteria were met in the case of the
Ten-Choice Trainer. Moreover, the biases in the target sequences obtained from
the Trainer, in the first training study, were much larger than those envisioned
by Wilson for a well-designed machine. The biases were large enough that Tart’s
highly selected subjects may have taken advantage of them, even though his
computer program could not do so very effectively. There are still many human
talents that computers cannot match. Otherwise, we should conclude that a per
son is psychic whenever he or she defeats a computer program.
262 Parapsychology
Methodological Criticisms of Parapsychology 121
Tart (1980) regards the target biases as “slight deviations from randomicity”
which “have no real empirical consequences” (p. 219). This is a recurrent
theme in the psi controversy: Where critics see a “dirty test tube,” the original
investigator sees a minor contaminant. In the first training study the contami
nant was a nonrandom target sequence. However, none of Tart’s critics were
able to show that this could account for the extrachance scores achieved by his
subjects. Since the scores remained unexplained, Tart felt free to attribute them
to ESP. In doing so he made a shift from the usual logic of an ESP experiment.
The usual logic is to suppose an ESP effect only when all other alternatives have
been ruled out—not by post hoc analyses, but by the design of the experiment.
The reason for this is that ESP is not an established process; it is only a label that
we assign to certain anomalous findings. To verify the anomaly it is necessary to
absolutely rule out sources of error, as Humphreys (1968) has argued. Hum
phreys’ approach is the conservative one adopted in this chapter.
were not introduced in any systematic fashion. Hyman (1981a) agreed with
Hansel, that this was a weakness in the Schmidt experiment, but he did not see
it as a fundamental failure. I am inclined to agree with Hyman. Schmidt did,
at least, conduct extensive control runs at the time of the experiment.
Hyman (1981a) observed that PK research by Schmidt and others was still
mired in its preliminary phases, since (a) the generators were “neither standard
ized nor debugged,” and (b) results were “far from lawful, systematic, or inde
pendently replicable” (p. 39). A survey of 214 PK experiments, conducted by
May, Humphrey, and Hubbard (1980) appears to substantiate Hyman’s claim.
The authors found that none of the experiments had been adequately designed
and written up. Their four summary points are reproduced below:
(1) No control tests were reported in more than 44%of the references. Of
those that did, most did not check for temporal stability of the random
sources during the course of the experiment.
(2) There were insufficient details about the physics and constructed parame
ters of the experimental apparatus to assess the possibility of environ
mental influences.
(3) The raw data were not saved for later and independent analysis in virtu
ally any of the experiments.
(4) None of the experiments reported controlled and limited access to the
experimental apparatus [p. 8].
As a check on the authors’ first conclusion I examined 27 PK experiments
cited by Honorton (1978b) which had (a) achieved significant results (by Hon-
orton’s criteria), and (b) been published in full. In only 17 of the 27 were ran
domness tests reported at the time of the experiment, and, as claimed by May,
Humphrey, and Hubbard, these tests were for overall bias, and did not (with the
exception of the study by Schmidt, 1970a) ensure detection of short-term gen
erator bias. The 17 experiments were all authored (or coauthored) by Schmidt
or Braud, except for one by Bierman and Hootkooper (1975-1977). In none of
the studies were randomness tests introduced in the systematic fashion advo
cated by critics, though this has been the case in more recent research (e.g., Bier
man & Houtkooper, 1981; Broughton, 1979, 1982). It was generally unclear
whether the REGs had been tested in the actual experimental environment, with
all peripheral equipment attached. This is a fundamental requirement for a ran
domness test (Davis & Akers, 1974).
One means of systematically controlling for generator bias is to randomly pair
control and experimental trials, as suggested by Hansel (1981). This is a proce
dure that Broughton began routinely applying several years ago (e.g., Broughton,
1979), and it is quite easily accomplished when one has an REG interfaced to a
computer (Broughton, 1982). However, this is not to imply that there are no
other means of accomplishing such control. Where several levels of a “psi-
conducive variable” are manipulated, the sequence of conditions could be deter
mined by randomly selecting latin square patterns (Cochran & Cox, 1957).
However, this would have to be carefully planned out, since one needs to control
not just for some linear trend in the generator, but also for short-term biases,
264 Parapsychology
Methodological Criticisms of Parapsychology 123
and sequential dependencies. If a single latin square pattern was used repeatedly,
it might coincide with a pattern that the generator exhibited. It should be men
tioned, however, that some Schmidt generators (e.g., Davis & Akers, 1974, pp.
403-406) have been tested by generating sequences of over a million trials, and
have shown no evidence of either short- or long-term bias. Hence, the problem is
not a severe one with a well-designed generator which has been thoroughly
tested.
4. Sensory Leakage
4.1. Introduction
In an ESP experiment the design should exclude cues from the target itself,
from an agent who “sends” the target (in a telepathy test), or from the experi
menters who have prepared the target. If judges are employed, as in many free-
response ESP tests, the design must also exclude leakage to them. These issues
will be discussed in turn.
1977) and that the container be kept well out of the subject’s reach. An alterna
tive would be to allow the subject unlimited access to the container, which was
made fraudproof. Price (1955) suggested that a metal container be used with a
cover welded on and photomicrographs taken of the welds. But perhaps such a
container could be accessed today by the same technology that is used to exam
ine oil pipelines (advanced imaging techniques). Rather than attempt to devise
the perfect container (which would in any case be impractical for day-to-day
research), an experimenter can simply take precautions to ensure that the sub
ject has no visual, tactile, or other access to the target container at any time dur
ing the experiment. This may not be as easy as it sounds, if the target must be
prepared in advance, and the subject is a clever trickster.
The same principle, of limiting the subject’s access, applies to computer stor
age of targets. Davis (1974) has observed that a subject who has access to the
computer, knows the data format, and has sufficient programming knowledge,
might subvert experimental precautions. Davis also noted that software “bugs”
might allow easy access to computer targets, in a system which was supposedly
secure.
doubtful of whether sensory cues were excluded. This depends on the dimen
sions of the box, and on the relative locations of box and subject, but such infor
mation is lacking. With one subject the investigators used the safer “down
through” technique, in which the deck of cards is left in an undisturbed pile as
the subject makes the calls. Unfortunately, it was not stated where the shuffling
and cutting of the card deck took place. Perhaps the subject was able to glimpse
the bottom card, on occasion. At any rate, the study as a whole is hard to evalu
ate, given the reporting deficiencies. Hence I assigned a flaw.
There were similar reporting deficiencies in studies by Casler (1962), Nash
and Nash (1961), and Wilson (1964, Exp. 1). In none of these studies was the
relative location of subject and target(s) described. In the first two studies the
“down through” technique was used exclusively, and the bottom card was
shielded from view. I assumed that subjects were not permitted to handle the
decks in either of these studies, and on that basis found the procedure secure.
In the third study (Wilson, 1964, Exp. 1) the target was enclosed in an envelope
which was apparently unsealed. Again, it was unclear whether the subjects were
allowed to handle the envelope. Somewhat arbitrarily, I did assign a flaw in this
study, since the procedure is less well-defined than the “down through” tech
nique. Obviously, there are subjective judgments involved in assessing a “flaw”
as opposed to a “minor contaminant” or a “procedural weakness.” (It should be
noted, in passing, that Wilson himself never claimed to have found evidence for
ESP.)
In summary of the clairvoyance experiments, there were five cases (out of 33)
which appear to have allowed possibilities for sensory leakage (Fahler & Cadoret,
1958, Exp. C; Johnson & Kanthamani, 1967, Exp. 1; Shields, 1962, Exps. 1 & 2;
Wilson, 1964, Exp. 1). Three of these experiments had already been assigned
flaws for their randomization procedures (Johnson & Kanthamani, 1967, Exp. 1;
Shields, 1962, Exp. 1) or for a failure to describe randomization (Shields, 1962,
Exp. 2). Hence the proportion of flawed experiments increases from 13/54 (in
Section 3) to 15/54.
In the discussion which follows, I will consider only inadvertent leakage from
agent to percipient, leaving the question of deliberate cheating to Section 5.
In about half (12/22) of these experiments the percipient and agent were in
separate rooms, with one of the rooms soundproofed in some manner. I have
included a study by Krippner (1968) in this group, on the assumption that the
Maimonides “sleep room,” which was used in this study, was soundproofed. In
five of the 12 cases rooms are reported as “soundproofed” but are otherwise
undescribed (Braud & Braud, 1974, Exp. 2; Sargent, 1980b, Exps. 2, 3, & 5;
Sargent, Bartlett, & Moss, 1982). In Sargent’s experiments the agent moved
at the end of the session from the soundproof room to a “quiet room” that was
apparently unshielded. Possibly, the percipient had not at this time completed
the judging. However, the “quiet room” was in a building which did not adjoin
the percipient’s building. Moreover, the agent in the “quiet room” was usually
one of the experimenters. Hence there would seem to have been no possibility
for inadvertent cueing. There was a change from these conditions in the fifth
experiment from Sargent (1980b), where half the trials were conducted in un
shielded rooms within the same building. Yet, results were marginally significant
in the half conducted between two buildings. In all six of the cases cited above,
an inadvertent cueing appears to have been eliminated, despite a vagueness in
the experimental reports.
In six of the twelve experiments commercial sound-isolation rooms were used
(Honorton & Harper, 1974; Moss & Gengerelli, 1968; Moss et al., 1970; Sondow,
1979; Terry & Honorton, 1976, Exps. 1 & 2). In four of these experiments this
was not apparent from the reports (Honorton & Harper, 1974; Sondow, 1979;
Terry & Honorton, 1976, Exps. 1 & 2). My information is based on a personal
communication from Honorton (1981), and on a phone conversation with Son
dow (1983). Auditory cueing was presumably eliminated, though this was not
objectively assessed.
This leaves another 10 cases where percipient and agent were not in shielded
rooms. In eight of these cases the percipient and agent were, at least, in nonad
joining rooms which were separated by distances ranging from 30 to 78 feet
(where reported). These distances were probably sufficient, but I would feel
more secure if tests for auditory transmission had been conducted. My concern
is greatest in two experiments (Braud & Braud, 1973, Exp. 2; Braud & Wood,
1977) where the agents were someone other than the experimenters, and where
the percipients made oral (as opposed to written or push-button) responses. The
oral responses make the notion of a subtle guidance by the agent more plausible.
In the Braud and Wood (1977) study, the percipient’s verbalizations were actually
fed to the agent through an intercom. However, the intercom operated only
one-way. For the agent to have cued the percipient, any sounds would have had
to travel a distance of 70 feet, through two closed doors, and would have also
had to overcome the masking sounds of the “pink noise” which was fed through
the percipient’s earphones. In Braud and Braud (1973, Exp. 2) there was a dis
tance of 78 feet separating agent and percipient. There was no intercom used in
the latter study. In this study, as well as the last, it would appear that auditory
Parapsychology 269
128 Charles Akers
cues were eliminated. However, it would have been preferable if the experiment
ers had objectively assessed the extent to which such cues had been eliminated.
In some buildings sounds travel quite readily between distant rooms.
Despite these reservations, I chose to assign a flaw only in studies by Casler
(1964) and Shields (1962, Exp. 1), where there were obvious possibilities for
sensory leakage. In Casler (1964) the agent and percipient were in adjoining
rooms with both doors open, and the participants in visual contact with the
experimenter (who remained in the hallway). Auditory cues were excluded by
the dubious means of playing a phonograph record during the trials. In Shields
(1962, Exp. 1) agent and percipient were in the same room directly across from
one another. The cards were screened from the subject, but it is unclear whether
the experimenter/agent’s face was screened.
On a cumulative basis, this increases the proportion of flawed experiments to
16/54 (Shields, 1962, Exp. 1, having already been cited).
cally, Hansel alleged that an experimenter may have been with the agent when
the latter opened the target envelope for a night’s session. If an experimenter
did know the target, he or she could have inadvertently biased the subject’s
dream report in the direction of the target content.
Published accounts of the experiment by Ullman & Krippner (1970) and
Ullman, Krippner, & Vaughan (1973) were ambiguous. However, I found a
preliminary report of the experiment (Ullman & Krippner, 1969) which resolves
the ambiguity. It is clear that the experimenter had no contact with the agent,
after the agent had selected the target envelope. How then did the agent receive
instructions for the night? (This was Hansel’s question.) It turns out that the
instructions were on a written form, enclosed in the envelope. If the published
accounts had provided these facts, the controversy would never have arisen.
It appears, however, that the experiment can be criticized from another per
spective. Generally, the experimenters were well aware of the need to keep
everyone blind (e.g., Ullman & Krippner, 1970, pp. 67 & 83). However, on one
trial from the experiment in question, there was a clear violation of this experi
mental requirement: An experimenter, possibly a research assistant, personally
selected the last target after the agent had rejected two blind selections as “too
gruesome” (Ullman, Krippner, & Vaughan, 1973, p. 151). Possibly, the experi
menter did not see the target selected, but he at least saw which targets had been
rejected. The trial should clearly have been excluded from the series, on the
basis of nonrandom target selection, and nonblind conditions.
If there were no other violations of the experimental protocol, then Hansel’s
hypothesis (of leakage from the experimenter) can be rejected, since the results
would remain significant with the questionable trial eliminated. Yet, the inci
dent described leaves doubts as to the rigor with which the experiment was con
ducted.
Targ and Puthoff acknowledged the existence of the cues, but denied that
this could account for the more striking correspondences that they had obtained.
To prove their point, they had the transcripts edited with the help of Tart, and
enlisted their own judge (who had no knowledge of the Price series). Using all
nine transcripts and targets, the new judge was able to match seven out of the
nine transcripts to their correct targets, a highly significant finding. Marks
(1981b) objected to the procedures of the rejudging. He speculated that Tart
might have been biased in his editing, or that the “blind” judge might have at
some time encountered published reports of the Price series.
Readers who are interested in further details may consult the cited sources.
The point to be made, however, is that no amount of editing can rectify the
critical flaw in the procedure, which is the use of trial-by-trial feedback with a
“closed” target pool. This problem was initially explained by Hyman (1977b):
The statistics and judging procedure assume independence of descriptions for
each target site. But this is obviously violated by the experimental procedure.
Immediately after the subject generates his description, he is taken to the tar
get site to be given feedback on how well he had done. Although the reason
for doing this may be understandable, it makes his next description no longer
independent of the first target site. To give one example of how this might
generate false hits, assume that the first site is a municipal swimming pool.
The next day the subject will probably avoid describing features that obvi
ously belong to a swimming pooL If the second site, say, is a marina, the sub
ject, in the third protocol, would avoid describing things that obviously belong
to a swimming pool or a marina, and so on. Such a situation, in principle,
could suffice to give ajudge sufficient information to make perfect matches at
each site from the descriptions ... [p. 20].
Obviously, this is not a problem that can be solved by editing. Kennedy (1979b)
has described two ways of avoiding the problem. The first is to use open-deck
targets, and have judges evaluate transcripts in the order in which they were pro
duced. The second solution is the one commonly applied, which is to have sub
jects do their own evaluations after each trial (again with open-deck targets).
Even though a wrong procedure was followed, the Price series may yet be sal
vaged. It should be possible to conduct a rejudging in which nine independent
judges each evaluate one transcript against whichever targets the subject had not
yet visited at the time of that transcript. This would be cumbersome but feasible.
basis of these cues, despite having no ESP ability at all. In a more realistic exam
ple the cues would be something more subtle, such as a slight smudge or a crease
on the target.
Hyman (1977a) saw the placing of such cues, which could be either inadvert
ent or deliberate, as a “clear-cut possibility” (p. 48) in a ganzfeld study by Hon-
orton and Harper (1974). The targets in that study were Viewmaster reels,
which the agent removed from the viewer, at the end of a session, to mix with
control reels. Honorton (1979) addressed this question two years later, when
the same issue was raised by Kennedy (1979a,b). (See also Stokes, 1978.) Hon
orton thought that the handling cue hypothesis was overly speculative. Even if
such cues were present (and they were not known to be), they would not neces
sarily have been detected by subjects. To further support his case, Honorton did
a comparison between studies that allowed handling cues, and studies that did
not. He found no difference in their success rates (see Section 11.2, for a cri
tique of this argument).
I believe that Honorton could more directly respond to Hyman and Kennedy
by conducting a rejudging of this material (if the subjects’ imagery reports are
still available). Outside judges could compare the subjects’ imagery with an un
handled set of Viewmaster reels, in the manner recommended by Kennedy
(1979b). Until then, the study must be considered flawed, since the existence
of such cues is more than just a remote possibility. If such cues did exist, they
were not necessarily “subtle” or “subliminal.” as assumed by Honorton; they
may have been so blatant as to be readily detected by subjects. This is obviously
the case if the cues were deliberately introduced (see Section 5.2). If the han
dling cue hypothesis is speculative, it is no more so than the ESP hypothesis.
The Honorton and Harper study was within my 54-experiment sample. An
other such study was by Sondow (1979), where the handling cue hypothesis is
also applicable. This possibility was noted by Sargent (1980a), an English para
psychologist. In reply to Sargent, Sondow (1980) did find holes in some (but
not all) of his argument. She concluded that handling cues could not explain
(a) the “dramatic quality of the correspondences between target and response”
(p. 269), nor (b) the significant matching of targets and responses by outside
judges. However, the outside judges were not always successful. In particular,
they succeeded only when they had access to the subjects’ “associations.” The
subjects’ “associations” were obtained after they had viewed both the target and
control pictures; hence these could have been influenced by handling cues.
The appeal to “dramatic correspondences” is one that was also made by Tart,
Puthoff, and Targ (1980). I find it an uncomfortably subjective argument,
particularly when it is based, as with Sondow (1980), on the publication of
selected excerpts from subjects’ reports. The assumption is that subjects have
no knowledge of the target pool, but in Sondow’s experiment such information
could have been obtained from previous subjects. In any case, some “remark
able correspondences” can arise purely as a result of chance coincidence (Child
& Levi, 1980). It is for this reason, of course, that parapsychologists typically
rely on objective techniques of assessment.
274 Parapsychology
Me thodological Criticisms of Parapsy chology 133
5. Subject Cheating
5.7. Introduction
Many claims for paranormal phenomena have been based on tests with highly
selected subjects (e.g., psychics, sensitives, or mediums). Obviously, a first re
quirement of such tests is that they rule out trickery by the subjects. Hansel
(1980), pp. 29-33) cites several early experiments where trickery was not ruled
out, and where the subjects later confessed to having used tricks.
One of the most remarkable tricksters was Margery Crandon, a medium who
276 Parapsychology
Methodological Criticisms of Parapsychology 135
formed me that the room used in her 1979 study had been modified. Of course,
a trickster could introduce his or her own modifications.
It should be mentioned that possibilities for cheating were considerably less
ened, in seven telepathy experiments, by using an experimenter or an experi
mental assistant as the agent (Bevan, 1947; Braud & Braud, 1974, Exps. 1 &2;
Braud, Wood, & Braud, 1975; Grela, 1945; Krippner, 1968; Shields, 1962, Exp.
1). There were six other experiments where an experimenter served as agent for
at least some trials (Honorton & Harper, 1974; Sargent, 1980b, Exps. 2, 3, & 5;
Sargent, Bartlett, & Moss, 1982; Sondow, 1979). The results in two of these
experiments (Honorton & Harper, 1974; Sondow, 1979) remain significant when
one excludes trials where a friend of the subject served as agent. (See Rogo,
1979a, with respect to Honorton & Harper, 1974.) In the remaining experi
ments, authored by Sargent, the results are not broken down according to who
the agent was.
In all of the above experiments, scenarios for fraud could be imagined, par
ticularly where the subject was left unattended as in studies by Bevan (1947),
Braud and Braud (1974), Braud, Wood, and Braud (1975), and Grela (1945).
However, this assumes that tricksters were present in the samples. In my own
experience, I have rarely encountered sophisticated trickery, even among sub
jects claiming psychic skills. On that basis I did not assign flaws. Braud and
Braud (1974) show some awareness of the problem, since they locked the sub
jects into their rooms during the experiment.
Trickery may also have been possible in many of the clairvoyance experi
ments, though it is difficult to judge. A trickster might, for example, prepare
duplicate envelopes, similar to those used to enclose the target. As the experi
ment was about to begin, he or she would find a means of misdirecting the
experimenter’s attention, and substitute the duplicate envelope for the experi
menter’s. The trickster would then need to open the target envelope and reseal
it in a new container, all without the experimenter’s noticing. Yet, an experi
enced magician could probably accomplish this. The method would fail if the
target envelope had been sealed or coded in such a manner that the substitu
tion would be detected.
I do not know whether the trick described, or a variety of other tricks, could
have been used in the sample experiments. But in general, the sample experi
ments seem to have been designed under the (reasonable) assumption that a
trickster would not be present. Usually, this would be a safe assumption. Hence,
in assigning flaws, I was only concerned with cases where ordinary subjects
might have cheated spontaneously, without much forethought. On that basis,
I found eight experiments where cheating may have been an easy task (Braud &
Braud, 1973, Exp. 2; Braud & Braud, 1977; Casler, 1964; Casper, 1952; Johnson
& Kanthamani, 1967, Exp. 1; Schmeidler, 1970; Terry & Honorton, 1976, Exps.
1 & 2 ).
In the study by Braud and Braud (1973, Exp. 2), which was less controlled
than their later research, the agents were allowed to select their own targets.
Apparently, they were unsupervised, and could have made selections which
Parapsychology 279
138 Charles Akers
were not random, but based on the percipients’ preferences. (The percipients
were their spouses.)
In one of the telepathy experiments (Casler, 1964) the agent and percipient
were at close range (see Section 4.4) and could easily have exchanged auditory
signals. In three other telepathy experiments (Casper, 1952; Terry & Honorton,
1976, Exps. 1 & 2) agent and percipient were in separate rooms, but one or the
other was unsupervised. In the study by Casper (1952) the agent could have
easily signaled the percipient by varying the manner of operating a “ready”
light. Where such “ready” signals are used, they should operate only from the
percipient’s room to the agent’s room, so that cueing is eliminated. There were
also obvious possibilities for cheating in the first experiment by Terry and Hon
orton (1976), where undergraduate subjects took turns testing each other. The
senior investigators did supervise target selection, in some trials, but they appar
ently had no further role in controlling percipients, agents, and student experi
menters. In the second study by Terry and Honorton, Terry was the sole experi
menter (see Terry, 1975). He could not have simultaneously controlled both
agent and percipient. In both of the Terry and Honorton studies, there was
another possibility—that an agent deliberately introduced handling cues on the
target (see Section 4.9).
Cheating was also a clear possibility in three clairvoyance experiments (Braud
& Braud, 1977; Johnson & Kanthamani, 1967, Exp. 1; Schmeidler, 1970). In
the study by Braud and Braud (1977) there was again the possibility that sub
jects would deliberately introduce handling cues. The authors advanced five
arguments against that hypothesis, but none of these is entirely convincing.
The possibilities for cheating were more direct in the study by Johnson and Kan
thamani (1967, Exp. 1), where the subjects could have peeked into the unsealed
envelopes while the experimenter was occupied with her recording task (see Sec
tion 4.3). There was also a direct avenue for cheating in an experiment by
Schmeidler (1970), where subjects recorded the targets before having submitted
their guesses. Perhaps they made small “revisions” in their guesses after learning
the targets. The “revisions” might correspond with a subject’s “second guess” ;
they would not have to be outright fabrications.
In addition to the eight experiments cited above, there were another four
cases where it was difficult to judge whether subject fraud could have been easily
accomplished (Haraldsson, 1978; Fahler & Cadoret, 1958, Exp. C; Shields, 1962,
Exp. 2; Wilson, 1964, Exp. 1). In the Shields (1962, Exp. 2) and Fahler and
Cadoret (1958, Exp. C) studies, the experimental apparatus is only vaguely de
scribed; it is unclear whether “peeking” was excluded. Likewise, it is unclear
from Wilson’s (1964, Exp. 1) report whether the subjects may have had an
opportunity to open the target envelope (which was apparently unsealed). Har-
aldsson’s (1978) test machine was undescribed, so there is no way of knowing
whether subjects may have reset counters, or employed some other trick (a long-
playing record was awarded for high scores). All four experiments may have
failed to exclude cheating.
In summary, there were 12 experiments where subject fraud was not at all
280 Parapsychology
Methodological Criticisms of Parapsychology 139
excluded (or where this judgment could not be made). Since 11 of the experi
ments had already been cited for other flaws, the cumulative proportion of
flawed experiments is increased only from 28/54 to 29/54 (by the inclusion of
Schmeidler, 1970).
6. Recording Errors
placed cards in piles under key cards, according to where they thought a match
might occur. In such an experiment it is important to carefully segregate the
piles before exposing the key cards. Otherwise, there may be some ambiguously
placed cards whose status as hits or misses must be decided on a post hoc basis.
The result could be an artifactual psi-hitting or psi-missing effect, depending on
the direction of the experimenter’s bias. This is a possibility in experiments by
Carpenter (1971) and Rao, Dukhan, and Rao (1978, Pilot, Exps. 1 & 2). Rao
has informed me that he observed some of the initial sessions conducted by Du
khan (his student). If it can be established that these witnessed results were
independently significant, then this counterhypothesis will be eliminated. My
concern is partially derived from the fact that Dukhan collected an enormous
amount of data over a surprisingly short time span (see Dukhan & Rao, 1973).
There may have been a temptation to take shortcuts in recording, as well as in
other aspects of the procedure (which are not described).
In a blind-matching experiment by Shields (1962, Exp. 2) it is unclear whether
the individual target cards were recorded at all. Possibly, the experimenter sim
ply made a visual search for matches, and recorded only the total hits from each
run. Obviously, this procedure would enhance the possibilities for errors. Such
errors may also have inflated results in a study by Casler (1962), where there is a
similar failure to describe recording. A pertinent fact is that these experiments
were the first published by Casler and Shields; the two investigators were rela
tively inexperienced.
Haraldsson (1978) used student experimenters who would also have been
inexperienced. They used an ESP test machine, but probably had to hand-copy
readings from digital counters on the machine (which was not described). It
may be that subjects were allowed access to reset buttons. When this is allowed,
some subjects will reset to zero before the experimenter has had a chance to
make careful observations of the counters. Suppose, however, that recording
was completely automated (as advocated by Gardner, 1975, and Hansel, 1980).
It is still the case that no control data were collected during the experiment (or
at least none were reported). There is no assurance that the recording equip
ment functioned as it was supposed to function.
In summary, there are nine experiments which may have been flawed on the
basis of recording procedures: Carpenter (1971, Exp. 2); Casler (1962); Haralds
son (1978); Johnson and Kanthamani (1967, Exp. 1); Rao, Dukhan, and Rao
(1978, Pilot, Exps. 1 & 2); and Shields (1962, Exps. 1 & 2).
There was one borderline case, a card experiment by Sargent (1978). Sargent
conducted a clairvoyance test in which he removed cards one at a time from a
pile after the subject made the call. This is the usual “BT technique” (Rhine &
Pratt, 1962), except that Sargent recorded each card as he proceeded through
the deck. The usual procedure is to record the deck of cards only after each run
is completed. There are problems that can arise with Sargent’s modified BT pro
cedure: Trying this myself, I found that it was easy to get out of step with a
subject who made rapid calls (the trials were not timed). Sometimes I would
turn a card over before the subject made his call. When this happened I had
282 Parapsychology
Methodological Criticisms of Parapsychology 141
changed the conditions from clairvoyance to telepathy, which made cueing more
likely (though Sargent’s subjects were screened from both cards and experi
menter). More important, I was no longer blind to the target while recording the
subject’s call. Assuming that Sargent was more practiced than I, in this tech
nique, he may not have encountered the same difficulty. Nonetheless, I have
chosen to assign a flaw here. There was no fixed intertrial interval in Sargent’s
experiment, and in such cases it is much preferred that recording take place at
the end of the run, as is usual.
Of the 10 experiments discussed above, four had already been cited for flaws
in previous sections. Hence the overall proportion of flawed experiments is in
creased from 29/54 to 35/54.
and Hubbard (see Section 3.6) indicates that parapsychologists have generally
not provided for such checks. There are, however, signs that the situation is im
proving, with some experimenters introducing control trials, generated at the
same time as experimental trials (e.g., Broughton, 1979,1982).
(where subjects could not be readily classified as either sheep or goats). Ambigu
ous cases may have been common, since the criteria for classification were loose.
Schmeidler and Murphy (1946) described sheep as those who accept the theo
retical possibility of ESP. In another report of the same experiments (Schmeid
ler, 1945) sheep are described as those who accept that ESP could occur in the
actual experimental situation. However, Table 1 of the latter source described
sheep as those who hope to succeed in the experiment. Schmeidler and Murphy
(1946) wrote that “a rigid phrasing of the question was not considered neces
sary” (p. 271). There were no fixed response alternatives, nor a preplanned
scoring system.
In Schmeidler’s group experiments the assignment of subjects to the sheep or
goat categories depended on integrating several sources of information. This
varied from one experiment to the next. In all experiments the major datum
was the subject’s self-assessment as a sheep or a goat, but this might change dur
ing the testing session. In some experiments, subjects were asked to give an
open-ended explanation of why they considered themselves sheep or goats. In
other experiments they were asked to mark their degree of belief on a line repre
senting a continuum from belief to disbelief. Precisely how these sources were
integrated, from one experiment to the next, is unclear from the published re
ports. McConnell (1959) claims that the sheep-goat forms from a September
1949 series were such that no ambiguities arose. Unfortunately, that particular
series may have been one where subjects obtained ESP feedback before having
submitted the sheep-goat forms. They may have made last-minute changes in
their self-assignment, after having learned their scores (see Schmeidler & McCon
nell, 1958, Appendix B).
In many of the group series, the ESP data were recorded on sheets which
were separate from the sheep-goat forms. On this basis, Schmeidler (1959)
rejects the possibility of post hoc assignment. However, all records had identify
ing names or numbers, and it would not be surprising if the experimenter associ
ated certain names or numbers with unusually high or low ESP scores. Where
Schmeidler definitely knew the ESP scores, she took the precaution of submit
ting the sheep-goat form to a third party, such as Murphy, for evaluation
(Schmeidler & McConnell, 1958, pp. 48, 113). However, Scott notes that the
decision to submit or not submit an ambiguous form could itself have been inad
vertently biased by knowledge of the ESP score.
by Schmeidler, using Lowy’s notes. It is unclear from the report whether Lowy
or Schmeidler had seen the subjects’ ESP scores, prior to classification. A simi
lar question could be raised with respect to Bhadra’s (1966) experiment, in
which a division between sheep and goats was accomplished only after data col
lection. However, there were, at least, fixed response alternatives in Bhadra’s
study, and the scoring system devised appears to be a reasonable one. Hence I
chose not to assign a flaw in this borderline case.
My sample also included two studies of personality correlates where the class
ification into criterion groups was nonblind (Johnson & Kanthamani, 1967, Exp.
1; Shields, 1962, Exp. 1). In Shields’ experiment, the division into “withdrawn”
and “not-withdrawn” groups was based in part on projective personality tests
which were administered and scored after the experimenter knew the ESP scores.
There was a similar problem in a study by Johnson and Kanthamani (1967, Exp.
1) , and this was acknowledged by the authors. In a second experiment, they
took precautions to ensure that scoring of the projective measure was blind. An
.05-level effect emerged in the same direction, but it did not satisfy Palmer’s
(1977) two-tailed criterion, and hence that study was not included in my sample
(see Section 2).
A study by Sargent and Harley (1981) represents a borderline case. The au
thors conducted a pilot and two confirmatory experiments. They then con
ducted some “preplanned” analyses of the combined data from all three experi
ments. Those subjects who were high on extraversion (scores of 7 or 8) obtained
higher ESP scores than subjects who were low on extraversion (scores of 0,1, or
2) . Subjects with intermediate scores were left out of the analysis.
The critical question is whether the cut-off points were selected prior to any
collection of data. The analysis was “preplanned,” but how far in advance of
the analysis? The planning may have occurred before the pilot experiment, but
it may also have taken place before the confirmatory experiments, or before the
combining of data from all three experiments. Only if the planning was before
the pilot, was bias in the choice of cut-off points eliminated. Yet, the pilot was
described simply as a “preliminary examination of the value of the question
naire” where the authors “did not expect any significant effects” (p. 202). For
this reason, I am inclined to assume that the planning of the overview analysis
took place at a later time, and that bias was not eliminated. Hence the study
must tentatively be considered flawed.
In contrast with the personality measures, the ESP variables could usually be
scored in a straightforward fashion. Dale (1943) found that the most common
error, in fixed-choice experiments, is a failure to circle hits, resulting in a psi-
missing effect. I suspected such errors in two experiments which showed overall
psi-missing (Kanthamani & Rao, 1971, Exp. C; Ryzl, 1968). At my request, Rao
supervised a rescoring of Kanthamani & Rao’s data. Although a few such errors
were found, these did not account for the results. Until Ryzl’s (1968) data can
be rechecked, I will be inclined to suspect such errors. It was necessary to hand-
score 40,000 trials, and the scorer quite possibly knew which group (sheep or
goats) a given record sheet belonged to.
Parapsychology 287
146 Charles Akers
In summary of results from the sample, there were nine experiments with
possibilities for classification or scoring errors (Carpenter, 1971, Exp. 2; Johnson
& Kanthamani, 1967, Exp. 1; Moss & Gengerelli, 1968; Ryzl, 1968; Sargent &
Harley, 1981; Schmeidler, 1971; Schmeidler & McConnell, 1958, Individual &
Group Exps.; Shields, 1962, Exp. 1). Four of the experiments had already been
cited for flaws in previous sections. Hence the overall proportion of flawed
experiments is increased from 35/54 to 40/54.
8. Statistical Violations
but only to have lessened its chance for expression, in the control condition. In
hypnosis research, for example, a control group might include subjects who had
not been hypnotized. When I examined my sample studies, I found support for
Morris’ argument; in about two-thirds of the cases, the significance of the results
did depend on comparisons between two experimental conditions or between
two subject populations who were expected to score differently. Both Morris
(1982) and Palmer (1981) see the inclusion of control or comparison groups as
desirable in process-oriented research. Alcock sees their inclusion as an absolute
necessity. From a practical standpoint this is not such a large difference in
opinion.
Based on Palmer’s (1981) arguments, I have not assessed a flaw in sample
studies which lacked an empirical control group. As Palmer observes, the inclu
sion of a control group does not in any way reduce one’s reliance on probability
theory. Tests of the difference between experimental and control groups are
necessarily grounded in probability assumptions. These assumptions were often
checked in early ESP research, by comparing calls against targets for which they
were not intended (Pratt et al., 1940/1966). The results nearly always con
formed to what one would expect from probability theory.
If probability theory is flawed, and psi effects are some kind of statistical
fluke, it then becomes difficult to account for the “experimenter effect” (Ken
nedy & Taddonio, 1976; Parker, 1978; White, 1977). This “effect” is simply a
dependence of psi results on the identity of the person conducting (or supervis
ing) the research. Many researchers, such as myself, obtain almost uniformly
chance results, while others obtain extrachance results on a routine basis (e.g.,
Sargent, in his ganzfeld research). So far as I can tell, the successful and unsuc
cessful experimenters employ the same statistical techniques, as described by
Burdick and Kelly (1977).
This is not to say that the techniques are always correctly applied. Some
common violations are discussed below.
al. (1970) study, if control target slides were sometimes borrowed from “epi
sodes” that the subject’s testing partner had already seen. (The description of
the experiment is unclear on this point.) There was a similar possibility for indi
rect feedback in experiments by Honorton and Harper (1974) and Terry and
Honorton (1976). However, in the latter studies any feedback effect would be
too small to account for the observed results.
Diaconis (1978) also refers to the problems that can arise when multiple re
sponses to a single target are treated as if they were independent. Suppose, for
example, that “flower” is always the first target in a free-response experiment,
and that this corresponds with the response preference of the subjects. There
may be an enormous number of “hits” on the first trial, but this is obviously no
evidence of ESP. This is only evidence that the subjects agree among themselves.
Goodfellow (1938) made this argument long ago, in a critique of the Zenith
Radio mass-broadcast ESP tests. Often, a similar problem has arisen in remote
viewing research, where multiple judgments of a transcript/target pair have been
treated as if they were independent. This error has been discussed by Child and
Levi (1980), Marks (1982), and Stokes (1980).
Multiple-call experiments can be appropriately analyzed by applying either a
majority-vote method, or the more sensitive “Greville technique” (Burdick &
Kelly, 1977; Greville, 1944). Within my 54-experiment sample, there were five
multiple-call experiments (Bhadra, 1966; McBain et al., 1970; Musso, 1965;
Schmeidler & McConnell, 1958, Group Exps.; Wilson, 1964, Exp. 1). Surpris
ingly, in none of these cases were the data analyzed appropriately. Musso
(1965) realized the problem and “corrected” his z-score—but in a manner that
Burdick and Kelly (1977, Footnote 5) have rejected as inappropriate.
The multiple-calling problem in Schmeidler’s research was pointed out by
Scott (1951a). In response to Scott, Schmeidler (1951) acknowledged the diffi
culty, but questioned whether it was a matter of practical concern in such large-
scale experiments. Presumably, the dependence among subjects’ calls is rather
small in an experiment with so many target lists. Scott (1951b) saw some logic
to Schmeidler’s argument and withdrew his criticism (though he later found
other reasons to question Schmeidler’s results, as discussed in Section 7.2). Em
pirical comparisons (Davis, 1978; Humphrey, 1949) confirm that in multiple-
call data from typical forced-choice ESP tests, it makes little difference whether
the results are analyzed by the usual binomial formula, or by the more appropri
ate Greville method. For this reason, I decided to assign a flaw only in the case
of a study by McBain et al., 1970, where the results were marginally significant.
Hopefully, these data are still available and can be reanalyzed.
It should be noted that the situation is quite different in remote-viewing
experiments, where the number of targets is generally small. It then becomes
critical to control for response bias (Marks, 1982). In the remote-viewing experi
ments, violations of independence have arisen even where there was only one
response per target. In several such studies a single judge has evaluated the corre
spondences between targets and responses. The judge may, under these condi
tions, be influenced in assigning a rank or rating to a given target by the mem-
290 Parapsychology
Methodological Criticisms of Parapsychology 149
ory of how he or she assigned ranks or ratings to other targets (Kennedy, 1979a,
b; Morris, 1972; Scott, 1972). The data from such experiments can still be ana
lyzed, but it is necessary to apply the method described by Scott (1972). Scott’s
(1972) article does not seem to have been widely read, since researchers have
often made false assumptions of independence, and applied a binomial analysis.
Kennedy (1979a,b) has cited such cases, three of which were in my sample
(Honorton, 1972; Honorton & Stump, 1969; Parker & Beloff, 1970, Exp. 1).
Kennedy’s reanalyses show that Honorton’s (1972) results are actually nonsig
nificant, while results from the other two studies would remain significant when
maximum dependence was assumed.
Questions of statistical dependence can also be raised with respect to a hyp
nosis study by Krippner (1968). In this study judges’ ratings were entered into
matrices of targets by transcripts, and evaluated by an analysis of variance.
However, entries in such a matrix cannot be assumed to be independent. As
Kennedy (1979a, Footnote 4) observes, it is unclear how the analyses were con
ducted, or how the dependencies were taken into account. Until these data
(which may no longer be available) can be reanalyzed, their significance will re
main in doubt.
In summary of results from the sample, there were five studies with possible
violations of independence (Braud & Braud, 1973, Exp. 2; Honorton, 1972;
Krippner, 1968; McBain et al., 1970; Moss et al., 1970). The studies by Honor-
ton, Krippner, and McBain et al. were not previously cited for flaws. Hence
their inclusion increases the proportion of flawed experiments from 40/54 to
43/54, in the overall sample.
set figure. I decided to assume that the number had been preset, unless there
was some reason for doubt. I was especially concerned in cases where an experi
menter obtained intermittent feedback and stopped the experiment at a time
when marginally significant results had been achieved. There were two such
cases in the sample (Honorton & Stump, 1969; Stanford & Mayer, 1974). In
the former case, the experiment was stopped when the two experimenters were
suddenly forced to relocate. In the latter case, the experiment was stopped so
that the experimenters could prepare a convention report. Since the stopping
point was less well-defined in the latter case (Stanford & Mayer, 1974) I have
chosen to assess a flaw; it is difficult to exclude the possibility that Stanford
and Mayer were unconsciously influenced by the trend of the results (which
yielded a p-value of .03, one-tailed). The Stanford and Mayer study, which was
generally well-controlled, also allowed a minor possibility for handling cues (see
Section 4.9).
the results were not so good, they were ‘demonstrations’ ” (p. 38). Marks and
Kammann provided no firm evidence of selection, and were forced to withdraw
one such allegation (Marks, 1981a). They did, however, provide evidence of
selection in Targ and Puthoffs (1974) ESP studies with Uri Geller. Comparing
the 1974 report with a “daily log” of the same experiments (Puthoff & Targ,
1976), Marks and Kammann found discrepancies with respect to (a) whether a
trial had been designated a “pass” or a “real attempt,” and (b) whether com
pleted drawings were or were not submitted to judges. The ambiguities suggest
possibilities for an accidental selection bias. It may be that Targ and Puthoff can
answer this allegation, by distinguishing between two varieties of “passes,” only
one of which was to be excluded from the results. However, they have not yet
published a formal rebuttal.
observes, first, that many of the ganzfeld experiments have been significant on a
straightforward binary or direct hit analysis. Second, the obtained p-values in
many such studies are low enough to survive corrections for multiple analysis.
In my sample, there were 11 ganzfeld experiments which had been published
in full. (Hyman’s survey included many convention papers or other minor pub
lications.) Among the 11 studies, the results were significant on a direct-hit anal
ysis in seven cases, and on a binary-hit analysis in two cases. There was one
study (Sargent, Bartlett, & Moss, 1982) where significance was claimed on the
basis of a preplanned sum-of-ranks test. In the remaining study (Braud & Wood,
1977) there were, as Hyman notes, a bewildering variety of analyses. The inves
tigators used the Maimonides slides as their target material, and on that basis,
one can assume that they intended to apply the binary coding system which
Honorton devised for the slides. I am impressed by the results of Table 3 (p.
420), on the basis of which I compute an overall t of 3.88, with d f =29, for an
analysis based on binary coding. Since this result would withstand corrections
for multiple analysis, I do not believe that a flaw should be assigned.
Both Hyman (1983) and Sargent (1980b) have questioned whether the binary
hit analysis in Braud, Wood, and Braud’s (1975) study was preplanned. Stanford
(198Id) argues that it was, since the Brauds used the same analysis in previous
research (e.g., Braud & Braud, 1973; Braud & Braud, 1974). In the ganzfeld
condition from Braud, Wood, and Braud, the investigators obtained 10 binary
hits and 0 misses, a result which would again withstand some correction for mul
tiple analysis. The only other analysis which the Brauds had used was an anal
ysis for direct hits, as in their earlier research. On that basis, I saw no reason to
assign a flaw.
Finally, mention should be made of the binary hit analysis used by Terry and
Honorton (1976, Exp. 1). Kennedy (1979a) raised some legitimate questions as
to whether the analysis had been preplanned. In his previous ganzfeld research,
Honorton had used only a direct-hit analysis (Honorton & Harper, 1974). If
direct hits had been the measure of significance in the Terry and Honorton
(1976, Exp. 1) study, it would have been deemed nonsignificant. (The direct hit
rate just misses significance, with p = .053, one-tailed.) In a further study Terry
and Honorton (1976, Exp. 2) reverted to a direct-hit analysis. Yet, Honorton
(1979) has provided strong assurances that the binary-hit measure was pre
planned, and on that basis I chose not to assign a flaw. However, it would have
been preferable if these assurances had been included in the original experimen
tal report.
Later, investigators began using measures which ought to be more sensitive
than simply counting the number of hits. Sargent (1980b) has used a sum-of-
ranks analysis (in addition to counts of direct hits). Palmer (e.g., Palmer & Vas-
sar, 1974) and Stanford (e.g., Stanford & Mayer, 1974) have used measures
based on subject ratings. Obviously, the danger of overanalysis does exist. Yet,
so long as each investigator prespecifies a method, the error rate remains at
alpha, even though this may differ from what he or she used previously, or from
what others are using. It is not surprising that investigators differ in their meth-
Parapsychology 295
154 Charles Akers
ods, since all of the free-response research is in its preliminary stages; there are
disagreements on many aspects of the experimental designs.
9. Reporting Failures
Many of the sample studies have already been cited for failures to describe
such procedural details as randomization or recording. Nonetheless, the sample
studies as a whole were well-described, at least in comparison with Rhine’s
(1934/1964) early ESP studies. Description of Rhine’s initial work was so bare
that no assessment was possible. Dingwall (1937) concluded that “the experi
menters have but slight idea of the kind of report which is necessary for scien-
296 Parapsychology
Methodological Criticisms of Parapsychology 155
tific men to be able to understand the work being accomplished” (pp. 140-141).
In response to a similar assessment by Hansel (1966, 1980), Rao (1981) has de
fended Rhine, arguing that “details of the sort that we now require . . . were
simply not found necessary then” (p. 192). There is some truth in Rao’s assess
ment. It should be emphasized that Rhine was not attempting to “prove ESP”
in the early research. He thought that there was already ample evidence for the
phenomenon (Rhine, 1934/1964, Chapter 2). It was only in the late 1930s, as
the research did become more proof-oriented, that reporting standards improved
(e.g., Pratt & Woodruff, 1939).
ly followed, or that there was a critical reporting omission. For that reason,
Morris believes that isolated experiments can never provide evidence for ESP; he
argues that the evidence can only come from repeatable experiments, such as
those discussed in the Wolman Handbook and in volumes of the present series.
Earlier, Crumbaugh (1966) made a similar argument. He noted that what an
experimenter has actually done “may deviate in unrecognized ways from what
he thinks he has done” (p. 52). Hence, only a repeatable experiment can be
adjudged as proof of ESP. However, Crumbaugh called for total repeatability.
Morris argues that repeatability need only be clearly above the chance level, and
sufficient for experimental progress. I agree with Morris that critics should not
insist on total repeatability at this early stage of parapsychological research. If
the repeatability is low, however, then critics can reasonably insist on unusually
high standards for the experimental reports. This is especially the case if “low
repeatability” means that some experimenters can never elicit the effects, even
under “psi conducive” conditions. (Hopefully, the repeatability problem is not
quite this severe.)
Deficiencies in reports from my sample have already been discussed, for the
most part, under appropriate headings. In some cases these deficiencies or
omissions appear to indicate a lack of planning. If an experiment has not been
planned, and there was no written protocol, then our knowledge of how it was
conducted depends entirely on the investigator’s ability to accurately recall what
he or she did. This recalled procedure is likely to differ from the actual proce
dure, since it is based only on human testimony (e.g., Loftus, 1979).
On the other hand, a reporting omission does not always imply lack of plan
ning. A ganzfeld researcher might, for example, focus so closely on details of
the ganzfeld induction, seeing this as crucial to a successful replication, that he
or she forgets to report on the method of target randomization. The randomiza
tion procedure might have been carefully planned, but simply left out of the
experimental report. Keeping this possibility in mind, I do encourage a further
“information exchange” with authors, as Morris (1980b) advocates. It is pref
erable if such exchanges take place in print, rather than in private correspond
ence; otherwise, there may be no public benefit. In the controversy surrounding
Targ and Puthoffs (1974) research, there has been a voluminous private corre
spondence (see Marks, 1981a; Morris, 1980b, 1981) which apparently has left
several key questions unresolved.
It is the responsibility of scientists to provide most of the relevant details in
their initial reports. A seriously flawed experimental report cannot be salvaged
by “information exchange.” Among the studies in my sample, procedural de
tails were especially lacking in studies by Haraldsson (1978); Honorton and Har
per (1974); Musso (1965); Rao, Dukhan, and Rao (1978); Shields (1962); and
298 Parapsychology
Methodological Criticisms of Parapsychology 157
Schmeidler and McConnell (1958). Rao, Dukhan, and Rao (1978) wrote a re
port of a large study (92,600 card trials) that contains virtually no details on
recording and safeguarding of data. Musso (1965) conducted another very large
experiment, in which 302 subjects were tested by 11 different experimenters.
Yet, the description of the procedure occupies less than one page of text. Har-
aldsson conducted a smaller study, but he used 14 undergraduate experimenters;
details were lacking on their training and/or supervision, nor was there a descrip
tion of their ESP test machine. Honorton and Harper (1974) failed to clarify
the locations of their agents, percipients, and experimenters. Shields (1962) pro
vided almost no procedural details in a report on personality correlates of ESP.
Schmeidler and McConnell (1958) described “typical” sessions, but did not ex
plain changes in the sheep-goat criteria from one session to the next. These de
tails are rarely available from earlier reports of the same experiments (see Sec
tion 7.2).
With the exception of the study by Musso (1965), these experiments have
already been critiqued under previous subheadings. The inclusion of Musso
(1965) increases the proportion of flawed experiments, in the sample as a whole,
from 45/54 to 46/54.
dictum that “the charge of fraud . . . is leveled only at the peril of the accuser”
(p. 33). Price had focused on the research of Rhine in America and Soal in
England. He later corresponded with these investigators, and was influenced to
withdraw the allegations (Price, 1972). Yet, evidence accumulated in later years
that Soal may in fact have been guilty of data fabrication (Hansel, 1980; Mark-
wick, 1978; Scott & Haskell, 1974). Markwick’s argument seems fairly conclu
sive, but her supporting analyses are involved, and have not yet been independ
ently checked.
Experimenter fraud has frequently been raised as a counter-hypothesis to psi
(e.g., Girden, 1978; Hansel, 1966, 1980; Markwick, 1978; Medhurst & Scott,
1974; Price, 1955; Scott & Haskell, 1974). Yet the actual incidence of fraud is
unknown. Broad and Wade (1982) reviewed many actual or apparent instances
of scientific fraud, and concluded, “fraud is endemic” (p. 224). Their review
does at least establish that fraud is much more common in science than had
previously been supposed. If faking of data is common, how much more com
mon are “lesser” violations of scientific ethics, e.g., data “massage,” data sup
pression, distortions in experimental reports, and the use of “hired-hand” assist
ants (Roth, 1966) who approach their experimental tasks with less than the
most lofty motives? These “lesser” violations generally pose little risk to senior
investigators. If they are found out, they can generally plead guilty to careless
ness; there will be no evidence of deliberate deception.
The general opinion seems to be that fraud or fudging cannot be completely
controlled. Since it cannot be controlled for, it will always represent a “last-
ditch” alternative for the skeptic, when he or she is presented with strong evi
dence for psi. In short, experimenter fraud is an unfalsifiable hypothesis (e.g.,
Honorton, 1981; Morris, 1980b).
I would agree that experimenter fraud is difficult to control for, but I do not
agree that controls are impossible. Johnson (1975) and Schmidt (1980) have
proposed models which I believe would satisfy all but the most hardened skep
tic, on this point. The models are too complex to be more than briefly men
tioned here. In Johnson’s “Model 3” the experimenter cannot fake data, be
cause he or she does not have access to the targets until after the subject’s calls
have been recorded by a distant computer center. In Schmidt’s model, fraud is
eliminated because the experimenter does not know, at the time the data are col
lected, which trials are experimental and which are controls.
The ease with which such methods can be introduced depends on the nature
of the experimental design. It might be difficult to introduce these controls into
a telepathy experiment, for example. However, the choice is not between com
plete control and no control at all. There is some value in introducing partial
controls against fraud. The situation is analogous to classroom cheating. Cer
tainly, the standard procedure of monitoring students during an exam does not
prevent cheating. A determined trickster might, for example, use a miniature
electronic receiver in place of crib sheets. Yet monitoring of the classroom does
cut down on fraud, simply because it makes fraud more difficult. Presumably,
scientists as well as students cheat, in many cases, simply because cheating is
300 Parapsychology
Methodological Criticisms of Parapsychology 159
11. Summary
the procedure was the placing of an (unsealed?) target envelope in close prox
imity to the subject (who was, however, not permitted to handle it).
Kanthamani and Rao (1971, Exp. C): This was a study of personality corre
lates of ESP. The description of the procedure is skeletal, though further details
are available in Kanthamani’s (1969) dissertation. Again, randomization was
informal (shuffling). One-half of the ESP data were collected by Rao, an experi
enced investigator. However, the other half were collected by an experimental
assistant who was relatively inexperienced. It is unclear whether the personality
differences would remain significant if Rao’s data were considered separately.
Sargent (1980b, Exps. 2, 3, & 5); Sargent, Bartlett & Moss (1982): These four
ganzfeld studies are considered together, since they all employed a similar meth
odology. The weakest of these was Sargent’s (1980b) second experiment, where
the results were only marginally significant, and where there was some question
as to what defined the beginning and end of the experiment. This experiment
overlapped in time with Sargent’s (1980b) first experiment. In fact, the first
four sessions of the second experiment could, it would seem, have equally well
been assigned to the first experiment.
The Sargent experiments seem to have been well-controlled, on the whole.
Stanford (1981 d) notes that there is a certain vagueness in Sargent’s descrip
tions of his randomization procedures; this was the case in two of the sample
studies (Sargent, 1980b, Exps. 3 & 5). I would assume that the procedure was
similar to that in his initial experiments.
My main concern, in the Sargent experiments, is with possibilities for subject
cheating. In some trials (Sargent does not say how many) the agent is someone
other than an experimenter. This agent is in a shielded room, during part of the
experiment, but the room is not electrically shielded. Moreover, the agent leaves
the shielded room before there is any assurance that the percipient has completed
all the rankings. Perhaps the agent would have an opportunity, after leaving the
shielded room, to pass information to a confederate, who would then signal the
percipient. An argument against that hypothesis is that Sargent has obtained sig
nificant results with naive subjects (Stanford, 198Id). On the other hand, his
strongest results have been with experienced subjects who have had a close rela
tionship with the agent.
In conclusion, eight experiments were conducted with reasonable care, but
none of these could be considered as methodologically ideal. When all 54
experiments are considered, it can be stated that the research methods are too
weak to establish the existence of a paranormal phenomenon.
the psi scoring rates differ; or (b) they can “repeat the experiments, systemati
cally varying the presence or absence of the suspected flaw” (p. 159). If psi
scoring rates are the same in flawed and unflawed samples, or if the experimental
introduction of the flaw makes no difference, then the “flaw” can be seen to
have no real-life consequences.
Honorton (1979) applied the first approach to the issue of handling cues
(which were discussed in Sections 4.8 and 4.9). He found that ESP scoring rates
were similar in studies which allowed, or did not allow handling cues. Thus, it
apparently made no difference whether or not this variable was controlled for.
Honorton’s implicit assumption is that the two samples were comparable in all
respects, except for the presence or absence of the handling cue possibility.
However, this assumption is one that most skeptical outsiders would be unwill
ing to make. If a study is “unflawed” with respect to handling cues, and yielded
strong evidence of ESP, then the natural suspicion of the skeptic is that the
study may be flawed in some other respect, such as randomization or recording.
The skeptic may be wrong, but certainly this possibility must be taken into ac
count. Hence, as Kennedy (1979b) and Hansel (1980) have argued, the only
meaningful analysis is one in which all the methodological variables are consid
ered together.
Other variables can be held constant only if the presence or absence of the
suspected flaw is experimentally manipulated (Honorton’s second alternative).
This is a more viable choice, but it presents practical difficulties. To do this, one
needs to know the precise nature of the flaw (e.g., what the handling cues were).
This information is generally unavailable, because the flaw resulted from uncon
trolled variables. These variables were never measured. For this reason, they are
usually too poorly defined to allow the experimental manipulation that Honor-
ton desires. In this sense, a “potential flaw” is worse than an “actual flaw”
(whose effects have been identified and can be objectively assessed).
An analogy can again be made to the proverbial “dirty test tube” (Section
3.5). No one knows whether the dirt introduced an artifact, because the nature
of the dirt is unknown (or it would not have been allowed in the test tube to
begin with). The dirty test tube represents no more than a “potential artifact.”
Nevertheless, the investigators have no choice but to repeat the experiment with
a clean test tube. The results cannot be salvaged.
Since critics of parapsychological research identify “potential fUws,” they do
not usually succeed in establishing an alternative explanation for psi effects. Ac
cording to Honorton (1975a, 1979) their case is thereby weakened. However,
the critics are not ordinarily pushing for acceptance of an alternative hypothesis.
They are usually asking only that claims for psi be suspended until properly con
trolled studies can be carried out. The burden of proof lies not with the critics,
but with the parapsychologists. That situation is not peculiar to parapsychol
ogy. In science generally, new claims (and especially startling new claims such
as a cure for cancer) must withstand the critical scrutiny of skeptical scientists.
On the other hand, there is a danger that critics will overstep their bounds,
and engage in speculative criticism long after the major methodological problems
304 Parapsychology
Methodological Criticisms of Parapsychology 163
have been solved. Honorton (1975a, 1979) believes that this is already the situa
tion in parapsychology. However, the present survey suggests that there are
methodological issues still to be solved, and that the critics have something posi
tive to offer. Indeed, I feel that critics have a responsibility to do more than just
criticize; they must offer constructive suggestions, and work with parapsycholo
gists in the design of new experiments, which exclude all counterhypotheses.
Table 1
Summary of Maimonides Results on Tendency for Dreams to Be Judged More Like Target
Than Like Nontargets in Target Pool
Judges’ Subjects'
score score z or t resulting from judgments
GESP: Dreams monitored and recorded throughout night; agent “transmitting” during each REM period
A. 1st screening 7 5 10 2 z = 0.71b z = 7.33b Ullman, Krippner, &
Feldstein (1966)
B. 1st Erwin 5 2 6 1 z= 2.53b z = 1.90b Ullman et al. (1966)
C. 2nd screening 4 8 9 3 z = —,25b z = 7.77b Ullman (1969)
D. Posin 6 2 6 2 z = 1,05c z = 1,05c Ullman (1969)
E. Grayeb 3 5 5 3 z = —.63c z = 0.63c Ullman, Krippner, &
Vaughan (1973)
F. 2nd Erwin 8 0 t = 4.93® Ullman & Krippner
(1969)
Krippner & Ullman
c\i
rv
G. Van de Castle 6 2 8 0 t = 2.81a
II
(1970)
H. Pilot sessions 53 14 42 22 z = 4 .2 0 b z = 2 .2 1 b Ullman et al. (1973)
Precognition: Dreams monitored and recorded throughout night; target experience next day
I. IstBessent 7 1 t = 2.81a Krippner, Ullman, &
Honorton (1971)
J. 2nd Bessent 7 1 t = 2.27a Krippner, Honorton, &
Ullman (1972)
K. Pilot sessions 2 0 z = 0.67c Ullman et al. (1973)
GESP: Dreams monitored and recorded throughout night; agent active only at beginning or sporadically
L. Sensory bombard 8 0 4 4 z = 3 .1 1 b z = 0 .0 0 c Krippner, Honorton,
ment Ullman, Masters, &
Houston (1971)
M. Grateful Dead 7 5 8 4 z = 0.61° z = 0.81c Krippner, Honorton, &
Ullman (1973)
Clairvoyance: Dreams monitored and recorded throughout night; concealed target known to no one
N. Pilot sessions 5 3 4 5 z = 0 .9 8 b z = 0.00b Ullman et al. (1973)
GESP = general extrasensory (perception. Italics identify results obtained with procedures that preserve independence of judgments in a senes.
N o te .
For some series, the published source does not use the uniform measures entered in this table, and mimeographed laboratory reports were also
consulted. Superscipts indicate which measure was available, in order of pnonty.
■ Ratings. b Rankings. e Score (count of hits and misses).
plan to merge the outcomes for judges and subjects. sonably be ascribed to chance. There is some system
Moreover, the various series could be split up in other atic—that is, nonrandom—source of anomalous re
ways. Although I think my organization of the table semblance of dreams to target.
is very reasonable (and I did not notice this outcome Despite its breadth, this “hitting”tendency seems
until after the table was constructed), it is not the to vary greatly in strength. The data on single
organization selected by Ullman et al. (1973); their dreams—Line O—suggest no consistency. At the
table, ifevaluated statistically in this same way, would other extreme, some separate lines of the table look
not yield so striking a result. What is clear is that the impressive. I will next consider how we may legiti
tendency toward hits rather than misses cannot rea mately evaluate the relative statistical significance of
1222 November 1985 • American Psychologist
Parapsychology 321
separate parts of the data on all-night sessions. (I will Is there likely to have been much of this non
not try to take exact account here of the fact that the independence in the series where it was possible? A
single-dreamdata are not significant, though it is wise pertinent fact is that the hits were not generally direct
to have in mind that the exact values I cite must be hits. That is, there was no overwhelming tendency for
viewed as slightly exaggerated, in the absence of any the correct target to be given first place rather than
explicit advance prediction that the results forall-night just being ranked in the upper half of the target pool.
sessions and for single dreams would differ greatly.) This greatly reduces the strength of the argument that
Two difficulties, one general and one specific, ordinary significance tests are grossly inaccurate be
stand in the way of making as thorough an evaluation cause of nonindependence. Because certainty is not
as I would wish. The general difficulty is that the re possible, however, we need to separate results accord
searchers turned the task of statistical evaluation over ing to whether the procedures permitted this kind of
to various consultants—for the most part, different nonindependence. In the table, I have italicized results
consultants at various times—and some of the con that cannot have been influenced by this difficulty
sultants must also have influenced the choice of pro (either because each night’s ratings were made by a
cedures and measures. The consultants, and presum different person or because each night in a series had,
ably the researchers themselves, seemnot to have been and was judged in relation to, a separate target pool)
at that time very experienced in working with some or that closely approximate this ideal condition.
of the design problems posed by this research nor in The outcome is clear. Several segments of the
planning how the research could be done to permit data, considered separately, yield significant evidence
effective analysis. Much of the research was not prop that dreams (and associations to them) tended to re
erly analyzed at the time, and for much of it the full semble the picture chosen randomly as target more
original data are no longer available. (The researchers than they resembled other pictures in the pool. In the
have been very helpful in supplying me with material case of evaluation by outside judges, two of the three
they have been able to locate despite dispersal and segments that are free of the problem of noninde
storage of the laboratory’s files. Perhaps additional pendence yield separately significant results: The pilot
details may be recovered in the future.) The result is sessions (Line H) yield a z of 4.20, and thus a p of
that completely satisfactory analysis is at present pos .00002. An experiment with distant but multisensory
sible only for some portions of the data. targets (Line L) yields a z of 3.11 and a p of .001. If
The specific difficulty results from a feature of we consider segments in which judgments may not
the research design employed in most of the experi be completely independent of each other and analyze
mental series, a feature whose implications the re them in the standard way, we find that the two series
searchers did not fully appreciate at the time. If a with psychologist William Erwin as dreamer are also
judge is presented with a set of transcripts and a set significant (if nonindependence ofjudgments does not
oftargets and is asked tojudge similarity of each target seriously interfere), Line B with a z of 2.53 (p < .01)
to each transcript, the various judgments may not be and Line F with a t of 4.93 and 1 d f {p < .01). The
completely independent. Ifone transcript is so closely two precognitive series (Lines I and J), each with 7
similar to a particular target that thejudge is confident d f yield ts of 2.81 and 2.27, with p values slightly
of having recognized a correct match, the judge (or above and below .05, respectively.
percipient, of course) may minimize the similarity of Segment results based on the subjects’ ownjudg
that target to the transcriptsjudged later. Instructions ments ofsimilarity are less significant than those based
tojudges explicitly urgedthemto avoid this error, but on judgments by outside judges. Only two segments
we cannot tell how thoroughly this directive was fol reach minimal levels of statistical significance: Line
lowed. Nonindependence would create no bias toward G, where the t of 2.74 with 7 d f is significant at the
either positive or negative evidence of correspondence .05 level, and Line H, where the z of 2.21 is significant
between targets and transcripts, but it would alter at the .05 level.
variability and thus render inappropriate some stan The statistical evaluation of the separate seg
dard tests of significance. I have entered in the two ments of the Maimonides experiments also permits
succeeding columns of the table a t or a z that can be a more adequate evaluation of their overall statistical
used in evaluating the statistical significance of the significance. For judgments by outside judges, three
departure fromchance expectancy (t is required when segments are free of the potential nonindependence
ratings are available, and z must be used when only of successive judgments (Lines H, L, and N). Putting
rankings or score counts are available, because sample these three together by the procedure Mosteller and
variability in the former case is estimated from the Bush (1954, pp. 329-330) ascribed to Stouffer (rec
data but in the lattercase must be based conservatively ommended by Rosenthal [1984, p. 72] as the “simplest
on a theoretical distribution.) Ifratings were available, and most versatile” of the possible procedures), the
they were used; if not, rankings were used if available; joint p value is <.000002. For the subjects’ own judg
otherwise, score count was used. ments, six segments are available (Lines A, C, G, H,
Novem ber 1985 • Am erican Psychologist 1223
322 Parapsychology
L, and N), and their joint p value is less than .002. to use atermsuch as anom alies, so as to avoid variable
The other segments of the data have the problem of and possibly confusing connotations about the origin
potential nonindependence of successive judgments, of the anomalies. Zusne and Jones (1982) wisely pre
and even if the exaggeration of significance may be pared the way for this usage in speaking of an om alistic
small for a single line, I would not want to risk com psychology. But meanwhile, psychologists need not
pounding it in an overall p. Their prevailing unity of cut themselves off from knowledge of relevant facts
direction, however (direction not being subject to in because of dissatisfaction with the terminology sur
fluence by the kind of nonindependence involved rounding their presentation.
here), and the substantial size of some of the differ Attempted Replications Elsewhere
ences, justify the inference that the overall evidence
of consistency far exceeds that indicated by only those The Maimonides pattern of controlled experiment in
selected segments for which a precise statistical state a sleep laboratory, obviously, is extremely time con
ment is possible. The impression given by the mere suming and expensive, and replication seems to have
count of hits and misses is thus fully confirmed when been attempted so far at only two other sleep labo
more sensitive measures are used. ratories. At the University of Wyoming, two experi
Parapsychological experiments are sometimes ments yielded results approximately at mean chance
criticized on the grounds that what evidence they expectation—slightly below in one study (Belvedere
provide for ESP indicates at most some very small &Foulkes, 1971), slightly above in the other (Foulkes
effects detectable only by amassing large bodies of et al., 1972). In a replication at the Boston University
data. Those to whom this criticism has any appeal School of Medicine (Globus, Knapp, Skinner, &
should be aware that the Maimonides experiments Healey, 1968), overall results were not significantly
are clearly exempt from it. The significant results on positive, though in this instance encouragement for
Lines F and G of the table, for example, are each further exploration was reported. The researchers had
attributable basically to just eight data points. decided in advance to base their conclusions on exact
If replications elsewhere should eventually con hits—that is, placing the target first, rather than just
firm the statistically significant outcome of the Mai in the upper half; by this measure, the results were
monides experiments, would the fact of statistical sig encouraging, though not statistically significant.
nificance in itself establish the presence of the kind Moreover, to quote the researchers, “Post hoc analysis
of anomaly called ESP? Of course not. Statistical sig revealedthat thejudges were significantly more correct
nificance indicates only the presence of consistency when they were more ‘confident’ in their judgments.
and does not identify its source. ESP, or the more . . . Further conservatively designed research does
general termpsi, is a label for consistencies that have seem indicated because of these findings” (Globus et
no identifiable source and that suggest transfer of in al., 1968, p. 365).
formation by channels not familiar to present scien Astudy by Calvin Hall (1967) is sometimes cited
tific knowledge. Ajudgment about the appropriateness as a replication that confirmed the Maimonides find
of the label, and thus about the “ESP hypothesis,” is ings; in truth, however, although it provided impres
complex. It depends on a variety of other judgments sive case material, it was not done in a way that per
and knowledge—how confidently other possible mits evaluation as a replication of the Maimonides
sources of the consistent effect can be excluded, experiments. Several small-scale studies, done without
whether other lines of experimentation are yielding the facilities of a sleep laboratory, have been reported
results that suggest the same judgment, and so on. that are not replications of even one of the more am
I believe many psychologists would, like myself, bitious Maimonides experiments but each of which
consider the ESP hypothesis to merit serious consid reports positive results that might encourage further
eration and continued research if they read the Mai exploration (Braud, 1977; Child, Kanthamani, &
monides reports for themselves and if they familiar Sweeney, 1977; Rechtschaffen, 1970; Strauch, 1970;
ized themselves with other recent and older lines of Van de Castle, 1971). In the case of these minor stud
experimentation (e.g., Jahn, 1982, and many of the ies—unlike the Maimonides studies and the three
chapters in Wolman, 1977). systematic replications—one must recognize the like
Some parapsychological researchers—among lihood of selective publication on the basis of inter
them the Maimonides group—have written at times esting results. Taken all together, these diverse and
as though a finding of statistical significance suffi generally small-scale studies done elsewhere do, in my
cientlyjustified a conclusion that the apparent anom opinion, add something to the conviction the Mai
aly should be classified as ESP. I can understand their monides experiments might inspire, that dream re
choice of words, which is based on their own confi searchis apromising technique forexperimental study
dence that their experiments permitted exclusion of of the ESP question.
other interpretations. But perhaps psychologists who The lack of significant results in the three sys
in the future become involved in this area may prefer tematic replications is hardly conclusive evidence
1224 Novem ber 1985 • Am erican Psychologist
Parapsychology 323
against eventual replicability. In the Maimonides se the night. He did this notably by misinterpreting an
ries, likewise, three successive replications (Lines C, ambiguous statement in the Maimonides reports, not
D, and E in Table 1) yielded no significant result, yet mentioning that his interpretation was incompatible
they are part of a program yielding highly significant with other passages; his interpretation was in fact er
overall results. roneous, as shown by Akers (1984, pp. 128-129).
If results of such potentially great interest and Furthermore, Hansel did not alert the reader to the
scientific importance as those of the Maimonides great care exerted by the researchers to eliminate pos
program had been reported on a more conventional sible sources of sensory cuing. Most important is the
topic, one might expect them to be widely and ac fact that Hansel did not provide any plausible ac
curately described in reviews of the field to which count—other than fraud—of how the opportunities
they were relevant, and to be analyzed carefully as a for sensory cuing that he claimed existed would be
basis for sound evaluation of whether replication and likely to lead to the striking findings of the research.
extension of the research were indicated, or of whether For example, he seemed to consider important the
errors could be detected and understood. What has fact that at Maimonides the agent could leave his or
happened in this instance of anomalous research her room during the night to go to the bathroom,
findings? whereas in Wyoming the agent had a room with its
Representation of the Maimonides own bathroom, and the outer door to the room was
Research in Books by Psychologists sealed with tape to prevent the agent from emerging.
Hansel did not attempt to say how the agent’s visit to
It is appropriate to begin with E. M. Hansel’s 1980 the bathroom could have altered the details of the
revision of his earlier critical book on parapsychology. percipient’s dreams each night in a manner distinc
As part of his attempt to bring the earlier book up to tively appropriate to that night’s target. The only
date, he included an entire chapter on experiments plausible route of influence on the dream record
on telepathy in dreams. One page was devoted to a seems to be deliberate fraud involving the researchers
description of the basic method used in the Maimon and their subjects. The great number and variety of
ides experiments; one paragraph summarized the im personnel in these studies—experimenters, agents,
pressive outcome of 10 of the experiments. The rest percipients, and judges—makes fraud especially un
ofthe chapterwas devoted mainlyto aspecific account likely as an explanation of the positive findings; but
of the experiment in which psychologist Robert Van Hansel did not mention this important fact.
de Castle was the subject (the outcome is summarized It appears to me that all of Hansel’s criticisms
in Line G of my Table 1) and to the attempted rep of the Maimonides experiments are relevant only on
lication at the University of Wyoming (Belvedere & the hypothesis of fraud (except for the mistaken crit
Foulkes, 1971), in which Van de Castle was again the icism I have mentioned above). He said that uninten
subject. Another page was devoted to another of the tional communication was more likely but provided
Maimonides experiments that was also repeated at no evidence either that it occurred or that such com
the University of Wyoming (Foulkes et al., 1972). munication—in any form in which it might have oc
Hansel did not mention the replication by Globus et curred—could have produced such consistent results
al. (1968), whose authors felt that the results encour as emerged fromthe Maimonides experiments. I infer
aged further exploration. Hansel gave more weight to that Hansel was merely avoiding making explicit his
the two negative outcomes at Wyoming than to the unsupported accusations of fraud. Fraud is an inter
sum of the Maimonides research, arguing that sensory pretation always important to keep in mind, and it is
cues supposedly permitted by the procedures at Mai one that could not be entirely excluded even by pre
monides, not possible because of greater care taken cautions going beyond those used in the Wyoming
by the Wyoming experimenters, were responsible for studies. But the fact that fraud was as always, theo
the difference in results. He did not provide, of course, retically possible hardly justifies dismissal of a series
the full account ofprocedures presented in the original of carefully conducted studies that offer important
Maimonides reports that might persuade many read suggestions for opening up a new line of inquiry into
ers that Hansel’s interpretation is far fromcompelling. a topic potentially of great significance. Especially re
Nor did he consider why some of the other experi grettable is Hansel’s description of various supposed
ments at Maimonides, not obviously distinguished in defects in the experiments as though they mark the
the care with which they were done fromthe two that experiments as being carelessly conducted by general
were replicated (e.g., those on Lines E, M, and O of scientific criteria, whereas in fact the supposed defects
Table 1) yielded a close-to-chance outcome such as are relevant only if one assumes fraud. A reader who
Hansel might have expected sensory cuing to prevent. is introduced to the Maimonides research by Hansel’s
Hansel exaggerated the opportunities for sensory chapter is likely to get a totally erroneous impression
cuing—that is, forthe percipient to obtain by ordinary of the care taken by the experimenters to avoid various
sensory means some information about the target for possible sources of error. The one thing they could
Novem ber 1985 • Am erican Psychologist 1225
324 Parapsychology
not avoid was obtaining results that Hansel considered seemed to reject the Maimonides experiments because
a priori impossible, hence evidence of fraud; but they included no control groups. He wrote that “a
Hansel was not entirely frank about his reasoning. control group, for which no sender or no target was
An incidental point worth noting is that Hansel used, would appear essential” (p. 163). Later he added,
did not himself apply, in his critical attack, the stan “One could, alternatively, ‘send’when the subject was
dards of evidence he demanded of the researchers. not in the dream state, and compare ‘success’ in this
His conclusions were based implicitly on the assump case with success in dream state trials” (p. 163). The
tion that the difference of outcome between the Mai- first ofthese statements suggests a relevant use of con
monides and the Wyoming experiments was a genuine trol groups but errs in calling it essential; in other
difference, not attributable to random variation. He psychological research, Alcock would have doubtless
did not even raise the question, as he surely would readily recognized that within-subject control can,
have if, in some parallel instance, the Maimonides where feasible, be much more efficient and pertinent
researchers had claimed or implied statistical signif than a separate control group. His second statement
icance where it was questionable. In fact, the difference suggests a type of experiment that is probably im
of outcome might well have arisen fromrandomerror; possible (because in satisfactory form it seems to re
for the percipient’s own judgments the difference is quire the subject to dream whether awake or asleep
significant at the 5%level (2-tailed), but for the out and not to know whether he or she was awake or
siders’judgments it does not approach significance. asleep). This second kind of experiment, moreover,
Another 1980 book is T he P sych ology o f Tran has special pertinence only to a comparison between
scendence, by Andrew Neher, in which almost 100 dreaming and waking, not to the question of whether
pages are devoted to “psychic experience.” Neher dif ESP is manifested in dreaming.
fered from the other authors I refer to in describing Alcock, in short, did not seem to recognize that
the Maimonides work as a “series of studies of great the design of the Maimonides experiments was based
interest” (p. 145), but this evaluation seems to be ne on controls exactly parallel to those used by innu
gated by his devoting only three lines to it and four merable psychologists in other research with similar
lines to unsuccessful replications. logical structure (and even implied, curiously enough,
A third 1980 publication, T he P sych ology o f the in his own second suggestion). He encouraged readers
Psychic, by David Marks and Richard Kammann, to think that the Maimonides studies are beyond the
provides less of a general review of recent parapsy pale of acceptable experimental design, whereas in
chology than Hansel’s book or even Neher’s one long fact they are fine examples of appropriate use of
chapter. It is largely devoted to the techniques of within-subject control rather than between-subjects
mentalists (that is, conjurors specializing in psycho control.
logical rather than physical effects) and can be useful The quality of thinking with which Alcock con
to anyone encountering a mentalist who pretends to fronted the Maimonides research appeared also in a
be “psychic.” Most readers are not likely to be aware passage that did not refer to it by name. Referring to
that parapsychological research receives only limited an article published in T he H u m a n ist by Ethel Grod-
attention. The jacket blurbs give a very different view zins Romm, he wrote,
of the book, as do the authors in their introductory Romm (1977) argued that a fundamental problem with both
sentences: the dream telepathy research and the remote viewing tests
ESP is just around the next comer. When you get there, it is that the reports suffer from what she called “shoe-fitting”
is just around the next comer. Having now turned over one language; she cited a study in which the sender was installed
hundred of these comers, we decided to call it quits and in a room draped in white fabric and had ice cubes poured
report our findings for public review. (Marks & Kammann, down his back. A receiver who reported “white” was im
1980, p. 4) mediately judged to have made a “hit” by an independent
panel. Yet, as she observed, words such as “miserable”,
Given this introduction to the nature of the book, “wet”, or “icy” would have been better hits.. . . Again, the
readers might suppose it would at least mention any obvious need is for a control group. Why are they not used?
comer that many parapsychologists havejudged to be (P- 163)
an impressive turning. But the Maimonides dream What Romm described as “shoe fitting” (misinter
experiments received no mention at all. pretingevents to fit one’s expectations) is an important
Another volume, by psychologist James Alcock kind of errorthat is repeatedly made in interpretation
(1981), quite clearly purports to include a general re of everyday occurrences by people who believe they
view and evaluation of parapsychological research. are psychic. But the dreamtelepathy research at Mai
Alcock mentioned (p. 6) that Hansel had examined monides was well protected against this kind of error
the Maimonides experiments, but the only account by the painstaking controls that Alcock seemed not
ofthemthat Alcock offered (on p. 163) was incidental to have noticed. Surely Romm must be referring to
to a discussion of control groups. By implication he some other and very sloppy dream research?
1226 November 1985 • American Psychologist
Parapsychology 325
Not at all. The details in this paragraph, and Expect?” and it repeatedly speaks of “cult phuds,”
even more in Romm’s article, point unmistakably, meaning people with PhDs who are interested in
though inaccurately, to the fifth night of the first pre- parapsychological problems. Alcock’s repetition of
cognitive series at Maimonides. The actual details of Romm’s misstatements in a context lacking these
target and response would alone deprive it of much clues may well be taken by many a reader as scholarly
of its value as an example of shoe fitting. As reported writing based on correct information and rational
by Krippner, Ullman, &Honorton (1971), the target thought. Paradoxically, both Alcock’s paragraph and
was a morning experience that included being in a Romm’s article are excellent examples of the shoe
room that was draped with white sheets. The subject’s fitting error that both decry in others who are in fact
first dream report had included the statement, “I was carefully avoiding it.
just standing in a room, surrounded by white. Every The last of the five books that bring, or fail to
imaginable thing in that room was white” (p. 201). bring, the Maimonides research to the attention of
There is more similarity here than Romm and Alcock psychologists and their students is A n o m a listic P sy
acknowledged in mentioning from this passage only chology: A S tu d y o f E x tra o rd in a ry P h en om en a o f B e
the single word “white.” h avior a n d E xperien ce, a 1982 volume by Leonard
More important, however, is the fact that the ex Zusne and Warren H. Jones. This is in many ways
periment they were referring to provided no oppor an excellent book, and it is also the one of the five
tunity for shoe fitting. The procedures followed in the that comes closest to including a general review of
experiment were completely misrepresented in a way important recent research in parapsychology. Its brief
that created the illusion that the possibility existed. account of the Maimonides dreamexperiments, how
There was no panel, in the sense of a group of people ever, misrepresented them in ways that should seri
gathered together and capable of influencing each ously reduce a reader’s interest in considering them
other. The judges, operating independently, separately further.
judged every one of the 64 possible combinations of Zusne and Jones’s description of the basic pro
target and transcript yielded by the eight nights of the cedure made three serious errors. First, it implied that
experiment, not just the eight correct pairings, and one of the experimenters had a chance to know the
they had no clues to which those eight were. Their identity of the target. (“After the subject falls asleep,
responses are hardly likely to have been immediate, an art reproduction is selected from a large collection
as they required reading the entire night’s transcript. randomly, placed in an envelope, and given to the
Because each judge was working alone and was not agent” p. 260). In fact, precautions were taken to en
recording times, there would have been no record if sure that no one but the agent could know the identity
a particular response had been immediate, and no of the target. Second, the authors stated that “three
record of what particular element in the transcript judges . . . rate their confidence that the dream con
led to an immediate response. tent matches the target picture” (p. 260), leading the
I looked up in a 1977 issue of The H u m a n ist the reader to suppose that the judges were informed of
article by Romm that Alcock cited. The half page on the identity of the target at the time of rating. In fact,
shoe-fitting language gave as examples this item from a judge was presented with a dream transcript and a
the Maimonides research and also the SRI remote pool of potential targets and was asked to rate the
viewing experiments (Puthoff &Targ, 1976) done at degree of similarity between the transcript and each
SRI International. In both cases what was said was member of the pool, while being unaware of which
pure fiction, based on failure to note what was done member had been the target. Third, there was a sim
in the experiments and in particular that the experi ilarly, though more obscurely, misleading description
menters were well aware of the danger of shoe-fitting of how ratings were obtained from the dreamer.
language and that the design of their experiments in This misinformation was followed by even more
corporated procedures to ensure that it could not oc serious misrepresentation of the research and, by im
cur. Romm’s ignorance about the Maimonides re plication, of the competence of the researchers. Zusne
search and her apparent willingness to fabricate false and Jones (1982) wrote that Ullman and Krippner
hoods about it should be recognized by anyone who (1978) had found that dreamers were not influenced
had read any of the Maimonides research publica telepathically unless they knew in advance that an
tions. Yet Alcock accepted and repeated the fictions attempt would be made to influence them. This led,
as thpugh they were true. His presentation in the con they wrote, to the subject’s being “primed prior to
text of a book apparently in the scientific tradition going to sleep” through the experimenter’s
seems to me more dangerous than Romm’s original preparing the receiver through experiences that were related
article, for anyone with a scientific orientation should to the content of the picture to be telepathically transmitted
be able to recognize Romm’s article as propaganda. during the night. Thus, when the picture was Van Gogh’s
Its title, for example, is “When You Give a Closet Corridor of the St. Paul Hospital, which depicts a lonely
Occultist a PhD, What Kind of Research Can You figure in the hallways of a mental hospital, the receiver: (1)
The term psi denotes anomalous processes of information or chologists was only 34%. Moreover, an equal number of psy
energy transfer, processes such as telepathy or other forms of chologists declared ESP to be an impossibility, a view expressed
extrasensory perception that are currently unexplained in by only 2% of all other respondents (Wagner & Monnet, 1979).
terms of known physical or biological mechanisms. The term is We psychologists are probably more skeptical about psi for
purely descriptive: It neither implies that such anomalous phe several reasons. First, we believe that extraordinary claims re
nomena are paranormal nor connotes anything about their un quire extraordinary proof. And although our colleagues from
derlying mechanisms. other disciplines would probably agree with this dictum, we are
Does psi exist? Most academic psychologists don’t think so. more likely to be familiar with the methodological and statisti
A survey of more than 1,100 college professors in the United cal requirements for sustaining such claims, as well as with pre
States found that 55% of natural scientists, 6 6 % of social scien vious claims that failed either to meet those requirements or
tists (excluding psychologists), and 77% of academics in the arts, to survive the test of successful replication. Even for ordinary
humanities, and education believed that ESP is either an estab claims, our conventional statistical criteria are conservative.
lished fact or a likely possibility. The comparable figure for psy- The sacred p - .05 threshold is a constant reminder that it is far
more sinful to assert that an effect exists when it does not (the
Type I error) than to assert that an effect does not exist when it
Daryl J. Bern, Department of Psychology, Cornell University; Charles does (the Type II error).
Honorton, Department of Psychology, University of Edinburgh, Edin Second, most of us distinguish sharply between phenomena
burgh, Scotland. whose explanations are merely obscure or controversial (e.g.,
Sadly, Charles Honorton died of a heart attack on November 4, 1992, hypnosis) and phenomena such as psi that appear to fall outside
9 days before this article was accepted for publication. He was 46. Para our current explanatory framework altogether. (Some would
psychology has lost one of its most valued contributors. I have lost a characterize this as the difference between the unexplained and
valued friend. the inexplicable.) In contrast, many laypersons treat all exotic
This collaboration had its origins in a 1983 visit I made to Honorton’s psychological phenomena as epistemologically equivalent;
Psychophysical Research Laboratories (PRL) in Princeton, New Jersey, many even consider deja vu to be a psychic phenomenon. The
as one of several outside consultants brought in to examine the design blurring of this critical distinction is aided and abetted by the
and implementation of the experimental protocols. mass media, “new age” books and mind-power courses, and
Preparation of this article was supported, in part, by grants to Charles
Honorton from the American Society for Psychical Research and the “psychic” entertainers who present both genuine hypnosis and
Parapsychology Foundation, both of New York City. The work at PRL fake “mind reading” in the course of a single performance. Ac
summarized in the second half of this article was supported by the cordingly, most laypersons would not have to revise their con
James S. McDonnell Foundation of St. Louis, Missouri, and by the ceptual model of reality as radically as we would in order to
John E. Fetzer Foundation of Kalamazoo, Michigan. assimilate the existence of psi. For us, psi is simply more ex
Helpful comments on drafts of this article were received from Debo traordinary.
rah Delanoy, Edwin May, Donald McCarthy, Robert Morris, John Finally, research in cognitive and social psychology has sensi
Palmer, Robert Rosenthal, Lee Ross, Jessica Utts, Philip Zimbardo, and tized us to the errors and biases that plague intuitive attempts
two anonymous reviewers.
Correspondence concerning this article should be addressed to Daryl to draw valid inferences from the data of everyday experience
J. Bern, Department of Psychology, Uris Hall, Cornell University, Ith (Gilovich, 1991; Nisbett & Ross, 1980; Tversky & Kahneman,
aca, New York 14853. Electronic mail may be sent to d.bem@cor- 1971). This leads us to give virtually no probative weight to an
nell.edu. ecdotal or journalistic reports of psi, the main source cited by
330 Parapsychology
ANOMALOUS INFORMATION TRANSFER 5
our academic colleagues as evidence for their beliefs about psi published between 1966 and 1972 (Child, 1985; Ullman,
(Wagner & Mon net, 1979). Krippner, & Vaughan, 1973).
Ironically, however, psychologists are probably not more fa In the Maimonides dream studies, two subjects—a “receiver”
miliar than others with recent experimental research on psi. and a “sender”—spent the night in a sleep laboratory. The re
Like most psychological research, parapsychological research is ceiver’s brain waves and eye movements were monitored as he
reported primarily in specialized journals; unlike most psycho or she slept in an isolated room. When the receiver entered a
logical research, however, contemporary parapsychological re period of REM sleep, the experimenter pressed a buzzer that
search is not usually reviewed or summarized in psychology’s signaled the sender—under the supervision of a second experi
textbooks, handbooks, or mainstream journals. For example, menter—to begin a sending period. The sender would then con
only 1 of 64 introductory psychology textbooks recently sur centrate on a randomly chosen picture (the “target”) with the
veyed even mentions the experimental procedure reviewed in goal of influencing the content of the receiver’s dream.
this article, a procedure that has been in widespread use since Toward the end of the REM period, the receiver was awak
the early 1970s (Roig, Icochea, & Cuzzucoli, 1991). Other sec ened and asked to describe any dream just experienced. This
ondary sources for nonspecialists are frequently inaccurate in procedure was repeated throughout the night with the same
their descriptions of parapsychological research. (For discus target. A transcription of the receiver’s dream reports was given
sions of this problem, see Child, 1985, and Palmer, Honorton, to outside judges who blindly rated the similarity of the night’s
& Utts, 1989.) dreams to several pictures, including the target. In some studies,
This situation may be changing. Discussions of modern psi similarity ratings were also obtained from the receivers them
research have recently appeared in a widely used introductory selves. Across several variations of the procedure, dreams were
textbook (Atkinson, Atkinson, Smith, & Bern, 1990,1993), two judged to be significantly more similar to the target pictures
mainstream psychology journals (Child, 1985; Rao & Palmer, than to the control pictures in the judging sets (failures to repli
1987), and a scholarly but accessible book for nonspecialists cate the Maimonides results were also reviewed by Child, 1985).
(Broughton, 1991). The purpose of the present article is to sup These several lines of evidence suggested a working model of
plement these broader treatments with a more detailed, meta- psi in which psi-mediated information is conceptualized as a
analytic presentation of evidence issuing from a single experi weak signal that is normally masked by internal somatic and
mental method: the ganzfeld procedure. We believe that the external sensory “noise.” By reducing ordinary sensory input,
replication rates and effect sizes achieved with this procedure these diverse psi-conducive states are presumed to raise the sig-
are now sufficient to warrant bringing this body of data to the nal-to-noise ratio, thereby enhancing a person’s ability to detect
attention of the wider psychological community. the psi-mediated information (Honorton, 1969, 1977). To test
the hypothesis that a reduction of sensory input itself facilitates
The Ganzfeld Procedure psi performance, investigators turned to the ganzfeld procedure
(Braud, Wood, & Braud, 1975; Honorton & Harper, 1974; Par
By the 1960s, a number of parapsychologists had become dis ker, 1975), a procedure originally introduced into experimental
satisfied with the familiar ESP testing methods pioneered by psychology during the 1930s to test propositions derived from
J. B. Rhine at Duke University in the 1930s. In particular, they gestalt theory (Avant, 1965; Metzger, 1930).
believed that the repetitive forced-choice procedure in which a Like the dream studies, the psi ganzfeld procedure has most
subject repeatedly attempts to select the correct “target” sym often been used to test for telepathic communication between a
bol from a set of fixed alternatives failed to capture the circum sender and a receiver. The receiver is placed in a reclining chair
stances that characterize reported instances of psi in everyday in an acoustically isolated room. Translucent ping-pong ball
life. halves are taped over the eyes and headphones are placed over
Historically, psi has often been associated with meditation, the ears; a red floodlight directed toward the eyes produces an
hypnosis, dreaming, and other naturally occurring or deliber undifferentiated visual field, and white noise played through the
ately induced altered states of consciousness. For example, the headphones produces an analogous auditory field. It is this ho
view that psi phenomena can occur during meditation is ex mogeneous perceptual environment that is called the Ganzfeld
pressed in most classical texts on meditative techniques; the be (“total field”). To reduce internal somatic “noise,” the receiver
lief that hypnosis is a psi-conducive state dates all the way back typically also undergoes a series of progressive relaxation exer
to the days of early mesmerism (Dingwall, 1968); and cross- cises at the beginning of the ganzfeld period.
cultural surveys indicate that most reported “real-life” psi ex The sender is sequestered in a separate acoustically isolated
periences are mediated through dreams (Green, 1960; Prasad room, and a visual stimulus (art print, photograph, or brief vid
& Stevenson, 1968; L. E. Rhine, 1962; Sannwald, 1959). eotaped sequence) is randomly selected from a large pool of
There are now reports of experimental evidence consistent such stimuli to serve as the target for the session. While the
with these anecdotal observations. For example, several labora sender concentrates on the target, the receiver provides a con
tory investigators have reported that meditation facilitates psi tinuous verbal report of his or her ongoing imagery and menta
performance (Honorton, 1977). A meta-analysis of 25 experi tion, usually for about 30 minutes. At the completion of the
ments on hypnosis and psi conducted between 1945 and 1981 ganzfeld period, the receiver is presented with several stimuli
in 1 0 different laboratories suggests that hypnotic induction (usually four) and, without knowing which stimulus was the
may also facilitate psi performance (Schechter, 1984). And target, is asked to rate the degree to which each matches the
dream-mediated psi was reported in a series of experiments imagery and mentation experienced during the ganzfeld period.
conducted at Maimonides Medical Center in New York and If the receiver assigns the highest rating to the target stimulus, it
Parapsychology 331
6 DARYL J. BEM AND CHARLES HONORTON
is scored as a “hit.” Thus, if the experiment uses judging sets are assigned a mean z score of zero, the Stouffer z across all 38
containing four stimuli (the target and three decoys or control studies becomes 5.67 (p = 7.3 X 10-9).
stimuli), the hit rate expected by chance is .25. The ratings can Thus, whether one considers only the studies for which the
also be analyzed in other ways; for example, they can be con relevant information is available or includes a null estimate for
verted to ranks or standardized scores within each set and ana the additional studies for which the information is not available,
lyzed parametrically across sessions. And, as with the dream the aggregate results cannot reasonably be attributed to chance.
studies, the similarity ratings can also be made by outside judges And, by design, the cumulative outcome reported here cannot
using transcripts of the receiver’s mentation report. be attributed to the inflation of significance levels through
multiple analysis.
Meta-Analyses ofthe Ganzfeld Database Rates by laboratory One objection to estimates such as
those just described is that studies from a common laboratory
In 1985 and 1986, the Journal o f Parapsychology devoted two are not independent of one another (Parker, 1978). Thus, it is
entire issues to a critical examination of the ganzfeld database. possible for one or two investigators to be disproportionately
The 1985 issue comprised two contributions: (a) a meta-analy responsible for a high replication rate, whereas other, indepen
sis and critique by Ray Hyman (1985), a cognitive psychologist dent investigators are unable to obtain the effect.
and skeptical critic of parapsychological research, and (b) a The ganzfeld database is vulnerable to this possibility. The
competing meta-analysis and rejoinder by Charles Honorton 28 studies providing hit rate information were conducted by
(1985), a parapsychologist and major contributor to the ganz investigators in 10 different laboratories. One laboratory con
feld database. The 1986 issue contained four commentaries on tributed 9 of the studies, Honorton’s own laboratory contrib
the Hyman-Honorton exchange, a joint communique by Hy uted 5, 2 other laboratories contributed 3 each, 2 contributed 2,
man and Honorton, and six additional commentaries on the each, and the remaining 4 laboratories each contributed 1.
joint communique itself. We summarize the major issues and Thus, half of the studies were conducted by only 2 laboratories,
conclusions here. 1 of them Honorton’s own.
Accordingly, Honorton calculated a separate Stouffer z score
Replication Rates for each laboratory. Significantly positive outcomes were re
ported by 6 of the 1 0 laboratories, and the combined z score
Rates by study. Hyman’s meta-analysis covered 42 psi ganz across laboratories was 6.16 (p = 3.6 X 10~‘°). Even if all of
feld studies reported in 34 separate reports written or published the studies conducted by the 2 most prolific laboratories are
from 1974 through 1981. One of the first problems he discov discarded from the analysis, the Stouffer z across the 8 other
ered in the database was multiple analysis. As noted earlier, it laboratories remains significant (z = 3.67, p = 1.2 X 10~4). Four
is possible to calculate several indexes of psi performance in a of these studies are significant at the 1% level (p = 9.2 X 10-6,
ganzfeld experiment and, furthermore, to subject those indexes binomial test with 14 studies, p = .01, and q = .99), and each
to several kinds of statistical treatment. Many investigators re was contributed by a different laboratory. Thus, even though
ported multiple indexes or applied multiple statistical tests the total number of laboratories in this database is small, most
without adjusting the criterion significance level for the number of them have reported significant studies, and the significance
of tests conducted. Worse, some may have “shopped” among of the overall effect does not depend on just one or two of them.
the alternatives until finding one that yielded a significantly suc
cessful outcome. Honorton agreed that this was a problem. Selective Reporting
Accordingly, Honorton applied a uniform test on a common
index across all studies from which the pertinent datum could In recent years, behavioral scientists have become increas
be extracted, regardless of how the investigators had analyzed ingly aware of the “file-drawer” problem: the likelihood that
the data in the original reports. He selected the proportion of successful studies are more likely to be published than unsuc
hits as the common index because it could be calculated for the cessful studies, which are more likely to be consigned to the file
largest subset of studies: 28 of the 42 studies. The hit rate is drawers of their disappointed investigators (Bozarth & Roberts,
also a conservative index because it discards most of the rating 1972; Sterling, 1959). Parapsychologists were among the first to
information; a second place ranking—a “ near miss”—receives become sensitive to the problem, and, in 1975, the Parapsycho
no more credit than a last place ranking. Honorton then calcu logical Association Council adopted a policy opposing the selec
lated the exact binomial probability and its associated z score tive reporting of positive outcomes. As a consequence, negative
for each study. findings have been routinely reported at the association’s meet
Of the 28 studies, 23 (82%) had positive z scores (p = 4.6 X ings and in its affiliated publications for almost two decades. As
10~4, exact binomial test with p = q = .5). Twelve of the studies has already been shown, more than half of the ganzfeld studies
(43%) had z scores that were independently significant at the 5% included in the meta-analysis yielded outcomes whose signifi
level (p = 3.5 X 10' 9, binomial test with 28 studies, p = .05, cance falls short of the conventional .05 level.
and q = .95), and 7 of the studies (25%) were independently A variant of the selective reporting problem arises from what
significant at the 1% level (p = 9.8 X 10-9). The composite
Stouffer z score across the 28 studies was 6.60 (p = 2.1 X 10- " ) . 1
A more conservative estimate of significance can be obtained 1 Stouffer’s z is computed by dividing the sum of the z scores for the
by including 1 0 additional studies that also used the relevant individual studies by the square root of the number of studies (Rosen
judging procedure but did not report hit rates. If these studies thal, 1978).
332 Parapsychology
ANOMALOUS INFORMATION TRANSFER 7
Hyman (1985) has termed the “retrospective study.” An inves sion. There are, however, potential channels of sensory leakage
tigator conducts a small set of exploratory trials. If they yield after the ganzfeld period. For example, if the experimenter who
null results, they remain exploratory and never become part of interacts with the receiver knows the identity of the target, he or
the official record; if they yield positive results, they are defined she could bias the receiver’s similarity ratings in favor of correct
as a study after the fact and are submitted for publication. In identification. Only one study in the database contained this
support of this possibility, Hyman noted that there are more flaw, a study in which subjects actually performed slightly below
significant studies in the database with fewer than 2 0 trials than chance expectation. Second, if the stimulus set given to the re
one would expect under the assumption that, all other things ceiver forjudging contains the actual physical target handled by
being equal, statistical power should increase with the square the sender during the sending period, there might be cues (e.g.,
root of the sample size. Although Honorton questioned the as fingerprints, smudges, or temperature differences) that could
sumption that “all other things” are in fact equal across the differentiate the target from the decoys. Moreover, the process of
studies and disagreed with Hyman’s particular statistical analy transferring the stimulus materials to the receiver’s room itself
sis, he agreed that there is an apparent clustering of significant opens up other potential channels of sensory leakage. Although
studies with fewer than 20 trials. (Of the complete ganzfeld da contemporary ganzfeld studies have eliminated both of these
tabase of 42 studies, 8 involved fewer than 20 trials, and 6 of possibilities by using duplicate stimulus sets, some of the earlier
those studies reported statistically significant results.) studies did not.
Because it is impossible, by definition, to know how many Independent analyses by Hyman and Honorton agreed that
unknown studies—exploratory or otherwise—are languishing there was no correlation between inadequacies of security
in file drawers, the major tool for estimating the seriousness of against sensory leakage and study outcome. Honorton further
selective reporting problems has become some variant of Ro reported that if studies that failed to use duplicate stimulus sets
senthal’s file-drawer statistic, an estimate of how many unre were discarded from the analysis, the remaining studies are still
ported studies with z scores of zero would be required to exactly highly significant (Stouffer z = 4.35, p= 6 . 8 X 10“6).
cancel out the significance of the known database (Rosenthal, Randomization. In many psi experiments, the issue of
1979). For the 28 direct-hit ganzfeld studies alone, this estimate target randomization is critical because systematic patterns in
is 423 fugitive studies, a ratio of unreported-to-reported studies inadequately randomized target sequences might be detected by
of approximately 15:1. When it is recalled that a single ganzfeld subjects during a session or might match subjects’ preexisting
session takes over an hour to conduct, it is not surprising that— response biases. In a ganzfeld study, however, randomization is
despite his concern with the retrospective study problem—Hy a much less critical issue because only one target is selected dur
man concurred with Honorton and other participants in the ing the session and most subjects serve in only one session. The
published debate that selective reporting cannot plausibly ac primary concern is simply that all the stimuli within each judg
count for the overall statistical significance of the psi ganzfeld ing set be sampled uniformly over the course of the study. Sim
database (Hyman & Honorton, 1986).2 ilar considerations govern the second randomization, which
takes place after the ganzfeld period and determines the se
Methodological Flaws quence in which the target and decoys are presented to the re
ceiver (or external judge) forjudging.
If the most frequent criticism of parapsychology is that it has Nevertheless, Hyman and Honorton disagreed over the find
not produced a replicable psi effect, the second most frequent ings here. Hyman claimed there was a correlation between flaws
criticism is that many, if not most, psi experiments have inade of randomization and study outcome; Honorton claimed there
quate controls and procedural safeguards. A frequent charge is was not. The sources of this disagreement were in conflicting
that positive results emerge primarily from initial, poorly con definitions of flaw categories, in the coding and assignment of
trolled studies and then vanish as better controls and safeguards flaw ratings to individual studies, and in the subsequent statisti
are introduced. cal treatment of those ratings.
Fortunately, meta-analysis provides a vehicle for empirically Unfortunately, there have been no ratings of flaws by inde
evaluating the extent to which methodological flaws may have pendent raters who were unaware of the studies’ outcomes
contributed to artifactual positive outcomes across a set of stud (Morris, 1991). Nevertheless, none of the contributors to the
ies. First, ratings are assigned to each study that index the degree subsequent debate concurred with Hyman’s conclusion,
to which particular methodological flaws are or are not present; whereas four nonparapsychologists—two statisticians and two
these ratings are then correlated with the studies’ outcomes. psychologists—explicitly concurred with Honorton’s conclu
Large positive correlations constitute evidence that the ob sion (Harris & Rosenthal, 1988b; Saunders, 1985; Utts, 1991 a).
served effect may be artifactual. For example, Harris and Rosenthal (one of the pioneers in the
In psi research, the most fatal flaws are those that might per use of meta-analysis in psychology) used Hyman’s own flaw rat
mit a subject to obtain the target information in normal sensory ings and failed to find any significant relationships between
fashion, either inadvertently or through deliberate cheating. flaws and study outcomes in each of two separate analyses:
This is called the problem of sensory leakage. Another poten
tially serious flaw is inadequate randomization of target selec 2 A 1980 survey of parapsychologists uncovered only 19 completed
tion. but unreported ganzfeld studies. Seven of these had achieved signifi
Sensory leakage. Because the ganzfeld is itself a perceptual cantly positive results, a proportion (.37) very similar to the proportion
isolation procedure, it goes a long way toward eliminating po of independently significant studies in the meta-analysis (.43) (Black-
tential sensory leakage during the ganzfeld portion of the ses more, 1980).
Parapsychology 333
8 DARYL J. BEM AND CHARLES HONORTON
“Our analysis of the effects of flaws on study outcome lends no portion from .5, Cohen considers .65 to be a medium effect size:
support to the hypothesis that Ganzfeld research results are a A statistically unaided observer should be able to detect the bias
significant function of the set of flaw variables” (1988b, p. 3; for of a coin that comes up heads on 65% of the trials. Thus, at .62,
a more recent exchange regarding Hyman’s analysis, see Hy the psi ganzfeld effect size falls just short of Cohen’s naked-eye
man, 1991; Utts, 1991a, 1991b). criterion. From the phenomenology of the ganzfeld experi
menter, the corresponding hit rate of 35% implies that he or she
Effect Size will see a subject obtain a hit approximately every third session
rather than every fourth.
Some critics of parapsychology have argued that even if cur It is also instructive to compare the psi ganzfeld effect with
rent laboratory-produced psi effects turn out to be replicable the results of a recent medical study that sought to determine
and nonartifactual, they are too small to be of theoretical inter whether aspirin can prevent heart attacks (Steering Committee
est or practical importance. We do not believe this to be the case of the Physicians’ Health Study Research Group, 1988). The
for the psi ganzfeld effect. study was discontinued after 6 years because it was already clear
In psi ganzfeld studies, the hit rate itself provides a straight that the aspirin treatment was effective (p < .0 0 0 0 1 ) and it was
forward descriptive measure of effect size, but this measure can considered unethical to keep the control group on placebo med
not be compared directly across studies because they do not all ication. The study was widely publicized as a major medical
use a four-stimulus judging set and, hence, do not all have a breakthrough. But despite its undisputed reality and practical
chance baseline of .25. The next most obvious candidate, the importance, the size of the aspirin effect is quite small: Taking
difference in each study between the hit rate observed and the aspirin reduces the probability of suffering a heart attack by
hit rate expected under the null hypothesis, is also intuitively only .008. The corresponding effect size (h) is .068, about one
descriptive but is not appropriate for statistical analysis because third to one fourth the size of the psi ganzfeld effect (Atkinson
not all differences between proportions that are equal are etal., 1993, p. 236; Utts, 1991b).
equally detectable (e.g., the power to detect the difference be In sum, we believe that the psi ganzfeld effect is large enough
tween .55 and .25 is different from the power to detect the to be of both theoretical interest and potential practical impor
difference between .50 and .20). tance.
To provide a scale of equal detectability, Cohen (1988) de
vised the effect size index h, which involves an arcsine transfor Experimental Correlates of the Psi Ganzfeld Effect
mation on the proportions before calculation of their difference.
Cohen’s h is quite general and can assess the difference between We showed earlier that the technique of correlating variables
any two proportions drawn from independent samples or be with effect sizes across studies can help to assess whether meth
tween a single proportion and any specified hypothetical value. odological flaws might have produced artifactual positive out
For the 28 studies examined in the meta-analyses, h was .28, comes. The same technique can be used more affirmatively to
with a 95% confidence interval from . 11 to .45. explore whether an effect varies systematically with conceptu
But because values of h do not provide an intuitively descrip ally relevant variations in experimental procedure. The discov
tive scale, Rosenthal and Rubin (1989; Rosenthal, 1991) have ery of such correlates can help to establish an effect as genuine,
recently suggested a new index, 7r, which applies specifically to suggest ways of increasing replication rates and effect sizes, and
one-sample, multiple-choice data of the kind obtained in ganz enhance the chances of moving beyond the simple demonstra
feld experiments. In particular, ir expresses all hit rates as the tion of an effect to its explanation. This strategy is only heuris
proportion of hits that would have been obtained if there had tic, however. Any correlates discovered must be considered
been only two equally likely alternatives—essentially a coin flip. quite tentative, both because they emerge from post hoc explo
Thus, 7r ranges from 0 to 1, with .5 expected under the null ration and because they necessarily involve comparisons across
hypothesis. The formula is heterogeneous studies that differ simultaneously on many inter
related variables, known and unknown. Two such correlates
_ Pjk-D emerged from the meta-analyses of the psi ganzfeld effect.
* F\k - 2 ) + 1 ’ Single-versus multiple-image targets. Although most of the
28 studies in the meta-analysis used single pictures as targets, 9
where P is the raw proportion of hits and k is the number of (conducted by three different investigators) used View Master
alternative choices available. Because has such a straightfor
tv stereoscopic slide reels that presented multiple images focused
ward intuitive interpretation, we use it (or its conversion back on a central theme. Studies using the View Master reels pro
to an equivalent four-alternative hit rate) throughout this article duced significantly higher hit rates than did studies using the
whenever it is applicable. single-image targets (50% vs. 34%), ?(26) = 2.22, p = .035, two-
For the 28 studies examined in the meta-analyses, the mean tailed.
value of 7r was .62, with a 95% confidence interval from .55 to Sender-receiver pairing. In 17 of the 28 studies, partici
.69. This corresponds to a four-alternative hit rate of 35%, with pants were free to bring in friends to serve as senders. In 8 stud
a 95% confidence interval from 28% to 43%. ies, only laboratory-assigned senders were used. (Three studies
Cohen (1988, 1992) has also categorized effect sizes into used no sender.) Unfortunately, there is no record of how many
small, medium, and large, with medium denoting an effect size participants in the former studies actually brought in friends.
that should be apparent to the naked eye of a careful observer. Nevertheless, those 17 studies (conducted by six different inves
For a statistic such as tv, which indexes the deviation of a pro tigators) had significantly higher hit rates than did the studies
334 Parapsychology
ANOMALOUS INFORMATION TRANSFER 9
that used only laboratory-assigned senders (44% vs. 26%), /(23) “only the Ganzfeld ESP studies [the only psi studies they evalu
= 2.39, p = .025, two-tailed. ated] regularly meet the basic requirements of sound experi
mental design” (p. 53), and they concluded that
The Joint Communique it would be implausible to entertain the null given the combined p
After their published exchange in 1985, Hyman and Honor- from these 28 studies. Given the various problems or flaws pointed
out by Hyman and Honorton . . . we might estimate the obtained
ton agreed to contribute a joint communique to the subsequent accuracy rate to be about 1/3 . . . when the accuracy rate expected
discussion that was published in 1986. First, they set forth their under the null is 1/4. (p. 51 )3
areas of agreement and disagreement:
We agree that there is an overall significant effect in this data base The Autoganzfeld Studies
that cannot reasonably be explained by selective reporting or In 1983, Honorton and his colleagues initiated a new series
multiple analysis. We continue to differ over the degree to which of ganzfeld studies designed to avoid the methodological prob
the effect constitutes evidence for psi, but we agree that the final
verdict awaits the outcome of future experiments conducted by a lems he and others had identified in earlier studies (Honorton,
broader range of investigators and according to more stringent 1979; Kennedy, 1979). These studies complied with all of the
standards. (Hyman & Honorton, 1986, p. 351) detailed guidelines that he and Hyman were to publish later in
their joint communique. The program continued until Septem
They then spelled out in detail the “more stringent stan ber 1989, when a loss of funding forced the laboratory to close.
dards” they believed should govern future experiments. These The major innovations of the new studies were computer con
standards included strict security precautions against sensory trol of the experimental protocol—hence the name autoganz
leakage, testing and documentation of randomization methods feld—and the introduction of videotaped film clips as target
for selecting targets and sequencing the judging pool, statistical stimuli.
correction for multiple analyses, advance specification of the
status of the experiment (e.g., pilot study or confirmatory ex Method
periment), and full documentation in the published report of
the experimental procedures and the status of statistical tests The basic design of the autoganzfeld studies was the same as that
(e.g., planned or post hoc). described earlier4: A receiver and sender were sequestered in separate,
acoustically isolated chambers. After a 14-min period of progressive
relaxation, the receiver underwent ganzfeld stimulation while describ
The National Research Council Report ing his or her thoughts and images aloud for 30 min. Meanwhile, the
In 1988, the National Research Council (NRC) of the Na sender concentrated on a randomly selected target. At the end of the
tional Academy of Sciences released a widely publicized report ganzfeld period, the receiver was shown four stimuli and, without know
ing which of the four had been the target, rated each stimulus for its
commissioned by the U.S. Army that assessed several contro similarity to his or her mentation during the ganzfeld.
versial technologies for enhancing human performance, includ The targets consisted of 80 still pictures (static targets) and 80 short
ing accelerated learning, neurolinguistic programming, mental video segments complete with soundtracks (dynamic targets), all re
practice, biofeedback, and parapsychology (Druckman & corded on videocassette. The static targets included art prints, pho
Swets, 1988; summarized in Swets & Bjork, 1990). The report’s tographs, and magazine advertisements; the dynamic targets included
conclusion concerning parapsychology was quite negative: excerpts of approximately 1-min duration from motion pictures, TV
“The Committee finds no scientific justification from research shows, and cartoons. The 160 targets were arranged in judging sets of
conducted over a period of 130 years for the existence of para- four static or four dynamic targets each, constructed to minimize simi
psychological phenomena” (Druckman & Swets, 1988, p. 22). larities among targets within a set.
An extended refutation strongly protesting the committee’s Target selection and presentation. The VCR containing the taped
targets was interfaced to the controlling computer, which selected the
treatment of parapsychology has been published elsewhere target and controlled its repeated presentation to the sender during the
(Palmer et al., 1989). The pertinent point here is simply that ganzfeld period, thus eliminating the need for a second experimenter to
the NRC’s evaluation of the ganzfeld studies does not reflect an accompany the sender. After the ganzfeld period, the computer ran
additional, independent examination of the ganzfeld database domly sequenced the four-clip judging set and presented it to the re
but is based on the same meta-analysis conducted by Hyman ceiver on a TV monitor forjudging. The receiver used a computer game
that we have discussed in this article. paddle to make his or her ratings on a 40-point scale that appeared on
Hyman chaired the NRC’s Subcommittee on Parapsychol
ogy, and, although he had concurred with Honorton 2 years ear
lier in their joint communique that “there is an overall signifi 3 In a troubling development, the chair of the NRC Committee
cant effect in this data base that cannot reasonably be explained phoned Rosenthal and asked him to delete the parapsychology section
by selective reporting or multiple analysis” (p. 351) and that of the paper (R. Rosenthal, personal communication, September 15,
“significant outcomes have been produced by a number of 1992). Although Rosenthal refused to do so, that section of the Harris-
different investigators” (p. 352), neither of these points is ac Rosenthal paper is nowhere cited in the NRC report.
4 Because Honorton and his colleagues have complied with the Hy-
knowledged in the committee’s report. man-Honorton specification that experimental reports be sufficiently
The NRC also solicited a background report from Harris and complete to permit others to reconstruct the investigator’s procedures,
Rosenthal (1988a), which provided the committee with a com readers who wish to know more detail than we provide here are likely to
parative methodological analysis of the five controversial areas find whatever they need in the archival publication of these studies in
just listed. Harris and Rosenthal noted that, of these areas, the Journal o f Parapsychology (Honorton et al., 1990).
Parapsychology 335
10 DARYL J. BEM AND CHARLES HONORTON
the TV monitor after each clip was shown. The receiver was permitted 9 sessions testing a procedure in which the experimenter, rather than
to see each clip and to change the ratings repeatedly until he or she was the receiver, served as the judge at the end of the session. Study 3 com
satisfied. The computer then wrote these and other data from the session prised 35 sessions and served as practice for participants who had com
into a file on a floppy disk. At that point, the sender moved to the receiv pleted the allotted number of sessions in the ongoing formal studies but
er’s chamber and revealed the identity of the target to both the receiver who wanted additional ganzfeld experience. This study also included
and the experimenter. Note that the experimenter did not even know several demonstration sessions when TV film crews were present.
the identity of the four-clip judging set until it was displayed to the re Novice studies Studies 101-104 were each designed to test 50 par
ceiver forjudging. ticipants who had had no prior ganzfeld experience; each participant
Randomization The random selection of the target and sequencing served as the receiver in a single ganzfeld session. Study 104 included 16
of the judging set were controlled by a noise-based random number gen of 20 students recruited from the Juilliard School in New York City to
erator interfaced to the computer. Extensive testing confirmed that the test an artistically gifted sample. Study 105 was initiated to accommo
generator was providing a uniform distribution of values throughout date the overflow of participants who had been recruited for Study 104,
the full target range (1-160). Tests on the actual frequencies observed including the 4 remaining Juilliard students. The sample size for this
during the experiments confirmed that targets were, on average, selected study was set to 25, but only 6 sessions had been completed when the
uniformly from among the 4 clips within each judging set and that the laboratory closed. For purposes of exposition, we divided the 56 sessions
4 judging sequences used were uniformly distributed across sessions. from Studies 104 and 105 into two parts: Study 104/105(a) comprises
Additional control features The receiver’s and sender’s rooms were the 36 non-Juilliard participants, and Study 104/105(b) comprises the
sound-isolated, electrically shielded chambers with single-door access 20 Juilliard students.
that could be continuously monitored by the experimenter. There was Study 201 This study was designed to retest the most promising
two-way intercom communication between the experimenter and the participants from the previous studies. The number of trials was set to
receiver but only one-way communication into the sender’s room; thus, 20, but only 7 sessions with 3 participants had been completed when
neither the experimenter nor the receiver could monitor events inside the laboratory closed.
the sender’s room. The archival record for each session includes an au Study 301. This study was designed to compare static and dynamic
diotape containing the receiver’s mentation during the ganzfeld period targets. The sample size was set to 50 sessions. Twenty-five experienced
and all verbal exchanges between the experimenter and the receiver participants each served as the receiver in 2 sessions. Unknown to the
throughout the experiment. participants, the computer control program was modified to ensure that
The automated ganzfeld protocol has been examined by several they would each have 1 session with a static target and 1 session with a
dozen parapsychologists and behavioral researchers from other fields, dynamic target.
including well-known critics of parapsychology. Many have partici Study 302 This study was designed to examine a dynamic target
pated as subjects or observers. All have expressed satisfaction with the set that had yielded a particularly high hit rate in the previous studies.
handling of security issues and controls. The study involved experienced participants who had had no prior ex
Parapsychologists have often been urged to employ magicians as con perience with this particular target set and who were unaware that only
sultants to ensure that the experimental protocols are not vulnerable one target set was being sampled. Each served as the receiver in a single
either to inadvertent sensory leakage or to deliberate cheating. Two session. The design called for the study to continue until 15 sessions
“mentalists,” magicians who specialize in the simulation of psi, have were completed with each of the targets, but only 25 sessions had been
examined the autoganzfeld system and protocol. Ford Kross, a profes completed when the laboratory closed.
sional mentalist and officer of the mentalist’s professional organization, The 11 studies just described comprise all sessions conducted during
the Psychic Entertainers Association, provided the following written the 6.5 years of the program. There is no “file drawer” of unreported
statement “In my professional capacity as a mentalist, I have reviewed sessions.
Psychophysical Research Laboratories’ automated ganzfeld system and
found it to provide excellent security against deception by subjects” Results
(personal communication, May, 1989).
Daryl J. Bern has also performed as a mentalist for many years and is Overall hit rate. As in the earlier meta-analysis, receivers’
a member of the Psychic Entertainers Association. As mentioned in ratings were analyzed by tallying the proportion of hits achieved
the author note, this article had its origins in a 1983 visit he made to and calculating the exact binomial probability for the observed
Honorton’s laboratory, where he was asked to critically examine the number of hits compared with the chance expectation of .25.
research protocol from the perspective of a mentalist, a research psy
chologist, and a subject. Needless to say, this article would not exist if he As noted earlier, 240 participants contributed 354 sessions. For
did not concur with Ford Kross’s assessment of the security procedures. reasons discussed later, Study 302 is analyzed separately, reduc
ing the number of sessions in the primary analysis to 329.
As Table 1 shows, there were 106 hits in the 329 sessions, a
Experimental Studies hit rate of 32% (z = 2.89, p = .002, one-tailed), with a 95%
confidence interval from 30% to 35%. This corresponds to an
Altogether, 100 men and 140 women participated as receivers in 354 effect size (tt) of .59, with a 95% confidence interval from .53 to
sessions during the research program.5 The participants ranged in age .64.
from 17 to 74 years (M = 37.3, SD = 11.8), with a mean formal educa Table 1 also shows that when Studies 104 and 105 are com
tion of 15.6 years (SD = 2.0). Eight separate experimenters, including
Honorton, conducted the studies. bined and re-divided into Studies 104/105(a) and 104/105(b), 9
The experimental program included three pilot and eight formal
studies. Five of the formal studies used novice (first-time) participants
who served as the receiver in one session each. The remaining three 5 A recent review of the original computer files uncovered a duplicate
formal studies used experienced participants. record in the autoganzfeld database. This has now been eliminated, re
Pilot studies. Sample sizes were not preset in the three pilot studies. ducing by one the number of subjects and sessions. As a result, some of
Study 1 comprised 22 sessions and was conducted during the initial the numbers presented in this article differ slightly from those in Hon
development and testing of the autoganzfeld system. Study 2 comprised orton etal. (1990).
336 Parapsychology
ANOMALOUS INFORMATION TRANSFER 11
Table 1
Outcome by Study
N N N % Effect
Study Study/subject description subjects trials hits hits size ir z
1 Pilot 19 22 8 36 .62 0.99
2 Pilot 4 9 3 33 .60 0.25
3 Pilot 24 35 10 29 .55 0.32
101 Novice 50 50 12 24 .47 -0.30
102 Novice 50 50 18 36 .63 1.60
103 Novice 50 50 15 30 .55 0.67
104/105(a) Novice 36 36 12 33 .60 0.97
104/105(b) Juilliard sample 20 20 10 50 .75 2.20
201 Experienced 3 7 3 43 .69 0.69
301 Expenenced 25 50 15 30 .56 0.67
302 Experienced 25 25 16 54* .78* 3.04*
Overall
(Studies 1-301) 240 329 106 32 .59 2.89
Note. All z scores are based on the exact binomial probability, with p = .25 and q - .75.
a Adjusted for response bias; the hit rate actually observed was 64%.
of the 10 studies yield positive effect sizes, with a mean effect applied to the autoganzfeld studies, however, because there are
size (r) of .61, /(9) = 4.44, p = .0008, one-tailed. This effect size no unreported sessions.
is equivalent to a four-alternative hit rate of 34%. Alternatively, One reviewer of this article suggested that the negative corre
if Studies 104 and 105 are retained as separate studies, 9 of the lation might reflect a decline effect in which earlier sessions of a
10 studies again yield positive effect sizes, with a mean effect study are more successful than later sessions. If there were such
size (7r) of .62, /(9) = 3.73, p = .002, one-tailed. This effect size an effect, then studies with fewer sessions would show larger
is equivalent to a four-alternative hit rate of 35% and is identical effect sizes because they would end before the decline could set
to that found across the 28 studies of the earlier meta-analysis.6 in. To check this possibility, we computed point-biserial corre
Considered together, sessions with novice participants (Stud lations between hits (1) or misses (0) and the session number
ies 101-105) yielded a statistically significant hit rate of 32.5% within each of the 10 studies. All of the correlations hovered
(p = .009), which is not significantly different from the 31.6% around zero; six were positive, four were negative, and the over
hit rate achieved by experienced participants in Studies 201 and all mean was .01.
301. And, finally, each of the eight experimenters also achieved An inspection of Table 1 reveals that the negative correlation
a positive effect size, with a mean 7r of .60, t(l) = 3.44, p = .005, derives primarily from the two studies with the largest effect
one-tailed. sizes; the 20 sessions with the Juilliard students and the 7 ses
The Juilliard sample. There are several reports in the liter sions of Study 201, the study specifically designed to retest the
ature of avrelationship between creativity or artistic ability and most promising participants from the previous studies. Accord
psi performance (Schmeidler, 1988). To explore this possibility ingly, it seems likely that the larger effect sizes of these two stud
in the ganzfeld setting, 10 male and 10 female undergraduates ies—and hence the significant negative correlation between the
were recruited from the Juilliard School. Of these, 8 were music number of sessions and the effect size—reflect genuine perfor
students, 10 were drama students, and 2 were dance students. mance differences between these two small, highly selected sam
Each served as the receiver in a single session in Study 104 or ples and other autoganzfeld participants.
105. As shown in Table 1, these students achieved a hit rate of Study 302. All of the studies except Study 302 randomly
50% (/? = .014), one of the five highest hit rates ever reported for sampled from a pool of 160 static and dynamic targets. Study
a single sample in a ganzfeld study. The musicians were partic 302 sampled from a single, dynamic target set that had yielded
ularly successful; 6 of the 8 (75%) successfully identified their a particularly high hit rate in the previous studies. The four film
targets (p = .004; further details about this sample and their clips in this set consisted of a ‘scene of a tidal wave from the
ganzfeld performance were reported in Schlitz & Honorton, movie Clash o f the Titans, a high-speed sex scene from A Clock
1992). work Orange, a scene of crawling snakes from a TV documen
Study size and effect size. There is a significant negative cor tary, and a scene from a Bugs Bunny cartoon.
relation across the 10 studies listed in Table 1 between the num
ber of sessions included in a study and the study’s effect size (x), 6 As noted above, the laboratory was forced to close before three of
r = -.64, t{8) = 2.36, p < .05, two-tailed. This is reminiscent the formal studies could be completed. If we assume that the remaining
of Hyman’s discovery that the smaller studies in the original trials in Studies 105 and 201 would have yielded only chance results,
ganzfeld database were disproportionately likely to report sta this would reduce the overall z for the first 10 autoganzfeld studies from
tistically significant results. He interpreted this finding as evi 2.89 to 2.76 (p = .003). Thus, inclusion of the two incomplete studies
dence for a bias against the reporting of small studies that fail to does not pose an optional stopping problem. The third incomplete
achieve significant results. A similar interpretation cannot be study, Study 302, is discussed below.
Parapsychology 337
12 DARYL J. BEM AND CHARLES HONORTON
The experimental design called for this study to continue un
Fisher’s
exact p
.300
.032
til each of the clips had served as the target 15 times. Unfortu
.029
.027
nately, the premature termination of this study at 25 sessions
left an imbalance in the frequency with which each clip had
served as the target. This means that the high hit rate observed
(64%) could well be inflated by response biases.
As an illustration, water imagery is frequently reported by
Difference
receivers in ganzfeld sessions, whereas sexual imagery is rarely
.20
.46
.46
.62
.44
reported. (Some participants probably are reluctant both to re
port sexual imagery and to give the highest rating to the sex-
related clip.) If a video clip containing popular imagery (such as
water) happens to appear as a target more frequently than a
clip containing unpopular imagery (such as sex), a high hit rate
might simply reflect the coincidence of those frequencies of oc
currence with participants’ response biases. And, as the second
first when
Ranked
(2/18)
(1/22)
(1/21)
decoy
column of Table 2 reveals, the tidal wave clip did in fact appear
.05
.05
.36
.11
.14
more frequently as the target than did the sex clip. More gener
ally, the second and third columns of Table 2 show that the fre
quency with which each film clip was ranked first closely
matches the frequency with which each appeared as the target.
One can adjust for this problem by using the observed fre
quencies in these two columns to compute the hit rate expected
first when
Ranked
target
(4/7)
(2/3)
(1/4)
if there were no psi effect. In particular, one can multiply each
.25
.67
.82
.57
.58
proportion in the second column by the corresponding propor
tion in the third column—yielding the joint probability that the
clip was the target and that it was ranked first—and then sum
across the four clips. As shown in the fourth column of Table 2,
this computation yields an overall expected hit rate of 34.08%.
When the observed hit rate of 64% is compared with this base
Expected
1.44
1.28
hit rate
6.72
24.64
34.08
(%)
line, the effect size (h) is .61. As shown in Table 1, this is equiv
alent to a four-alternative hit rate of 54%, or a ir value of .78,
and is statistically significant (z = 3.04, p = .0012).
The psi effect can be seen even more clearly in the remaining
columns of Table 2, which control for the differential popularity
of the imagery in the clips by displaying how frequently each
Relative frequency of
was ranked first when it was the target and how frequently it was
first place ranking
ranked first when it was one of the control clips (decoys). As can
be seen, each of the four clips was selected as the target relatively
(2/25)
(6/25)
(3/25)
.24
.12
.08
.56
more frequently when it was the target than when it was a decoy,
a difference that is significant for three of the four clips. On
average, a clip was identified as the target 58% of the time when
it was the target and only 14% of the time when it was a decoy.
Dynamic versus static targets. The success of Study 302
raises the question of whether dynamic targets are, in general,
more effective than static targets. This possibility was also sug
gested by the earlier meta-analysis, which revealed that studies
Relative frequency
(3/25)
(4/25)
(7/25)
.28
.12
.16
.44
Overall
D iscussion Paper
Despite the field’s long history, there is still controversy over whether
the results of parapsychology experiments offer evidence for a genuine
communication anomaly—psi. For some time, parapsychologists have
The writing of this paper and organization of the debate were generously supported
by the Fundagao Bial and the Society for Psychical Research. I amgrateful to Hoyt Edge for
moderating the debate, Gertrude Schmeidler for editing the debate material, and Paul
Stevens for writing the anonymizing software and acting as systems manager for the discus
sion. I am indebted to the researchers who were kind enough to observe or participate in
the debate and to Bob Morris for comments on an earlier draft of the discussion paper.
346 Parapsychology
310 The Journal of Parapsychology
recognized that the evidence for psi most likely to convince fair-minded
but critical scientists would be an experimental procedure that a range of
experimenters could carry out that would produce reasonably replicable
effects. Unless the experiment’s effects could be replicated across experi
menters, there would always remain fraud, error, or sensory leakage as
strong alternative explanations to the psi hypothesis.
For many years, such replicability appeared to be out of reach. This
perception appeared to change however, with the arrival in the 1970s of
several research programs involving free-response ESP. In particular,
ganzfeld ESP studies seemed especially promising. Not only did a range
of experimenters appear to obtain outcomes in ganzfeld studies that
were above chance, but they did so under conditions that appeared to be
well-controlled and without using specially selected participants. In 1981,
Ray Hyman, a psychologist skeptical of the existence of psi, wanted to
conduct a critical assessment of a research program that represented
parapsychology’s strongest evidence. Because of claims then being made
for ganzfeld research, it was an obvious choice for his attention (Hyman,
1985). Hyman (1985) meta-analyzed the 42 studies conducted since pub
lication of the first ganzeld ESP study in 1974, finding an overall statisti
cally significant outcome; however, he concluded that the methodologi
cal problems that he identified in the studies could account for the
positive results. In response, Charles Honorton, a proponent of ganzfeld
research, conducted his own meta-analysis of the database, restricting his
attention to the 28 studies reporting direct hits as an outcome measure
(Honorton, 1985). He also obtained a statistically significant overall out
come (see Table 1); but although he conceded that the studies contained
potential methodological problems, he did not agree that the problems
were sufficient to account for the overall outcome.
Rather than continue to dispute the matter, Hyman and Honorton
(1986) instead jointly drew up a set of methodological guidelines for the
stringent conduct of future ganzfeld studies, agreeing that the case for
psi in the ganzfeld would rely on a broad range of experimenters obtain
ing positive results under such conditions. Meanwhile, Honorton and his
research team at Princeton Research Laboratories (PRL) had begun in
1982 a series of partially automated ganzfeld studies— autogaiizfeld stud
ies—designed to meet Hyman’s methodological concerns (Bern &
Honorton, 1994; Honorton et al., 1990). Before PRL closed in 1989,
eleven series were completed, obtaining a statistically significant overall
outcome and a mean effect size nearly identical to that obtained in
Honorton’s (1985) meta-analysis of the earlier ganzfeld database (see Ta
ble 1). Replication under stringent conditions of the early ganzfeld re
sults appeared to suggest that methodological problems were unlikely to
have accounted entirely for the effects obtained in the earlier studies;
however, Bern and Honorton pointed out that it still remained for their
Parapsychology 347
Discussion Paper 311
Table 1
Outcomes of meta-analyses of ESP ganzfeld studies1.
aThe original table included some errata. The present table has been corrected, and the corrected figures are shown in bold.
The Journal of Parapsychology
Table 2
Mean Methodological Quality of Studies in Parapsychology
Meta-analyses Expressed as a Percentage of the Maximum
Number of Quality Points Available.
Meta-analysis Effect examined Mean quality (%)
Honorton (1985) Ganzfeld ESP 70*
Hyman (1985) Ganzfeld ESP 44b
Honorton & Forced-choice precognition 41
Ferrari (1989)
Honorton et al. ESP-extraversion relationship
(1998) Forced-choice studies 45
Free-response studies 86
Lawrence (1993) ESP-belief in psi relationship 46
Milton (1997) Non-ASC free-response ESP
GESP studies0 61
Clairvoyance studies0 58
Precognition studies0 47
Radin & Ferrari (1991) Dice PK Not reported
Radin & Nelson (1989) Micro-PK Not reported
Stanford & Stein (1994) ESP-Hypnosis relationship 49
Steinkamp et al. (1998) Precognition vs clairvoyance
Clairvoyance studies 66
Precognition studies 63
Note: The meta-analyses used different qualitycriteria, ranging from2to 18safeguards being
examined in each meta-analysis. The mean quality of each meta-analysis is therefore, not di
rectly comparable with another.
aIn this meta-analysis, Honorton assessedstudyqualityonjust twofeatures—the availabilityof
sensory cues fromtarget handling and the adequacyof the target randomization method. He
assigned partial credit to studies containing methodological features (the use of single rather
than duplicate target sets and randomization using hand shuffling, coin-flipping or
die-throwing) that have received no credit in other parapsychological meta-analyses
(Honorton &Ferrari, 1989; Lawrence, 1993; Milton, 1997; etc.). This method allowed him to
make a distinction between these studies and studies using less stringent or unknown meth
ods; but forthe purposes of this table, the methodarguablyinflatesapparent studyqualitybya
considerable amount. For example, all but one study received at least one quality point for
preventing sensory cueing regardless of whether a duplicate target set was used. If quality
points are assigned in a manner more consistent with the other meta-analyses, with one point
for the use of duplicatejudging sets and no points for manual methods of randomization, the
studies obtained 46%of the maximum available quality points.
bBased on only 4 of Hyman’s 12 flawcategories. One of the excluded categories involved as
signing a flawto studies in which it was not clear that receivers’ friends were used as senders.
This does not seem appropriate because it is absence of appropriate security rather than the
relationship between participants that would constitute aninadequate precaution against col
lusion. The remaining 7 flaws concerned statistical errors and the use of multiple outcome
measures without adjustment for multiple analysis. They could not have affected study out
comes in the meta-analysis because Hyman calculated outcomes using appropriate statistics
and single measures and are not therefore included here.
fThe original paper reports these percentages in terms of publication type rather than study
type.
Parapsychology 351
Discussion Paper 315
used suggests that the probability of any one of them being statistically
significant with an alpha of .05 is approximately .15. In a database of 78
free-response studies (Milton, 1997), the observed probability of a study
being statistically significantly above chance was .22, and 96% of studies
did not report whether the choice of outcome measure was preplanned.
Hyman’s study is likely to provide an extreme upper limit for the action
of this particular flaw because it is not probable that post hoc selection of
statistically significant outcome measures happens in every study, as it did
in his simulation. Nevertheless, the potential effects of not prespecifying
outcome measures is clearly not trivial in comparison with the outcomes
of ESP studies. Similarly, recording errors have been estimated empiri
cally to occur on approximately 1% of trials and to be biased in favor of
the observer’s hypothesis on two-thirds of the trials (Rosenthal, 1978).
The mean effect size in Honorton and Ferrari’s (1989) database of
forced-choice precognition studies is equivalent to raising a study’s out
come 1% above a mean chance expectation of 50%; but the frequency
with which studies reported double-blind, double-checked, or automated
data recording is not reported.
In most parapsychological meta-analyses, estimates of overall study
quality do not correlate statistically significantly with effect size. A num
ber of the researchers who obtained such null correlations have con
cluded that methodological problems, therefore, had no meaningful in
fluence on their databases (e.g., Honorton & Ferrari, 1989; Lawrence,
1993; Radin & Ferrari, 1991; Radin & Nelson, 1989); however, in data
bases that do not consist entirely or mostly of clearly well-controlled stud
ies such as the parapsychology databases , there are many ways in which a
relationship between methodological flaws and effect size could be ob
scured. This is a general problem in meta-analysis and not one restricted
to parapsychology. Because these problems have received little attention
in parapsychology (although see Hyman, 1985; Milton, 1997; Stanford &
Stein, 1994), it is worth listing some of them. A selection, by no means ex
haustive, is as follows:
1. The absence of safeguards for certain procedures (such as ran
domization or sensory-shielding procedures) might inflate effect size
more than the absence of safeguards for others (such as lack of dou
ble-blind checking of data records). In an unweighted correlation of
study quality and effect size, the effect of the absence of these more im
portant safeguards might be drowned out by the other data (Stanford &
Stein, 1994). In some cases, experts have been called upon to rate flaws in
terms of their likely impact so that a weighted correlation can be per
formed between the absence of safeguards and effect size (e.g., Milton,
1997; Radin & Ferrari, 1991). Thus far, these weightings have not indi
cated any such relationships, but it could be argued that, given the gen
eral lack of direct empirical evidence concerning effect sizes that result
Parapsychology 353
Discussion Paper 317
unlikely that criteria could be set up that would anticipate all of the novel
features that experimenters might introduce in their studies that would
lead most researchers to expect them to be unsuccessful. In addition to
having to conform to a basic set of criteria, the procedures planned for
each study would therefore also have to be examined on a case-by-case ba
sis to determine whether or not the study ought to be included in the rep
lication test. The existence of such a project would neither affect the
usual conduct of process-oriented research nor force experimenters to
use certain procedures in their studies. It would simply be the case that
studies eligible to be included in the meta-analysis would be included and
others would not. Similarly, the project would not affect anyone’s usual
freedom to conduct a meta-analysis of their own. In particular, there is no
reason anyone should not conduct a process-oriented meta-analysis in
volving all studies.
Some researchers may believe that it is already possible to identify
successful ganzfeld studies based on their procedures alone, and that it
would be advisable to begin such a meta-analysis now. Others may think
this premature. Very few variables have been explored repeatedly or sys
tematically in ganzfeld studies, and even fewer have been examined
meta-analytically across studies to determine whether there is good statis
tical evidence that they relate to effect size. Meta-analytic investigation of
some of the variables suggested by Bern and Honorton (1994) as having
been important in the PRL work indicates that other experimenters have
not replicated their effects in the few areas where this has been attempted
(Milton & Wiseman, 1999a). In addition, some variables identified by
Bern and Honorton as having had statistically significant relationships
with effect size in the PRL studies do not in fact appear to have done so
(Milton & Wiseman, 1999a), suggesting that our success so far in identify
ing what variables are important in the ganzfeld might be more limited2
2 The previous ganzfeld meta-analyses did not report explicit exclusion rules but the
implicit rules appear to have been to include every ganzfeld study (for Hypian’s
meta-analysis) or every single trial (for the PRLmeta-analysis) in which a ganzfeld environ
ment (even a modified one) was used to conduct an ESP test, with one disputed exception.
For the first meta-analysis of ganzfeld studies, Honorton provided Hyman with “acopy of ev
ery ganzfeld study known to him" (Hyman, 1985, p. 4), all of which Hyman included in his
meta-analysis. The studies were procedurally veryvaried, with some having features that lab
oratory lore might predict would not be psi-conducive, such as veryshort mentation periods
(e.g. Rogo et al., 1976); however, Honorton did exclude two conditions in astudy byRabum
(1975) inwhich participants were not aware that theywere taking part in an ESPtest, on the
grounds that these trials were too atypical of other ganzfeld research. Hyman (1985) ob
jected to their exclusion because other studies contained unique features and yet were in
cluded in the database. Bern and Honorton’s (1994) subsequent meta-analysis of the PRL
work included every single trial done using the autoganzfeld. The PRL studies were also
procedurally varied and the meta-analysis included trials that, again, might arguably not be
expected to be successful, such as demonstration trials carried out in the presence of a TV
crewand trials fromSeries 302 inwhichTarget 79was included in the target set on each trial
despite its never having been previously correctly identified when serving as the target.
360 Parapsychology
than has been assumed. Before embarking upon a replication test that
should exploit its findings, it may be that a systematic assessment of pro
cess-oriented ganzfeld research is called for (e.g., see Dalton, 1997b).
Summary and conclusion
The meta-analysis of recent, well-controlled ganzfeld studies (Milton
& Wiseman, 1999a) indicates a failure to replicate the results of the earlier
work, and the evidence for psi from meta-analyses and process-oriented re
views of parapsychology studies of low or uncertain quality does not ap
pear compelling. If the search for strong evidence for psi is to continue,
ganzfeld research appears to be its natural arena. A meta-analysis that ex
cludes studies before they are conducted if they are not expected to repli
cate a positive effect appears to be the obvious test of future replication.
Until more research has been done to identify what factors may be psi
conducive in the ganzfeld, such a meta-analysis may be premature, but it
appears to be an important goal to work towards.
Many researchers may disagree with my assessment of the evidence
for psi accumulated so far, and with my goal of continuing to seek stron
ger evidence in general, and with my proposal for a prospective
ganzfeld meta-analysis in particular. Conversely, many may disagree
with the use of meta-analyses of studies of uncertain quality being pro
moted as strong evidence for psi, and with ganzfeld research having be
come a crucial test case before the factors that affect its replicability have
been well-established. Whatever researchers’ views may be, however, the
momentum of previous events is carrying the field towards another inclu
sive meta-analysis of future ganzfeld studies that appears likely to show
the same failure to replicate as did the last one. Should a second failure to
replicate occur despite the warning of a first failure, it will give the ap
pearance of reasonably strong evidence against claims for psi as a
replicable (and therefore, probably genuine) effect.
If this is not a direction that parapsychologists want events to take,
then now appears to be the time to say so. Although the choice of
whether to carry out a meta-analysis is likely to be an individual one, its re
sults will affect other researchers. The opportunity for the research com
munity, rather than a few, key individuals, to discuss the issues and ex
press their opinions is long overdue. I look forward to hearing the views
of my colleagues on the matters that I have discussed in this paper.
Organization of an electronic mail discussion
The apparent replication problems in ganzfeld research described
in the preceding paper appeared to require discussion among the
ganzfeld research community in order to determine what, if any, course
of action seemed appropriate and could be agreed upon. I, therefore,
invited a group of researchers with expertise in ganzfeld research and
Parapsychology 361
Discussion Paper 325
Referen ces
Atkinson , R. L., Atkinson , R. C., Sm ith , E. E., & Bem, D.J. (1990). Introduction to
psychology (10lh e d .). O rlando, FL: H arcourt Brace Jovanovich.
Bem, D. J., 8 c H onorton , C. (1994). D oes psi exist? Replicable evidence for an
anom alous process o f inform ation transfer. Psychological Bulletin, 115, 4-18.
Broughton , R. (1992). Parapsychology: The controversial science. L ondon: Rider.
Carpenter , J. C. (1977). Intrasubject and subject-agent effects in ESP experi
m ents. In B. B. W olm an (Ed.), Handbook ofparapsychology (pp. 202-272). New
York: Van Nostrand Reinhold.
D alton , K. *( 1997a). E xploring the links: Creativity and psi in the ganzfeld. Pro
ceedings o f Presented Papers: The Parapsychological Association 4Cf A nnual Conven
tion, 119-134.
D alton , K. (1997b). Is there a form ula to success in the ganzfeld? Observations
on predictors o f psi-ganzfeld perform ance. European Journal o f Parapsychology,
13,71-82.
H ayes, N . (1998). Foundations o f psychology: An introductory text (2nd ed .).
W alton-on-Thames, England: N elson.
H onorton , C. (1985). Meta-analysis o f psi ganzfeld research: A response to
Hyman. Journal o f Parapsychology, 49, 51-91.
H onorton , C., B erger, R. E., Varvoglis, M. P., Q uant , M., D err, P.,
Schechter , E. I., 8c Ferrari, D. G (1990). Psi com m unication in the
Parapsychology 363
Discussion Paper 327
ganzfeld: Experim ents with an autom ated testing system and a com parison
with a meta-analysis o f earlier studies. Journal o f Parapsychology, 54, 99-139.
H onorton , C., 8 c Ferrari, D. C. (1989). Meta-analysis o f forced-choice p recogn i
tion experim ents. Journal o f Parapsychology, 53, 281-308.
H onorton , C., Ferrari, D.C. 8 c B em , D .J. (1998). Extraversion and ESP perfor
m ance: A meta-analysis and a new confirm ation. Journal o f Parapsychology, 62,
255-276.
H yman , R. (1985). T he ganzfeld psi experim ent: A critical appraisal. Journal o f
Parapsychology, 49, 3-49.
H yman , R., 8 c H onorton , C. (1986). A join t com m unique: T he psi ganzfeld co n
troversy. Journal o f Parapsychology, 50, 350-364.
Krippner , S., Braud , W., Child , I. L., Palmer ,J., Rao , K. R., Schlitz , M., W hite ,
R. A., 8c U tts ,J. (1993). D em onstration research and m eta-analysis in para
psychology. Journal o f Parapsychology, 57, 275-286.
Lawrence , T. (1993). G athering in th e sh eep and goats . . . A m eta-analysis o f
forced-choice sheep-goat ESP studies, 1947-1993. Proceedings o f Presented P a
pers: The Parapsychological Association 36? A nnual Convention, 75-86.
Milton ,J. (1997). Meta-analysis o f free-response studies w ithout altered states o f
consciousness. Journal of Parapsychology, 61, 279-319.
Milton ,J., 8c Wiseman , R. (1997a). G anzfeld at the crossroads: A meta-analysis o f
the new generation o f studies. Proceedings o f Presented Papers: The
Parapsychological Association 4(T A nnual Convention, 267-282.
Milton , J., 8c W iseman , R. (1997b). Guidelines for extrasensory perception research.
H atfield, England: University o f H ertfordshire Press.
Milton , J., 8c Wiseman , R. (1999a). D oes psi exist? Lack o f replication o f an
anom alous process o f inform ation transfer. Psychological Bulletin, 125, 387-391.
Milton , J., 8c Wiseman , R. (1999b). A meta-analysis o f m ass-m edia tests o f extra
sensory perception. B ritish Journal of Psychology, 90, 235-240.
Morris , R. L., Cunningham , S., McAlpine , S., 8 c T aylor , R (1993). Toward rep
lication and exten sion o f autoganzfeld results. Proceedings o f Presented Papers:
The Parapsychological Association 3 6 h A nnual Convention, 177-191.
Palmer, J. (1978). Extrasensory perception: R esearch findings. In S. K rippner
(Ed.), Advances in parapsychological research 2: Extrasensory perception, (pp.
59-243). New York: Plenum Press.
Parker, A., 8c W esterlund ,J. (1998). Current research in giving th e ganzfeld an
old and a new twist. Proceedings o f Presented Papers: The Parapsychological A ssocia
tion 41* A nnual Convention, 135-142.
Pratt ,J. G. (1966). New ESP tests with Mrs. Gloria Stewart. Journal o f the American
Society for Psychical Research, 60, 321 -339.
Raburn , L. (1975). Expectation and transm ission factors in psychic functioning. U n
published honors thesis, T ulane University, New Orleans, LA.
Radin , D. I. (1997). The conscious universe: The scientific truth o f psychic phenomena.
New York, NY: H arperCollins.
Radin , D. I., 8c Ferrari, D. C. (1991). Effects o f consciousness o n the fall o f dice:
A meta-analysis. Journal o f Scientific Exploration, 5, 61-83.
Radin , D. I., 8c N elson , R. D. (1989). Evidence for consciousness-related anom a
lies in random physical systems. Foundations o f Physics, 19, 1499-1514.
364 Parapsychology
328 The Journal of Parapsychology
Department of Psychology
University of Edinburgh
7 George Square
Edinburgh EH8 9JZ
Scotland' UK
Parapsychology 365
Discussion Paper 329
Appendix A
Table A1
Ganzfeld Studies Published to Date (March 1999) since
Completion of Milton & Wiseman (1999a) Meta-analysis
(February 1997)
Study (N = 12) Number of trials z z /N I/2
Dalton (1997a) 128 5.26 .46
Parker & Westerlund 30 2.40 .44
(1998) Study IV
Parker & Westerlund 30 1.25 .23
(1998) Study V
Parker & Westerlund 30 a a
Table A2
Po st HOC C o m p a r is o n s b e t w e e n M e a n E f f e c t S iz e s in
M e t a -a n a l y s e s o f R e c e n t a n d E a r l ie r G a n z f e l d S t u d ie s
A p p e n d ix B
Members of the Discussion Group
Members of the discussion group, in alphabetical order, were as fol
lows (those who posted messages are marked with an asterisk): Cheryl Al
exander, Daryl Bern, Dick Bierman, Douwe Bosga, William Braud,
Kathy Dalton, Deborah Delanoy, Norman Don, Ricardo Eppinger, Hans
Gerding, Gerd Hovelmann, Anjum Khilji, Diana Kombrot, Tony Law
rence, Bruce McDonough, Stuart Menzies, Julie Milton, Bob Morris,
Roger Nelson, John Palmer,* Adrian Parker,* Dean Radin,* Chris Roe,
Ephraim Schechter,* Marilyn Schlitz, Fabio da Silva, Matthew Smith, Rex
Stanford, Fiona Steinkamp, Charles Symmons, James Terry, Jessica Utts,
Mario Varvoglis, Charles Warren, Caroline Watt,* Joakim Westerlund,
Rens Wezelman, Nils Wiklund, Carl Williams, Melvyn Willin,* Richard
Wiseman.
Parapsychology 367
Discussion Paper 331
A p p e n d ix C
Questionnaire Data
As noted earlier, all members of the mailbase group were sent an op
tional pre- and postdiscussion questionnaire concerning the main issues,
and a postdiscussion questionnaire asking about their satisfaction with
the organizational features of the debate. To minimize response bias, dis
cussants were asked to send their responses for compilation to the mod
erator, who would keep their individual replies permanently confidential
from me.
Pre- and Postdiscussion Opinions on the Main Issues
The results of the questionnaires are summarized in Table 1. Just un
der half of the mailbase members answered the pre- and postdiscussion
questionnaires, and so it is not clear that the results proportionately re
flect the views of whole group. Respondents were not asked to give their
identities, to maximize response rates. It is, therefore, not clear whether
any change in opinion reflects a change in the opinion of broadly the
same group of people, or a change in the identities of those responding
to the questionnaire. The data can only be interpreted as reflecting the
views of those who chose to express an opinion at the time.
Bearing these limitations in mind, it can be seen that respondents ap
peared to maintain their position of tending to favor (but with some un
certainty) the view that the experimental evidence for psi as a genuine
anomaly is strong enough to convince a neutral scientist. Respondents
tended to agree before the discussion that ganzfeld research should con
tinue as an important focus for psi as a genuine effect, replicable across
experimenters under certain conditions; and they agreed more strongly
with this view after the discussion. There was little change in respondents’
view that meta-analyses of stringently conducted studies are important as
part of the case for psi as a replicable, genuine anomaly, nor in their view
that it is necessary to plan exclusions in advance rather than post hoc in
the next ganzfeld meta-analysis. Before the debate, respondents had a
slight tendency to believe on balance that it is already possible to identify
successful ganzfeld studies reasonably reliably in advance on the basis of
their procedures; but afterwards the majority did not think this possible.
368 Parapsychology
332 TheJournal of Parapsychology
Table 3
O p in io n s o n t h e M a in D is c u s s io n I s s u e s B e f o r e a n d A f t e r D e b a t e
Percent Agreement
Question Response Entry Exit
Poll Poll
(N = 16)" (N = 18)b
1. Do you think that the ex- Yes, certainly 13 6
perimental evidence for psi is Yes, on balance 50 44
strong enough that a neutral Uncertain 31 39
scientist should be convinced
that a genuine anomaly has No, on balance 0 0
been demonstrated, that is, No, certainly not 6 11
that there is a phenomenon
not explicable in terms of er
ror, selective reporting, fraud,
ordinary sensorimotor effects
and so on?
2. Do you think that ganzfeld I do not believe
research should remain an that further test
important focus for testing ing of this hypoth
the hypothesis that, at least esis is necessary, it
under certain conditions, psi has already been
is a genuinely anomalous ef sufficiently con
fect that can be replicated firmed 13 6
across experimenters? No, certainly not 13 0
No, on balance 13 11
Uncertain 19 11
Yes, on balance 19 61
Yes, certainly 25 11
3a. How important do you Crucial 13 12
think meta-analyses of strin Important 63 59
gently conducted parapsychol Uncertain 13 6
ogy studies are in making at
least part of the case for psi as Not important 6 6
a genuine and replicable Irrelevant 6 18
1anomaly?
Parapsychology 369
Discussion Paper 333
ABSTRACT: The subjects in this research were tested for their psychokinetic
ability by means of an electronic apparatus made up of a random number gen
erator (RNG) connected with a display panel. The RNG produced random se
quences of two numbers which were determined by a simple quantum process
(the decay of radioactive strontium-90 nuclei). The essential aspect of the display
panel was a circle of nine lamps which lighted one at a time in the clockwise
(+1) direction or the counterclockwise (—1) direction depending on which of the
two numbers the RNG produced. The subject’s task was to choose either the
clockwise or counterclockwise motion and try by PK to make the light proceed
in that direction.
One run was made up of 128 “jumps” of the light, and there were four runs
per session. In a preliminary series of 216 runs, the 18 subjects had a negative
deviation of 129 hits. Accordingly, the main series was expected to give negative
scores, and a negative attitude was encouraged among the subjects. Fifteen sub
jects carried out 256 runs, with a significant negative deviation of 302 hits
(P = .001).
The RNG was checked for randomness throughout the experiment and was
found to be adequate.—Ed.
I n previous work (4, 5) the author was able to get significant ev
idence of precognition in which the testing apparatus was an elec
tronic device based on a simple quantum process. The present
experiment was an attempt to get significant evidence of psychoki
nesis by the use of a similar apparatus.
The basic part of the apparatus was a binary random number
generator which produced the numbers “+ 1 ” and “— 1” in random
sequence, and the general objective was to have the subjects try to
mentally influence the generator to produce one of the two numbers
more frequently than the other.
378 Parapsychology
176 T he Journal of Parapsychology
The m ost easily available random generators, which have been
used in many P K experiments, are a rolled die and a flipped coin.
In comparison with these, an electronic random generator, the op
eration of which m ost of the subjects cannot understand, may at first
thought seem psychologically unfavorable. Results of experiments
with complex targets (3, p. 142), however, suggest that P K is goal
oriented in the sense that results can be obtained by concentrating
on the goal only, no matter how complicated the intermediate steps
may seeni to the rationalizing mind. A definite advantage of an elec
tronic apparatus is that it permits a psychologically challenging for
mulation of the goal. In the present experiment the random number
generator (R N G ) was connected with a display panel show ing a
circle of nine lamps. One lamp was lit at a time, and each generated
“4-1” or “— 1” caused the light to jump one step in the clockwise
or counterclockwise direction, respectively. The subjects were not
asked to try to force the generator to produce more + l ’s than — l ’s
but, rather, to force the light on the panel to make more jumps in
one direction or the other. Both tasks are certainly equivalent, but
the latter seems psychologically much more appealing to m ost sub
jects.
A further obvious advantage of electronic test equipment is that
the detailed results can be automatically recorded and evaluated and
that one can work, if desired, at high speeds.
The particular type of random generator used here was chosen
partly for practical and partly for theoretical reasons. The sequence
in which the random numbers are produced is determined by simple
quantum processes, the decays of radioactive strontium-90 nuclei.
The electrons emitted in this decay trigger a Geiger counter, and
the random times at which electrons are registered at the Geiger
counter decide the generated numbers. Practically, the generator is
easy to build, and the randomness of the generated numbers has
been found to be very good. Furthermore, the simplicity of the gen
erator allows a complete theoretical discussion (6 ) of its random
ness properties; and in addition, one can say fairly well at which
point the random element in the number generation comes in. The
generator is essentially deterministic except for the random decay
times of the nuclei.
The use of simple quantum jumps to provide randomness is, for
Parapsychology 379
A P K T est w ith E lectronic E qu ipm ent 177
the theorist, a rather natural choice, since these processes are as
sumed by physicists to be nature’s m ost elementary source of
randomness, and some psi tests utilizing quantum processes have
already been reported (1, 2 ). Certainly, the outcome of a die throw
is also largely determined by microscopic quantum processes. The
thermal vibrations of the surface and the air fluctuations at an atom ic
level co-determine the generated die face. The process in this case
is much more complicated, however, since many more factors con
tribute to the end result.
A p p a r a t u s
The result of the experiment shows that the binary random num
ber generator had no bias for generation of + T s or — Ts as long as it
was left unattended (in the randomness tests) but that it displayed a
significant bias when the test subjects concentrated on the display
panel, wishing for an increased generation rate of one number.
The experiment has been discussed in terms of P K , but in prin
ciple the result could certainly also be ascribed to precognition on
the part of the experimenter or the subject. Since the sequence of
generated numbers depended critically on the time when the test
run began, and since the experimenter, in consensus with the sub
ject, decided when to flip the start switch, precognition m ight have
prompted experimenter and subject to start the run at a time which
favored scoring in a certain direction.
If the P K interpretation is appropriate, the results imply the
action of P K at some distance, since the generator was separated
from the subject by a wall and only the display panel was close to
the subject.
R eferences
1. B , J., and E
e l o f f , L. A radioactivity test of psychokinesis. /.
v a n s
S pecu lation s abou t the role o f consciousn ess in p h y sic a l sy ste m s are fre q u e n tly
o b served in the litera tu re con cern ed w ith the in terp reta tio n o f quan tum m echanics.
W hile o n ly three ex p erim en ta l in vestig a tio n s can be fo u n d on this to p ic in ph ysics
jo u rn a ls , m o re than 8 0 0 relevan t ex p erim en ts have been re p o rte d in the literatu re
o f p a ra p sych o lo g y. A w ell-defin ed b o d y o f em pirical eviden ce fr o m this dom ain
w as review ed using m eta -a n a ly tic techn iqu es to assess m eth o d o lo g ica l q u a lity a n d
overa ll effect size. R esu lts sh o w ed e ffects conform ing to chance e x p ecta tio n in
co n tro l con dition s a n d un equ ivocal non-chance effects in ex p erim e n ta l conditions.
This q u a n tita tive litera tu re review agrees w ith the findings o f tw o earlier review s ,
su ggestin g the ex isten ce o f so m e fo r m o f co n scio u sn ess-rela ted an o m a ly in random
p h y sic a l system s.
1. INTRODUCTION
The nature of the relationship between human consciousness and the
physical world has intrigued philosophers for millenia. In this century,
speculations about mind-body interactions persist, often contributed by
physicists in discussions of the measurement problem in quantum mechanics.
Virtually all of the founders of quantum theory—Planck, de Broglie,
Heisenberg, Schrodinger, Einstein—considered this subject in depth/11 and
contemporary physicists continue this tradition/2 71
2. THE EXPERIMENTS
The experiments involved some form of microelectronic random
number generator (RNG), a human observer, and a set of instructions for
the observer to attempt to “influence” the RNG to generate particular
numbers, or changes in a distribution, solely by intention. RNGs are
usually based upon a source of truly random events such as electronic
noise, radioactive decay, or randomly seeded pseudorandom sequences/19)
Feedback about the distribution of random events is often provided in the
form of a digital display, but audio feedback, computer graphics, and a
variety of other mechanisms have also been used. Some of the RNGs
described in the literature are technically sophisticated, the best devices
employing electromagnetic shielding, environmental failsafe mechanisms
triggered by deviant voltages, currents, or temperature, automatic
computer-based data recording on magnetic media, redundant hard copy
output, periodic randomness calibrations, and so o n/,8-20)
RNGs are typically designed to produce a sequence of random bits at
the press of a button. After generating a sequence of say, 100 random bits
(0's or Fs), the number of Fs in the sequence may be provided as feedback.
In an experimental protocol using a binary RNG, a run might consist of
an observer being asked to cause the RNG to produce, in three successive
button presses, a high number (sum of Fs greater than chance expectation
of 50), a low number (less than 50), and a control condition with no direc
tional intention. An experiment might consist of a group of individuals
each contributing a hundred such runs, or one individual contributing
several thousand runs. Results are usually analyzed by comparing high
aim and low aim means against a control mean or theoretical chance
expectation.
388 Parapsychology
1502 Radin and Nelson
3. META-ANALYTIC PROCEDURES
The quantitative literature review, also called meta-analysis, has
become a valuable tool in the behavioral and social sciences.,21)
Meta-analysis is analogous to well-established procedures used in the
physical sciences to determine parameters and constants. The technique
assesses replication of an effect within a body of studies by examining the
distribution of effect sizes.122 24) In the present context, the null hypothesis
(no mental influence on the RNG output) specifies an expected mean effect
size of zero. A homogeneous distribution of effect sizes with nonzero mean
indicates replication of an effect, and the size of the deviation of the mean
from its expected value estimates the magnitude of the effect.
Meta-analyses assume that effects being compared are similar across
different experiments, that is, that all studies seek to estimate the same pop
ulation parameters. Thus the scope of a quantitative review must be strictly
delimited to ensure appropriate commonality across the different studies
that are combined.*2125> This can present a nontrivial problem in meta-
analytic reviews because replication studies typically investigate a number
of variables in addition to those studied in the original experiments. In the
present case, because different subjects, experimental protocols, and RNGs
were employed within the reviewed literature, some heterogeneity
attributable to these factors was expected in the obtained distribution of
effect sizes. However, the circumscription for the review required that every
study in the database have the same primary goal or hypothesis, and hence
estimate the same underlying effect.
Experiments selected for review examined the following hypothesis:
The statistical output of an electronic RNG is correlated with observer
intention in accordance with prespecified instructions, as indicated by
the directional shift of distribution parameters (usually the mean) from
expected values.
Because this “directional shift” is most often reported as a standard
normal deviate (i.e., Z score) in the reviewed experiments, we determined
effect size as a Z score normalized by the square root of the sample size
(TV), e = Z/^/yV, where N was the total number of individual random events
(with probability of a hit at p = 0.5, p = 0.25, etc.). This effect size measure
is equivalent to a Pearson product moment correlation.*211
3.1. Unit of Analysis
To avoid redundant inclusion of data in a meta-analysis, “units of
analysis” are often specified. We employed the following method: If
an author distinguished among several experiments reported in a single
Parapsychology 389
Consciousness in Physical Systems 1503
statistics, the data, and the RNG device-and they cover virtually all
methodological criticisms raised to date. They are ( l) control tests noted,
(2) local controls conducted, ( 3) global controls conducted, (4) controls
established through the experimental protocol, ( 5) randomness calibrations
conducted, (6)failsafe equipment employed, (7) data automatically recor-
ded, (8) redundant data recording employed, (9) data double checked,
( 10) data permanently archived, ( 11) targets alternated on successive trials,
(12) data selection prevented by protocol or equipment, ( 13) fixed run
lengths specified, ( 14) formal experiment declared, (15) tamper-resistant
R NG employed, and ( 16) use of unselected subjects.
Each criterion was coded as being present or absent in the report of
an experiment, specifically excluding consideration of previously published
descriptions of RNG devices or control tests. This strategy was employed
to reflect lower confidence in such experiments since, for example, random-
ness tests conducted once on an RNG do not guarantee acceptable perfor-
mance in the same RNG in all future experiments. As a result, assessed
quality was conservative, that is, lower than the "true" quality for some
experiments, especially those reported only as abstracts or conference
proceedings. Using unit weights (which have been shown to be robust in
such applications< 361 ) on each of the sixteen descriptors, the quality rating
for an individual experiment was simply the sum of the descriptors. Thus,
while a quality score near zero indicated a low quality or poorly reported
experiment, a score near sixteen reflected a highly credible experiment.
Fig. 1. Distribution of Z scores reported in 235 control studies. Thirty-three of these studies
were reported only as “nonsignificant” and were assigned Z scores of zero. To replace the
spurious spike at Z = 0, those 33 studies were recast as normally distributed Z scores,
bounded bv ±1.64, averaging Z = 0.
825 19 12-5
392 Parapsychology
1506 Radin and Nelson
Z -S C O R E S
These results, expressed as overall mean effect sizes, show that control
studies conform well to chance expectation (Fig. 3a), and that experimental
effects, whether calculated for studies or investigators, deviate significantly
from chance expectation (Fig. 3b, 3c). To obtain a homogeneous distribu
tion of effect sizes, it was necessary to delete 17% of individual outlier
studies (Fig. 3d) and 13% of mean effect sizes across investigators (Fig. 3e).
This may be compared with exemplary physical and social science reviews,
where it is sometimes necessary to discard as many as 45% of the studies
to achieve a homogeneous effect size distribution/19) Of individual studies
deleted, 77% deviated from the overall mean in the positive direction, and
of investigator means deleted, all were positive (i.e., supportive of the
experimental hypothesis).
4.1. Effect of Quality
Some critics have postulated that as experimental quality increases in
these studies, effect size would decrease, ultimately regressing to the “true”
value of zero, i.e., chance results/12,13'15,32-33-38* We tested this conjecture
with two linear regressions of mean effect size vs. mean quality assessed per
investigator, one weighted with cof as defined above and the other weighted
with the number of studies per investigator. The calculated slope for the
former is —2.5 x 1 0 '5± 3.2 x 10~5, and for the latter, —7.6x10 ' 4 +
3.9xl0-4. These nonsignificant relationships between quality and effect
size is typical of meta-analytic findings in other fields/39,401 suggesting
that the present database is not compromised by poor experimental
methodology. Another assessment of the effect of quality was obtained by
comparing unweighted and quality-weighted effect sizes per experiment
(Fig. 3b vs. 3f). These are nearly identical, and the same is true after
deleting outliers to obtain a homogeneous quality-weighted distribution
(Fig. 3d vs. 3g), confirming that differences in methodological quality are
not significant predictors of effect size.
It might be argued that the quality assessment procedure employed
here was nonoptimal because some quality criteria are more important
than others, so that if appropriate weights were assigned, the
quality-weighted effect size might turn out to be quite different. This was
tested by Monte Carlo simulation, using sets of 16 weights, one per
criterion, randomly selected over the range 0 to 6. A quality-weighted effect
size was calculated for the 597 experiments as before, now using the
random weights instead of unit weights, and this process was repeated one
thousand times, yielding a distribution of possible quality ratings. The
average effect size from the simulation was 3.18xl0~4± 0.15xl0~4,
indicating that in this particular database coded by these sixteen criteria,
394 Parapsychology
1508 Radin and Nelson
the probable range of the quality-weighted mean effect size clearly excludes
chance expectation of zero.
4.2. The “ Filedrawer” Problem
Although accounting for differences in assessed quality does not nullify
the effect, it is well known in the behavioral and social sciences that non
significant studies are published less often than significant studies (this is
called the “filedrawer” problem*21,41 43)). If the number of nonsignificant
studies in the filedrawer is large, this reporting bias may seriously inflate
the effect size estimated in a meta-analysis. We explored several procedures
for estimating the magnitude of this problem and to assess the possibility
that the filedrawer problem can sufficiently explain the observed results.
The filedrawer hypothesis implicitly maintains that all or nearly all
significant positive results are reported. If positive studies are not balanced
by reports of studies having chance and negative outcomes, the empirical
Z score distribution should show more than the expected proportion of
scores in the positive tail beyond Z = 1.645. While no argument can be
made that all negative effects are reported, it is interesting to note that the
database contains 37 Z scores in the negative tail, where only 30 would be
expected by chance. On the other hand, there are 152 scores in the positive
tail, about five times as many as expected. The question is whether this
excess represents a genuine deviation from the null hypothesis or a defect
in reporting or editorial practices.
This question may be addressed by modeling based on the assumption
that all significant positive results are reported. A four-parameter fit mini
mizing the chi-square goodness-of-fit statistic was applied to all observed
data with Z ^ 1.645, using the exponential
( 1)
Table I. Four-Parameter Fit (E:N, TV, Mean, sd) Minimizing Chi-Square (lOdf)
Goodness-of-Fit Statistic to the Positive Tail of the Observed Z Score Distribution,
for Several Exponential:Normal Ratios"
Assumption E.N ratio N Mean sd Chi-square P
Normal distribution 0 585,000 0 1 57,867.84 0
(null hypothesis) 1 5.300 0 1 220.97 0
2 4,800 0 1 167.84 0
3 4,600 0 1 148.45 0
10 4,400 0 1 119.69 0
Empirical distribution 0 700 0.145 2.10 23.94 0.008
1 747 0.345 1.90 16.32 0.091
2 757 0.445 1.80 14.21 0.164
3 111 0.445 1.80 11.08 0.226
10 807 0.445 1.80 11.08 0.351
The null hypothesis is tested by clamping the mean at 0 and the standard deviation at 1,
allowing N and E:N to vary. The empirical database is addressed by allowing all four
parameters to vary.
account for the 152 Z scores in the positive tail and 37 Z scores in the
negative tail. This mean-shift model, which ignores the shape of the
observed distribution, results in an N = 1,580 and a mean Z score = 0.34.
These modeling efforts suggest that the number of unreported or
unretrieved RNG studies falls in the range of 200 to 1,000. A remaining
question is, how many filedrawer studies with an average null result would
be required to reduce the effect to nonsignificance (i.e., p <0.05)7 This
“failsafe” quantity is 54,000—approximately 90 times the number of studies
actually reported. Rosenthal suggests that an effect can be considered
robust if the failsafe number is more than five times the observed number
of studies.121)
5. DISCUSSION
Repeatable experiments are the keystone of experimental science. In
practice, repeatability depends upon a host of controllable and uncon
trollable ingredients, including factors such as stochastic variation, changes
in environmental conditions, difficulties in communicating tacit knowledge
employed by successful experimenters,(44) and so on. Difficulties in
achieving systematic replication are therefore ubiquitous, from experimental
psychology121,451 to particle physics.123,241 Of course, this is not to say that
systematic replication is impossible in these or other fields, but it may
appear to be extraordinarily difficult when experiments are considered
individually rather than cumulatively. In the case of the present database,
the authors of a recent report issued by the US National Research Council
stated that the overall results of the RNG experiments could not be
explained by chance,1461 but they questioned the quality and replicability of
the research. This meta-analysis shows that effects are not a function of
experimental quality, and that the replication rate is as good as that found
in exemplary experiments in psychology and physics.
Besides the issue of replicability, five other objections are often raised
about the present experiments. These are (a) the effect is inconsistent with
prevailing scientific models, (b)the experimental methodology is techni
cally naive, thus the results are not trustworthy, (c) the experiments are
vulnerable to fraud by subjects or by experimenters, (d) skeptics cannot
obtain positive results, and (e) there are no adequate theoretical explana
tions or predictions for the anomalous effect.
These criticisms may be addressed as follows: (a) “Inconsistency with
the scientific world-view” is essentially a philosophical argument that
carries little weight in the face of repeatable experimental evidence, as
suggested by the present and two corroborating meta-analyses.117,181
Parapsychology 397
Consciousness in Ph\sical Systems 1511
6. CONCLUSION
In this paper, we have summarized results of all known experiments
testing possible interactions between consciousness and the statistical
behavior of random-number generators. The overall effect size obtained in
experimental conditions cannot be adequately explained by methodological
flaws or selective reporting practices. Therefore, after considering all of the
398 Parapsychology
1512 Radin and Nelson
ACKNOWLEDGMENTS
This study was supported by major grants from the James S.
McDonnell Foundation, Inc. and the John E. Fetzer Foundation, Inc. The
authors express their gratitude to Dr. York Dobyns of the Princeton
University Engineering Anomalies Laboratory for his assistance with the
filedrawer models.
REFERENCES
1. R. G. Jahn and B. J. Dunne, M argin s o f R e a lity (Harcourt Brace Jovanovich, Orlando,
Florida, 1987).
2. B. d'Espagnat, “The quantum theory and reality,” Sci. Am ., pp. 158-181 (November,
1979).
3. O. Costa de Beauregard, “S-matrix, Feynman zigzag and Einstein correlation,” P hys. L ett.
67A, 171-173 (1978).
4. N. D. Mermin, “Is the moon there when nobody looks? Reality and the quantum theory,”
P hys. T o d a y , pp. 38-47 (April, 1985).
5. A. Shimony, “Role of the observer in quantum theory,” A m . J. Phys. 31, 755 (1963).
6. E. P. Wigner, “The problem of measurement,” A m . J. P hys. 31, 6 (1963).
7. U. Ziemelis, “Quantum-mechanical reality, consciousness and creativity,” C an. R es. 19,
62-68 (September, 1986).
8. E. J. Squires, “Many views of one world—an interpretation of quantum theory,” Eur. J.
P hys. 8, 173 (1987).
9. J. Hall, C. Kim, B. McElroy, and A. Shimony, “Wave-packet reduction as a medium of
communication,” Found. P hys. 7, 759-767 (1977); p. 761.
10. R. Smith, unpublished manuscript, MIT, 1968. (Cited in Ref. 9, p.767.)
11. R. G. Jahn and B. J. Dunne, “On the quantum mechanics of consciousness, with applica
tion to anomalous phenomena,” Found. P hys. 16, 721-772 (1986).
12. J. E. Alcock, P a ra p sych o lo g y: Scien ce or M agic? (Pergamon Press, Elmsford, New York,
1981), pp. 124-125.
13. M . Gardner, S cien ce: G o o d , B ad, a n d B ogus (Prometheus Books, Buffalo, New York,
1981).
14. R. Hyman, “Parapsychological research: A tutorial review and critical appraisal,” P roc.
IE E E 74, 823-849 (1986).
15. P. Kurtz, “Is parapsychology a science?” In P a ra n o rm a l B orderlan ds o f S cien ce ,
K. Frazier, ed. (Prometheus Books, Buffalo, New York, 1981).
Parapsychology 399
Consciousness in Physical Systems 1513
35. J. B. Rhine, “Comments: ‘A new case of experimenter unreliability. J. P arapsych ol. 38,
215-255 (1974).
36. R. M. Dawes, “The robust beauty of improper linear models in decision making,” Am .
P sych ol. 34, 571-582 (1979).
37. L. V. Hedges, “How hard is hard science, how soft is soft science?” A m . P sych ol. 42,
443-455 (1987).
38. C. E. M. Hansel, E S P : A S cien tific E valu ation (Charles Scribner’s Sons, New York, 1966),
p. 234.
39. R. Rosenthal and D. B. Rubin, “Interpersonal expectancy effects: The first 345 studies,”
Behav. B rain Sci. 3, 377-415 (1978).
40. G. V. Glass, B. McGaw, and M. L. Smith, M eta -a n a lysis in S o cia l R esearch (Sage Publi
cations, Beverly Hills, California, 1981).
41. Q. McNemar, “At random: Sense and nonsense,” A m . Psychol. 15, 295-300 (1960).
42. S. Iyengar and J. B. Greenhouse, “Selection models and the file-drawer problem,”
Technical Report 394, Department of Statistics, Carnegie-Mellon University (July, 1987).
43. L. V. Hedges, “Estimation of effect size under nonrandom sampling: The effects of
censoring studies yielding statistically insignificant mean differences,” J. Ediic. S ta t. 9,
61-86 (1984).
44. H. H. Collins, C hanging O rder: R eplication a n d Induction in S cien tific P ractice (Sage
Publications, Beverly Hills, California, 1985).
45. S. Epstein, “The stability of behavior, II: Implications for psychological research,” Am .
P sych ol. 35, 790-806 (1980).
46. D. Druckman and J. A. Swets, eds. Enhancing H um an P erform ance: issu es, T h eories , an d
T echniques (National Academy Press, Washington, D.C., 1988), p. 207.
47. A. Nehcr, The P sych o lo g y o f T ranscendence (Prentice-Hall, Englewood Cliffs, New Jersey,
1980), p. 147.
[21]
FURTHER STUDIES OF AUTONOMIC
DETECTION OF REMOTE STARING:
REPLICATION, NEW CONTROL
PROCEDURES, AND PERSONALITY
CORRELATES
By W illiam B raud , D onna S hafer , and Sperry Andrews
Method
Subjects
Thirty volunteer participants (22 females and 8 males) served as
“starees” for Replication 1, and 16 volunteers (5 males and 11 females)
participated as starees for Replication 2. In Replication 1, half of the
starees were persons already known by the starers (relatives, friends, or
familiar undergraduate classmates), whereas half were unknown at the
time of the laboratory session (i.e., they were unfamiliar undergradu
ates); only one of the starees had participated previously in laboratory
psi experiments. (Later results did not differ for the known versus un
known starees.) It had been decided in advance that each starer was to
work with 10 starees and that results for all 30 starees were to be pooled
for purposes of analysis. In Replication 2, 13 of the starees were pre
viously unknown undergraduate students from a local college, and 3
were friends or relatives of the starer; only 2 of the starees had partici
pated previously in laboratory psi studies. Participants were selected on
the basis of availability during planned laboratory session times and on
the basis of interest in participating in a study exploring the “feeling of
being stared at.” Across both replications, staree age ranged from 17
years to 40 years.
The starers of Replication 1 were three undergraduate psychology
students (two females and one male) from a local college who were
participating in independent studies internships at the Mind Science
Foundation. None of these starers had prior laboratory psi research
experience. The starers were trained for the experiment by the second
author (D.S.), who had served as starer in our original (1993) staring
detection experiments. D.S. served as starer for Replication 2. She her
self had participated previously in extensive “connectedness” training
that had been provided by the third author, S.A. This training (which is
described in Braud, Shafer, 8c Andrews, 1993) took the form of approxi
mately 20 hours of intellectual and experiential exercises designed to
help individuals become more adept at and comfortable with experienc
ing interconnections with others, and to become more aware of, and to
404 Parapsychology
394 The Journal of Parapsychology
For Replication 2, the SAD scale was used along with the Myers-Briggs
Type Indicator (MBTI, Form F: see Briggs & Myers, 1957). For this study,
we were especially interested in the MBTI extraversion-introversion scale
because of its possible relationships with SAD scoring and with remote-
staring detection effects in this psi-mediated social (staring) context.
For Replication 1, the psychological assessments were completed by
the starees after their experimental sessions. For Replication 2, the psy
chological assessments were completed by starees during their experi
mental sessions.
Experimental Hypotheses
Our experimental hypotheses were that, in Replications 1 and 2, the
starees would discriminate the true staring from the nonstaring periods
autonomically (electrodermally)— that their levels of spontaneous elec-
trodermal activity during the staring periods would differ from those
during the nonstaring periods. Therefore, two-tailed tests were used in
the analyses, with alpha set at < .05. We also predicted that, in Replica
tion 2, no such discrimination would occur in the empirical (sham)
control segments of the sessions.
Exploratory analyses examined the correlations among the magni
tude of the autonomic remote staring detection effect, SAD scoring, and
MBTI extraversion-introversion scoring. Since these analyses were ex
ploratory, two-tailed tests were used in their evaluation, with alpha set at
<.05.
Results
Primary Analyses
For each volunteer participant (staree), electrodermal activity was
measured during 10 staring and 10 nonstaring periods (for Replication
1) or during 8 staring and 8 nonstaring periods (for Replication 2).
Rather than compare these multiple scores within a given participant,
we reduced the activities for an entire session to a single score for each
participant and performed statistical tests using participants, instead of
multiple period scores, as the units of analysis. We used the more conser
vative session score (a kind of single majority-vote score) in order to
bypass criticisms based on possible nonindependence of multiple
electrodermal measures taken within a given session. Although it would
Parapsychology 409
Further Studies of Remote Staring 399
the basis of chance. Here the scoring rates were 49.16% (for the
pseudostaring periods) and 50.84% (for the nonstaring periods). This
scoring rate yielded a single mean t (15) = 0.30; p = .76, two-tailed; and
an effect size (r) = .08. Expanded summary statistics for Replications 1
and 2 and for the Sham Control series are presented in Table 1. For
comparative purposes, the results for our previous two series with un
trained and trained starees (see Braud, Shafer 8c Andrews, 1993) are also
included in this table. Electrodermal activity rates during the staring and
nonstaring periods of all four experiments, as well as for the sham con
trol sessions, are presented graphically in Figure 1.
Table 1
Statistical Summary of Autonomic Staring
D etection Results for Four Experiments
and for the Sham Control Series
Scoring Scoring Single Effect 95%
rate rate mean size
c
Confidence
Series X SD t
a
df P zb T interval
Untrained Ss 59.38% 14.11 -2.66 15 .02 -2.37 -.57 51.86-66.90
Trained Ss 45.45% 8.46 2.15 15 .05 1.98 .48 40.94-49.95
Replication 1 45.15% 13.85 1.92 29 .06 1.85 .34 39.97-50.32
Replication 2 45.66% 8.37 2.08 15 .05 1.91 .47 41.19-50.12
Sham control 49.16% 11.34 0.30 15 .76 0.31 .08 43.11-55.20
a All p s are two-tailed. b zs are given for StoufFer z computations. c The effect size is
derived from r =
Secondary Analyses
Linear correlation coefficients (Pearson rs) were calculated in order
to determine the interrelationships among the magnitude of the re
mote-staring detection effect, SAD scoring, and MBTI extraversion-in
troversion (E /I) scoring. To study the relationship betw een
remote-staring detection and SAD, Pearson rs were computed for the
percent electrodermal activity occurring during the staring periods (as
in the primary analyses) versus the SAD scores (expressed as a percent
age of the highest possible SAD score) for Replication 1, for Replication
2, and for the sham control sessions. Summary statistics are provided in
Table 2. For Replication 1, the magnitude of the remote-staring detection
Parapsychology 411
Further Studies of Remote Staring 401
Series r if p'
Replication 1 .36 28 .05
Replication 2 .43 14 .09
Sham control -.1 2 14 .66
aAll p s are two-tailed.
scale, for Replication 2 and for the sham control sessions. (The MBTI
was not administered for Replication 1.) Summary statistics appear in
Table 3. For comparative purposes, similar analyses are presented for
our previous two series with untrained and trained starees (in which the
MBTI, but not the SAD, had been administered). For Replication 2,
there was a strong, positive, and highly significant correlation between
the magnitude of the remote staring detection effect and the staree’s
degree of MBTI introversion. No such correlation occurred for the sham
control segment of the experiment.
Table 3
Linear Correlations Between Staring Period EDA (Percent)
and MBTI Extraversion/I ntroversion (E/I) Score
Series r df p*
Replication 2 .68 14 .0037
Sham control .16 14 .55
Untrained .12 14 .66
Trained .07 14 .80
aAll p s are two-tailed.
References
B raud , W., & S chlitz , M. (1983). Psychokinetic influence on electrodermal
activity. Journal ofParapsychology, 47, 95-119.
B raud , W., 8 c S chlitz , M. (1989). A methodology for the objective study of
transpersonal imagery. Journal of Scientific Exploration, 3, 43-63.
B raud , W., Shafer , D., 8 c A ndrew s , S. (1993). Reactions to an unseen gaze
(remote attention): A review, with new data on autonomic staring detection.
Journal ofParapsychology, 57, 373-390.
B riggs , K. C., 8 c Myers , I. B. (1957). Myers-Briggs Type Indicator Form i? Palo Alto,
CA: Consulting Psychologists Press.
C oles , M. G. , Gale , A., 8c Kline , P. (1971). Personality and habituation of the
orienting reaction: Tonic and response measures of electrodermal activity.
Psychophysiology, 8, 54-63.
Eysenck, H.J. (1967). The biological basis ofpersonality. Springfield, IL: Charles C.
Thomas.
Geen , R. G. (1984). Preferred stimulation levels in introverts and extraverts:
Effects on arousal and performance. Journal ofPersonality and Social Psychology,
46, 1303-1312.
LeS han , L. (1966). The medium, the mystic, and the physicist. New York: Viking
Press.
P aivio , A. (1965). Personality and audience influence. In B. Maher (Ed.),
Progress in experimental personality research, Vol. 2. New York: Academic Press.
Parapsychology 419
Further Studies of Remote Staring 409
ABSTRACT: Each of the two authors recently attempted to replicate studies in which the
“receivers” were asked to psychically detect the gaze directed at them by unseen “senders.”
R. W.’s studies failed to find any significant effects; M. S.’s study gave positive results. The
authors then agreed to carry out the joint study described in this paper, in the hope of
determining why they had originally obtained such different results. The experimental
design was based on each author carrying out separate experiments, but running them in
the same location, using the same equipment/procedures, and drawing participants from
the same subject pool. The 32 experimental sessions were divided into two sets of ran
domly ordered trials. Half were “stare” trials during which the experimenter directed
his/her attention toward the receiver; half were “non-stare” (control) trials during which
the experimenter directed his/her attention away from the receiver. The receivers’ elec-
trodermal activity (EDA) was continuously recorded throughout each session. The EDA of
R. W.’s receivers was not significantly different during stare and non-stare trials. By
contrast, the EDA of M. S.’s receivers was significantly higher in stare than non-stare trials.
The paper discusses the likelihood of different interpretations of this effect and urges
other psi proponents and skeptics to run similar joint studies.
... the experimenter effect is the most important challengefacing modem experimental
parapsychology. It may be that we will not be able to make too much progress in other
areas of thefield until the puzzle of the experimenter effect is solved. (Palmer, 1986,
pp. 220-221.)
The apparent detection of an unseen gaze (i.e., the feeling of being
stared at, only to turn around and discover somebody looking direcdy at
you) is a common type of ostensible paranormal experience, with between
68% and 94% of the population reporting having experienced the phe
nomenon at least once (Braud, Shafer, 8c Andrews, 1993a; Coover, 1913).
Some parapsychologists have attempted to assess whether this experi
ence is based, at least in part, on genuine psi ability. Such studies use two
The authors would like to thank the following organizations for supporting the re
search described in this paper: The Perrott-Warrick Fund, Cambridge University, the Insti
tute for Noetic Sciences, UltraMind, Ltd., the Hodgson Fund, Department of Psychology,
Harvard University, and the University of Hertfordshire. We are also grateful to Matthew
Smith and Emma Greening for their help in running this experiment and analyzing the
data, John Palmer, Dorothy Pope, and the blind reviewers for their helpful comments and
suggestions.
422 Parapsychology
198 The Journal of Parapsychology
participants: a “sender” and a “receiver.” These individuals are isolated
from one another, but in such a way that the sender can see the receiver.
Early experim ents had the sender sitting behind the receiver (Coover,
1913; Poortm an, 1959; Titchener, 1898); some later studies have used
one-way m irrors (Peterson, 1978) or a closed-circuit television system
(Braud, Shafer, 8c Andrews, 1993a, 1993b; Williams, 1983). The experi
m ental session in this type of study is divided into two sets of randomly
ordered “stare” and “non-stare” trials. During stare trials the sender
directs h is/h e r attention toward the receiver; during non-stare trials the
sender directs h is/h e r attention away from the receiver. Either during or
after each trial a response is made by the receiver. In early studies, the
receivers m ade verbal guesses as to whether they believed they had been
stared at; later studies have m easured receivers’ electrodermal activity
(EDA) throughout each trial. A num ber of studies have obtained statis
tically significant differences between responses to stare and non-stare
trials and in a recent review of this work, Braud, Shafer, and Andrews
(1993b) concluded:
We hope other investigators will attempt to replicate these studies. We rec
ommend the design as one that is straightforward, has already yielded con
sistent positive results, and addresses a very familiar psi manifestation in a
manner that is readily communicable and understandable to the experi
mental participants and to the public at large, (p. 408)
Both authors of the present paper previously attem pted to replicate
this staring effect. The first author (R. W.) is a skeptic regarding the
claims of parapsychology who wished to discover whether he could rep
licate the effect in his own laboratory. The second author (M. S.) is a psi
proponent who has previously carried out many parapsychological stud
ies, frequendy obtaining positive findings. The staring experiments car
ried out by R. W. showed no evidence of psychic functioning (Wiseman
8c Smith, 1994; Wiseman, Smith, Freedm an, Wasserman, 8c Hurst, 1995).
M. S.’s study, on the other hand, yielded significant results (Schlitz 8c
LaBerge, 1997).
Such “experim enter effects” are common within parapsychology and
are open to several competing interpretations (see Palmer, 1989a,
1989b). For example, M. S.’s study may have contained an experim ental
artifact absent from R. W.’s procedure. Alternatively, M. S. may have
worked with m ore psychically gifted participants than R. W. had, or'may
have been m ore skilled at eliciting participants’ psi ability. It is also
possible that M. S. and R. W. created desired results via their own psi
abilities, or fraud. Little previous research has attem pted to evaluate
these com peting hypotheses. This is unfortunate, because it is clearly
Parapsychology 423
Experimenter Effects and the Remote Detection 199
im portant to establish why experim enter effects occur, both in terms of
assessing past psi research and attem pting to replicate studies in the
future. For these reasons, the authors agreed to carry out a jo in t study in
the.hope of learning why our original studies obtained such dramatically
different results.
Method
Design
Our jo in t study required M. S. and R. W. to act as separate experi
menters for two different sets of trials. The two sets of trials were carried
out at the same time (early October, 1995) and in the same location (R.
W.’s laboratory at the University of H ertfordshire in the U.K.). In addi
tion, the experim enters used the same equipm ent, drew subjects from
the same subject pool, and employed exactly the same methodological
procedures. The only real difference between the trials was that one set
was carried out by M. S. and the other set was run by R. W. We were
curious to discover if, under these conditions, we would continue to
obtain significantly different results. Each study had one independent
variable with two levels—stare and non-stare. The dependent variables
were the receivers’ EDA during the experim ental session and their re
sponses to a “belief-in-psi” questionnaire.
Participants
Thirty-two subjects (10 males and 22 females; mean age of 25.72, age
range 18 - 49) acted as receivers. Thirty of these were undergraduate
psychology students studying at the University of Hertfordshire. The
remaining two were the authors’ colleagues. M. S. and R. W. acted in a
dual capacity as both experim enter and sender.
Apparatus and Materials
Layout of room. It was clearly im portant to minimize the possibility of
any sensory leakage between sender and receiver during the experim en
tal sessions. For this reason the receiver was located in the University’s
Social Observation Laboratory while the sender was located in a small
room approximately 20 meters away from the laboratory (see Figure 1).
Video equipment. A Panasonic AG-450 video camera was positioned in
front of the receiver and relayed an image (via a long cable connecting
the two rooms) to a 14-inch JVC color TV m onitor in the sender’s room.
424 Parapsychology
This one-way closed circuit television system allowed the experim enter
to see the subject, but not vice versa.
EDA measurement. The receivers’ EDA (electrodermal activity) was
recorded by the RelaxPlus system (a commercially available hardware
and software package produced by UltraMind, Ltd.). This system meas
ures skin resistance level by placing a constant current across two stain
less steel electrodes and then recording the resistance encountered by
that current at a rate of 10 samples per second. The system filters for
possible artifacts (caused, for example, by movement) and records data
to the com puter’s hard disk. The equipm ent (i.e., electrodes, input de
vice, computer, com puter m onitor) was located next to the receiver
throughout the experim ent. The part of the program involved in storing
the details of subjects and their physiological data could be accessed
only via a password known only to M. S. and R. W. Data from the Relax
Plus system were then fed into a spreadsheet (Microsoft’s Excel) in order
to calculate the m ean EDA for each 30-second trial. All statistical analy
ses were carried out using the Statview software package.
Belief-in-psi questionnaire. The receivers were asked three questions
concerning their attitudes toward psi (see Appendix). They indicated
their responses on a seven-point scale ranging from -3 to +3. A general
“belief-in-psi” score was obtained by summing the receiver’s responses
over all three questions. Low scores on this questionnaire were taken to
indicate strong belief in psi.
Trial randomization. The receivers’ EDA may decline during a session
for several reasons (e.g., the apparatus measuring EDA may warm up or
the participants may habituate to their surroundings). This decline
could lead to artifactual evidence for psi if stare trials tend to precede
non-stare trials. The following randomization procedure was devised to
minimize this possible artifact.
Prior to the experim ent, an individual not involved in running the
experim ent (Matthew D. Smith) prepared a set of 32 sheets, each of
which contained the order of the 32 stare or non-stare trials for one
session. For 16 of these sheets the trial orders were generated in the
following way: M. D. S first opened the random num ber table (Robson,
1983, Appendix T hree), chose a num ber as an entry point into the table,
and then threw a die twice. The numbers that came up determ ined how
he moved from this entry point to an actual starting point. The eight
consecutive numbers located in the row to the right of this starting point
determ ined the order of the stare and non-stare trials. An even num ber
translated into an ABBA (stare, non-stare, non-stare, stare) order while
an odd num ber translated into a BAAB (non-stare, stare, stare, non
stare) order. The trial order for the remaining 16 sheets was determ ined
by counterbalancing the orders of the randomized sheets just described.
426 Parapsychology
202 The Journal of Parapsychology
Results1
Primary Analyses
All analyses were preplanned. A Wilcoxon signed rank test was used
to compare receivers’ total EDA for the 16 stare trials with their total
EDA during the 16 non-stare trials.*2 Receivers run by R. W. did not differ
from chance expectation (Wilcoxon z = -.44, df= 15, p= .64, two-tailed).
In contrast, receivers run by M. S. showed a significant effect (Wilcoxon
z = -2.02, d f= 15,/?= .04, two-tailed).
A “detect score” was then calculated for each subject by subtracting
the total EDA during the stare trials from the total EDA for the non-stare
trials. An unpaired t test revealed that the detect scores of M. S.’s subjects
were not significandy different from those of R. W.’s (df= 30, t = 1.39, p
= .17, two-tailed).
Secondary Analyses
Table 1 contains the correlation coefficients between participants’
belief-in-psi questionnaire scores and their detect scores. Spearman rank
correlation coefficients revealed that none of these correlations were
significant. Table 1 also contains the means (and standard deviations) of
]This experiment was first reported at the 1996 Convention of the Parapsychological
Association (Wiseman 8c Schlitz, 1996). While preparing the paper forjournal publication,
the authors reviewed the data and discovered an error in the way one subject’s data had
been transferred into the statistical package used for the analyses. For this reason the
results reported here are slighdy different from those reported in Wiseman and Schlitz
(1996).
2Previous studies (e.g., Braud et al., 1993a, 1993b) have assessed their results by creat
ing a “psi score” (the sum of EDAduring stare trials divided by the sum of the total EDA)
for each participant and then using a one-sample l test to determine the degree to which
these scores deviate from chance expectation. This procedure obscures the question of
whether an overall result is caused by a very small number of participants performing
extremely well. The Wilcoxon sign rank test is more conservative than the one-sample t test
because it is less influenced by the size of the deviation between participants’scores.
428 Parapsychology
204 The Journal of Parapsychology
the questionnaire scores for R. W.’s group, M. S.’s group, and all partici
pants.
Table 1
Means and Standard Deviations for
the Belief in Psi Questionnaire
and
Correlation Coefficients and p Values Between Subjects’
Questionnaire Scores and Detect Scores
R.W’s M. S.’s All
participants participants participants
Mean 1.94 -.81 .56
Standard deviation (SD) 4.22 4.12 4.33
Correlation (r) -.15 .32 .15
(Corrected for ties)
z score -.58 1.23 .84
Rvalue, two-tailed .56 .22 .39
D iscussion
Subjects run by R. W. did not respond differendy to stare and non
stare trials. In contrast, participants run by M. S. were significandy m ore
activated in stare than non-stare trials. These findings can be interpreted
in several ways.
First, one might argue that M. S.’s significant results were caused by
some type of experim ental artifact. Several steps were taken to guard
against this possibility. For example, neither the receivers nor the experi
m enters knew the order of the stare and non-stare trials before the start
of the experim ent; the location of the rooms minimized the possibility
of any sender-to-receiver sensory leakage; and the random ization proce
dure ensured that the results were unlikely to be caused by progressive
errors. This, coupled with the fact that one would expect any artifact to
influence the results of both studies, suggests that M. S.’s significant
results are unlikely to have been caused by a m ethodological error.
Second, one could argue that either R. W.’s or M. S.’s results were
caused by receivers’ cheating. For example, subjects could have discov
ered the order of stare and non-stare trials before the experim ental
session and altered their EDA accordingly. Alternatively, participants
could have altered their data files so that they coincided with the order
of stare and non-stare trials. Several factors mitigate against these
Parapsychology 429
Experimenter Effects and the Remote Detection 205
possibilities. First, such cheating would have been far from straightfor
ward. For example, the selection of trial order was carried out a few
moments before the start of the experim ental session and it could only
have been accessed by a participant who had installed some kind of
covert m onitoring equipm ent in the sender’s room. Likewise, the com
puter could only be accessed if a participant had discovered a password
which was known only to the experimenters. Also, neither R. W.’s or M.
S.’s significant results are due to one exceptional participant, and one
would therefore have to hypothesize that several participants success
fully cheated.
Third, the results could have been caused by experim enter fraud.
Although the experim ent was not designed to make such fraud impossi
ble, its design does m ean that certain types of cheating would have been
extremely unlikely. For example, neither experim enter could have de
cided to include data only from certain subjects because the full list of all
subjects was known to both experimenters. However, more sophisticated
forms of cheating were theoretically possible. For example, one experi
m enter could have substituted false sets of EDA values for subjects’ ac
tual values before the data were analyzed. Although possible, this would
have been far from straightforward because subjects were frequendy
scheduled back-to-back (thus cutting to a minim um the time available
for recording a false replacem ent session), and each experim enter
made a back-up disk of all of the day’s sessions at the end of each day
(thus minimizing the possibility of an experim enter’s substituting data
after the day they had been recorded). In addition, no evidence of any
cheating was uncovered during the running of the experim ent or analy
sis of the data.
Fourth, one could argue that M. S. was working with a m ore “psychi
cally gifted” population than R. W. was. This also seems unlikely because
the receivers were assigned to the two experim enters in an opportunistic
fashion.
Fifth, it is possible that M. S. was m ore skilled at eliciting subjects’ psi
ability than R. W. was. Interestingly, M. S.’s subjects scored higher on the
“belief-in-psi” questionnaire than R. W.’s subjects did (although this dif
ference just failed to reach significance: unpaired /value = 1.86, df= 30,
p = .072, 2-tailed). Given that participants were opportunistically as
signed to experimenters, this difference might be a reflection of the
different ways in which R. W. and M. S. oriented receivers at the start of
the experiment. It seems quite possible that the experim enters’ own
level of belief/disbelief in the existence of psi caused receivers to express
different levels of belief/disbelief in psi and to have different expecta
tions about the success of the forthcoming experim ental session.
430 Parapsychology
206 The Journal of Parapsychology
References
Braud, W., Shafer, D., 8 c Andrews, S. (1993a). Reactions to an unseen gaze
(remote attention): A review, with new data on autonomic staring detection.
Journal of Parapsychology, 57, 373-390.
Braud, W., Shafer, D., 8 c Andrews, S. (1993b). Further studies of autonomic
detection of remote staring: replications, new control procedures, and per
sonality correlates. Journal of Parapsychology, 57, 391-409.
Coover,J. E. (1913). The feeling of being stared at. American Journal of Psychol
ogy, 24, 570-575.
Palmer, J. (1986). ESP research findings: the process approach. In H. L. Edge,
R. L. Morris, J. Palmer, 8cJ. H. Rush (Eds.), Foundations ofparapsychology (pp.
184-222). London: Routledge 8c Regan Paul.
Palmer, J. (1989a). Confronting the experimenter effect. Parapsychology Review,
20, 1-4.
Palmer,J. (1989b). Confronting the experimenter effect. Part 2. Parapsychology
Review, 20(5), 1-5.
Peterson, D. M. (1978). Through the looking glass: an investigation of extra
sensory detection of being stared at. M.A Thesis, University of Edinburgh.
Poortman, J. J. (1959). The feeling of being stared at. Journal of the Society for
Psychical Research, 40, 4-12.
Parapsychology 431
Experimenter Effects and the Remote Detection 207
Robson, C. (1983). Experiment, design and statistics in psychology. London:
Penguin Books.
Schlitz, M.J., 8c LaBerge, S. (1997). Covert observation increases skin conduc
tance in subjects unaware of when they are being observed: A replication.
Journal of Parapsychology, 61, 185-196.
^Titchener, E. B. (1898). The feeling of being stared at. Science, 8, 895-897.
Williams, L. (1983). Minimal cue perception of the regard of others: The
feeling of being stared at. Paper presented at the 10th Annual Conference of
the Southeastern Regional Parapsychological Association, West Georgia Col
lege, Carrollton, GA. See Journal of Parapsychology, 47, 59-60.
Wiseman, R., 8 c Schlitz, M. (1996). Experimenter effects and the remote de
tection of staring. Proceedings of the Parapsychological Association 39th Annual
Convention, 149-157.
Wiseman, R., 8 c Smith, M. D. (1994). A further look at the detection of unseen
gaze. Proceedings of the Parapsychological Association 37th Annual Convention,
465-478.
Wiseman, R., Smith, M. D., Freedman, D., Wasserman, T., & Hurst, C. (1995).
Two further experiments concerning the remote detection of an unseen
gaze. Proceedings of the Parapsychological Association 38th Annual Convention,
480-492.
Dept, of Psychology
University of Hertfordshire
College Lane
Hatfield, Hertfordshire
England AL10 9AB
UK
Institute of Noetic Sciences
473 Gate Five Road
Suite 300
Sausalito, CA 94963
[23]
The Efficacy of “Distant Healing”: A Systematic Review of
Randomized Trials
John A. Astin, PhD; Elaine Harkness, BSc; and Edzard Erast, MD, PhD
In addition, we contacted leading researchers in the Using our search methods, we found more than
fields of distant and spiritual healing to further 100 clinical trials of distant healing. The principal
identify studies. We also searched our own files and reasons for excluding trials from our review were
the reference sections of articles on distant healing lack of randomization, no adequate placebo condi
that we identified. Numerous studies have been car tion, use of nonhuman experimental subjects or
ried out in these areas—for example, in a review of nonclinical populations, and not being published in
spiritual healing, Benor (12) identified 130 con peer-reviewed journals. Twenty-three studies met
trolled investigations, and Rosa and colleagues (15) our inclusion criteria (13, 20-41). These trials in
identified 74 “quantitative studies” of Therapeutic cluded 2774 patients, of whom 1295 received the
Touch. However, we included only studies that met experimental interventions being tested. Method
the following criteria: 1) random assignment of ologic details and results of these trials are summa
study participants; 2) placebo, sham, or otherwise rized in Tables 1 to 3.
“patient-blindable” or adequate control interven The studies are categorized as three types: prayer,
tions; 3) publication in peer-reviewed journals (ex Therapeutic Touch, and other distant healing. How-
904 6 Ju n e 2 0 0 0 • Annals o f Internal Medicine * V o lu m e 132 • N u m b e r 11
Parapsychology 435
Table 1. Randomized, Placebo-Controlled Trials of Prayer
Author, Year Design Sample Size Experimental Control Result Comments Jadad
(Reference) Intervention Intervention* Score
Joyce and Welldon, Double-blind; 2 48 patients with Prayer in Christian or Usual care No significant differ Inclusion and exclusion 5
1965 (20) parallel groups psychological Quaker tradition; ences in clinical or criteria not stated;
or rheumatic patients received attitude state heterogeneous
disease 15 hours of daily patient groups;
prayer for 6 months results of only 16
pairs available
Collipp, 1969 (21) Triple-blind; 2 18 children with Daily prayer for 15 Usual care Higher death rate Heterogeneity of 4
parallel groups leukemia months in control group, groups makes find
but difference ings inconclusive,
was not signifi inclusion criteria not
cant (P = 0.1) stated
Byrd, 1988 (23) Double-blind; 2 393 coronary Prayer in Christian Usual care Treatment group Outcomes combined 5
parallel groups care patients tradition; 3 to 7 required less ven into "severity score"
intercessors per tilatory support to handle multiple
patient until patient and treatment comparisons; score
was released from with antibiotics was lower in treat
hospital or diuretics ment group
Walker et al., Double-blind; 2 40 patients re Prayer for 6 months Usual care No treatment effect Insufficiently powered 4
1997 (24) parallel groups ceiving alco on alcohol con
hol abuse sumption
treatment
Harris et al., Double-blind; 2 990 coronary Remote intercessory Usual care Significant treat No differences were 5
1999 (39) parallel groups care patients prayer in Christian ment effects for observed when the
tradition for summed and summed scoring
28 days weighted coro system developed in
nary care unit Byrd's study (23)
score; no differ was used; unclear
ences in length whether baseline
of hospital stay differences were
adequately con
trolled for
* A placebo was unnecessary because patients were unaware of whether prayers were made on their behalf
ever, these classifications are not mutually exclusive. persons for whom they were praying. Instructions
For example, the study of distant healing by Sicher on how the intercessors should pray were fairly
and colleagues (13) included 40 healers, some of open-ended in most instances. For example, in the
whom would describe what they did as prayer, and trial by Harris and colleagues (39), intercessors were
the study by Miller (22) described the intervention asked to pray for a “speedy recovery with no com
as both prayer and remote mental healing. plications and anything else that seemed appropri
ate to them” (39).
Prayer Two trials showed a significant treatment effect
Of studies that met our inclusion criteria, five on at least one outcome in patients being prayed for
specifically examined prayer as the distant healing (23, 39), and three showed no effect (20, 21, 24)
intervention (Table 1). In all five studies, the inter (Table 1). The average effect size, computed for
vention involved some version of intercessory four of these studies, was 0.25 (P = 0.009).
prayer, in which a group of persons was instructed
to pray for the patients (there was no way to control Therapeutic Touch
for whether patients prayed for themselves during Eleven trials examined the healing technique
the study). Qualifications for being an intercessor known as “noncontact Therapeutic Touch” (Table
varied from study to study. For example, in the trial 2). A criterion for inclusion in our review was that
by Byrd (23), intercessors were required to have an the Therapeutic Touch intervention be compared to
“active Christian life, daily devotional prayer, and an adequate placebo, consisting of a mock or mimic
active Christian fellowship with a local church.” In Therapeutic Touch condition or a design in which
the study by Harris and colleagues (39), those pray patients could not physically observe whether a
ing were not required to have any particular denom Therapeutic Touch practitioner was working on
inational affiliation, but they needed to agree with them. Of the 11 trials, 7 showed a positive treat
the statement “I believe in God. I believe that He is ment effect on at least one outcome (25, 27, 28, 30,
personal and is concerned with individual lives. I 33, 34, 41), 3 showed no effect (26, 29, 31), and 1
further believe that He is responsive to prayers for showed a negative treatment effect (the controls
healing made on behalf of the sick.” healed significantly faster) (32) (Table 2). The av
In each of these studies, the intercessors did not erage effect size, computed for 10 of the studies,
have any physical or face-to-face contact with the was 0.63 (P = 0.003).
6 Ju n e 2000 • Annals o f Internal Medicine • V o lu m e 132 • N u m b e r 11 905
436 Parapsychology
Table 2. Randomized, Placebo-Controlled Trials of Therapeutic Touch
Author, Year Design Sample Size Expenmental Control Result Comments Jadad
(Reference) Intervention Intervention Score
Quinn, 1984(25) Double-blind 60 patients in Noncontact Thera Simulated or mock 17% decrease in post 2
cardiovascular peutic Touch for Therapeutic test anxiety scores
unit 5 minutes Touch in treatment group
Keller and Bzdek, Single-blind, 60 patients with Noncontact Thera- Mock Therapeutic Treated group-showed Treatment effects were 3
1985 (2 7 ) 2 parallel tension head peutic Touch for Touch pain reduction after no longer present at
groups ache 5 minutes trial 4 hours of follow
up; however, when
participants who
used intervening
therapy were re
moved from analy
sis, 4-hour changes
became significant
Quinn, 1988(26) Single-blind; 153 patients Noncontact Thera Mock Therapeutic No significant treat Negative findings sug 2
3 parallel awaiting peutic Touch for Touch; no treat ment effects gest importance of
groups open-heart 5 minutes ment eye and face contact
surgery
Meehan, 1992 Single-blind; 108 postopera- Noncontact Thera- Mock Therapeutic Nonsignificant reduc Used conservative 3
(28) 3 parallel tive patients peutic Touch for Touch, usual tions in postopera “intention-to-treat"
groups 5 minutes care (analgesic tive pain (P < 0.06), analyses
drugs) treatment group
showed reduced
need for analgesic
medication
Simington and Double-blind; 105 institution Noncontact Thera Mock therapeutic Lower levels of post No differences be 2
Laing, 1993 3 parallel alized elderly peutic Touch touch with back test anxiety ob tween therapeutic
(29) groups patients with back rub rub; back rub served in treatment touch and mock
for 3 minutes alone group compared therapy; no pretest
with back rub only given
Wirth et a!., Double-blind 24 participants Noncontact Thera No treatment (pla More rapid healing in 4
1993 (30) with experi peutic Touch cebo not neces treatment group
mentally (healer behind sary)
induced punc one-way mirror)
ture wounds 5 min/d for 10
days
Wirth et a l, Double-blind; 38 participants Noncontact Thera No treatment (pla No treatment effect in Control group healed 3
1996 (32) 2 parallel with experi peutic Touch cebo not neces terms of healing of significantly faster
groups mentally (healer behind sary) dermal wounds than treatment
induced punc one-way mir group
ture wounds ror), 5 min/d for
10 days
Gordon et al., Single-Wind 31 patients with Noncontact Thera Mock Therapeutic Treatment group No change in func 3
1998 (33) osteoarthritis peutic Touch, 1 Touch, usual showed improve tional disability
of knee session/wk for 6 care ments in pain,
weeks health status, and
function
Turner et al., Single-Wind; 99 burn patients Noncontact Thera Mock Therapeutic Treatment group 3
1998 (34) 2 parallel peutic Touch for Touch showed reductions
groups 5 days; time in pain and anxiety
varied from 5 to and had lower
20 minutes CD8+ counts
Wirth et al., Double-blind 25 participants Noncontact Thera Visualization and No treatment effect Authors note that the 4
1994 (31) crossover with experi peutic Touch relaxation with number of healed
study mentally in with visualiza out Therapeutic wounds was insuffi
duced punc tion and relax Touch cient to compare for
ture wounds ation analyses
Wirth, 1990(41) Double-Wind 44 men with Noncontact Thera Mock Therapeutic Treatment group 4
experimen peutic Touch Touch showed accelerated
tally induced (healer not visi wound healing at
puncture ble to partici days 8 and 16
wounds pants), 5 min/d
for 10 days
Other Distant Healing 38, 40). Effect sizes were computed for five of the
Seven studies examined some other form of dis studies, resulting in an average effect size of 0.38
tant healing (Table 3). Descriptions of these inter (P = 0.073).
ventions included “distance or distant healing” (13,
37, 38, 40), “paranormal healing” (36), “psychoki- Overall Effect Size
netic influence” (35), and “remote mental healing” An overall effect size was calculated for all trials
(22). Positive treatment effects were observed in in which both patient and evaluator were blinded.
four of the trials (13, 22, 35, 37), and three showed Along with the four studies that were previously
no significant effect of the healing intervention (36, excluded because effect sizes could not be calcu-
906 6 Ju n e 2000 • Annals o f Internal M edicine • V o lu m e 132 • N u m b e r 11
Parapsychology 437
Jated, three additional trials were excluded because clear relation emerged between the methodologic
it was unclear whether the evaluator was blinded to quality of the studies and whether the results were
the treatment condition. For the 16 remaining trials, for or against the treatment. There was a trend
the average effect size was 0.40 (P < 0.001) across toward studies with higher quality scores being less
the three categories of distant healing (2139 pa likely to show a treatment effect, but this correlation
tients). A chi-square test for homogeneity was sig was weak and not statistically significant (R = -0.15;
nificant (P = 0.001), suggesting that the effect sizesP > 0.2).
were not homogeneous. Subgroup analysis revealed Despite the fairly high average quality of the
that effect sizes were homogeneous within the cat trials, the methodologic limitations of several stud
egories of prayer and other distant healing but not ies (such as inadequate power, failure to control for
within the category of Therapeutic Touch studies. baseline measures, and heterogeneity of patient
groups) make it difficult to draw definitive conclu
In this analysis, the “fail-safe N ” was 63; this value
represents the number of studies with zero effect sions. For example, the findings reported by Collipp
that there would have to be to make the effect size (21) may have resulted from a randomization prob
lem that produced heterogeneous patient groups
results nonsignificant. It suggests that the significant
findings are less likely to be the result of a “file- (two of the eight controls had myelogenous leuke
drawer effect” (that is, the selective reporting and mia, but no patient in the experimental group had
publishing of only positive results). this condition). In the study by Miller (22), the
positive finding of decreased systolic blood pressure
in the remote mental healing group is difficult to
Methodologic Issues interpret owing to the failure to control for baseline
Owing in part to our stringent inclusion criteria, use of blood pressure medication.
the methodologic quality of trials was fairly high; The Therapeutic Touch studies carried out by
the mean Jadad score across all studies was 3.6. No Quinn (25), Keller and Bzdek (27), Turner and
Author, Year Design Sample Size Expenmental Control Result Comments Jadad
(Reference) Intervention Intervention Score
Braud and Schlitz, Single-blind 32 participants Distant mental influ No-influence 10% reduction in gal No effect in participants 3
1983(35) within and with high ence (intention to control con vanic skin response with initially low gal
between levels of auto decrease arousal ditions between control and vanic skin response
participants nomic arousal with ten 30-second influence sessions levels
sessions)
Beutler et a!., Double-blind; 120 patients Laying on of hands Healing at a No treatment effect Unclear what precisely 4
1988 (36) 3 parallel with hyper by 12 healers, 20 distance; the healers did, acute
groups tension mm/wk for 15 usual care increase in diastolic
weeks blood pressure after
laying on of hands
Wirth et a l , Double-blind 21 patients with Distance healing (Reiki, No treatment Treatment group showed 4
1993 (37) crossover bilateral LeShan) for 15-20 (placebo decrease in pain inten
study asymptomatic minutes 3 hours not neces sity and greater pain
impacted after surgery sary) relief after surgery
third molar
who were
undergoing
surgery
Greyson, Double-blind 40 patients with Distance healing Usual care No treatment effect May have been under 5
1996 (38) depression (LeShan technique) powered
Sicher et al.. Double-blind; 40 patients with Distance healing (40 Usual care (no Healing group had fewer Mood changes may 5
1998(13) 2 parallel AIDS healers from differ placebo new AIDS-defining have been due to
groups ent spiritual tradi necessary) illnesses, less illness baseline differences;
tions; each patient severity, fewer physi no apparent statisti
treated by 10 cian visits and hospital cal adjustment for
healers) izations, and improved multiple comparisons
mood
Miller, 1982 (22) Double-blind; 96 patients with “Remote mental heal No treatment Decrease in systolic blood Undear how many par 1
2 parallel hypertension ing” in Church of (no placebo pressure in treatment ticipants were lost to
groups Religious Science necessary) group follow-up; results
tradition given for only 4 of 8
healers; use of medi
cation not controlled
for
Harkness et al., Double-blind 84 patients with 6 weeks of distant No treatment No significant treatment Seems that baseline 5
(40) warts healing ("channeling (no placebo effect on size or num values were not con
of energy") by 10 necessary) ber of warts trolled for in analysis
healers
Despite the methodologic limitations that we 16. O'Mathuna D. Therapeutic touch: what could be the harm? Scientific Re
view of Alternative Mediane. 1998;2:56-62.
have noted, given that approximately 57% (13 of 17. Jadad AR, Moore RA, Carroll D. Jenkinson C Reynolds D, Gavaghan
23) of the randomized, placebo-controlled trials of DJ, et al. Assessing the quality of reports of randomized clinical tnals: is
blinding necessary? Control Clin Trials. 1996; 17-1-12.
distant healing that we reviewed showed a positive 18. Cohen J. Statistical Power Analysis for the Behavioral Sciences. Hillsdale, NJ:
clusion of the Cochrane Collaboration’s review of ments Psychol Bull. 1982;92 490-9.
further study (46). We believe that additional stud 21. Collipp PJ. The efficacy of prayer: a tnple-blind study. Med Times. 1969;97.
issues outlined above are now called for to help potheses. 1982;8:481-90.
ature and shed further light on the potential efficacy 24. Walker SR, Tonigan JS, Miller WR, Comer S, Kahlich L Intercessory
prayer in the treatment of alcohol abuse and dependence: a pilot investiga
of these approaches. tion. Altern Ther Health Med. 1997;3:79-86.
25. Quinn JF. Therapeutic touch as energy exchange: testing the theory. ANS
Adv Nurs Sri. 1984;6:42-9.
From University of Maryland School of Medicine, Baltimore, 26. Quinn JF. Therapeutic touch as energy exchange: replication and extension.
Maryland; and University of Exeter, Exeter, United Kingdom. Nurs Sri Q. 1989;2:79-87.
27. Keller E, Bzdek VM. Effects of therapeutic touch on tension headache pain
Grant Support: By the National Center for Complementary Nurs Res. 1986;35:101-6.
and Alternative Medicine, National Institutes of Health 28. Meehan TC Therapeutic touch and postoperative pain: a Rogenan research
study. Nurs Sci Q. 1993;6:69-78.
(1P50AT0008401), The Wellcome Trust (050836/Z/970, and a 29. Simnrigton JA, Laing GP. Effects of therapeutic touch on anxiety in the
charitable donation from the Maurice Laing Foundation. institutionalized elderly. Clin Nurs Res. 1993;2:438-50.
30. Wirth DP, Richardson JT. Eidelman WS. O'Malley AC. Full thickness
Requests for Single Reprints: John A. Astin, PhD, Complementary dermal wounds treated with non-contact therapeutic touch, a replication and
Medicine Program, Keman Hospital Mansion, 2200 Keman extension. Complement Ther Med. 1993;1:127-32.
Drive, Baltimore, MD 21207-6697; e-mail, jastin@compmed.ummc 31. Wirth DP, Barrett MJ, Eidelman WS. Non-contact therapeutic touch and
.umaiyland.edu.
wound re-epithetialization: an extension of previous research. Complement
Ther Med. 1994;2:187-92.
32. Wirth DP, Richardson JT, Martinez RD, Eidelman WS, Lopez ME. Non-
Requests To Purchase Bulk Reprints (minimum, 100 copies): Bar contact therapeutic touch intervention and full-thickness cutaneous wounds:
bara Hudson, Reprints Coordinator; phone, 215-351-2657; e-mail, a replication. Complement Ther Med. 1996;4:237-40.
bhudson@mail.acponline.org. 33. Gordon A. Merenstein JH, D'Amico F, Hudgens D. The effects of thera
peutic touch on patients wrth osteoarthritis of the knee. J Fam Pract. 1998;
2. Definition o f constructs
Quite apart from differences of viewpoint in what constitutes the range of appro
priate subject matter, a much more important definitional problem arises in terms
of defining and measuring specific psi phenomena. The problem arises primarily
because psi phenomena are defined, not in terms of what they are, but only in
terms of what they are not. Telepathy is the simultaneous sharing or transfer of
information between two brains in the absence o f any ‘n ormal' mechanism that
could accountfor it; precognition involves seeing future events in a manner that
cannot be accountedfor by any means understood by contemporary science, and
so on. Telepathy is not telepathy if sender and receiver communicate by ‘silent’
dog whistles that one of them is able to hear, or if they have some sort of secret
code that allows them to communicate without the knowledge of the researcher.
Psychokinesis is not psychokinesis if the psychic causes an object to move by
hidden, although normal, means. Indeed, parapsychology is the only realm of
objective inquiry in which the phenomena are all negatively defined, defined in
terms of ruling out normal explanations. Of course, ruling out all normal expla
nations is not an easy task. We may not be aware of all possible normal explana
tions, or we may be deceived by our subjects, or we may deceive ourselves.
If all normal explanations actually could be ruled out, just what is it that is at
play? What is psi? Unfortunately, it is just a label. It has no substantive definition
that goes beyond saying that all normal explanations have apparently been elimi
nated. Of course, parapsychologists generally presume that it has something to
do with some ability of the mind to transcend the laws of nature as we know
them, but all that is so vague as to be unhelpful in any scientific exploration.
Some parapsychologists, recognizing the problem of trying to provide a positive
rather than a negative definition of psi, choose to sidestep the issue and instead
focus on ‘anomalies’. Psi effects are thus thought of as anomalous findings that
apparently should not occur if the current scientific worldview is accurate. These
are not just any such anomalies, of course. They are anomalies that involve, in
one way or another, the mind.
Anomalistic observations that do not fit with accepted theory are vital to scien
tific progress, for they force us to modify our theories and to gather additional
data until they can be understood and accommodated into a revised theory. For
example, to AIDS researchers it is quite anomalous that some Nairobi prostitutes
show an inherent resistance to HIV infection, but only as long as they continue to
have exposure to multiple partners. This is an important anomaly — it does not
make immediate sense in terms of what is known about this illness, but coming to
understand it will undoubtedly lead to a much better understanding of HIV in
general. Elsewhere in science, anomalies sometimes lead to such fundamental
changes in theory that philosophers of science speak in terms of a paradigm shift.
The precession of Mercury in its orbit behind the sun was anomalous; for it did
not fit with Newton’s theory of gravity and the derivative understanding of the
movement of planets. Scientists a century ago went so far as to speculate that
Mercury’s orbit behind the sun was actually disrupted by the gravitational field
of an unseen planet (they called it Vulcan) on the far side of the sun. However,
446 Parapsychology
34 J.E. ALCOCK
Einstein’s general theory of relativity was able to account for the perihelion shift
of Mercury, resolving the anomaly and thereby helping to usher in a new scien
tific worldview.
Yet, when parapsychologists seek to establish their subject matter in terms of
anomalies, there is something quite different going on compared to either of the
examples above. In mainstream science, one does not deliberately seek anoma
lies; they present themselves. They are unexpected and unpredicted by current
theory, that is why, after all, they are called anomalies. However, no psi anomaly
has ever presented itself in the course of research in mainstream science. Con
sider the particularly delicate experiments in subatomic physics, which might be
ideal for the manifestation of putative psi forces, given that they involve very
tiny amounts of matter and energy, highly precise measurements and very highly
motivated researchers with, at least at times, varying expectations. We do not
read research reports that suggest that the outcomes of such experiments seem to
depend on who was operating the linear accelerator at the time, and that a particu
lar effect is found only when certain researchers are present and not otherwise,
reflecting perhaps a researcher’s ‘psychic’ influence. In the course of doing nor
mal science, anomalies suggestive of psi just do not pop up. Rather, parapsychol
ogists, in their work, deliberately try to generate them; they are the goal of much
parapsychological research and are only labelled as anomalous by the rather cir
cular route of deeming them to be impossible if current science is accurate and
complete.
Parapsychologists need to be able to provide a positive definition of psi, to tell
us how to identify psi ‘anomalies’ in ways other than exclusion, and to tell us
how to rule out psi, how to know when it is absent. This problem is as great now
as it has ever been, and no progress has been made in overcoming it across more
than a century of empirical parapsychological research. Because of its negative
definition, we are left with no idea as to when psi might occur, and more impor
tantly to the scientist, as to when it will not occur. There is no way, we are told,
that psi can be blocked or attenuated by the researcher, and thus we cannot com
pare conditions where psi could not occur to those where, were it to exist, it could
be observed. Moreover, because it is claimed that psi influences can occur with
out any attenuation as a function of distance, and can occur backwards and for
wards in time, it becomes impossible ever to truly ‘control’ the conditions of an
experiment.
3. Failure to achieve replication
If parapsychologists cannot provide a positive definition of psi, then at least one
would hope that they could provide a reliable, replicable, demonstration of the
subject of their study, be it an ‘anomaly’ or whatever. Mainstream science
accords a high value to replicability, for it is perhaps the best safeguard against
being taken in by results produced by error, self-delusion or fraud. Yet replicability
itself is a somewhat complex concept. Simply repeating an experiment and get
ting the same results is not by itself enough, for whatever errors or self-delusions
Parapsychology 447
GIVE THE NULL HYPOTHESIS A CHANCE 35
may have occurred in the first instance might also be part of subsequent repeti
tions of the experiment (Hyman, 1977). That was precisely the case when, at the
beginning of the twentieth century, the French physicist, Professor Blondlot,
‘discovered’ N-rays, an apparently new form of energy. He replicated his experi
ments many times, and indeed, a score or more of other scientists reported that
they had confirmed the existence of N-rays in their own laboratories. Yet sceptical
scientists were unable to replicate these results, and ultimately Blondlot’s find
ings were shown to be a product of self-delusion (Alcock, 1981). The concept of
replicability, to be useful, implies that researchers in general, provided that they
have the expertise and equipment, should be able to reproduce the reported
results, and not just those who are believers and enthusiasts.
Because parapsychologists have never been able to produce a successful
experiment that neutral scientists, with the appropriate skill, knowledge and
equipment, can replicate, some parapsychologists have gone so far as to argue
that the criterion of replicability should not be applied to psi research because the
phenomena are so different from the usual subject matter of science (Pratt,
1974). Yet, what a risky adventure it would be to yield to special pleading and
relax the very rules of scientific methodology that help to weed out error,
self-delusion and fraud in order to admit claims that violate the basic tenets of
science as we know it!
Several of the papers in this Special Issue address the problem of replicability
in psi research:
(1) My good and respected friend Adrian Parker acknowledges the highly
problematic inconsistencies in parapsychology that reflect both failures to repli
cate and situations where some experimenters, but not others, can replicate a set
of findings. Yet he does not take this to suggest that the Psi hypothesis might be
wrong and the Null hypothesis correct, but instead views these irregularities as
reflecting possible properties of the ostensible phenomenon, such as the psi-
experimenter effect (discussed below). This is begging the question. When there
has been a failure to replicate, it is not appropriate to engage in the circularity of
assigning to this failure a label (psi-experimenter effect), and then implicitly sug
gesting the label as its explanation. Since there is no other way of defining or
identifying the psi-experimenter effect, it has no explanatory value. Using it as a
possible explanation only leads to a tautology: by substituting the definition of
the psi-experimenter effect, one gets ‘The failure to replicate may be a manifesta
tion of “one researcher failing to replicate a finding that another researcher had
made”.’ This circular reasoning excludes from the debate a possibly fruitful
aspect of research, in terms of coming to understand the reasons, other than psi,
that might account for the fact that different experimenters have obtained differ
ent results.
(2) With regard to ESP in the ganzfeld, Palmer concludes that, while he finds
statistically significant departures from the Null hypothesis across the aggregate
data bases that he has examined, ‘the marked heterogeneity of results across
experiments leaves doubt about the future replicability of the phenomenon out
side parapsychology’.
448 Parapsychology
36 J.E. ALCOCK
(3) In their article, Sherwood and Roe examine attempts to replicate the
well-known Maimonides dream studies that began in the 1960s. They provide a
good review of these studies of dream telepathy and clairvoyance, but if one
thing emerges for me from their review, it is the extreme messiness of the data
adduced. Lack of replication is rampant. While one would normally expect that
continuing scientific scrutiny of a phenomenon should lead to stronger effect
sizes as one learns more about the subject matter and refines the methodology,
this is apparently not the case with this research. They conclude: ‘Overall, the
Maimonides studies were more successful than the post-Maimonides studies but
this may be due to procedural differences.’ Indeed, this leads the authors to indi
cate that ‘more recent work has concentrated on the question of whether consen
sus methods are superior to individual performance. With consensus judgement
procedures, the responses from a number of individuals are combined to give a
single judgement.’ To the sceptic, this is a strange turn of events. The phenome
non of interest is the alleged ability of some individuals to paranormally receive
information while they are asleep. Because research cannot demonstrate this
clearly, the researchers choose to complicate the situation immensely by combin
ing information from a number of subjects.
(4) Jeffers’ article also bears directly on the question of replicability. Jeffers
stands in lonely company as one of the very few neutral scientists who have
empirically investigated the existence of psi phenomena. My first interaction
with Jeffers is memorable to me. Jeffers, a physics professor at my university,
was inspired by the work of Robert Jahn (e.g. Jahn, 1982), that purported to dem
onstrate the influence of the human mind on the output of a random event genera
tor, and he decided to carry out his own psi experiments. His methodology was
different from Jahn’s (or indeed from other psi experiments) in that it investi
gated the possible effect of psi on the interference of light. He reasoned, and Jahn
had agreed, that if Jahn’s results were due to subjects’ mental influence on quan
tum processes, then that same influence might be expected to affect the interfer
ence patterns produced when two beams of light are sent through narrow slits. In
Jahn’s work, a series of numbers appeared on a computer screen, the ultimate
result of a quantum process, and subjects strove to affect the magnitude of those
numbers. In Jeffers’ work, a bar appeared on a computer screen, its length deter
mined by a quantum process (fringe contrast in the interference pattern) and sub
jects attempted to influence the height of the bar. Thus, Jahn and Jeffers were
both attempting to measure subjects’ ability to influence quantum processes by
mentation alone and, given that different methodology was used, were Jeffers’
research to have produced significant results this would have added even more
weight to Jahn’s conclusions than would a straight replication. This is because
Jeffers studied the same construct, or concept, from a slightly different angle,
thereby making his research capable of producing convergent evidence, whereas
a straight replication using exactly the same methodology might also reproduce
any undetected errors and biases in the original.
Back to our initial meeting: Jeffers came to me at least a tad defiantly, request
ing that I review his experimental design and offer any suggestions and
Parapsychology 449
GIVE THE NULL HYPOTHESIS A CHANCE 37
criticisms before he began his research. He stressed that I should not after the
fact, were he to obtain data supporting the parapsychological interpretation, then
argue that the experiment was not to be taken seriously because it had fallen
methodologically short in some fashion. Thus began our relationship, which was
to grow into the very positive one that it is today. I reviewed his experimental
design, and I raised some reservations — the same reservations that I had written
about (Alcock, 1990) with regard to Jahn’s work. While so far as I am aware,
Jahn’s group never paid any heed to my comments, Jeffers incorporated changes
that satisfied all my concerns. As Jeffers reports in his paper, his research find
ings give no support to the Psi hypothesis.
Jeffers’ research makes a very important contribution to the study of putative
psi phenomena, in my opinion, for the following reasons:
1. It was carried out by a neutral scientist who approached the subject with
great interest and motivated by the possibility that Jahn may really have dis
covered something very important — the influence of human mentation on
random physical processes. This should be an ideal condition for producing
the desired results: Jeffers was very much open to the possibility of psi and
was motivated to find it.
2. The research began with the full approbation of both proponent and sceptic.
Jeffers’ had the full-fledged support of Jahn himself and, as noted above, I
fully supported the appropriateness of the revised methodology that he
employed. Had he produced positive results, Jahn no doubt would have
viewed this as a significant conceptual replication of his own work by a neu
tral scientist, and I in turn would have had to admit that the research was
done carefully and correctly, and that I had no basis for rejecting it on meth
odological grounds.
However, when Jeffers’ research did not produce results supportive of the Psi
hypothesis, other researchers in the area dismissed it, and now it receives virtu
ally no attention from parapsychology at all. (To be precise, his article discusses
two kinds of experiments, one single-slit and one double-slit. The results of the
single-slit experiment, carried out at York University, were null. There were two
sets of double-slit experiments, one conducted at York University and one car
ried out in Jahn’s laboratory at Princeton. The York experiment produced a null
outcome, while that at Princeton produced ‘marginal’ significance (p = 0.05),
which Jeffers views, as do I, as unconvincing). This neglect of Jeffers’ research is
most unfortunate. Although his data, as reviewed in his current paper, is in line
with the Null hypothesis, the fact that it is now ignored within parapsychology is
another instance of not giving the Null hypothesis a fair chance.
Incidentally, Jahn’s laboratory more recently collaborated with researchers
at two German universities to attempt a carefully controlled replication of the
basic claims of Jahn’s research group. The result? Neither the researchers at
Jahn’s lab nor those in the two German universities found anything of signifi
cance with regard to the hypotheses under test (Jahn et al., 2000). They did,
however, on a post-hoc basis — as is so often the case in parapsychology —
450 Parapsychology
38 J.E. ALCOCK
find some ‘anomalies’ in the patterning of the data which they argue call for
more sophisticated experiments and theoretical models in order to under
stand ‘the basic phenomena involved’. Again, failure to confirm predictions
does not, in their view, give strength to the Null hypothesis. By post-hoc data
snooping, a success of sorts can always be wrestled away from the jaws of the
Null.
In sum, parapsychologists have never been able to produce a demonstration
that can be reliably replicated by researchers in general, and failures to replicate
are either ignored, explained away or interpreted as evidence for the existence of
arbitrary properties of psi, as is discussed below.
4. Multiplication o f entities
Despite William of Ockham’s exhortation that one should not increase the num
ber of entities required to explain a phenomenon beyond what is necessary
(‘Ockham’s Razor’), parapsychology has unabashedly invented a number of
such entities by way of explaining away failures to produce consistent and
replicable data. For example:
1. As touched on earlier, if only some researchers can obtain an effect — and
then only some of the time — while other researchers using identical meth
ods cannot, this is taken, not as lending support to the Null hypothesis, but as
a manifestation of a property of psi — the psi-experimenter effect This ‘ef
fect’ supposedly occurs because some experimenters, perhaps because of
their own psi abilities, are conducive to the production of psi in experiments,
while others are not.
Smith’s article in this Special Issue provides a good overview of the
enduring problem of the experimenter effect in parapsychology, but his
analysis also indirectly serves to demonstrate the problem that I am address
ing. While acknowledging the issue of replication in parapsychology, Smith
argues that ‘replication difficulties in parapsychology may be due, at least in
part, to psi-related experimenter influences’. He recognizes that this view is
difficult from the point of view of science because it suggests that ‘it is only
those researchers who believe that psi exists that are likely to be able to repli
cate positive results’. Nonetheless, as he reflects upon this problem, Smith’s
optimism is not diminished and he argues: ‘the scientific approach adopted
by psi research has so far achieved some limited success in identifying fac
tors associated with obtaining positive results in psi experiments, and it is
my view that it is such an approach that is likely to reveal more of these fac
tors in future research. Only when we have a much more detailed recipe for
success can more consistent levels of replication be expected.’ Thus, while
aware of the problem he sidesteps it.
Parker also addresses this subject, and states that ‘experimenter effects
and psi-conduciveness are every bit as integral part of the phenomena being
studied as, say, placebo effects are in psychological treatment’. The problem
is that the ‘experimenter effect’ is really only a lack of consistency, a lack of
Parapsychology 451
GIVE THE NULL HYPOTHESIS A CHANCE 39
general replicability, which itself is more in line with the Null hypothesis
than anything else. There is no reason, no justification, to engage in further
multiplication of explanatory entities, to use Ockham’s language. What we
have here is a failure to replicate. Period. The psi-experimenter effect pro
vides the ultimate Catch-22: if you find the psi effect you are looking for,
well and good. If you do not find it, this might be because of the experi
menter effect, and so this too could be a manifestation of psi!
2. The sheep-goat effect refers to the observation that believers in psi are more
likely than non-believers to demonstrate evidence of psi in an experiment.
3. If subjects fail to obtain the above-chance scores predicted in a psi experi
ment, that is not taken as lending weight to the Null hypothesis. Instead — so
long as they fail miserably enough that their data deviate statistically signifi
cantly in the non-predicted direction, then this is taken as support for the Psi
hypothesis, and another ‘effect’ — thepsi-missing effect is invoked, allow
ing the interpretation that the miserable failure was indeed a success.
4. If a ‘gifted’ subject scores well in early trials but then, as is so often the case,
scores only at a chance level later, this is not taken as support for the Null
hypothesis. Instead, it is taken as evidence for another ‘property’ of psi —
the decline effect. Thus, failure is often interpreted as a kind of success, as an
indication of the weird properties that this elusive psi possesses.
I note that one such ‘effect’, at one time well-known within parapsychology,
appears to have quietly disappeared. I am referring to the quartile-decline effect,
much discussed by the pre-eminent parapsychologist Joseph Banks Rhine, and
so-named because it was noted that when subjects’ scores were recorded in two
columns to a page, there was often a significant decline in subjects’ success if
one compared the scores in the upper left-hand quadrant of the page to those in
the lower right-hand quadrant. While such an ‘effect’ always struck sceptical
observers as somewhat convenient and arbitrary, it was touted as again suggest
ing some strange property of psi.
Indeed, the very fact that it has proven so difficult to produce a reliable demon
stration of a psi phenomenon has led some researchers to think of this general
elusiveness not as something in line with the Null hypothesis, but rather as
another property of psi. Parker’s paper speaks to this: ‘For whatever reason the
phenomena appear to have an elusiveness as a defining characteristic that makes
them intrinsically difficult to capture in the laboratory in a stable, predictable and
controllable fashion.’
Note that none of these so-called effects are anything other than arbitrary,
post-hoc labels attached to unexpected negative outcomes. The employment of
arbitrary post hoc constructs to explain away failures and inconsistencies in the
data is a serious problem when one considers the scientific status of parapsychol
ogy. The Null hypothesis is not given a fair chance when data that are consistent
with it are explained away in this manner.
452 Parapsychology
40 J.E. ALCOCK
5. Unfalsifiability
Obviously, the use of such ‘effects’ as those just discussed serves to make claims
about psi essentially unfalsifiable, for any failure to produce the predicted effect,
or any inconsistency in the data, can be explained away in terms of one or another
of them. Failure to produce data consistent with psi has never been taken as pro
viding weight to the null hypothesis.
Falsifiability is an important concept in science, especially when highly
unusual claims are made. Science did not ignore Roentgen’s rays just because
they did not fit in with what was known at the time. On the other hand, science
did not ignore Blondlot’s rays (N-rays) either. The former turned out to be a
highly replicable phenomenon that demanded changes in physical theory to
account for it. The latter, despite numerous independent ‘replications’ initially,
turned out to be a figment of the imagination. This is why falsifiability is so
important.
6. Unpredictability
This problem is also related to the replication difficulty. Parapsychologists can
not in general make predictions before running experiments and then confirm
them. Yet, as discussed earlier, even if predictions are not confirmed, researchers
often point to some apparent irregularity in the data that suggests, post-hoc, that
some other psi event occurred.
Yet, if psi is real, one might expect that psi manifestations would be predict
able, as least to some extent. With the vast amounts of data that parapsycholo
gists typically collect, it would be straightforward enough to calculate the
number of datapoints needed to obtain an effect size of an arbitrary magnitude,
and then rerun the study with that number of data points, and find the predicted
effect if it is there. It never works out that way. This has led Palmer to admit to
‘what appears to be an intractable problem in parapsychology. Until we can pre
dict such outcomes ahead of time, the establishment of lawful relationships still
evades us.’ This unpredictability, I must point out, is what one would expect to
find if the Null hypothesis, rather than the Psi hypothesis, obtains. If the Null
hypothesis is true, if there is no such thing as psi, then ‘significant results’ occur
from time to time because of a concatenation of chance factors, flaws in the
experimental design, and so on. In such a case, one would not expect any lawful
ness in the data, and one would not be able to predict what should occur in the
next experiment based on what has happened in the last.
7. Lack o f progress
Not only is there a problem of general inconsistency in the data, as discussed
above, there has not been any real improvement in this situation over time.
Despite the use of modem random event generators and sophisticated statistical
analyses, parapsychologists are no closer to making a convincing scientific case
for psi than was Joseph Banks Rhine back in the 1930s. There has been no growth
in understanding. Psychic phenomena, if they exist, remain as mysterious as
Parapsychology 453
GIVE THE NULL HYPOTHESIS A CHANCE 41
ever. No consistent patterns have emerged. Effect sizes do not grow over time as
a result of refinements in methodology. No well-articulated theory supported by
data has been developed. Indeed, rather than producing a gradual accumulation
of knowledge and an evolution of better and better methodology, every decade
seems to spawn some new methodology or paradigm or research programme that
offers promise of the long-awaited breakthrough, but that gradually loses its glit
ter. The famous Rhine experiments (e.g. see Rhine et al., 1966/1940) are no lon
ger held up as strong evidence for the Psi-hypothesis. Soaks research (e.g. Soal
and Bateman, 1954), once trumpeted, is now forgotten, and for good reason.
Targ and Puthoff s remote-viewing experiments (e.g. Targ and Puthoff, 1974) ,
which showed early promise, now are virtually ignored, again for good reason.
The Maimonides research has been difficult to replicate, as Sherman and Roe
point out. Jahn’s research group at Princeton continues its efforts (e.g. Jahn et al.,
2000), but its impact is minimal within modem parapsychology, partly due to
methodological problems identified by other parapsychologists and critics alike.
There has been no real growth in understanding or in the ability to isolate the
putative phenomena over time. New research strategies seem to ‘fret and stmt
their hour upon the stage’ and then are heard little more.
8. Methodological weaknesses
Given that psi is defined negatively, and can only assumed to have been present if
all possible normal explanations can be ruled out, critics of parapsychology are
naturally inclined to look for flaws in the experimental design and execution of
research that would account for whatever positive effects parapsychologists have
adduced. Of course, this quest is hampered by the fact that experimental reports
will only rarely capture sources of error of which the experimenter was oblivi
ous, and so it is not always possible in the first instance to find normal sources of
putative psi effects based on the write-ups alone. The nub of the debate between
sceptic and proponent is most typically the adequacy of the methodology. I think
it fair to say, and I suspect that both Parker and Palmer would agree with me on
this, for they have been strong methodological critics of much parapsychological
research themselves, methodological weaknesses have, in a large number of
studies, vitiated the claim to have demonstrated something paranormal. How
ever, some parapsychologists have argued that even when errors and weaknesses
are found, the onus is on the critic to show that the error could have produced the
observed effects. That argument is not persuasive however, for the onus is always
on the researcher to demonstrate that he or she has done the experiment well, and
flaws in design or procedure show that it was not done well, and that perhaps
other less obvious methodological problems have also been a factor. The answer
is simply to run the experiment again, doing it right this time. That is what is
expected in mainstream science. The problem for parapsychology, however, is
that the difficulty in replication means that it may not be possible to get the same
results a second time, whether the methodology is cleaned up or not.
454 Parapsychology
42 J.E. ALCOCK
naked eye, could have occurred by chance alone. In recent years, such analysis
has been employed to do much more than simply provide guidance about the
likelihood that particular data may well have arrived by chance alone. Powerful
statistical techniques now exist for finding patterns in data that elude the naked
eye, and this provides an important tool for researchers in many domains. Thus,
statistical analysis originally helped cool our ardour about what appeared to be
meaningful effects in the data, whereas now, those statistical tools are used to
find significant effects that we would not otherwise detect. Now, in modem para
psychology (and, alas, in mainstream psychology as well to some extent), statis
tical analyses are being used to define and defend the importance of differences
so small that they would have carried no interest to researchers of a century ago.
If subjects score at a rate of 51% when the chance rate is 50%, it is unlikely that
anyone would have taken any notice a century ago. Now, provided the sample
size is large, such a small difference may well be ‘statistically significant’.
There is no reason in principle that such analysis should not also be used in
parapsychology, but there is an important difference in the way that it is used in
that field. In regular science, statistics are used either to look for covariation
amongst well-defined variables, or to evaluate whether a given measurement is
affected by the presence or absence of an ‘independent’ variable. However, in
parapsychology, there are no well-defined variables, and there is no way of con
trolling whether psi (if it exists) is present or absent, and so the statistical process
is used, not to evaluate the effect of one or more variables on other measurable
variables, but as a basis for inferring the presence of psi itself. One begins
with the assumption that a particular mathematical distribution describes the
probability distribution of outcomes of a randomly generated event. A subject in
some way tries mentally to influence the distribution of outcomes (even if he or
she knows nothing about the nature of that distribution, or about the generator
that produces it, or even where the generator is physically located). If the out
comes depart from the theoretical distribution to a significant extent, this is taken
as evidence that a psi influence caused the departure.
Any such statistically significant departure is viewed as an ‘anomaly’ relating
to psi, and thus is viewed as support for the Psi hypothesis. However, statistical
significance tells us nothing about causality. If a person tries to guess or ‘intuit’
what number will come next in a randomly generated sequence, and succeeds
better than one would expect by chance, that tells us absolutely nothing at all
with regard to why such results were obtained. The departure from chance expec
tation could be due to any number of influences — a non-random ‘random gener
ator’, various methodological flaws, or . . . Zeus. (I could posit that Zeus exists
and likes to torment parapsychologists, and thereby gives them significant out
comes from time to time, but does not allow replication outside parapsychology.
The significant outcome would provide as much support for my hypothesis that
Zeus exists as it does for the Psi hypothesis that the human subject’s volition
caused the results.)
Joseph Banks Rhine, whose psi research was motivated in part by the desire to
find scientific evidence for post-mortem survival, passionately believed in the
456 Parapsychology
44 J.E. ALCOCK
to understanding such micro-PK, and inherent in this notion is the idea of statisti
cal balancing in the long run, so that macro-PK will not be observed. While I
admire Pallikari’s efforts, they are premature, for the problem remains that to
date there are no substantive empirical data to justify such theorizing. Of course,
her conclusion that psychokinesis does not show up except at the micro level is at
variance with what many other parapsychologists have claimed to have observed
at the macro level.
11. Failure to jibe with other areas o f science
A major criticism of parapsychology is that it fails to jibe with other areas of sci
ence. The late neuropsychologist Donald Hebb (1978) once commented that if
parapsychology is right, then physics and biology and neuroscience are horribly
wrong in some fundamental respects. He went on to say that science has been
wrong before, but that parapsychology would need very strong evidence if it was
going to be able to challenge successfully the current state of knowledge in main
stream science. For example, psi influences, unlike any known energy, are
invariant over distance. Time produces no barrier either, apparently, for such
influences are said to be able to operate backwards and forwards in time. If the
‘out-of-body experience’ is a psi effect, then it would apparently demonstrate
that the complex mechanisms of the brain, while extremely vulnerable to disrup
tion or total destruction as a result of disease or injury, are apparently unneces
sary for perception or cognition in the out-of-body individual. To be fair, some
parapsychologists have argued that their data tends to support the idea that the
brain does indeed process incoming psi. Yet such processing is not a simple mat
ter, for as Beyerstein (1987) noted, in pointing to the profound implications that
psi would have for the neurosciences. He pointed out that perception, memory
and emotion involve extremely complex neurochemical configurations that are
the result of the spatiotemporal integration of activity in millions of widely-
distributed neurons and their internal components. Extrasensory perception
would by definition bypass the activity of peripheral receptors and nerves that
normally determine these central electrochemical configurations. To experience
the emotion or the percept, then, any hypothetical ‘psi signal’ would have to pro
duce the corresponding central electrochemical configurations directly, which
would involve influencing the internal chemical processes of millions of neurons
in the correct sequences and in the appropriate anatomical pathways. This, in the
view of neuroscientists in general, is highly unlikely. Yet while there are attempts
to interpret physical theory in such as way as to accommodate psi (e.g. Pallikari
in this Issue), parapsychologists appear disinterested in the contradictions
between parapsychology and neuroscience (Kirkland, 2000).
On the other hand, failure to jibe with other areas of science is in a very real
sense the sine qua non of parapsychology. As discussed earlier, something is
only considered paranormal if it defies current scientific models of reality.
458 Parapsychology
46 J.E. ALCOCK
phenomena. This dissonance can be resolved either by assuming that the exclu
sion from the halls of mainstream science is unfair and unjustified, or that there is
some reason other than lack of persuasive data that underlies the rejection. As an
instance of the latter, the prominent parapsychologist Charles Tart once wrote
that sceptical scientists may be unconsciously so afraid of their own psychic abil
ities that they have to attack any evidence that might provoke knowledge of their
own ability (Tart, 1982; 1984). Parker, in this Issue, argues that perhaps sceptical
psychologists do not really want to resolve the issue about the reality of psi for
fear of the ‘unwanted implications’ for psychology if it were shown that psi
really does exist. He may be correct, but I doubt it. In my many years in the field
of psychology, I have never detected anything other than simple disinterest in
parapsychology from the vast majority of psychologists. They simply assume
that psi phenomena have never been shown to exist. On the other hand, I am cer
tain that were there suddenly to be produced compelling evidence for the reality
of psi, parapsychologists would be knocked over in the stampede by experimen
tal psychologists to explore an exciting new area of research.
Can the psi question be resolved? Parker argues that the technology now exists
that would allow a resolution of the question of whether psi exists, and that it
would be relatively straightforward to resolve the question, were it not for a lack
of funding from mainstream science. He also states that parapsychology might
turn out to present genuine phenomena — or, it could turn out to be based on a
mixture of fraud, artefact and subjective validation.
I would certainly applaud any effort and investment directed at resolving the
psi issue, but I do not think that it is really possible to resolve it, unless of course
compelling and replicable demonstrations of the existence of psi are forthcom
ing. I do not believe that parapsychologists give the Null hypothesis a proper
chance, and I cannot conceive of any research that could serve to persuade para
psychologists that psi does not exist. It would be far easier, were good and reli
able data available, to persuade sceptics of the reality of psi than to dissuade
parapsychologists. What evidence can one produce with regard to ‘disproving’
the psi hypothesis? Certainly not carefully executed studies that fail to replicate,
that fail to produce any evidence of a psi anomaly. Those are too easily explained
away in terms of the ‘experimenter effect’ or simply ignored, as is the case with
Jeffers’ research. Finding prosaic explanations for a given data set may persuade
parapsychologists that, in that particular instance, there was no evidence for psi,
but what about all the other data sets yet to come? Parapsychologists can neither
tell us under what circumstances psi, if it is real, does not occur, nor can they tell
us how it would be possible to disprove its existence.
While some parapsychologists, as noted earlier, ascribe hidden motivations to
the continued resistance of mainstream scientists to bring parapsychology into
the scientific fold, I judge it unlikely that parapsychologists would under any cir
cumstances abandon their belief in and pursuit of the paranormal. In fact, while
Brugger and Taylor propose the joint collaboration of traditional parapsychology
and neuroscience in the hope that findings from prospective research conducted
by representatives of two apparently conflicting views will most likely be taken
Parapsychology 461
GIVE THE NULL HYPOTHESIS A CHANCE 49
seriously by both sides, they also foresee what many parapsychologists would
consider to be an unacceptable downside: ‘We thus anticipate that, although psi
would vanish from the scene as a process of information transfer, it would live on
as a phenomenon of subjective probability worthy of scientific investigation.’
Finally, even if one were to produce a set of circumstances that would lead
some parapsychologists to abandon the psi hypothesis, parapsychology as a
whole would carry on much as it always has, and the conclusions of those who
left the field would be downplayed or ignored, just as were Blackmore’s conclu
sions when she pronounced that she had become sceptical with regard to psi and
was leaving the field, or Wiseman’s as he had become more and more identified
with the sceptical position (Wiseman, 1997). Of course, for those who appropri
ate for themselves the label ‘parapsychologist’, but do not really subscribe to the
appropriateness of a scientific examination of psi in any case, any agreement by
science-oriented parapsychologists that resolves the psi question in a negative
direction would carry no weight at all.
Thus, the search for psi will go on for a long time to come, for I can think of
nothing that would ever persuade those who pursue it that the Null hypothesis is
probably true. Yet, as this search goes on, those of us who are sceptics should
applaud and support the approach taken by parapsychologists who have contrib
uted to this Special Issue — not because we agree with their conclusions, for we
shall continue to scrutinize and, when appropriate, find fault with their method
ology and challenge their interpretations — but because they share our belief in
the power of the scientific method to reveal truth in nature. I do marvel at their
tenacity, however, for they labour in search of psi despite a lack of the eviden
tiary and other rewards that are earned by mainstream scientists in their research.
Yet, that being said, and as I have stated before (Alcock, 1985; 1987), I continue
to believe that parapsychology is, at bottom, motivated by belief in search of
data, rather than data in search of explanation. It is the belief in a larger view of
human personality and existence than is accorded to human beings by modem
science that keeps parapsychology engaged in their search. Because of this
belief, parapsychologists never really give the Null hypothesis a chance.
A cknowledgem en ts
I wish to thank Jean Bums and Anthony Freeman for their very helpful com
ments with regard to the draft version of this manuscript.
References
Alcock, J.E. (1981), Parapsychology: Science or Magic? (London: Pergamon).
Alcock, J.E. (1985), ‘Parapsychology as a“spiritual science” in A Skeptic s Handbook of Para
psychology, ed. P. Kurtz (Buffalo, NY: Prometheus Books), pp. 537-69.
Alcock, J.E. (1987), ‘Parapsychology: science of the anomalous or search for the soul?’, Behavior
and Brain Sciences, 10 (4), pp. 553-65.
Alcock, J.E. (1990), Science and Supernature: A Critical Appraisal of Parapsychology (Buffalo,
NY: Prometheus Books).
Beyerstein, B.L. (1987), ‘Neuroscience and psi-ence’, Behavioral and Brain Sciences, 10 (4),
pp. 571-2.
462 Parapsychology
50 J.E. ALCOCK
Beyerstein, B.L. (1987-8), ‘The brain and consciousness: implications for psi phenomena’, The
Skeptical Inquirer ; 12 (2), pp. 163-73.
Beyerstein, B.L. (1988), ‘The neuropathology of spiritual possession’, The Skeptical Inquirer; 12
(3), pp. 248-62.
Blackmore, S.J. (1982), B eyond the B ody: An Investigation o f O ut-of-Body E xperiences (London:
Heinemann).
Gardner, M. (1957), F ads and F allacies in the N am e o f Science (New York: Dover).
Hebb, D.O. (1978), Personal Communication, cited in Alcock, J.E. (1981), P arapsychology: Sci
ence or M agic? (New York: Pergamon).
Hyman, R. (1977), ‘The case against parapsychology’, The H um anist, 37, pp. 37-49.
Hyman, R. (1985), ‘A critical overview of parapsychology’, in A S k e p tic s H andbook o f
P arapsych ology , ed. P. Kurtz (Buffalo, NY: Prometheus Books), pp. 3-96.
Hyman, R. (1989), The E lusive Q uarry: A Scientific A ppraisal o f P sych ical R esearch (Buffalo,
NY: Prometheus Books).
Jahn, R.G. (1982), ‘The persistent paradox of psychic phenomena —anengineering perspective’,
P roceedin gs o f the IEEE , 70, pp. 136-70.
Jahn, R., Dunne, B., Bradish, G., Dobyns, Y., Lettieri, A., Nelson, R., Mischo, J., Boiler, E.,
Boosch, H., Vaitl, D., Houtkooper, J. andWalter, B. (2000), ‘Mind/Machine InteractionConsor
tium: PortREG Replication Experiments’, J o u rn a l o f S cien tific E x p lo ra tio n , 14 (4),
pp.499-555.
Keen, M., Ellison, A. and Fontana, D. (1999), ‘The Scole Report’, Proceedings o f the SPR, 58,
pp. 150-392.
Kirkland, K. (2000), ‘Paraneuroscience?’, The Skeptical Inquirer, 24 (3), pp. 40-3.
Marks, D. (2000), The P sych ology o f the P sych ic (Buffalo, NY: Prometheus Books).
Mattuck, R.D. (1982), ‘Some possible thermal quantum fluctuation models for psychokinetic
influence on light’, P sych oen ergetics , 4, pp. 211-25.
May, E.C, Utts, J.M. and Spottiswoode, S.J.P. (1995), ‘Decision Augmentation Theory: towards a
model of anomalous phenomena’, Journal o f P arapsychology, 59 (3), pp. 195-220.
McConnell, R.A. (1977), ‘Theresolution ofconflictingbeliefs about the ESPevidence’,Journal o f
P arapsychology, 41, pp. 198-214.
Moore, L. (1977), In Search o f W hite C row s (Oxford: Oxford University Press).
Neher, A. (1990), The P sych ology o f Transcendence (New York: Dover).
Pratt, J.G. (1974), ‘In search of a consistent scorer’, in N ew D irections in P arapsychology, ed.
J. Beloff (London: Elek Science).
Rhine, J. B., Pratt, J.G., Stuart, C.E., Smith, B.M. andGreenwood, J.A. (1966), E xtra-Sensory P er
ception after Sixty Years (Boston: Bruce Humphries). (Original work published 1940.)
Schmidt, H. (1975), ‘Towards a mathematical theory of psi’, Journal o f the A m erican Society f o r
P sych ical R esearch, 69 (4), pp. 301-20.
Soal, S.G. and Bateman, F. (1954), M odern Experim ents in Telepathy (NewHaven, CT: Yale Uni
versity Press).
Stanford, R.G. (1990), ‘Anexperimentally testable model forspontaneous psi events’, inA dvances
in Parapsychological Research, 6, ed. S. Krippner(Jefferson, NC: McFarlandandCo), pp. 54-167.
Targ, R. and Puthoff, H.E. (1974), ‘Information transfer under conditions of sensory shielding’,
N ature, 252, p. 602.
Tart, C. (1982), ‘The controversy aboutpsi: two psychological theories’,Journal o f P arapsychology,
46, pp. 313-20.
Tart, C. (1984), ‘Acknowledging anddealing with the fearofpsi’, Journal o f the Am erican Society
fo r P sych ical R esearch, 78, pp. 133-43.
Walker, E.H. (1984), ‘Areviewof criticisms ofthe quantummechanical theory ofpsi phenomena’,
Journal o f P arapsychology, 48, pp. 277-332.
Wiseman, R. (1997), D eception an d Self-D eception: Investigating Psychics (Buffalo, NY: Prome
theus Books).
Part V
Experimenters’
Personal Perspectives
[25]
The Elusive Open Mind: Ten Years of
Negative Research in Parapsychology
What does a psychologist who’s had an
extraordinary experience do? Sets up a
research program to test for psi. The
lessons are surprising.
Susan Blackmore
E
VERYONE THINKS they are open-minded. Scientists in particular
like to think they have open minds, but we know from psychology
that this is just one of those attributes that people like to apply to
themselves. We shouldn’t perhaps have to worry about it at all, except that
parapsychology forces one to ask, “Do I believe in this, do I disbelieve in
this, or do I have an open mind?”
The research I have done during the past ten or twelve years serves as well
as any other research to show up some of parapsychology’s peculiar problems
and even, perhaps, some possible solutions.
I became hooked on the subject when I first went up to Oxford to read
physiology and psychology. I began running the Oxford University Society
for Psychical Research (OUSPR), finding witches, druids, psychics, clair
voyants, and even a few real live psychical researchers to come to talk to us.
We had Ouija board sessions, went exploring in graveyards, and did some
experiments on ESP and psychokinesis (PK).
S u sa n B la c k m o re is w ith th e B rain a n d P e r c e p tio n L a b o ra to ry , U n iv e rsity o f B ristol,
B risto l, E n g la n d , T h is a rticle is b a s e d o n h er p r e s e n ta tio n a t th e 198 6 C S I C O P
co n fe re n c e a t th e U n iv e rsity o f C o lo r a d o in B o u ld er. H er b o o k Adventures of a
Parapsychologist h a s re c e n tly b e en p u b lis h e d b y P ro m e th e u s B o o k s.
466 Parapsychology
This article was the Presidential Address deliveredat the 43rdAnnual Convention of
the Parapsychological Association in Freiburg, Germany, August 17-20, 2000.
478 Parapsychology
336 TheJournal of Parapsychology
a fundamental role in speaking to this global situation and to this historical
moment. It is a time of opportunity for us.
When I think about the human genome project, for example, I think
about the remarkable success of the materialist paradigm—of the
physicalist paradigm—that allows us to create a book that maps the hu
man genome. The implications and possibilities from that mapping leave
many, many things ahead for us in terms of our ability to diagnose and
treat disease, as well as to make changes in future life forms.
But there are many questions left unanswered through an exclusively
physicalist model of reality. As we have cultivated this remarkable set of
knowledge-based and reason-based skills, there are questions we are not
answering. What does it mean to be human? What does it mean to have
emotion? What does it mean to have motivation, intention, and atten
tion? All of these first-person experiences that we think of as uniquely our
own are not addressed within the strictly physical dimensions of reality. In
parapsychology, although we often do not think about ourselves as meta
physicians, we are in a position to arbitrate between these two dimensions
of experience: the physical dimensions of reality and the metaphysical di
mensions of experience.
Learning our ABCs
As I frame my talk tonight, I recall the delightful Presidential Address
by Dean Radin last year. Dean used “Green Eggs and Ham” as his meta
phor for describing the state of the field (Radin, 2000). Following his
lead, I suggest that we go back to thinking about our place in terms of the
basics—our ABCs.
A stands for Action. I believe it is time for us to own our social respon
sibility as participants in this evolving story of human complexity. I be
lieve we have something to do in transforming the world, such that it be
comes a more holistic, integral, and life-affirming scenario for future
generations.
B stands for Boldness. Here I would ask all of us to think about the
ways in which we have been beaten down by our interest in parapsychol
ogy. Have you ever felt you needed to apologize for your interest, particu
larly when surrounded by mainstream scientists? It is time for us to ac
knowledge that we are addressing some of the fundamental issues of our
time or anytime in human history.
C stands for Context. And that is “storying.” It behooves us to see our
data as relevant to larger social and political issues. We must answer the
“so what” question by touching people at the level of what is important to
them and their lives. This leads me to the context for my own participa
tion in psi research.
C om ing of A ge in P arapsychology
C o nclusio n
By way o f co n c lu sio n , I d o w a n t to rem in d us o f w h ere w e b eg a n — th e
A BCs stan d fo r A ctio n , B o ld n ess, a n d C o n tex t.
Action
It is im p orta n t to speak to socially relevan t issues. It is im p o rta n t fo r us
to reco g n ize that w hat w e are d o in g is fu n d am en tally vital. A s I talked
a b o u t co n v erg in g worldview s an d th e fact that w e are in a tim e o f glob al
transition, I th in k it is im p ortan t fo r u s to recogn ize that w e are b etw een
stories. T h e o ld story is n o t w ork in g an ym ore, and w e can see that. M o d e m
m ed ic in e, for ex a m p le, is in crisis. A n d at th e sam e tim e, th e n ew story has
n o t b e e n b o m yet. W e are in this transitional, lim inal p h ase, o f w aitin g to
see w h at are th e appropriate q u estio n s to b e asking ab o u t h u m a n possib il
ity an d ab o u t th e h u m a n co n d itio n . A n d w e have a role to play in form u lat
in g th ose q u estion s, if n o t answ ering th em ultim ately, in term s o f o u r abil
ity to m ake links a n d to actually resolve so m e o f th e q u estion s b efo r e us.
Boldness
W h en I talk a b o u t p arad igm a n d I talk ab o u t T h o m a s K u h n , it all m ay
so u n d a little o ld o r a little like P ollyan n a. It is n o t. T h e q u e stio n s w e are
ask in g are really th e m o st im p o rta n t things: W hat is life? W h at is c o n
sciou sn ess? W h at is o u r capacity as h u m a n b ein g s to b e c o m e s o m e th in g
m ore? Is th ere a n ew story th a t is n o t ab o u t a strictly p h ysical,
red u ctio n ist, sep arate, ob jective w o rld o u t th ere, b u t o n e in w h ich w e are
fu n d a m e n ta l actors in th e ev o lu tio n a ry process? If w e ca n b e b o ld
e n o u g h to ow n o u r responsib ility, w e m ay b e b o ld e n o u g h to r e co g n ize
th at w e are co n sc io u s p articip an ts in an ev olvin g u n iverse.
Context
W e n e e d to story ou r w ork in a way th at is m o re relevan t. W e ca n d o a
rem ark ab le jo b o f p u ttin g a n y o n e to sleep over ou r fin d in gs. W e n e e d to
w ake u p to th e fact th at it is really in terestin g . If w e create th e c o n te x t fo r
490 Parapsychology
348 TheJournal of Parapsychology
u n d e rsta n d in g psi research , b o th in term s o f p olitics a n d in term s o f th e
so cia l im p lica tio n s, I th in k th e w h o le fie ld w ill m o v e forw ard. W h eth er w e
call it p arap sych o logy or d istan t in ten tio n a lity or co n scio u sn ess stu d ies or
altern ative m e d ic in e or ap p lied e p istem o lo g y — w h atever w e c h o o se to
call it— w e are m a k in g a d ifferen ce in th e w orld .
Finally, I w a n t to rem in d us o f w h ere w e started , w h ich is in th e m ys
tery. W e sh o u ld em b ra ce it an d o u r d a n c e w ith th e in effab le. O u r go a l is
to b eg in to u n d ersta n d a n d to play w ith m ystery in a m o re active way. A s I
was p rep a rin g th is talk, m y so n g o t sick. I was g o in g th ro u g h a baby b o o k ,
a n d I ca m e across th is q u o te in The Well Baby Book (S am u els & S am u els,
199 1): “Y oung c h ild ren o ften see an d p o in t o u t to th eir p aren ts ob jects o r
p er so n s th at th e p a ren ts c a n ’t see. S u ch im a g es m ay b e im agin ary or real.
T h e e x isten c e o f real p sychic p h e n o m e n a h as b e e n d em o n stra ted in
p ara p sy ch o lo g ica l ex p erim e n ts over th e p ast 30 years” (p. 2 2 4 ).
T h e parad igm is sh iftin g , w h eth er w e w an t to go w ith it or n o t. I am
re m in d ed o f M ax P la n ck ’s in fam ou s q u o te , “A n ew scien tific truth d o e s
n o t triu m p h by co n v in c in g its o p p o n e n ts an d m ak in g th em see th e ligh t,
b u t rath er b eca u se its o p p o n en ts ev en tu ally d ie an d a n ew g e n era tio n
grow s u p that is fam iliar w ith it” (P lan ck & L au e, 1949, pp. 33-34 ). T h e
p arad igm is ch a n g in g , an d it is ou r jo b to act as b rid ge m akers to so m e
th in g b ig g er a n d so m e th in g m o re in clu siv e o f th e fu lln ess an d th e rich
n ess o f th e h u m a n c o n d itio n .
References
B rau d , W ., 8 c S ch litz , M . (1 9 8 3 ). P sy ch o k in etic in flu e n c e o n electro -
d erm al activity. Journal of Parapsychology, 4 7 , 9 5 -1 1 9 .
B r au d , W., S c h litz , M ., C ollins , J., 8 c Kl itc h , H . (1984). Further stud
ies o f the bio-P K effect: F eed b ack , b lo ck in g ; sp ecificity /g en er a lity
[A b stract]. In R. A. W h ite &J. Solfvin (E d s.), Research in parapsychology
1984 (p p . 4 5 -4 8 ). M etu ch en , NJ: S carecrow Press.
B y r d , R. C. (1 9 8 8 ). P ositive th erap eu tic effects o f intercessory prayer in a
coron ary care u n it p op u lation . Southern MedicalJournal, 81, 8 2 6 -8 2 9 .
C a rd en a , E., L y n n , S. J., & Krippner , S. (E d s .). (2 0 0 0 ). Varieties of ano
malous experience: Examining the scientific evidence. W a sh in g to n , DC:
A m erican P sy ch o lo g ica l A ssociation .
C arr , B. (2000 ). Research activities in the SPR New initiatives. R etrieved N o
vem b er 17, 2000 from h ttp://w w w .spr.ac.uk/research_initiatives.htm l.
E in st e in , A. (1 9 5 4 ). Ideas and opinions. N ew York: C row n.
F reem an , W. J. (1 9 9 5 ). Societies of brains: A study in the neuroscience of love
and hate. H illsd ale, NJ: E rlbaum .
Parapsychology 491
Boundless Mind: Coming of Age in Parapsychology 349