Gazzaniga. The Cognitive Neurosciences
Gazzaniga. The Cognitive Neurosciences
Gazzaniga. The Cognitive Neurosciences
GAZZAN I GA, E D I T O R - I N - C H I E F
FO U RTH ED ITI O N
Fourth Edition
A BRADFORD BOOK
THE MIT PRESS
CAMBRIDGE, MASSACHUSETTS
LONDON, ENGLAND
© 2009 Massachusetts Institute of Technology
10 9 8 7 6 5 4 3 2 1
For Charlotte Smylie Gazzaniga with deep appreciation and gratitude
CONTENTS
Preface xv
II PLASTICITY
contents vii
7 Synaptic Plasticity and Spatial Representations in the Hippocampus
Jonathan R. Whitlock and Edvard I. Moser 109
III ATTENTION
viii contents
IV SENSATION AND PERCEPTION
21 Grandmother Cells, Symmetry, and Invariance: How the Term Arose and
What the Facts Suggest Horace Barlow 309
contents ix
V MOTOR SYSTEMS
VI MEMORY
46 Medial Temporal Lobe Function and Human Memory Yael Shrager and
Larry R. Squire 675
x contents
50 Individual Differences in the Engagement of the Cortex during an Episodic
Memory Task Michael B. Miller 739
VII LANGUAGE
59 The Biology and Evolution of Language: “Deep Homology” and the Evolution
of Innovation W. Tecumseh Fitch 873
contents xi
64 Neurogenetic Studies of Variability in Human Emotion
Ahmad R. Hariri 945
66 The Neural Basis of Emotion Regulation: Making Emotion Work for You and
Not Against You Jennifer S. Beer 961
72 Semantic Cognition: Its Nature, Its Development, and Its Neural Basis
James L. McClelland, Timothy T. Rogers, Karalyn Patterson,
Katia Dilkina, and Matthew Lambon Ralph 1047
X CONSCIOUSNESS
xii contents
79 The Neurobiology of Consciousness Christof Koch 1137
XI PERSPECTIVES
Contributors 1263
Index 1269
contents xiii
PREFACE
It has been 20 years since we first met in Squaw Valley to assess the state of cognitive
neuroscience. We have held this meeting three times before and each meeting had its own
signature. When the first meeting concluded, we knew we had a vibrant, young field on
our hands. As the years passed, our knowledge deepened and slowly and gradually new
ideas emerged. With the fourth meeting cognitive neuroscience is busting out all over.
Fundamental stances are changing and new ideas are emerging. Everything from the view
that individual neurons change their functional role through time to claims that our moral
decisions can be tracked in the brain are indicants of the range and excitement of cognitive
neuroscience. Fresh air sweeps in and reinvigorates us on the view that we will some day
figure out how the brain works its magic and produces the human mind.
It is always in the first two sessions that one finds the contrast in approaches to studying
the brain so markedly different. The development and evolution section talks about a
dynamic growth pattern that becomes specific and fixed. At the same time those interested
in plasticity see the neuronal systems always changing and the dynamics seen in develop-
ment as continuing for the life of the brain. In our most recent meeting the reports on
brain plasticity were more bold than ever before.
The attention session featured a new emphasis on the interactions between reafferent
top-down and feed-forward, bottom-up attentional processes. Benefiting from ever-
impressive technological advances, the elucidation of attentional mechanisms is proceeding
at a dizzying pace. It is refreshing to note that in addition to providing a more compre-
hensive picture of attentional processes, this exciting new empirical evidence has also veri-
fied many central tenets of some of the most longstanding and influential theories in the
cognitive neuroscience of attention.
In the motor session, the boundaries of the motor system continued to be pushed
further into the realm of cognition. Some have demonstrated the existence of motor-related
areas in the parietal lobe that are involved in representing goals of oneself and of others,
providing a link for how we may intuitively translate the actions of others into a model
of their mental processes. Separate research indicates that areas once thought to have
only motor roles actually contain circuits related to executive and limbic function, further
complicating the distinction between cognition and motor processes. Perhaps above
all else, the work of the motor section suggests that we might be wise to relieve ourselves
of the need to make stark distinctions between these two phenomena, at least in higher
primates.
Memory research is, paradoxically, providing great insight into how humans imagine
future events. Moreover, new models of reconsolidation and retrieval are emerging,
and exciting evidence of individual differences in cortical activation patterns during epi-
sodical retrieval are forcing a careful reevaluation of central tenets of functional imaging
analysis.
The perception session demonstrated the vast potential of Bayesian modeling to provide
beneficial descriptions of how the brain performs various functions. But Bayesian modeling
preface xv
does not hold a monopoly; other theoretical and methodological pioneers are dramatically
enhancing our understanding of vision, quite literally from the level of the retina to large-
scale networks that connect distributed regions of the cortex. Keeping pace with advances
in vision science are exciting findings across the modalities of audition, olfaction, and ves-
tibular function.
Next, we turned our attention to language (and secretly hoped that a few lectures would
pass without mention of the word “Bayesian”). The understanding of specific components
of language processing is expanding, while exciting parallel studies are examining the genes
that may wire our brain in a way that enables language acquisition. But even as we move
toward an understanding of how genes and experience sculpt the human brain into a
speaking device, the question of who exactly is doing the speaking arises. Fittingly, we
transitioned into the session on executive function, where we came across a surprising
answer to this question. There does not appear to be a need for a “top” in top-down
control; instead, various regions for self-regulation and cognitive control have been identi-
fied and their interactions have been modeled in ways that leave the mythical homunculus
homeless. As if this were not profound enough, we also learned about an exciting new
characterization of resting brain activity, a remarkable advance that has too many implica-
tions to list.
Over a week removed from our introduction to theory of mind in the motor session,
the emotion and social neuroscience section further demystified the rapidly expanding
science of the social brain. New ideas on how our emotions and sense of self inform the
ways in which we think about and reflexively understand others continue to evolve, while
evidence for the genetic basis of individual variation in affect and, astoundingly, for dif-
ferences in BOLD activity related to this genetic variation has further shaped the current
models of how the emotional brain develops and operates.
As it always does, the conference ended with a bang, featuring two days’ worth of lively
discussion on the topic of consciousness. An exciting novel mechanism for how the brain
generates the baseline activity necessary to sustain conscious experience was comple-
mented by a bold theoretical attempt to make the problem of qualia a bit more tractable.
Between these extremes, others reported suggestive new evidence about the neural basis
of visual conscious experience. This session also served as a reminder of how far we had
come during those three weeks in Squaw Valley, as topics such as action, emotion, lan-
guage, and executive function reemerged in the context of examining how such varied
processing contributes to the content of conscious experience.
After three weeks of such intense stimulation, it is a testament to the amazing progress
unveiled that one somehow leaves Tahoe reinvigorated and enthusiastic to get back into
the lab. The past 20 years have seen advances that we could never have anticipated, and,
incredibly, the next five or ten hold the potential to continue this exponential progress.
The Summer Institute at Tahoe reveals simultaneously the exciting new ideas in the field
and the bright minds that are vigorously attacking the persisting mysteries of cognitive
neuroscience. It also exposes a talented group of graduate and postdoctoral students to the
wonderful breadth and depth of the field. Scanning the room of eager young minds
hanging onto the words of the various leaders of the field is truly a sight to behold. It is
exhilarating to witness the handing of the baton from one generation to the next. One can
only pause, take it all in, smile, and then get back to paying attention to the lecture because
the next big idea presented might knock you right out of your seat!
Needless to say, complex events and publications like this work well only if there are
dedicated people involved. First, the MIT Press continues to be exceptionally supportive
in carrying out high-quality production in a timely manner. Once again my daughter,
Marin Gazzaniga, managed the ebb and flow of the manuscripts, playing both good cop
and bad cop as the manuscripts moved between authors, section editors, and ultimately
the publisher. Marin is a brilliant playwright, actress, and writer in her own right, and all
of those skills are required in herding academics to a common goal.
xvi preface
The actual event at Lake Tahoe was managed from the beginning by my assistant, Jayne
Rosenblatt. She is always good humored and incredibly dedicated and runs complex events
seemingly effortlessly. Finally, these books don’t just happen. Peggy Gordon brings it all
together into print with a steady hand and professionalism.
Warm thanks and congratulations to all. We will see you all again in five years.
Michael S. Gazzaniga
The Sage Center for the Study of Mind
University of California, Santa Barbara
preface xvii
I
DEVELOPMENT
AND EVOLUTION
Chapter 1 rakic, arellano,
and breunig 7
3 preuss 49
abstract The cerebral cortex is the crowning achievement of the variations of body mass of the different mammalian
evolution and the biological substrate of human cognitive abilities. species. Conversely, our current knowledge indicates that
Although the basic principles of cortical development in all there are significant qualitative and quantitative changes in
mammals are similar, the modifications of developmental events
during evolution produce not only quantitative but also qualitative
the structure of the neocortex between species, in the size of
changes. Human cerebral cortex, as in the other species, is orga- cells, in the proportion of neurons to glia, in the ratio of
nized as a map in which specific cell classes are positioned into a excitatory projection cells to inhibitory interneurons, in the
radial, laminar, and areal array that depends on the sequential appearance of new types of neurons, in the specific organiza-
production and phenotypic specification of those cells and the tion of connections, and, most importantly, in the addition
directed migration from their place of origin to their distant final
of novel highly specialized cortical areas associated with cor-
destination. The long and curvilinear migration pathways in the
fetal human cerebrum depend critically on the stable radial glial respondingly new axonal pathways and patterns of synaptic
scaffolding. After neurons assume their proper areal and laminar connectivity that can certainly have a profound effect on the
position, they attach locally and develop numerous proximal and functional capacity of the cerebral cortex. In spite of these
long-distance connections that involve specific adhesion molecules, differences, the small size, high fertility rate, and low cost of
neurotransmitters, and receptors. However, the final pattern of
maintenance have converted mice into an unexcelled model
synaptic connections is selected through functional validation and
selective elimination of the initially overproduced neurons, axons, for experimental research on neuroscience and particularly
and synapses. In this review, the development of the cerebral cortex on basic cortical organization (Rakic, 2000).
is described in the context of the radial unit hypothesis, the postu- The study of the cortical development of mammals sug-
late of an embryonic protomap, and the concept of competitive gests that even small differences in the timing and duration
neural interactions that ultimately create a substrate for the highest of the genesis of neural cells and changes in the composition
cognitive functions.
and the ratios of cell proliferation and programmed cell
death in the transient embryonic zones of the developing
There is probably no disagreement among biologists that the forebrain can be responsible for the evolutionary expansion
cerebral cortex is the part of the brain that most distinctively of the neocortex and the concomitant appearance of novel
sets us apart from any other species and that the principles functions in cognitive processing of information (Bystron,
governing its development may hold the key to explaining Blakemore, & Rakic, 2008). Although genetically humans
our cognitive capacity, intelligence, and creativity (e.g., are surprisingly similar to other mammals, the uniqueness
Gazzaniga, 2008). Perhaps the most prominent feature of of the human cognitive potential, which is an output of corti-
the cerebral cortex in all species, and particularly in pri- cal function, must have some genetic basis. Since the molec-
mates, is its parcellation into distinct laminar, radial, and ular structure and function of neurotransmitters, receptors,
areal domains (Eccles, 1984; Mountcastle, 1997; Goldman- and ion channels do not change substantially over the phy-
Rakic, 1987; Rakic, 1988; Szentagothai, 1978). Although logenetic scale, the secret to the success of Homo sapiens is
the surface of the neocortex has expanded a thousandfold probably mainly due to an increased number of neurons,
during phylogeny, its thickness and its basic cytoarchitec- more elaborated connections, functional specialization, and
tonic organization appear to be changed comparatively introduction of new cortical areas. Even the so-called essen-
less. However, this morphological similarity of cortical tial genes, which are considered responsible for survival of
architecture in histological sections in all mammals may be the individual, give different phenotypes in different species,
misleading, since it might invite us to think in terms of a and about 20% of the mouse orthologs of human-essential
canonical cortex that only varies in size in accordance with genes are nonessential in mice (Liao & Zhang, 2008). It is
therefore apparent that neither the genetic, the cellular,
pasko rakic, jon i. arellano, and joshua breunig Department nor most importantly, the circuitry basis of human cortical
of Neurobiology, Yale University School of Medicine, New Haven, uniqueness can be deciphered by studying exclusively
Connecticut rodents, much as one cannot expect to understand the origin
elongated shafts of radial glial cells (figure 1.3 and Rakic, Committee, 1970), it attracted renewed attention after
1972). Radial glial cells are particularly prominent in the characterization of the migration through the rostral
primates including human, whose fibers span all the length migratory stream from the postnatal ventricular zone to
of the convoluted cerebral hemispheres at the late stages the olfactory bulb (Menezes & Luskin, 1994; Lois &
of corticogenesis (Rakic, 1976b; deAzevedo 2003). While Alvarez-Buylla, 1994) and from the ganglionic eminence
moving along the glial surface, migrating neurons remain to the dorsal neocortex (De Carlos, Lopez-Mascaraque, &
preferentially attached to curvilinear glial fibers, a finding Valverde, 1996; Tamamaki, Fujimori, & Takauji, 1997).
which suggested a “gliophilic” mode of migration (Rakic, Studies in rodents also suggested a more widespread dis-
1985, 1990) that may be mediated by heterotypic adhesion persion of clonally related cortical cells (reviewed in Rakic,
molecules (Rakic, Cameron, & Komuro, 1994). As many as 1995a; Tan et al., 1998; Reid, Liang, & Walsh, 1995).
30 migrating GFAP-negative neurons have been observed Although in rodents most of tangentially migrating cells in
migrating along a single GFAP-positive radial glial fascicle the dorsal neocortex are inhibitory GABAergic interneurons
in the human forebrain during midgestation (Rakic, 2003). (reviewed in Marin & Rubenstein, 2001) there are also a
However, some postmitotic cells do not obey glial con- number of migrating oligodendrocytes (He, Ingraham,
straints and move along tangentially oriented axonal fasci- Rising, Goderie, & Temple, 2001).
cles (e.g., black horizontally oriented cells aligned with It should be underscored, however, that clonal analysis in
thalamic radiation (TR in figure 1.3). We suggested the term the convoluted primate cortex revealed that the majority
“neurophilic” to characterize the mode of migration of this of migrating cells, both projection cells and interneurons,
cell class (Rakic, 1985, 1990). Although lateral dispersion of obey the radial constraints imposed by the radial glial scaf-
postmitotic neurons was initially observed in Golgi-stained folding (Kornack & Rakic, 1995; see also Rakic, 2007, and
preparations (e.g., figure 1A of the report of the Boulder the following section on the radial unit hypothesis). Also,
abstract The early development of cortical circuitry provides its complexity (Goldman-Rakic, 1987), life-long maturation
the biological substrate for human cognitive and psychological (Chugani, Phelps, & Mazziotta, 1987; Fuster, 2002; Giedd
maturation. Neuronal circuitry of the human frontal cortex appears et al., 1999; Goldman-Rakic, 1987; Huttenlocher &
around the eighth postconceptional week (8 PCW) with two synap-
tic strata, engagement of fetal neurons, and spontaneous activity. Dabholkar, 1997; Kostović, Petanjek, Delalle, & Judaš,
Midfetal and late-fetal cortex show transient lamination, differen- 1992; Petanjek, Judaš, Kostović, & Uylings, 2008; Sowell
tiation of subplate, deep synaptogenesis, and growth of thalamo- et al., 2004) and the existence of human-specific functions
cortical afferents. Cortical interaction with thalamic afferents (Preuss, 2004; Werker & Vouloumanos, 2001).
occurs around 24 PCW. Late preterm and neonatal periods are Which component of the prefrontal cortical circuitry
characterized by growth of cortico-cortical fibers and resolution of
transient circuitry. During early infancy (2–6 months) rapid synap-
develops already in utero, that is, before birth? And, knowing
togenesis coincides with reorganization of cortico-cortical path- that typical prefrontal executive functions develop post-
ways. Initial environmentally driven “cognitive” circuitry consists natally, what would be its functional roles during fetal life?
of increased number of synapses, differentiated layer V pyramids, In other words, what functions may the prefrontal cortex
“dormant” layer III pyramids and appearance of inhibitory subserve before the onset of cognition, during the so-called
neurons. Full maturity of layer III pyramids, local circuitry neurons
precognition period of life? One would also like to pinpoint
and maximal number of synapses is not achieved until 12–24
months when circuitry is layer III “centered” and socially driven. qualitative and quantitative features of the maturational
In summary, endogeneous and sensory-driven circuitry develops status of prefrontal circuitry at the end of the first post-
during prenatal life, initial “cognitive” circuitry appears in late natal year: what level of maturation of prefrontal cortex is
infancy, and maximal number of synapses and full maturity of layer required for the onset of cognition and language?
III develop during early childhood.
In this review, we summarize the data available on early
development of human frontal lobe circuitry, describe new
In revealing the neurobiological basis of cognitive and psy- data on neurogenetic events obtained with neuroimaging
chological development in humans, neuroscientists mostly and fine neurohistological studies, and propose structural
depend on close correspondence between structural, func- criteria for delineation of phases in prefrontal circuitry deve-
tional, and behavioral features during specific phases of lopment. We cover the period from the appearance of the
development (Casey, Giedd, & Thomas, 2000; Casey, cortical plate and establishment of first cortical synaptic con-
Tottenham, Liston, & Durston, 2005; Hammock & Levitt, tacts (at 8 PCW) to the third year of life when pyramidal
2006; Kagan & Baird, 2004; Levitt, 2003). Because the neurons of layer III attain size and complexity greater than
neural circuitry that underlies cognitive development is those of layer V pyramidal neurons, and language and
expected to be simpler in the developing than in the cognition are already well established (see table 2.1). The
adult brain, the developmental approach should provide following frontal cortical regions will be described: dorsola-
an easier analysis of principal elements of the neuronal teral, dorsomedial, orbitomedial, precingulate (area 32), and
circuitry: synapses, presynaptic axons, and postsynaptic anterior cingulate (area 24).
neurons.
It is difficult to correlate structure and function in the Development of neuronal circuitry of the human prefrontal
developing neuronal circuitry of the frontal lobe because of cortex during the early fetal period
ivica kostović and miloš judaš Croatian Institute for Brain Structural Organization The first cortical cells appear
Research, School of Medicine, University of Zagreb, Zagreb, in the neocortical preplate or primordial plexiform layer
Croatia (Bystron, Rakic, Molnar, & Blakemore, 2006; Bystron,
Blakemore, & Rakic, 2008; Zecevic & Milosevic, 1997) Importantly, the early fetal cerebral wall (telencephalic
(figure 2.1A). Soon thereafter, the formation of the cortical pallium) shows early regional differences in the thickness
plate (CP) at 8 PCW (figure 2.1B) marks the transition from and appearance of MZ, CP, and SP across mediolateral,
embryonic to fetal period and prefrontal neocortical anlage rostrocaudal, and dorsoventral extent. Dorsal fetal pallium,
is now composed of three architectonic zones: marginal zone as a forerunner of neocortex, displays thicker and condensed
(MZ), CP, and the presubplate, PSP (Bystron et al., 2008; CP, narrow MZ, and wider PSP. The CP becomes thinner
Kostović & Rakic, 1990). These zones contain two main toward the interhemispheric fissure where dorsal pallium
classes of neurons, radially oriented postmigratory neurons continues into the medial pallium. The pallium of the cortical
within the CP and randomly oriented and early maturing limbus (hem) is thin and convoluted with wide MZ, thin and
neurons located above and below the CP, that is, in the MZ convoluted CP, and almost invisible PSP. The CP is initially
and PSP, respectively (Bystron et al., 2008; Kostović, 1990; not developed in the hippocampal anlage. This early regional
Marin-Padilla, 1983; Meyer, Schaaps, Moreau, & Goffinet, specification probably reflects the activity of intrinsic pat-
2000; Verney, Lebrand, & Gaspar, 2002; Zecevic & terning mechanisms (Rakic, 1988, 2006) whereby patterning
Milosevic, 1997). centers generate, across the dorsal telencephalon, graded
Between 13 and 14 PCW (the end of the early fetal expression of transcription factors acting on cortical proge-
period, figure 2.1C ), the deep part of the hitherto densely nitor cells (Grove & Fukuchi-Shimogori, 2003; O’Leary,
packed CP transforms into a wider lamina that appears Chou, & Sahara, 2007). Thus the frontal telencephalon
pale in Nissl-stained preparations and merges with PSP, becomes specified by specific transcription factors, such as
thus forming a new, prominent, and transient zone—the the basic fibroblast growth factor (bFGF), expressed in the
subplate zone (SP) (Kostović & Rakic, 1990; Kostović & ventricular zone of the rostral (anterior) telencephalon.
Judaš, 2007). This marks the onset of the typical midfetal Although this early patterning of the cortical protomap
lamination pattern. (Rakic, 1988) occurs in humans probably during the second
MZ MZ
MZ
CP CP
PSP SPF
limb
10.1 mm
16.3 mm
3.4 mm
IZ SP
GE th
IZ IZ
SVZ
SVZ
VZ A VZ B VZ C
35.6 mm
35.8 mm
IZ
SVZ
SP SP
SVZF
VZ
D E F
3M 9M 3Y
I
I
IV
III
VI
40.9 mm
55.2 mm
51.5 mm
IV
IV
SP
VI
WM VI
WM
G WM H I
Figure 2.1 Development of cytoarchitectonic layers in the Abbreviations for this and subsequent figures: caud, caudate
prefrontal cortex from the embryonic phase (before the appearance nucleus; CC, corpus callosum; CP, cortical plate; GE, ganglionic
of the cortical plate) to the third year, that is, at 7.5 postconcep- eminence; IZ, intermediate zone; limb, limbic (hippocampal
tional weeks (PCW) (A), 8.5 PCW (B), 12.5 PCW (C ), 15 PCW (D), pallium); MZ, marginal zone; put, putamen; PSP, presubplate; SP,
28 PCW (E ), 33 PCW (F ), 3 months (G ), 9 months (H ), and 3 subplate zone; SPF, the subplate in formation (the “second” corti-
years (I ). All layers in prenatal phases are transient, and their cal plate); SVZ, subventricular zone; SVZf, subventricular fibrillar
laminar changes reflect neurogenetic events: proliferation, migra- zone; th, thalamus; VZ, ventricular zone; WM, white matter.
tion, differentiation, ingrowth of afferent pathways, and areal Roman numerals (I–VI) correspond to permanent cortical layers;
differentiation. double arrows point to the external capsule.
month of embryonic life—that is, even before the formation for superficial (associative) layer III are not born yet.
of the cortical plate—the specification of cortical areas Molecular specification of early cortical neurons was
continues during the fetal period, and thalamic input proven using different markers for GABA (Bystron et al.,
seems to have a significant role in the final differentiation of 2006, 2008; Zecevic & Milosevic, 1997; Rakic & Zecevic,
cortical areas. 2003) and reelin (Meyer & Goffinet, 1998; Rakic & Zecevic,
2003). For studying early cortical development in a clinical
Neurogenetic Events Major neurogenetic events are setting it is very important that early transient proliferative,
production of young postmitotic neurons in the ventricular migratory, and synaptic zones were successfully visualized
zone and their migration through the intermediate zone by in vivo, in utero imaging around 13 PCW ( Judaš et al.,
(figure 2.6). Comparison with data in monkey (Rakic, 1988, 2005; Kostović, Judaš, Škrablin-Kučić, Štern-Padovan, &
2006) shows that in the early fetal period, neurons destined Radoš, 2006).
EXT CP
SP
CC IZ
24.2 mm
33.1 mm
CC SP
29.8 mm
IZ
CP
SP EXT EXT
A B C
28 PCW 32 PCW
41.3 mm
SP
SP 43.9 mm
CP SP IZ CC CP SP IZ CC
D E
Figure 2.2 Laminar shifts, regional and areal differences in ace- 24 PCW, AChE-reactive fibers gradually penetrate into the cortical
tylcholinesterase (AChE) histochemical staining in the human pre- plate, as illustrated in 26 PCW specimen (C). There is a parallel
frontal cortex at different developmental phases. At 18 PCW (A), decrease in staining of the subplate zone. At 28 PCW (D), strong
AChE-reactive fibers originating in the basal forebrain, external AChE reactivity is present in the cortical plate (arrowheads).
capsule system (EXT), and thalamocortical/internal capsule system Gradual decrease in the overall AChE-staining is observed at 32
gradually invade the subplate. At 23 PCW (B), AChE-reactive PCW (E). Double arrows point to the external capsule, that is, the
fibers accumulate transiently in the superficial part of the SP. After deep border of the subplate zone.
early preterm prefrontal cortex is also visible in the MZ, structural data on laminar, regional, and radial organization
which has several sublayers and contains well-developed show transient patterns of organization of prefrontal cortex
but transient subpial granular layer (Kostović et al., 2004/ in preterm infant.
2005).
Transient lamination of the prefrontal cortex in human Neurogenetic Events The proliferative zones (ven-
preterms was well documented by acetylcholinesterase tricular and subventricular) in preterm infants gradually
(AChE) histochemistry (Kostović, 1990). There is a transient cease to produce neurons. However, the bulging of the
columnar arrangement (figure 2.2D) of the strong AChE- ventricular zone, the so-called ganglionic eminence, remains
staining in the prefrontal cortex, with regionally characteris- thick and voluminous. According to the evidence obtained
tic distribution (Kostović, 1990). The differences between in primates (Bystron et al., 2008; Rakic, 2006) after 24 PCW
orbital, lateral, dorsolateral, and orbitomedial cortex became this zone produces predominantly glial precursors. In
more obvious and roughly correspond to the incipient gyral our Golgi studies (Mrzljak et al., 1988, 1990) we have
landmarks (Kostović, Petanjek, et al., 1992). In summary, seen many migratory neurons in early preterm infants.
20.6 mm
27.1 mm
CC
IZ SP CP
CC caud
put
A B
Figure 2.3 The subplate (SP) is characterized by an abundant immunostaining for fibronectin (B). Note gradients of extracellular
extracellular matrix, as demonstrated by PAS-Alcian Blue histo- matrix and fibronectin concentration within the “waiting” com-
chemical staining (A). The subplate extracellular matrix also con- partment of the subplate zone.
tains a high amount of axonal guidance molecules, as shown by
Therefore, it is possible that in the prefrontal pallium detected by infrared monitoring (Fitzgerald, 2005). It is
neurogenesis and migration last several weeks longer than not known whether information from primary cortex in
in primary cortical areas. early preterms is conveyed further to prefrontal cortical
While there is cessation of proliferative processes, both areas. This seems to be unlikely because of the immaturity
pyramidal and nonpyramidal neurons continue to differen- of cortico-cortical pathways. Thus the prefrontal cortex
tiate. SP neurons continue to grow (Mrzljak et al., 1992) and in early preterms receives predominantly nonsensory
express different neuropeptides such as NPY (Delalle et al., information via mediodorsal-prefrontal projection. Thalamic
1997) and somatostatin (Kostović, Štfulj-Fučić, et al., 1991), axons in early preterm infants make synapses with cells in
as well as GABA. The distribution of synapses changes signi- both SP and CP (Kostović & Jovanov-Milošević, 2006;
ficantly after 24 PCW: synapses begin to appear within the Kostović & Judaš, 2006, 2007).
deep portion of the CP and bilaminar pattern of synaptic The contact of thalamocortical axons with subplate
distribution gradually disappears. The number of synapses neurons is a powerful activator of transient endogeneous
in the deep cortex (SP plus deep CP) is higher than in the cortical circuitry. The initial contact with cortical plate cells
superficial cortex (superficial CP plus MZ). is a forerunner of extrinsic circuitry. Prolonged coexistence
The predominance of deep synapses in early preterms and of these two types of circuitry, the extrinsic and transient
deep-to-superficial synaptogenesis after 28 PCW are struc- intrinsic, is the salient feature of cortical development in
tural factors important for generation of cortical dipole (from humans (Kostović & Judaš, 2006, 2007). The gradual
surface to intermediate zone) and changing surface posi- changes in organization of transient and “permanent”
tive potentials (Molliver, 1967). According to our studies circuitry may form the basic framework for changing
(Kostović & Goldman-Rakic, 1983; Kostović & Judaš, EEG (Vanhatalo & Kaila, 2006). Slow activity transients
2002; Kostović & Rakic, 1984, 1990) the most intense (SAT) are generated as early as 24 PCW (Tolonen, Palva,
neurogenetic process in the preterm cerebrum is growth of Andersson, & Vanhatalo, 2007; Vanhatalo & Kaila, 2006),
projection and callosal pathways. transform at 30 PCW, and disappear after birth (Vanhatalo
& Kaila, 2006). Excitatory extrinsic cholinergic input to the
Functional Organization The main feature dis- SP also originates in the basal forebrain. Subplate neurons
tinguishing preterm cortex from midfetal and late fetal are glutamatergic (Antonini & Shatz, 1990) and GABAergic
cortex (figures 2.1B, 2.1C, 2.1D) is the presence of strong (Meinecke & Rakic, 1992) or GABA-peptidergic neurons
thalamic input from mediodorsal nucleus (Kostović & (Allendoerfer & Shatz, 1994). There is significant increase in
Goldman-Rakic, 1983). In the primary sensory areas, the number of peptidergic neurons in preterm infants (Delalle
thalamic input is anatomical basis for evoked potentials et al., 1997).
(for a review of literature, see Kostović & Jovanov- While functional relationship between thalamic axons,
Milošević, 2006). After 24 PCW, thalamic afferents subplate neurons, and cortical plate neurons in primary
establish synaptic contacts with CP neurons (Molliver et al., cortical areas was clearly demonstrated in experimental
1973). In the somatosensory cortex of preterm infants, pain studies in carnivores (Allendoerfer & Shatz, 1994) and
stimuli from the skin may provoke cortical response as rodents (Hanganu, Kilb, & Luhmann, 2002), the prefrontal
SP
WM
Figure 2.4 Transient subplate neurons are present in the neurons are seen extending from deep cortical layer VI through
postnatal human cerebral cortex, as revealed by MAP2- the SP into the core of the gyral white matter (WM).
immunohistochemistry of the middle frontal gyrus. MAP2-positive
Figure 2.5 Development of dendritic arborizations of layers III and V pyramidal neurons, as revealed by computerized Neurolucida
reconstructions. (After Petanjek et al., 2008, with permission of Oxford University Press.)
WM
26-28 PCW
Figure 2.6 Growth of corpus callosum fibers through complex of intermingling with thalamocortical fibers. Abbreviations: cp,
guidance zones and decision points. Numbers denote eight sequen- cortical plate; cpn, cortical plate neuron; l.III, developing layer III;
tial decision points in the ipsilateral hemisphere, at the hemispheric m, migrating neuron; SP, subplate; spn, subplate neuron.
midline, and in the opposite hemisphere. Asterisks indicate areas
circuitry. Although the pruning of exuberant callosal connec- driven and as marking the onset of characteristic executive
tions in the human brain occurs during the first six months function with resolution of spatiotemporal relationships.
of postnatal life, the moderate reduction in size of the corpus As stated earlier, the maturation of layer III neurons
callosum was noted already in preterm infants (Innocenti & displays a “dormant” period between the 3rd and 15th
Price, 2005). postnatal months (Petanjek et al., 2008). According to
We proposed that the subplate exists as a cytoarchitectonic these data, the differentiation of associative neurons of layer
entity during growth of long cortico-cortical pathways and IIIC speeds up after the 15th month, during the period of
that some subplate circuitry elements exist also during the intensified social interaction.
late transitional phase of growth of short cortico-cortical The development of subplate peptidergic neurons follows
connectivity pathways (Kostović & Rakic, 1990; Kostović, a similar time schedule. In the newborn, NPY and somatosta-
1990; Kostović, Petanjek, et al., 1992). This late phase of tin neurons are important constituents of the subplate (Delalle
transient circuitry is resolved between 7 and 10 months of et al., 1997; Kostović, Štfulj-Fučić, et al., 1991). However, by
postnatal life when cortico-cortical connectivity is established 6 months, their number decreases, and this low number is
and when prefrontal cortex exhibits first executive func- maintained during childhood (Delalle et al., 1997).
tions (Diamond & Goldman-Rakic, 1989; Fuster, 2002; Modulatory pathways from basal forebrain and brain
Goldman-Rakic, 1987). This finding suggests that the dis- stem change their transient fetal pattern during early infancy
appearance of transient circuitry (which existed in parallel (Kostović, Škavić, & Strinović, 1988; Brown, Crane, &
with permanent circuitry elements since 24 PCW) should Goldman, 1979). However, important developmental shifts
precede the onset of goal-directed behavior and ability to in their density and distribution occur during childhood and
retrieve schemas from past events that are no longer in the adolescence (Brown et al., 1979).
perceptual field. We describe this new cognitive phase, which There is a correlation between synaptogenesis (Hut-
develops after the 7th postnatal month, as environmentally tenlocher & Dabholkar, 1997), formation of dendritic spines
Overview of the developmental phases of prefrontal synaptic circuitry involves postsynaptic elements situated in
circuitry during the first 2 years of life characteristic bilaminar distribution above and below the
cortical plate (figure 2.8A). This phase is endogeneous and
The data presented systematically in previous paragraphs nonsensory driven. The crucial event in the development
clearly suggest that development of prefrontal circuitry of fetal circuitry occurs after 13 PCW when cortical plate
begins during the early fetal life, in parallel with develop- loosens and new prominent subplate zone develops. This
ment of regions destined for motor and sensory functions typical fetal circuitry consists of (1) postsynaptic elements:
(Kostović & Rakic, 1990). well-differentiated GABAergic and glutamatergic subplate
These early circuits are oscillatory, endogeneous, and very neurons, Cajal-Retzius cells of the marginal zone, and basal
likely based on nonsynaptic contacts because we did not see and apical arborizations of quite immature pyramidal
synapses before the formation of the cortical plate. The first neurons; (2) presynaptic elements: short and long axons of
MZ
CP
I
I-V
SP SP
MZ
CP call
PSP
IZ
motor neuron
IZ BF
BF TH WM
TH
TH SVZ SVZ
TH
tegm (ma)
tegm (ma)
BF SVZ
tegm (ma) TH
VZ
VZ
A B VZ
C
Figure 2.8 The reconstruction of neuronal connections in preterm infant cortex at 34 PCW (C). (After Kostović &
the human fetal frontal cortex. At 10 PCW (A), early modulatory Judaš, 2007, with permission of Elsevier Science Publishers.)
afferents are present in transient fetal subplate (SP) and marginal Abbreviations: BF, basal forebrain; TH, thalamus; tegm (ma), teg-
zone (MZ). In the midfetal cortex at 24 PCW (B), thalamocortical mental monoaminergic afferents. Black squares depict GABAergic
afferents and SP neurons are dominant elements of deep transient subplate neurons, and white circles depict GABAergic neurons
cortical circuitry. The relocation of afferents from SP into the of the cortical plate. Glutamatergic neurons are depicted as
CP with subsequent synaptogenesis are illustrated for early diamond shapes.
abstract Until recently, neuroscientists have lacked powerful stimulation—require the use of invasive, and often terminal,
means for studying the human brain, and so have relied on studies experimental procedures. Given this fact, neuroscientific
of nonhuman species for understanding human brain organization. research has tended to focus on nonhuman species. To be
Moreover, the Darwin-Huxley claim that the human mind and
brain, while highly developed, are qualitatively similar to those of
sure, neuroscience has maintained a tradition of human
other species encouraged the concentration of research in a very research, beginning with clinical neurology and strength-
few “model” nonhuman species. Several recent developments chal- ened today by the availability of a suite of remarkable non-
lenge the traditional model-animal research paradigm and provide invasive imaging tools. Nonetheless, there remains an
the foundations of a new neuroscience. First, evolutionary biologists important gap between human and nonhuman studies: the
now understand that living species cannot be arrayed along a
techniques available for studying animals permit much more
single, unbroken sequence of phylogenetic development: species
can differ qualitatively. Second, neuroscientists are documenting detailed investigations involving finer levels of organization,
remarkable variations in the organization of cerebral cortex and and the use of model animals facilitates better experimental
other brain regions across mammals. Third, new noninvasive design. The result is that many of our ideas about human
methods from histology, neuroimaging, and genomics are making brain organization are actually inferences drawn from studies
the human brain accessible for direct, detailed study as never
of nonhuman species.
before. Finally, these same methods are being used to directly
compare humans to other species (including chimpanzees, the What is the scientific basis for making inferences about
species most closely related to humans), providing the foundations human brain organization from the study of nonhuman
of a new and richly detailed account of how the human brain both species? The answer, seemingly, is the principle of evolution,
resembles and differs from that of other species. which asserts that there was continuity between the human
species and other animals through Earth history. We rightly
honor Charles Darwin for having the profound insight that
Evolution isn’t what it used to be all animals are descended from one (or a few) progenitor
species and providing the evidence necessary to convince
What distinguishes the human brain from that of other scientists of the truth of that insight (Darwin, 1859). But
animals? This is one of the most profound questions that neu- Darwin did more than that. Darwin viewed the forces of
roscience confronts, yet surprisingly it has not attracted a evolution—mainly natural selection—as means of improve-
great deal of empirical investigation, nor have the answers ment, so that those species that have been subjected to
offered heretofore been particularly revealing. Humans have more selection are better species. In this way, Darwin
big brains relative to body size—that much is agreed upon— provided a naturalistic explanation of the Great Chain of
but what is different about the contents of those big brains? Being (scala naturae), the idea that life forms can be arranged
We should perhaps start by asking why, despite the great along a linear scale from the simple and base to the complex
advances made in neurosciences over the past few decades, and refined. The concept of a scale of being long predated
we have so little solid information about how the human Darwin, having its origins (in the Western tradition, at least)
brain differs from and resembles that of other animals. One in Aristotelian philosophy (Lovejoy, 1964). (Some pre-
reason, clearly, is that neuroscientists have had relatively Darwinian versions of the scale of being go beyond the
limited means for studying humans, compared to the means merely human, extending through the grades of angels up to
available for studying nonhuman species. The most powerful the Almighty.) Darwin took the metaphysical Great Chain
methods at our disposal for studying brains—genetic ma- of Being, which seems very strange to most of us today, and
nipulations, tracer injections, microelectrode recording and turned it into something familiar—the phylogenetic scale
(Richards, 1987, 1992). Accordingly, depictions of evolution-
todd m. preuss Division of Neuroscience and Center for ary history from the 1860s through the 1960s commonly
Behavioral Neuroscience, Yerkes Primate Research Center, Emory represented evolution as a process of ascent, with Man at the
University, Atlanta, Georgia top (figures 3.1A, 3.2A, 3.2B). Although some historians have
credited Darwin with the idea that evolution has no direction organisms they infest. Furthermore, evolutionary history has
or orientation, it is clear from even a casual reading of proven to be anything but an unblemished chronicle of
Darwin (see especially Darwin, 1871) that he firmly believed progressive improvement: cosmic collisions and tectonic
in the phylogenetic scale, and that human beings are at the upheavals repeatedly shuffled the deck.
top—that we are the most advanced form of animal life. There is another important transformation in the way
Darwin’s work marks the beginning of evolutionary evolutionary biologists came to view the history of life,
biology, but like other branches of science, evolutionary and like the study of adaptation, this transformation was
biology underwent revolutionary changes during the 20th ultimately rooted in another of Darwin’s profound insights.
century, resulting in a worldview that Darwin would proba- The sole figure in the Origin of Species (Darwin, 1859) depicts
bly have found at once very familiar and very strange. evolution as a branching process, with ancestral lineages
Detailed studies of evolutionary mechanisms made it clear splitting to form multiple daughter species. Early evolution-
that adaptation is a local process, fitting populations to their ists did not recognize any contradiction between the phylo-
particular circumstances. Adaptation does not necessarily genetic scale and the phylogenetic tree; as a result, early
involve increased complexity; in fact, it often results in sim- trees typically had a narrow, vertical orientation (figures
plification or loss of structures and functional capacities, as 3.1A; 3.2A, 3.2B). The contradictions became clear, however,
in the case of many parasitic organisms. Yet parasites are when the lack of a unitary direction to evolutionary history
arguably just as well adapted to their circumstances as the was appreciated. If every lineage is shaped by selection
accurate reconstructions can be obtained (e.g., Behrens, Functional MRI is now routine in humans, but the need to
Berg, Jbabdi, Rushworth, & Woolrich, 2007; Conturo prevent head movement in the scanner has meant that its
et al., 1999; Dauguet et al., 2007; Parker et al., 2002; application in nonhuman primates has largely been restricted
Schmahmann et al., 2007). to macaque monkeys, which can be physically restrained.
The advantage of DTI, of course, is that it can be used to Although many macaque studies have employed these
study connectivity noninvasively, so it can be used with animals in the role of human models, some workers have
humans, chimpanzees, and other primates. It can even be made a point of documenting differences between humans
used with fixed brains, which makes a wide variety of species and macaques. These studies have identified macaque-
accessible for connectivity studies for the first time (e.g., human differences in the responsiveness of contrast- and
Kaufman, Ahrens, Laidlaw, Zhang, & Allman, 2005). motion-related regions of the dorsal extrastriate visual cortex
Some of the first comparative studies undertaken with DTI and posterior parietal cortex (Denys et al., 2004; Orban,
have compared humans and macaque monkeys, examining Claeys, et al., 2006; Orban et al., 2003; Orban, Van Essen,
thalamocortical (Croxson et al., 2005), corticopontine & Vanduffel, 2004; Tootell et al., 1997; Vanduffel et al.,
(Ramnani et al., 2006), and prefrontal cortex (Croxson et 2002), the differences being sufficient to complicate the inter-
al.) connectivity. Recently, Rilling and colleagues published pretation of human-macaque homologies (Sereno & Tootell,
the first comparative study of humans, chimpanzees, and 2005) and to suggest that humans possess areas macaques
macaques, focusing on the arcuate fasciculus (AF), a lack (Orban, Claeys, et al.).
white matter tract that in humans interconnects Broca’s It is unlikely that we will soon have fMRI studies of awake,
and Wernicke’s areas (Geschwind, 1970; Glasser & Rilling, behaving chimpanzees, as imaging equipment is no match
2008). Rilling and colleagues (2008) tracked pathways for these enormously powerful animals. Yet it is possible
between the posterior superior temporal lobe (Wernicke’s to do functional scanning in chimpanzees, using a modifica-
area) and inferior frontal cortex (Broca’s area) in all three tion of the 18F-fluroodeoxyglucose (FDG)-PET technique.
species, but only chimpanzees and humans were found to In this approach, originally developed for macaques, the
have a distinct AF. Moreover, only in humans did the AF experimenter provides the animal with the FDG at the
consistently include fibers tracking to the middle temporal start of testing session, and then, after the level of FDG
gyrus, in addition to the fibers between Broca’s and plateaus (45–60 minutes), the animal is anesthetized and
Wernicke’s areas (figure 3.4). Functional imaging studies in PET scanned. This technique was recently used to compare
humans (reviewed by Glasser & Rilling, 2008) indicate that awake, resting-state brain activity in humans and chimpan-
the cortex of the middle temporal gyrus is involved in rep- zees (Rilling et al., 2007). The results demonstrated com-
resenting word meaning. Noting evidence that inferotempo- monalities between species: both exhibited activation in
ral visual cortex appears to be situated more posteriorly and medial frontal and posteromedial cortices, components of
inferiorly in humans than in macaques, Rilling and col- the “default-mode” network thought to be involved in
leagues suggest that the cortex of the middle temporal gyrus emotion-laden recollection and mental self-projection (e.g.,
was enlarged and modified in human evolution, and may Buckner & Carroll, 2007; Buckner & Vincent, 2007; Gusnard
constitute an evolutionary novelty. & Raichle, 2001; Raichle & Snyder, 2007). Humans, but not
In addition to comparative structural neuroimaging, it is chimpanzees, however, also showed activation of lateral
possible to do comparative functional imaging, using func- cortical regions in the left hemisphere associated with
tional MRI (fMRI) or positron-emission tomography (PET). language and conceptual representation.
Figure 3.4 Evolution of the human arcuate fasciculus (AF), which tions with the temporal cortex below the superior temporal sulcus
interconnects frontal and temporal language areas, based on the (STS), including area 21, a region where word meaning is repre-
comparative diffusion-tensor imaging (DTI) results of Rilling and sented. Chimpanzees have very few fibers in the AF that extend to
colleagues (2008). (A) Average tractography results from the left the cortex inferior to the STS. Macaques do not have
hemispheres from 10 humans, three chimpanzees, and two a definite AF: fibers traveling between the posterior inferior
macaque monkeys. (B) Schematic representation of the results frontal cortex and posterior temporal lobe take a more ventral
shown in A, representing the cortical endpoints of the tracts in route, passing deep to the insula, and include few if any fibers with
terms of Brodmann’s areas. Both humans and chimpanzees have endpoints inferior to the STS. (See color plate 1.)
a distinct AF, although in humans the AF includes strong connec-
Although fMRI scanning of awake chimpanzees currently (Vincent et al., 2007), and possibly the same can be done
seems impractical, it may be valuable do fMRI scanning of with chimpanzees. This technique would provide a valuable
anesthetized chimpanzees. Although cortical activity is addition to DTI as a source of information for exploring the
reduced with light anesthesia, it is not eliminated. Patterns evolution of human brain networks.
of regional coactivation that reflect patterns of anatomical
connectivity can be ascertained using so-called functional Comparative Genomics In recent years, few branches
connectivity MRI (fcMRI), even under light anesthesia of science have so firmly captured the public imagination—
Figure 3.5 Increased expression of the thrombospondin 4 protein levels in frontal cortex from three individual humans (Hs),
(THBS4) gene and protein in human brain evolution, according to compared to three chimpanzees (Pt) and three rhesus macaques
Cáceres, Suwyn, Maddox, Thomas, and Preuss (2007). (A) Microar- (Macaca mulatta, Mm). Tubulin (TUBB) served as a loading control.
ray analysis of THBS4 mRNA levels in different brain regions of (C) Immunocytochemistry for THBS4 in sections for frontopolar
humans (Hs, Homo sapiens) and chimpanzees (Pt, Pan troglodytes). cortex yielded much denser labeling of humans than of chimpan-
Expression levels were significantly higher in human frontal cortex zees or macaques. The difference was especially strong in the neu-
(FCx), temporal cortex (TCx), anterior cingulate cortex (ACCx), ropil space surrounding neuronal somas. Scales are 50 microns in
and caudate (Cd). (B) Western blots showing higher levels of THBS4 the upper panel and 250 microns in the lower panel.
abstract Since the pioneering studies of Wiesel and Hubel on neural activity plays an instructive role in the formation of
the development and plasticity of ocular dominance columns in the eye-specific visual projections. We then review recent studies
visual cortex, it has been widely thought that correlated discharges that tested directly whether patterns of neural activity in fact
of neighboring retinal ganglion cells play an instructive role in the
formation of segregated eye-specific domains in the mammalian provide the instructive cues required for the segregation of
visual system. Here we review the relevant evidence and conclude eye-specific projections to the dLGN. This chapter is an
that while correlated retinal discharges are required for the forma- update of a chapter on this topic that we authored for the
tion of segregated eye-specific projections in the visual cortex, there third edition of The Cognitive Neurosciences (Chalupa & Huber-
is reason to doubt that this is also the case at the level of the dorsal man, 2004). We have also offered re-cent reviews of the role
lateral geniculate. More likely, molecular cues play a key role in
the stereotypic pattern of segregated retinogeniculate projections
of activity in the formation of eye-specific projections in
that characterize different species. As yet, the role of activity and other publications (Chalupa, 2007; Huberman, Feller, &
the identity of the molecular cues involved in this process remain Chapman, 2008).
to be firmly established.
Formation of ocular dominance columns
Since the latter half of the last century, the formation of
That neuronal activity has an influence on the development
eye-specific projections has served as a model system for
of visual system connections stems from the pioneering
exploring the development and plasticity of neural circuits.
studies of Wiesel and Hubel. Their work showed that closure
Early in development, retinal ganglion cell (RGC) pro-
of one eye during a critical period in postnatal life rendered
jections to the dorsal lateral geniculate (dLGN) (Rakic,
that eye permanently incapable of driving cortical cells
1976; Linden, Guillery, & Cucchiaro, 1981; Shatz, 1983;
(Wiesel & Hubel, 1965a, 1965b). Subsequently it was shown
Godement, Salaun, & Imbert, 1984) and dLGN projections
that this physiological effect was accompanied by a marked
to V1 (Hubel, Wiesel, & LeVay, 1977; LeVay, Stryker, &
reduction in the amount of cortical territory innervated by
Shatz, 1978; LeVay, Wiesel, & Hubel, 1980; Rakic, 1976)
geniculocortical axons representing the deprived eye and a
are intermingled. Subsequently, they segregate into non-
dramatic expansion of the geniculocortical axons represent-
overlapping eye-specific territories, and this process is
ing the nondeprived eye (Hubel et al., 1977; Shatz & Stryker,
believed to require neuronal activity. Indeed, the precise
1978). These deprivation studies demonstrated that activity-
pattern of neural activity, as opposed to the mere presence
mediated competition between axons representing the two
of action potential, has been hypothesized to “instruct” the
eyes allocates postsynaptic space in V1.
segregation process by engaging well-established synaptic
Transneuronal tracing of retinal-dLGN-V1 connections
plasticity mechanisms (Crair, 1999; Feller, 1999; Stellwagen
has been used to assess the formation of ocular dominance
& Shatz, 2002; Torborg, Hansen, & Feller, 2005).
columns (ODCs) during early development. Monocular
Here, we provide a brief historical account of the experi-
injections of transneuronal tracers indicated that axons rep-
mental evidence that gave rise to the idea that patterned
resenting the two eyes start out overlapped (Hubel et al.,
1977; LeVay et al., 1978, 1980; Rakic, 1976) before gradu-
leo m. chalupa Department of Ophthalmology and Vision ally segregating into ODCs. This finding supported a
Science, School of Medicine and Department of Neurobiology,
role for retinal activity in ODC segregation in that abo-
Physiology and Behavior, College of Biological Sciences, University
of California, Davis, California lishing action potentials in both eyes with intraocular injec-
andrew d. huberman Department of Neurobiology, Stanford tions of tetrodotoxin (TTX) prevented the emergence of
University School of Medicine, Stanford, California ODCs in the visual cortex (Stryker & Harris, 1986). Thus
abstract What precisely is changing over time in a child’s brain We begin this chapter with a brief summary of changes
leading to improved control over his or her thoughts and behavior? in brain structure, focusing primarily on prefrontal and
This chapter investigates neural mechanisms that develop through parietal cortices, the brain regions that have been most
childhood and adolescence and underlie changes in working
memory, cognitive control, and reasoning. The effects of age and
closely associated with goal-directed behavior. We then
experience on specific cognitive functions are discussed with respect provide an overview of functional brain imaging studies
to functional brain imaging studies, highlighting the importance of focusing on age-related changes in working memory, cogni-
interactions between prefrontal and parietal cortices in cognitive tive control, and fluid reasoning over childhood and adoles-
control and high-level cognition. cence. Because working memory and cognitive control
development have been discussed extensively elsewhere
(Munakata, Casey, & Diamond, 2004; Rubia & Smith, 2004;
What precisely is changing over time in a child’s brain, Casey et al., 2005; Bunge & Wright, 2007), a relatively
leading to improved control over his or her thoughts and greater emphasis is placed on recent studies focusing on the
behavior? Throughout childhood and adolescence, we development of fluid reasoning.
improve at organizing our thoughts, working toward long-
term goals, ignoring irrelevant information that could dis-
tract us from these goals, and controlling our impulses—in
Structural brain development
other words, we exhibit improvements in executive function or The brain undergoes major structural and functional changes
cognitive control (Diamond, 2002; Zelazo, Craik, & Booth, over childhood and adolescence that may, in part, explain
2004; Casey, Tottenham, Liston, & Durston, 2005). By the changes in behavior and cognition. As explained in chapter
same token, we exhibit increased facility over this age range 2, by Kostović and Judaš, rapid changes occur at the
in tackling novel problems and reasoning about the world— neuronal level in the prefrontal cortex (PFC) in the first
a capacity referred to as fluid reasoning (Cattell & Bernard, few years of life, followed by slower, protracted changes
1971). Both the capacity to consciously control our thoughts through adolescence (Petanjek, Judas, Kostovic, & Uylings,
and actions and the capacity to reason effectively rely on 2008). While brain changes at the cellular level can be exam-
working memory, or the ability to keep relevant information in ined only in postmortem brain tissue, advances in neuroim-
mind as needed to carry out an immediate goal. aging techniques have made it possible to study gross
Neuroscientific research is being conducted to better anatomical development in vivo. Structural magnetic reso-
understand the changes in brain structure and function that nance imaging (MRI) methods make it possible to quantify
underlie improved cognitive control and fluid reasoning age-related changes in cortical thickness (Sowell et al.,
during child and adolescent development. More specifically, 2007), in the volume of specific brain structures (Gogtay
researchers seek to determine how the neural mechanisms et al., 2006), and in the thickness and coherence of white
underlying specific cognitive functions change with age, how matter tracts connecting distant brain regions to one another
they differ among individuals, and how they are affected by (Giedd et al., 1999; Klingberg, Vaidya, Gabrieli, Moseley,
experience. & Hedehus, 1999).
Cortical thickness follows an inverted U-shaped pattern
silvia a. bunge Helen Wills Neuroscience Institute and over development. Up to middle childhood (ages 8 to 12),
Department of Psychology, University of California at Berkeley, increased thickness of the gray matter at the surface of the
Berkeley, California
allyson p. mackey and kirstie j. whitaker Helen Wills
brain reflects increased density of neurons and dendrites.
Neuroscience Institute, University of California at Berkeley, Thereafter, decreased gray matter thickness reflects the
Berkeley, California pruning of excess dendrites and neurons, as well as increased
Figure 5.1 Development of nonspatial working memory and nance, whereas backward trials required manipulation in addition
working memory manipulation. (A) Subjects were asked to remem- to maintenance. (B) Group-averaged time courses for activation in
ber three nameable objects, presented for 750 ms each and sepa- the right DLPFC during the delay period show that adults
rated by a 250-ms fixation cross. After the last object the instruction and adolescents recruited this region more strongly during the
“forward” or “backward” directed the participant to either men- harder manipulations trials, whereas children showed the same
tally rehearse or reorder these objects during the 6,000-ms delay. activation in DLPFC for both “forward” and “backward” tasks.
Finally a probe object was presented and participants indicated (Reprinted with permission from Crone, Wendelken, Donohue,
with a button press whether it was first, second, or third object in van Leijenhorst, & Bunge, 2006, copyright © 2006, National
the memorized sequence. Forward trials required pure mainte- Academy of Sciences, USA.)
evidence suggests that it is indeed sensitive to cultural and his colleagues showed children pictorial problems of
environmental influences (Flynn, 2007). the form “A is to B as C is to . . . ?” and asked them to find
The development of reasoning ability is central to under- the D term among a set of pictures, he found that children
standing cognitive development as a whole, because it serves often chose items that were perceptually or semantically
as scaffolding for many other cognitive functions (Cattell, related to the C item (Piaget, Montangero, & Billeter, 1977).
1987; Blair, 2006). Fluid reasoning has been identified as a Sternberg and colleagues found similar limitations in young
leading indicator of changes in crystallized abilities (McArdle, children’s analogical reasoning, observing an overreliance
2001). It strongly predicts changes in quantitative ability on lower-order relations during analogical problem solving
(Ferrer & McArdle, 2004) and reading (Ferrer et al., 2007) (Sternberg & Nigro, 1980; Sternberg & Downing, 1982).
among children aged 5 to 10. Fluid reasoning ability even It has been argued that children as young as 3 years old
predicts performance through college and in cognitively can solve simple analogies as long as they are familiar
demanding occupations (Gottfredson, 1997). with the objects involved and understand the relevant
One form of fluid reasoning is relational reasoning: the relations (Goswami, 1989), but improvements in analogical
ability to consider relationships between multiple distinct reasoning are observed throughout childhood and
mental representations (Gentner, 1983; Hummel & Holyoak, adolescence (Sternberg & Rifkin, 1979; Richland, Morrison,
1997). Analogical reasoning, more specifically, involves & Holyoak, 2006).
abstracting a relationship between familiar items and apply- Fluid reasoning ability seems to be a distinct cognitive
ing it to novel representations (Gentner, 1988; Goswami, function, rising and falling at its own rate across the life span
1989). In other words, forming analogies allows us to deter- (Cattell, 1987). It follows a different developmental trajec-
mine general principles from specific examples and to estab- tory than crystallized abilities, supporting the idea that
lish connections between previously unrelated pieces of these are separable cognitive functions (Horn, 1991; Schaie,
information. Analogical thought is an important means by 1996; McGrew, 1997). Fluid reasoning capacity increases
which cognition develops (Goswami, 1989; R. Brown & very rapidly until late adolescence and early adulthood,
Marsden, 1990). For example, children use analogies to peaking at around age 22 and declining thereafter (McArdle,
learn new words and concepts by association with previously Ferrer-Caja, Hamagami, & Woodcock, 2002).
learned information (Gentner, 1983).
Assessing Reasoning Ability One of the most commonly
When Does Reasoning Ability Develop? Historically, used measures of fluid reasoning ability is the Raven’s
theories of reasoning development focused on children’s Progressive Matrices test (RPM), a classic visuospatial task
limitations. Piaget claimed that, before the stage of formal that can be administered to both children and adults (Raven,
operations around age 11, children are not capable of 1941). This test is considered an excellent measure of fluid
mentally representing the relations necessary to solve reasoning ability (Kline, 1993) and of intellectual ability
analogies (Inhelder & Piaget, 1958). When Piaget and overall (Wechsler & Stone, 1945).
REL-1 problems. Together with the response time data, this different patterns of activation suggesting functional imma-
finding suggests that the children were more likely to treat turity (Wright et al., 2008; Crone et al., 2009).
the REL-2 problems similarly to REL-1 problems, consider-
ing only a single dimension of change. Activation of RLPFC Conclusions
associated with the REL-2 problems increases with age, indi-
cating that development of RLPFC integration mechanism A growing literature indicates that the increased recruitment
occurs, at least in part, over the age range (8–12 years) that of task-related regions in prefrontal and parietal regions
was studied. Unlike RLPFC, the inferior parietal lobule was contribute to improvements in goal-directed behavior over
sensitive to the number of relations in adults and showed an middle childhood and adolescence. The pattern of develop-
immature pattern of activation in children. In summary, this mental changes in brain activation has been generally char-
study provides evidence that the development of reasoning acterized as a shift from diffuse to focal activation (Durston,
is associated with functional changes in RLPFC in response Davidson, et al., 2006) and from posterior to anterior activa-
to relational integration. tion (Rubia et al., 2007; T. Brown et al., 2005). Differences
In summary, fluid reasoning ability comes online early in can be either quantitative, with one age group engaging a
childhood but continues to develop through adolescence region more strongly or extensively than another, or qualita-
and even into adulthood. Intelligence in adults is related to tive, with a shift in reliance on one set of brain regions to
connectivity between PFC and parietal cortex (Shaw et al., another, or both (T. Brown et al., 2005; T. Brown, Petersen,
2006). Structural neuroimaging studies (Giedd, 2004) have & Schlagger, 2006; Rubia et al., 2007; Scherf et al., 2006;
shown that development of these regions, PFC and parietal Badre & Wagner, 2007). Importantly, the precise pattern of
cortex, follows a prolonged developmental time course change observed depends on the task, the ages being exam-
that matches behavioral data on reasoning in childhood ined, and the brain region in question. By further character-
(Richland et al., 2006). Initial functional neuroimaging izing neurodevelopmental changes in cognitive control
studies have shown that children recruit brain regions similar processes within subjects and across a range of tasks, we hope
to those that adults use to solve analogy problems, but with to better understand the development of the human mind.
9 pascual-leone 141
90 plasticity
6 Patterning and Plasticity of Maps in
the Mammalian Visual Pathway
sam horng and mriganka sur
abstract Maps at successive stages of the visual system, and in by genetic programs, electrical activity, and experience-
particular visual cortex, organize salient stimulus features into dependent modulation of stimulus input.
complex cortical networks. Retinotopic maps and ocular domi- During development, the formation of a retinotopic map
nance domains arise during development using a molecular
program that specifies the rough topographic order of projections.
requires that axons responsive to neighboring positions in
Genetic mutations in mice have identified guidance and patterning visual space maintain their relative positions as they inner-
cues that mediate this organization of maps and may lead to the vate their target. This process involves graded patterns of
creation of new maps. Spontaneous activity produced in the retina guidance receptors expressed across the population of axons
refines the precision of the maps before eye opening, and patterned and matched to a complementary gradient of ligands on the
activity after eye opening drives further refinement and mainte-
target cells. The genetic patterning of guidance cues confers
nance. For ocular dominance, the cortex has a critical period for
synaptic plasticity during which it is especially responsive to changes a rough order and spatial efficiency to the retinotopic map.
in input. During this time, changes in eye-specific drive lead to However, further retinotopic precision and ocular domi-
Hebbian and homeostatic changes in the cortical network. This nance segregation depend upon patterns of spontaneous
potential for plasticity represents a functional reorganization in activity in the retina and experience-driven input. Changes
response to changing demands from the outside world and allows
in the level or pattern of activity can alter the structure and
the organism to adapt to its environment.
function of the retinotopic map in early development and of
ocular dominance regions in later development and adult-
hood. Ocular dominance plasticity occurs in response to
A critical function of the brain is to provide an orderly changes in competitive input between the eyes, and a variety
and efficient neural representation of salient sensory stimuli of molecular pathways, many of which reflect the matura-
from the outside world. In the mammalian visual pathway, tional state of the circuit, have been implicated in this
representations of light reflectance in visual space are process. Thus the developmental context shapes the extent
relayed from the retina as a topographic map to the thala- to which changes induced by competitive input between the
mus and superior colliculus. Along this pathway, projections eyes occur. Here, genetic programs of development interact
from the two eyes are kept in parallel. Retinotopic and with activity- and experience-dependent input to mediate
eye-specific information from the thalamus are transferred map refinement and plasticity.
to the primary visual cortex, where additional stimulus
features are extracted. The mechanisms by which visual The formation of the visual pathway during
stimulus feature maps are established and modified in early development
response to experience are an active area of research, as
these mechanisms are central to specifying the organiza- Regionalization of Visual Pathway Centers
tional details of the visual pathway and the functional char- Functional pathways of the brain arise out of genetic
acteristics of vision. programs of early development, which establish structural
In this chapter, we will review the processes of retinotopic regions and wire them together (Rakic, 1988; O’Leary,
mapping and cortical plasticity in the mammalian brain. 1989; Job & Tan, 2003; Sur & Rubenstein, 2005). During
Molecular mechanisms of these phenomena have been embryogenesis, sources of diffusible molecules, called
studied most extensively in the mouse, a model for which signaling centers, induce regional and graded patterns
genetic manipulations are available. What we currently of gene expression in the anterior neural tube. These
know of these mechanisms illustrates how circuits are shaped patterns translate into structurally parcellated and func-
tionally differentiated brain regions, including those
sam horng and mriganka sur Department of Brain and Cognitive devoted to processing incoming visual stimuli (Figdor &
Sciences, The Picower Institute for Learning and Memory, MIT, Stern, 1993; Rubenstein, Martinez, Shimamura, & Puelles,
Cambridge, Massachusetts 1994; Rubenstein, Shimamura, Martinez, & Puelles, 1998;
horng and sur: patterning and plasticity of maps in mammalian visual pathway 91
Ragsdale & Grove, 2001; Nakagawa & O’Leary, 2002; expressing axons to the proper decussation site for optic
Grove & Fukuchi-Shimogori, 2003; Shimogori, Banuchi, chiasm formation (Erskine et al., 2000; Ringstedt et al., 2000;
Ng, Strauss, & Grove, 2004). Plump et al., 2002), and ephrin-B2 expression at the optic
In the mouse, centers of the visual pathway are established chiasm steers EphB1-receptor-expressing ventrotemporal
in this way: neuromeres P2 and P3 of the diencephalon dif- axons ipsilaterally (Williams et al., 2003; Lee, Petros, &
ferentiate into the dorsal and ventral thalamus, respectively, Mason, 2008). Matrix metalloproteinases (MMP) have been
and around E13.5, lateral nuclei cluster to form the dorsal implicated in optic chiasm crossing and tectal targeting
and ventral subdivisions of the lateral geniculate nucleus (Hehr, Hocking, & McFarlane, 2005). Additionally, tradi-
(LGN) ( Jones, 1985; Tuttle, Braisted, Richards, & O’Leary, tional morphogens influence retinal ganglion cell pathfinding
1998). From E11 to E19, area 17 of posterior cortex differ- (Charron & Tessier-Lavigne, 2005): FGF-2 repels RGC
entiates in response to cortical gradients of FGF8, Wnts, growth cones along the optic tract (Webber, Hyakutake, &
BMP, and Shh to form the primary visual cortex (V1) (Dehay McFarlane, 2003), BMP7 promotes axonal outgrowth at the
& Kennedy, 2007). How continuous gradients of gene optic disk (Carri, Bengtsson, Charette, & Ebendal, 1998;
expression throughout the neural tube are translated into Bovolenta, 2005), and Shh exhibits concentration-dependent
boundary-delimited regions of functionally specific identities attractive or repulsive effects in the retina and optic chiasm,
is not yet known; moreover, region-specific gene expression respectively (Trousse, Marti, Gruss, Torres, & Bovolenta,
has not yet been reported (Nakagawa & O’Leary, 2001; 2001; Kolpak, Zhang, & Bao, 2005). Less is known about the
Jones & Rubenstein, 2004). specific cues mediating ganglion cell ingrowth to the LGN
and geniculocortical targeting to area 17, or V1.
Targeting and Retinotopic Wiring The visual pathway However, molecules that contribute to the topographic
is wired (figure 6.1A) when roughly one-third of the ganglion ordering of projections have been investigated in the LGN
cell axons from the retina project to the dorsal and ventral and V1, as well as the SC. The spatial position of visual
subdivisions (LGNd, LGNv) of the LGN while the remaining stimuli is inverted through the lens and encoded on a sheet
two-thirds target the superior colliculus (SC) in the brain of retinal ganglion cells. This topographic map gets pro-
stem ( Jones, 1985; Tuttle et al., 1998). Axonal pathfinding to jected into the LGN and V1, as well as the SC. Because axon
fugal (i.e., thalamic) and collicular targets begins around guidance cues must not only flag targets but also confer
E15–16 and peaks at E19 (Colello & Guillery, 1990; Figdor information about the relative topography of neighboring
& Stern, 1993; Tuttle et al., 1998; Inoue et al., 2000; Gurung axons, positional cues are needed to maintain the retinotopic
& Fritzsch, 2004; Guido, 2008). In the mouse, axons from order of the projecting pathway. To avoid employing an
the ventrotemporal retina project ipsilaterally while the rest infinitely large number of distinct positional cues, a gradient
of the axons project contralaterally, with contralateral of one molecule along the sheet of axons may be matched
innervation to the thalamus occurring earlier (E15–16) than to a complementary gradient of its binding partner in the
ipsilateral targeting (P0–2) (Dräger & Olsen, 1980; Godement, target (Sperry, 1963). This “chemoaffinity” model has been
Salaun, & Imbert, 1984). Connections between the LGN confirmed with the discovery of a number of different
(in this review, LGN is used to denote LGNd) and V1, both receptor-ligand gradients expressed in projecting axons and
in the feedforward geniculocortical direction and in the target cells along the visual pathway.
feedback corticogeniculate pathway, emerge around E14 The most comprehensively studied of these graded
(Zhou et al., 2003). mapping molecules are the ephrin ligands and Eph family
Elucidating the mechanisms of retinotopic targeting and of tyrosine kinase receptors (figure 6.1B). The contribution
mapping has become a comprehensive field of study (figure of ephrinA-EphA receptor interactions to topographic
6.1B). Molecular mechanisms of retinal ganglion cell (RGC) mapping was first described in the optic tectum, where
guidance have been extensively studied in Xenopus, zebra- low-to-high ephrin-A2/A5 expression along the anterior-
fish, chick, and mouse models. Much of this work has focused posterior axis was found to interact with a complimentary
on guidance to the optic disk, dessucation at the optic chiasm, high-to-low EphA3 receptor gradient in terminals of the
and topographic map formation at the optic tectum, or SC temporal-nasal axis of retina (Nakamoto et al., 1996;
(Inatani, 2005; Mann, Harris, & Holt, 2004). In the retina, Feldheim et al., 1998, 2000; Hansen, Dallal, & Flanagan,
laminin and netrin repulse DCC receptor-expressing RGC 2004; Bolz et al., 2004). Interactions between ephrin-A and
axons out of the optic head and into the optic nerve (Hopker, EphA receptors were initially thought to be repulsive, though
Shewan, Tessier-Lavigne, Poo, & Holt, 1999). Along the subsequent studies revealed a concentration-dependent
optic nerve, a repulsive semaphorin 5a sheath maintains the transition from attraction to repulsion: with low ephrin-A
integrity of an interior axon pathway (Shewan, Dwivedy, concentrations causing axonal attraction and high levels
Anderson, & Holt, 2002; Oster, Bodecker, He, & Sretavan, causing repulsion (Hansen et al., 2004). The ability of one
2003). Slit1- and slit2-expressing cells guide repulsed robo- ligand-receptor system to both attract and repel allows for
92 plasticity
Figure 6.1 (A) Representation of the rodent visual pathway. embryogenesis by Zic2 and EphB1 expression. Conversely, the
Retinal ganglion cells project to the LGN, which in turn projects contralateral retina is characterized by Isl2 expression. Retinal
to the primary visual cortex (V1). A central region of the visual ganglion cells express DCC and are repulsed out of the optic
field is represented by both eyes along the pathway (ipsilateral, red; head by laminin and netrin. Factors, such as semaphoring-5a,
contralateral, blue). Contralateral and ipsilateral retinal ganglion keep retinal axons on course in the optic tract, where ipsilateral
cell terminals representing this binocular region are segregated axons are repulsed by ephrin-B2 while contralateral axons
in the LGN (red, ipsilateral zone; blue, contralateral zone). Genicu- decussate. High temporal to low nasal gradients of EphA receptor
locortical fibers representing this region converge onto a binocular and ten_m3 expression in retinal axons likely influence terminal
zone located in the lateral half of V1 (red, binocular zone; blue, zones onto gradients of ephrin-A in the LGN. Ipsilateral axons
monocular zone). (B) Schematic representation illustrating retino- terminate in a dorsomedial core of the LGN, segregated from
topic map organization at each stage of the visual pathway and surrounding contralateral axons. Activity-dependent refinement
known guidance cues contributing to patterning. The visual field is necessary for proper eye-specific segregation. While ephrin-A
can be divided into two Cartesian axes, azimuth and elevation. gradients shape retinotopic termination zones, ten_m3 specifically
For clarity, the azimuthal map on the left is diagrammed onto influences ipsilateral targeting. Geniculocortical axons innervate
the visual pathway of the right hemisphere. The elevation map V1. Ipsilateral inputs and corresponding contralateral fibers con-
on the right is diagrammed onto the pathway of the left hemi- verge in the lateral binocular zone, while contralateral inputs
sphere. In reality, both axes of visual space are represented representing regions not detected by the ipsilateral eye terminate
concurrently in both hemispheres. The ganglion cell sheet of in the medial monocular zone. Loss of ephrin-As leads to the
the retina is divided into a contralaterally projecting region and disorganization of cortical maps only on the azimuthal axis, sug-
an ipsilaterally projecting region. The ipsilateral retina originates gesting that other, unidentified factors contribute to the mapping
from the ventrotemporal quadrant and is characterized in late of elevation. (See color plate 2.)
horng and sur: patterning and plasticity of maps in mammalian visual pathway 93
the target to be filled more parsimoniously than with perception. In mice, ipsilateral projections form a dorsal
separate attractant and repulsant molecules. core in the LGN (LGNd) and are flanked laterally by
High lateral-to-medial gradients of ephrin-A2/A5 are also contralateral terminals representing matched areas of
present in the mouse and ferret LGN and direct topography visual space. These axons intermix when projecting to layer
of high levels of EphA5/A6 expression from the contralat- IV cells of the binocular zone, a V1 subregion bounded
eral nasal projections and low levels in the ipsilateral tem- medially by a monocular zone of contralateral input (figure
poral projections (Huberman, Murray, Warland, Feldheim, 6.1A,B). In mammals with more complex visual systems,
& Chapman, 2005). Loss and ectopic gain of ephrin-A2, 3 such as the ferret, cat, primate, and human, eye-specific
and 5 lead to disruptions of the topographic map in both the domains form a map of ocular dominance stripes in V1
LGN and V1: loss produces a medial shift in V1, in addition (figure 6.3). Whether eye-specific domains are influenced
to internal disorganization, while lateral overexpression by positional cues in addition to activity dependent processes
leads to a compression of V1, suggesting that EphA- of terminal segregation has only recently begun to be
expressing geniculocortical axons respond to a high medial explored. Developmental time course studies in the mouse
to low lateral gradient of ephrinA (Cang et al., 2005). High show that early (P0–P5) ipsilateral axons are diffusely targeted
dorsal EphB receptor expression in retina responds to to the dorsal-medial portion of the LGN and progressively
low ventral ephrin-B expression in the tectum, and EphB- become more strictly confined to a central core by P28
ephrinB gradients are speculated to similarly organize (Jaubert-Miazza et al., 2005). Although activity-dependent
distinct axes in the LGN and V1 (Hindges, McLaughlin, processes to be discussed later contribute to the refinement
Genoud, Henkemeyer, & O’Leary, 2002; McLaughlin, of ocular domains in the LGN (Shatz, 1983; Shatz & Stryker,
Hindges, Yates, & O’Leary 2003). The role of potential 1988; Pfeiffenberger et al., 2005), the initial ingrowth of
cis and trans mediated interactions among ephrin and Eph ipsilateral axons shows a bias toward the binocular region
receptors from countergradients expressed on axons of the in the central part of the dorsal half of the LGN, and eye-
same area have yet to be explored (Luo & Flanagan, 2007). specific guidance cues likely instruct this initial positioning
Finally, additional graded positional cues have been identi- (Godement et al., 1984). The presence of functional markers,
fied in the retinotectal map. Repulsive guidance molecule Isl2 and Zic2, during late embryogenesis (E13–E17), for
(RGM), a novel membrane-associated glycoprotein expressed contralaterally and ipsilaterally projecting retinal ganglion
in the posterior tectum, repulses temporal axons in vitro cells, respectively, suggests that the two populations have
(Monnier et al., 2002), while engrailed-2 (En-2), a homeodo- distinct differentiation programs and potentially respond
main transcription factor, is secreted by the posterior tectum, to unique cues in their target (Herrera et al., 2003; Pak,
is endocytosed into axons, and attracts nasal axons while Hindges, Lim, Pfaff, & O’Leary, 2004). Loss of ten_m3, a
repelling temporal axons (Brunet et al., 2005). A high-to-low homophilic binding protein expressed strongly on ipsilaterally
gradient of Wnt3 in the medial-lateral axis of the optic projecting axons, leads to the selective ventral expansion of
tectum mediates patterning via ventral-dorsal differences in ipsilateral axons and no disruption in contralateral axons
Ryk receptor expression (Schmitt et al., 2006), and a Wnt in the LGN (Leamey, Glendining, et al., 2007; Leamey,
signaling inhibitor, SFRP1, interacts with RGC receptor Merlin, et al., 2007). Therefore, ten_m3 and potentially
Fz2, to steer axons along the optic tract en route to the other unknown cues may contribute to the formation of
tectum (Rodriguez et al., 2005). eye-specific domains. Mechanisms of how corresponding
Experiments in which half of retinal ganglion cells are ipsilateral and contralateral axons are coordinated and
ablated or a disordered set of cells gain EphA expression aligned to form binocular maps are poorly understood.
reveal that retinotectal axons persistently fill their target
(Brown et al., 2000; Feldheim et al., 2000). Thus it is the Other Feature Maps and the Formation of New
relative level of positional information rather than absolute Maps In mice and other mammals, additional stimulus
signaling that determines the topography of retinal axons. features are encoded in the visual pathway at the cortical
Some limiting factor, whether from the axon-axon interac- level. Cells in V1 are selective for orientation, spatial
tion or target-derived cues, may ensure that target filling frequency, and the direction of visual stimuli. In carnivores
occurs (Luo & Flanagan, 2007). Loss of L1CAM leads to and primates, these cells are organized into selectivity maps
incomplete filling of the tectum, and this molecule may have of their own. For example, multiple stripes converging
such a role (Demyanenko & Maness, 2003). around a pinwheel center on the cortical surface represent
graded regions of different orientation selectivity. Within
Eye-Specific Domains A second fundamental organiza- these orientation-selective regions, directionally selective
tional feature of the visual pathway is its segregation into subregions are present. Using a layout that maximizes
eye-specific domains. Maintaining parallel channels for map continuity and cortical coverage (Swindale, Shoham,
eye-specific input allows for stereoscopic vision, or depth Grinvald, Bonhoeffer, & Hübener, 2000), multiple feature
94 plasticity
maps are superimposed and organized in systematic fashion, LGN cells may play a role (Tavazoie & Reid, 2000). Differ-
with regions of high gradients from different maps spatially ent feature maps in V1 appear to be guided by independent
segregated from one another (Yu, Farley, Jin, & Sur, 2005). mechanisms. Loss of the direction-selective map leaves the
That is, while individual, adjacent neurons respond best to orientation map intact, and monocular enucleation to elimi-
different values of the same feature, the way in which these nate the ocular dominance map does not interfere with the
features are mapped varies systematically. The critical formation of the remaining V1 feature maps (Farley, Yu, Jin,
parameter is the rate of change of each feature across the & Sur, 2007). However, the relative positioning of different
same set of neurons: at locations where one feature changes maps in V1 is responsive to alterations in a given map,
rapidly, other features change little. as monocular enucleation leads to the coordinated reorga-
Mechanisms of map formation for these additional stimu- nization of the remaining map dimensions (Farley et al.,
lus features are not well understood, although the role of 2007). Therefore, while the formation of stimulus-specific
intrinsic genetic programs of patterning and activity- maps or networks likely relies on unique developmental
dependent input may differ depending on the specific feature mechanisms, whether they be genetically determined or
map (White & Fitzpatrick, 2007). Whereas the retinotopic instructed by activity, the detailed organization of each map
and eye-specific maps are patterned roughly before birth and its structural and spatial coordination with other maps
and eye opening, the orientation map is detectable only by is a key feature of activity-dependent cortical organization.
the time of eye opening in the ferret (Chapman, Stryker, & Because of the independent origin of individual maps (and
Bonhoeffer, 1996; White, Coppola, & Fitzpatrick, 2001; response features), the appearance of new maps in evolution
Coppola & White, 2004), and the direction-selective map may have depended on unique events and developmental
appears 1–2 weeks later (Li, Fitzpatrick, & White, 2006). processes for a given map. However, there may be general
Therefore, the formation of these maps likely depends criti- properties in neural circuits that allow for the introduction
cally on developmental processes coincident with patterned of a novel map. Novel maps may arise potentially through
input into the cortex. the duplication and subsequent functional divergence of an
The formation of orientation maps coincides with a period existing map, or the addition of a novel input into an existing
during which axonal connections in V1, especially long- region and subsequent reorganization of cortical circuitry
range horizontal inhibitory projections in layer 2/3, prolifer- into a new map. An example of the former is the induction
ate (Bosking et al., 2002). Orientation tuning has been of duplicate barrel cortices by ectopic posterior cortical
hypothesized to arise from feedforward patterns of thalamo- FGF8 expression (Fukuchi-Shimogori & Grove, 2001). An
cortical connectivity (Ferster & Miller, 2000) and to be example of novel input leading to the introduction of a new
shaped by intracortical connections (Somers, Nelson, & Sur, map includes the implantation of a third eye leading to
1995) and balanced inhibition (Marino et al., 2005). The triple ocular dominance stripes in the tectum of the frog
maturation of this supragranular inhibitory network may (Constantine-Paton & Law, 1978), rewired retinal input to
contribute to the appearance of orientation tuning and orga- the MGN driving retinotopic maps to form in primary audi-
nization of tuned cells into selective domains. Mice deficient tory cortex (A1) (Sur, Garraghty, & Roe, 1988), and ten_m3
in Arc, an activity-dependent cytoskeletal-associated protein mutation in mouse leading to a medial expansion of ipsilat-
implicated in the synapse-specific modulation of AMPA eral input to V1 and the de novo formation of cortical ocular
receptor number, show weaknesses in orientation tuning in dominance stripes (C. Leamey, personal communication).
V1 (Wang et al., 2006). Dark-reared animals exhibit a delay
in the formation of the orientation map, while binocularly Rewiring vision into the auditory pathway
lid-sutured animals have a near complete degradation of the
map (White et al., 2001), suggesting that low levels of non- After neonatal surgical ablation of the inferior colliculus (IC),
patterned activity have a greater disruptive effect than the retinal ganglion cells are rerouted to target the auditory
absence of input. Therefore, unknown intrinsic properties of thalamus and subsequently induce the auditory pathway to
the cortex instruct the formation of the orientation map in process visual information (figure 6.2). This experimental
the weeks after eye opening and induce the map even in the paradigm allows us to investigate the role of novel input in
absence of vision. However, the orientation map is suscep- producing retinotopic and feature maps, and to screen for
tible to disruption in response to disorganized activity. unknown guidance cues involved in wiring together sensory
In contrast to orientation maps, a 2-week period following pathways. The normal auditory pathway comprises cochlear
eye opening is both necessary and sufficient for the forma- afferents projecting to the inferior colliculus (IC), which sends
tion of direction-selective maps (Li et al., 2006). Thus the fibers along the brachium of the IC (BIC) to the medial
direction-selective map is induced by changes in either the geniculate nucleus (MGN) in the thalamus, which then inner-
cortex or LGN that are driven by activity. Sharpening of vates the primary auditory cortex (A1; figure 6.2A). Using
retinotopic tuning and decreases in the response latency of hamsters, Schneider discovered that retinal afferents form
horng and sur: patterning and plasticity of maps in mammalian visual pathway 95
96 plasticity
Figure 6.2 Primary visual and auditory pathways in normal and specific terminals within MGv lamellae. (Adapted from Sur &
rewired mice: anatomical and physiological consequences of rewir- Leamey, 2001.) (D) Orientation maps are present in normal V1 and
ing. (A) The visual pathway in ferrets and mice begins with retinal rewired A1 of ferrets using optical imaging of intrinsic signals. The
projections to the lateral geniculate nucleus (LGN) and superior col- animal is stimulated with gratings of different orientations, while
liculus (SC). The LGN projects in turn to the primary visual cortex hemodynamic changes in red wavelength light reflectance caused
(V1). The auditory pathway traces from the cochlea to the cochlear by increases in oxygen consumption are detected from the cortex
nucleus (CN) and then to the inferior colliculus (IC). From IC, con- with a digital camera. The orientation preference map is calculated
nections are made with the medial geniculate nucleus (MGN), by computing a vector average of the response signal at each pixel.
which projects to the primary auditory cortex (A1). (B) Ablation of Color bar: color coding representing different orientations. Scale
the IC in neonatal animals induces retinal afferents to innervate the bar: 0.5 mm. (E ) Retrograde tracers reveal the pattern of horizontal
MGN and drive the auditory cortex to process visual information. connections in superficial layers of normal V1, normal A1, and
(C ) Retinogeniculate axons of normal ferrets project to eye-specific rewired A1 of ferrets. Distribution of horizontal connections in
regions of the LGN (horizontal plane), while IC afferents project to rewired A1 more closely resembles that of normal V1 than normal
the ventral subdivision (MGv) of the MGN (coronal plane) and A1 and potentially contributes to the refinement of orientation
innervate lamellae parallel to the lateral-medial axis. Rewired audi- mapping in rewired A1. Scale bars: 500 μm. (Adapted from Sharma,
tory fibers innervate the MGv along adjacent, nonoverlapping eye- Angelucci, & Sur, 2000.) (See color plate 3.)
novel connections to the ventral MGN (MGv) when the IC In the cortex of rewired ferrets, cells in A1 respond to
is ablated after birth (figure 6.2B; Schneider, 1973; Kalil & visual field stimulation and form a functional retinotopic
Schneider, 1975; Frost, 1982; Frost & Metin, 1985). This map of visual space (Roe et al., 1990). However, the thala-
“rewiring” paradigm has subsequently been demonstrated mocortical axons transmitting this information retain their
and studied in the ferret and mouse models (Sur et al., 1988; pattern of elongated projections along the anteroposterior
Roe, Pallas, Hahm, & Sur, 1990; Roe, Pallas, Kwon, & Sur, axis of A1, which typically correspond to isofrequency bands
1992; Lyckman et al., 2001; Newton, Ellsworth, Miyakawa, (Pallas, Roe, & Sur, 1990). In order to create the functional
Tonegawa, & Sur, 2004; Ellsworth, Lyckman, Feldheim, map of focal retinotopic representations, either a refinement
Flanagan, & Sur, 2005). of these elongated inputs by a reorganized intracortical
On receiving retinal ganglion cell input, the MGN adopts inhibitory network or a difference in drive along the projec-
some of the anatomic and physiologic features of the tion itself is required (Sur, Pallas, & Roe, 1990). Consistent
normal LGN (figure 6.2C ). Rewired MGN neurons of the with the first possibility, calbindin-immunoreactive GAB-
ferret exhibit center-surround visual receptive fields (Roe, Aergic neurons of rewired A1 have more elongated axonal
Garraghty, Esguerra, & Sur, 1993), topographic ordering arbors (Gao, Wormington, Newman, & Pallas, 2000). Thus,
(Roe, Hahm, & Sur, 1991), and eye-specific segregation despite persistent structural features of A1 and thalamocorti-
(Angelucci, Clasca, Bricolo, Cramer, & Sur, 1997). The cal input, functional retinotopy can be driven by novel pat-
potential to form ordered retinotopic and ocular dominance terns of activity.
regions in MGN indicates that common patterning cues In the ferret, rewired A1 acquires novel maps of orienta-
exist between the LGN and MGN. Experiments in ephrin tion selectivity with pinwheels and orientation domains
A2/A5 double knockout mice reveal that surgically induced (figure 6.2D), similar in general to maps in normal V1
rewiring is enhanced (Lyckman et al., 2001), with ipsilateral (Sharma et al., 2000; Rao, Toth, & Sur, 1997). In rewired
projections especially increased, as they originate from the A1, orientation maps are less organized, although intrinsic
temporal retina and express the highest levels of EphA horizontal connections of superficial layer pyramidal neurons
receptor (Ellsworth et al., 2005). Loss of innervation to the are clustered and bridge distantly located domains of the
MGN somehow makes this nucleus permissive to retinal same orientation preference, as in V1 (figure 6.2E; Sharma
axon ingrowth, and a gene-screening process between the et al., 2000). This pattern of intracortical connectivity is in
normal and rewired MGN may facilitate the discovery of contrast to horizontal connections in normal A1, where hori-
tropic or repulsive agents regulating retinal axon affinity for zontal connections are limited to isofrequency domains of
different sensory nuclei of the thalamus. the tonotopic map and stretch along these bands. Such reor-
Nonetheless, certain morphological aspects of rewired ganization of horizontal connections driven by visual activity
MGN are resistant to change (figure 6.2C ). In ferrets, retinal is likely related to changes in the inhibitory circuits of rewired
axon terminations are elongated along the typical isofre- A1, and it suggests that coordinated activity-dependent
quency axis, or lamellae, of the MGN as opposed to more changes in inhibitory and excitatory networks of at least the
focal, isotropic distributions in the LGN (Pallas, Hahm, superficial cortical layers are a prominent feature of cortical
& Sur, 1994). In addition, eye-specific clusters are smaller map organization and plasticity.
and cruder than the eye-specific layers of LGN (Angelucci Finally, the rewired auditory pathway is sufficient to
et al., 1997). instruct visually mediated behavior. After training to
horng and sur: patterning and plasticity of maps in mammalian visual pathway 97
distinguish a left visual hemifield stimulus from an audi- ments form a strong contribution to the integrity of the reti-
tory stimulus, ferrets with a unilaterally rewired left notopic and eye-specific maps (Pfeiffenberger et al., 2005;
hemisphere are able to accurately perceive a right visual Pfeiffenberger, Yamada, & Feldheim, 2006; Cang et al.,
hemifield stimulus as visual even after left LGN ablation 2008). The nob mutant mouse, which acquires an abnormal
(von Melchner, Pallas, & Sur, 2000). After left LGN onset of high-frequency waves after eye opening, develops
ablation, the ferrets also possess diminished yet intact normal eye-specific segregation before eye opening, because
spatial acuity in the right hemifield. Subsequent ablation early spontaneous waves are intact. After the onset of abnor-
of the rewired A1 abolishes the animals’ ability to distinguish mal high-frequency waves, eye-specific inputs desegregate
a right hemifield stimulus presented as visual. Thus rewired because of potentially synchronized firing between the eyes
A1 is sufficient and necessary in the absence of ipsilateral (Demas et al., 2006). Similarly, fish exposed to the synchro-
visual pathway input to detect a visual percept in trained nized stimuli of strobe illumination lose eye-specific segrega-
ferrets. In mice, direct subcortical projections from the tion (Schmidt & Eisele, 1985). The most straightforward
MGN to the amygdala are involved in rapid fear condition- mechanism for these data involves the strengthening of cor-
ing to an auditory cue (Rogan & LeDoux, 1995; Doran & related inputs and weakening and subsequent pruning of
LeDoux, 1999; Newton et al., 2004). Because of an indirect noncorrelated inputs (Hebb, 1949; Zhang & Poo, 2001). In
pathway from the LGN through V1 and the perirhinal retinogeniculate synapses, bidirectional changes in synaptic
cortex to the amygdala, a fear conditioning to a visual cue strength depend on the relative timing between optic tract
requires many more training sessions (Heldt, Sudin, Willott, stimulation and LGN depolarization (Butts, Kanold, &
& Falls, 2000). In rewired mice, the acquisition time of a Shatz, 2007).
fear conditioning to a visual cue is accelerated and resembles Specifically how decorrelated axon terminals are elimi-
that of a normal mouse in response to an auditory cue nated and persisting synapses strengthened is not well
(Newton et al., 2004). understood, though canonical immunologic signaling mole-
cules may be involved in synaptic pruning. Loss of Class
Activity-dependent refinement of visual maps I MHC proteins, neuronal pentraxins, and the C1qb com-
ponent of the complement cascade leads to persistently
Although topography of the retinotopic map and eye- enlarged and desegregated ipsilateral zones in the LGN
specific domains are roughly established by programmed (Huh et al., 2000; Bjartmar et al., 2006; Stevens et al.,
guidance and patterning cues, activity plays a critical role in 2007). These molecules are hypothesized to tag weak
the refinement and maturation of these maps. Single cells in synapses for pruning during activity-dependent refinement.
the mouse LGN receive weak input from one to two dozen Notably, these manipulations do not affect the basic topo-
retinal ganglion cells, which occupy 30% of the cell surface, graphic organization of the retinotopic map and eye-
in the first postnatal week, and then begin to prune these specific domains. Conversely, ephrin-A mutants alone
connections down to one to three strong monocular inputs contain topographically disorganized, yet tightly refined,
that occupy 1–5% of the cell surface (Chen & Regher, 2000; retinotectal terminals (Frisen et al., 1998; Feldheim et al.,
Jaubert-Miazza et al., 2005; Guido, 2008). Ipsilateral pro- 2000).
jections to the LGN are also diffuse and widespread during In the mouse, retinotopic maps in V1 require patterned
this first week, occupying nearly 60% of the nucleus area. input for normal maturation. During the first 10 days after
By the time of eye opening (P12–P14), the ipsilateral zone eye opening (P13–P23), normal activity brings eye-specific
occupies only 10% of the LGN ( Jaubert-Miazza et al., 2005; maps to adult levels of responsiveness and precision in
Guido, 2008). receptive field organization (Smith & Tractenberg, 2007).
Both the retinotopic and eye-specific pruning of synapses The contralateral eye develops more precociously in
is affected by altering spontaneous activity caused by cholin- map precision and magnitude, while the ipsilateral eye lags
ergic waves that sweep across the retina (Meister, Wong, behind by roughly 5 days. When the contralateral eye
Baylor, & Shatz, 1991; Wong, Meister, & Shatz, 1993). is deprived, both contralateral and ipsilateral maps are
Blockade of retinal electrical activity with TTX (Harris, delayed in retinotopic precision; when the contralateral
1980) or loss of retinal waves by genetic loss of the β2 nAchR eye is removed or silenced, precision of the ipsilateral
(Rossi et al., 2001; McLaughlin, Torborg, Feller, & O’Leary, maps is accelerated; when the contralateral eye is removed
2003; Grubb, Rossi, Changeux, & Thompson, 2003; Chan- and ipsilateral eye deprived, the ipsilateral map precision
drasekaran, Plas, Gonzalez, & Crair, 2005) causes terminals is delayed. These data suggest that competing patterned
to remain desegregated and diffuse. Combined ephrinA and inputs from both eyes is necessary for normal map refine-
β2 nAchR mutants lead to additive defects in retinotopic ment. Isolated patterned input accelerates map refinement,
organization in the LGN and V1 along the elevation axis in perhaps because of a lack of noise from the contralateral
visual space, demonstrating that activity-dependent refine- eye. The effects of binocular deprivation, or ipsilateral
98 plasticity
removal plus contralateral deprivation, were not examined ocular dominance map has become a paradigmatic model
in this study. of activity-driven reorganization in network structure and
In addition to Hebbian pruning and strengthening of function (figure 6.3A).
feedforward inputs, changes due to activity that contribute
to map refinement potentially involve additional develop- Structural and Functional Changes in Response
mental processes, including the remodeling of excitatory to Lid Suture Within the binocular zone of V1 in
connections, the maturation of inhibitory circuits, and the mammals, neurons particularly in the superficial and deep
timed expression of L-type Ca2+ channels. Excitatory syn-
apses in the LGN initially contain NMDA receptors but
increase their proportion of AMPA receptors as synaptic
elimination proceeds (Chen & Regehr, 2000; X. Liu &
Chen, 2008). Networks of GABAergic interneurons in the
LGN also appear at P5 and mature by P14 (Ziburkus, Lo,
& Guido, 2003; Jaubert-Miazza et al., 2005). L-type Ca2+
channels are expressed in excitatory LGN synapses before
eye opening and are necessary for eye-specific segregation
and CRE-mediated gene transcription (Cork, Namkung,
Shin, & Mize, 2001; Pham, Rubenstein, Silva, Storm, &
Stryker, 2001; Jaubert-Miazza et al., 2005).
In sum, mechanisms of map refinement in response to
activity likely involve a number of different processes that
contribute to the functional maturation of the circuit, includ-
ing the selection and elimination of synapses, the modulation
of synaptic strength, and the structural formation of inhibi-
tory networks. Activity may also in turn influence the actions
of guidance cues; activity blockade prevents ephrinA-
mediated repulsion because of disruptions in cAMP signal-
ing (Nicol et al., 2007).
horng and sur: patterning and plasticity of maps in mammalian visual pathway 99
layers of cortex are driven by both eyes, though neurons in administration of BDNF to dark-reared animals leads to the
layer 4 of carnivores and primates are primarily driven by induction of a critical period (Gianfranceschi et al., 2003).
one eye (Hubel & Wiesel, 1963; Stryker & Harris, 1986). Mice lacking polysialic acid also experience premature
When an imbalance of input occurs after lid suturing one maturation of inhibitory networks and a precocious critical
eye for several days (monocular deprivation, or MD), a period (Di Cristo et al., 2007). GAD65 knockout mice, which
series of structural and functional changes leads to the lack axonal GABA synthesis and subsequent inhibitory
weakening of the deprived eye input and the strengthening transmission, do not experience a critical period unless
of nondeprived eye input (figure 6.3B ). The mechanisms induced with benzodiazepine drug infusion at any age
underlying these changes (figure 6.3C ) shed light on core (Hensch et al., 1998; Fagiolini & Hensch, 2000). Therefore,
principles of plasticity in the developing brain in response to tonic GABA release is sufficient to mature an intracortical
experience. Before functional shifts are apparent, spine inhibitory network and induce critical period plasticity. The
motility increases (Majewska & Sur, 2003; Oray, Majewska, process by which the critical period closes and why critical
& Sur, 2004), followed by transient pruning of spines period induction is a one-time event are not understood.
(Mataga, Mizuguchi, & Hensch, 2004; Oray et al.). Although the critical period occurs once during develop-
Electrophysiological and optical imaging techniques reveal ment, longer periods of MD (7 to 10 days in the mouse) are
that deprived-eye connections are weakened first, while able to trigger ocular dominance plasticity in adulthood
supragranular horizontal connections are remodeled (Sawtell et al., 2003; Hofer, Mrsic-Flogel, Bonhoeffer, &
(Trachtenberg, Trepel, & Stryker, 2000; Trachtenberg & Hübener, 2006; Fischer, Aleem, Zhou, & Pham, 2007).
Stryker, 2001; W. Lee et al., 2006). Strengthening of This form of plasticity is thought to differ mechanistically
nondeprived eye connections follows (Frenkel & Bear, 2004), from that experienced during the critical period, as non-
and finally, layer IV geniculocortical axons representing the deprived-eye connections are strengthened more rapidly
nondeprived eye grow and expand their terminals at the and deprived-eye connections remain stable (Kaneko, Stell-
expense of shrinking deprived-eye terminals (Antonioni & wagen, Malenka, & Stryker, 2008). Previous experiences
Stryker, 1996; Antonioni, Fagiolini, & Stryker, 1999). The with MD, either in the critical period or in adulthood, facili-
chronology of these events has been best characterized for tate plasticity in response to short MD later in life (Hofer et
the developmental “critical period” in mouse, though al.; Frenkel & Bear, 2004), suggesting that a functionally
differences in structural and physiological response may exist suppressed anatomical trace has been laid. Ocular domi-
for MD during adulthood or under different paradigms of nance plasticity may also be induced in adulthood after a
development, such as dark rearing ( Jiang, Treviño, & 10-day period of visual deprivation, and this process mimics
Kirkwood, 2007). The developmental context under which the time course of plasticity present during the critical period
MD is applied can make a qualitative and quantitative (He, Hodos, & Quinlan, 2006; Frenkel & Bear, 2004).
difference in the ocular dominance plasticity observed and Therefore, even the apparent closure of critical period plas-
likely involves different cellular and network mechanisms. ticity may be reactivated by a brief loss of visual input.
Critical Periods and the Developmental Context of Hebbian and Homeostatic Mechanisms of
Plasticity The ability to induce and reverse ocular Plasticity Spike-timing-dependent activity has been
dominance plasticity was initially thought to exist only during demonstrated to lead to strengthening or weakening of
a “critical period” in development, a time approximately 10 geniculocortical and intracortical synapses in V1 (Frégnac
days after eye opening during which short MD (a few days & Shulz, 1999; Meliza & Dan, 2006). Long-term depression
in the mouse) leads to a robust shift in ocular dominance (LTD) of deprived-eye inputs occurs in vivo after MD and
toward the nondeprived eye (Hubel & Wiesel, 1970; Gordon has been proposed to precipitate the eventual reduction in
& Stryker, 1996). This “critical period” is delayed by roughly deprived-eye synapses (Heynen et al., 2003; Frenkel & Bear,
three weeks in dark-reared animals (Cynader, Berman, & 2004). Decreases in the threshold for LTD after dark rearing
Hein, 1976; Fagiolini, Pizzorusso, Berardi, Domenici, & (as a result of decreases in NR2A/NR2B ratio of subunit
Maffei, 1994; G. Mower, 1991), suggesting that the cortex composition in NMDA receptors) are posited to mediate the
must reach a maturational state that is facilitated by a period reactivation of plasticity (He et al., 2006). In hippocampal
of patterned vision. This maturational state has been shown neurons, AMPA receptors are added to synapses during
to involve the development of an inhibitory network that long-term potentiation (LTP) and removed during LTD
depends on BDNF produced in response to neural drive (Manilow & Malenka, 2002), a mechanism that may act as
(Hensch, 2005). Overexpression of BDNF leads to a the substrate for altering synaptic strength in visual cortex
precocious start of the critical period and premature due to altered visual experience. Group 1 metabotropic
development of inhibitory cells in the cortex (Huang et al., glutamate receptors have been identified as inducers of
1999; Hanover, Huang, Tonegawa, & Stryker, 1999), while LTD, and loss of mGlur5 blocks ocular dominance plasticity
100 plasticity
(Dölen et al., 2007). Gene transcription and protein synthesis Hensch, 2005). Gap junctions between parvalbumin-
downstream of synaptic events is necessary for ocular expressing inhibitory cells would also allow for tightly
dominance plasticity. Blocking cortical protein synthesis coupled inputs to drive networks of inhibitory cells more
while preserving LTD effectively prevents ocular dominance strongly and facilitate discriminative responsiveness of pyra-
shifts (Taha & Stryker, 2002). Loss of the cAMP responsive midal cells (Galarreta & Hestrin, 2001; Hensch, 2005).
element, CREB, a protein that promotes CRE-mediated Endocannabinoid signaling on presynaptic terminals of layer
gene transcription, prevents ocular dominance shifts (A. 2/3 are necessary for plasticity, and these synapses may
Mower, Liao, Nestler, Neve, & Ramoa, 2002). Upstream modulate the drive from the supragranular inhibitory
Ca2+-sensitive signaling kinases, including ERK, PKA, and network (C. Liu, Heynen, Shuler, & Bear, 2008).
CamKIIα, are also necessary for OD plasticity, and these
likely activate a number of functional cascades that lead Structural Plasticity and Permissive Changes in
to gene transcription and structural modifications to the Extracellular Matrix Increasing evidence supports
synapse (Di Cristo et al., 2001; Taha & Stryker, 2002; the role of proteases and perineuronal nets (PNN) of
Berardi, Pizzorusso, Ratto, & Maffei, 2003; Suzuki, al-Noori, extracellular matrix in regulating the ability of cortex
Butt, & Pham, 2004; Gomez, Alam, Smith, Horne, & to respond to MD. Degradation of chondroitin-sulfate
Dell’Acqua, 2002; Chierzi, Ratto, Verma, & Fawcett 2005; proteoglycans (CSPGs) leads to the reactivation of ocular
Taha & Stryker, 2005). The extent to which LTD and LTP dominance plasticity in adult cortex (Pizzorusso et al., 2002,
are necessary for ocular dominance shifts is uncertain, 2006). The protease tissue plasminogen activator (tPA),
however. In mice lacking the protein phosphatase, which cleaves extracellular matrix and other molecules, is
calcineurin, LTD is blocked, but ocular dominance plasticity expressed during juvenile MD and is necessary for functional
remains intact (Yang et al., 2005). plasticity in the adult (Mataga, Nagai, & Hensch, 2002;
Although Hebbian mechanisms are likely to contribute to Müller & Greisinger, 1998). Application of tPA enhances
ocular dominance plasticity in which poorly driven synapses spine motility, and loss of tPA prevents the loss of superficial
from the deprived eye are pruned and synapses from the spines after 4 days of MD (Oray et al., 2004; Mataga
nondeprived eye are strengthened (Katz & Shatz, 1996), et al., 2004). Extracellular matrix could have a restrictive
additional cellular and network mechanisms likely affect the effect on ocular dominance plasticity by constraining spine
response of the cortex to MD. Homeostatic processes that motility and axonal growth or by imposing structurally
work to preserve a certain level of cortical drive are known mature functional elements onto intracortical inhibitory
to operate in neuronal development (Turrigiano & Nelson, cells. Parvalbumin-expressing GABAergic cells become
2004) and may contribute to the ability of binocular neurons ensheathed in PNNs as the cortex matures (Härtig et al.,
to undergo ocular dominance plasticity after deprivation 2001); degradation of PNNs may reduce the efficacy of
(Desai, Cudmore, Nelson, & Turrigiano, 2002; Mrsic-Flogel inhibitory input by altering the ionic or chemical milieu and
et al., 2007). Nondeprived inputs strengthen only after allow for plasticity. Mice lacking myelination factors Nogo-
deprived inputs are weakened (Frenkel & Bear, 2004), and 66 receptor and Nogo-A/B exhibit ocular dominance
pathways of synaptic scaling, the global (or cellwide) modu- plasticity after brief MD in adulthood as well as a prolonged
lation of synapses, may be operating. TNFα, a glial secreted critical period (McGee, Yang, Fischer, Daw, & Strittmatter,
cytokine which acts as a positive scaling factor by increasing 2005), suggesting that extracellular factors strongly constrain
synaptic GluR1 and mEPSC amplitudes, is necessary for plasticity.
scaling up synaptic strength in vitro (Stellwagen, Beattie,
Seo, & Malenka, 2005; Stellwagen & Malenka, 2006) and Gene Screens for Novel Plasticity Factors The use of
for the increase in amplitude of nondeprived inputs after gene microarrays to screen for differences in cortical gene
MD (Kaneko et al., 2008). Arc, a negative scaling factor that expression under different conditions has facilitated the
increases AMPAR endocytosis (Chowdhury et al., 2006; discovery of novel pathways and functional molecules
Rial Verde, Lee-Osbourne, Worley, Malinow, & Cline, involved in ocular dominance plasticity. A screen comparing
2006), may also influence ocular dominance plasticity the expression of normal and MD cortex at different ages
(McCurry, Tropea, Wang, & Sur, 2007). revealed common and age-specific pathways modulated by
Inhibitory networks may provide an additional circuit MD (Madjan & Shatz, 2006). A comparison of V1 at different
mechanism for modulating input strength during MD. ages and with MD cortex showed an upregulation of actin-
Somatic inhibition on excitatory pyramidal cells would stabilizing genes, including the calcium sensor, cardiac
allow for instructive gating of precisely correlated inputs by troponin C, and myelinating factors, which were reversed
preventing the backpropagation, as well as subsequent with MD (Lyckman et al., 2008). Comparisons of dark-
strengthening, of imprecisely timed inputs (Bi & Poo, 2001; reared with normal V1 found a reduction in genes with a
Song, Miller, & Abbott, 2000; Pouille & Scanziani, 2001; role in functional inhibition, reflecting a maturational delay,
horng and sur: patterning and plasticity of maps in mammalian visual pathway 101
while MD and normal V1 comparisons identified a number synapses links retinal waves to activity-dependent refinement.
of growth factor and immunomodulatory factors that were PLoS Biol., 5(3), 361.
Cang, J., Kaneko, M., Yamada, J., Woods, G., Stryker,
upregulated in response to MD (Tropea et al., 2006).
M. P., & Feldheim, D. A. (2005). Ephrin-As guide the formation
of functional maps in the visual cortex. Neuron, 48, 577–589.
Summary and conclusion Cang, J., Niell, C. M., Liu, X., Pfeiffenberger, C., Feldheim,
D. A., & Stryker, M. P. (2008). Selective disruption of
Retinotopic and feature selective maps constitute key orga- one Cartesian axis of cortical maps and receptive fields by
deficiency in ephrin-As and structured activity. Neuron, 57(4),
nizational principles of the visual pathway. Intrinsic genetic 511–523.
programs and activity-dependent processes both play a role Carri, N. G., Bengtsson, H., Charette, M. F., & Ebendal, T.
in setting up the structure and function of these maps. In (1998). BMPR-II expression and OP-1 effects in developing
addition, patterns of activity interact with programs of gene chicken retinal explants. NeuroReport, 9(6), 1097–1101.
expression as they modulate signaling pathways within the Chandrasekaran, A. R., Plas, D. T., Gonzalez, E., & Crair, M.
C. (2005). Evidence for an instructive role of retinal activity in
cell. Understanding specific mechanisms of how visual
retinotopic map refinement in the superior colliculus of the
stimulus feature maps are assembled and modified in mouse. J. Neurosci., 25(29), 6929–6938.
response to experience is central to identifying fundamental Chapman, B., Stryker, M. P., & Bonhoeffer, T. (1996). Develop-
processes of neural circuit development and plasticity. ment of orientation-preference maps in ferret primary visual
cortex, J. Neurosci., 16, 6443–6453.
Charron, F., & Tessier-Lavigne, M. (2005). Novel brain wiring
functions for classical morphogens: A role as graded positional
REFERENCES
cues in axon guidance. Development, 132(10), 2251–2262.
Angelucci, A., Clasca, F., Bricolo, E., Cramer, K. S., & Sur, Chen, C., & Regehr, W. G. (2000). Developmental remodeling of
M. (1997). Experimentally induced retinal projections to the the retinogeniculate synapse. Neuron, 28(3), 955–966.
ferret auditory thalamus: Development of clustered eye- Chierzi, S., Ratto, G. M., Verma, P., & Fawcett, J. W. (2005).
specific patterns in a novel target. J. Neurosci., 17(6), 2040–2055. The ability of axons to regenerate their growth cones depends
Antonini, A., Fagiolini, M., & Stryker, M. P. (1999). Anato- on axonal type and age, and is regulated by calcium, cAMP and
mical correlates of functional plasticity in mouse visual cortex. ERK. Eur. J. Neurosci., 21, 2051–2062.
J. Neurosci., 19(11), 4388–4406. Chowdhury, S., Shepherd, J. D., Okuno, H., Lyford, G.,
Antonini, A., & Stryker, M. P. (1996). Plasticity of geniculocorti- Petralia, R. S., Plath, N., et al. (2006). Arc/Arg3.1 interacts
cal afferents following brief or prolonged monocular occlusion with the endocytic machinery to regulate AMPA receptor
in the cat. J. Comp. Neurol., 369(1), 64–82. trafficking. Neuron, 52, 445–459.
Berardi, N., Pizzorusso, T., Ratto, G. M., & Maffei, L. (2003). Colello, R. J., & Guillery, R. W. (1990). The early development
Molecular basis of plasticity in the visual cortex. Trends Neurosci., of retinal ganglion cells with uncrossed axons in the
26, 369–378. mouse: Retinal position and axonal course. Development, 108,
Bi, G., & Poo, M. (2001). Synaptic modification by correlated 515–523.
activity: Hebb’s postulate revisited. Annu. Rev. Neurosci., 24, Constantine-Paton, M., & Law, M. I. (1978). Eye-specific termi-
139–166. nation bands in tecta of three-eyed frogs. Science, 202(4368),
Bjartmar, L., Huberman, A. D., Ullian, E. M., Rentería, 639–641.
R. C., Liu, X., Xu, W., et al. (2006). Neuronal pentraxins Coppola, D. M., & White, L. E. (2004). Visual experience pro-
mediate synaptic refinement in the developing visual system. motes the isotropic representation of orientation preference.
J. Neurosci., 26(23), 6269–6281. Visual Neurosci., 21, 39–51.
Bolz, J., Uziel, D., Muhlfriedel, S., Gullmar, A., Cork, R. J., Namkung, Y., Shin, H. S., & Mize, R. R. (2001).
Peuckert, C., Zarbalis, K., et al. (2004). Multiple roles of Development of the visual pathway disrupted in mice with a
ephrins during the formation of thalamocortical projections: targeted disruption of the calcium channel beta(3)-subunit gene.
Maps and more. J. Neurobiol., 59(1), 82–94. J. Comp. Neurol., 440(2), 177–191.
Bosking, W. H., Crowley, J. C., & Fitzpatrick, D. (2002). Spatial Cynader, M., Berman, N., & Hein, A. (1976). Recovery of
coding of position and orientation in primary visual cortex. Nat. function in cat visual cortex following prolonged deprivation.
Neurosci., 5, 874–882. Exp. Brain Res., 25(2), 139–156.
Bovolenta, P. (2005). Morphogen signaling at the vertebrate Dehay, C., & Kennedy, H. (2007). Cell-cycle control and cortical
growth cone: A few cases or a general strategy? J. Neurobiol., development. Nat. Rev. Neurosci., 8(6), 438–450.
64(4), 405–416. Demas, J., Sagdullaev, B. T., Green, E., Jaubert-Miazza, L.,
Brown, A., Yates, P. A., Burrola, P., Ortuno, D., Vaidya, A., McCall, M. A., Gregg, R. G., et al. (2006). Failure to maintain
Jessell, T. M., et al. (2000). Topographic mapping from the eye-specific segregation in nob, a mutant with abnormally pat-
retina to the midbrain is controlled by relative but not absolute terned retinal activity. Neuron, 50(2), 247–259.
levels of EphA receptor signaling. Cell, 102, 77–88. Demyanenko, G. P., & Maness, P. F. (2003). The L1 cell adhesion
Brunet, I., Weinl, C., Piper, M., Trembleau, A., Volovitch, M., molecule is essential for topographic mapping of retinal axons.
Harris, W., et al. (2005). The transcription factor Engrailed-2 J. Neurosci., 23(2), 530–538.
guides retinal axons. Nature, 438(7064), 94–98. Desai, N. S., Cudmore, R. H., Nelson, S. B., & Turrigiano,
Butts, D. A., Kanold, P. O., & Shatz, C. J. (2007). A G. G. (2002). Critical periods for experience-dependent synaptic
burst-based “Hebbian” learning rule at retinogeniculate scaling in visual cortex. Nat. Neurosci., 5(8), 783–789.
102 plasticity
Di Cristo, G., Berardi, N., Cancedda, L., Pizzorusso, T., RAGS) is essential for proper retinal axon guidance and topo-
Putignano, E., Ratto, G. M., et al. (2001). Requirement of graphic mapping in the mammalian visual system. Neuron, 20,
ERK activation for visual cortical plasticity. Science, 292(5525), 235–243.
2337–2340. Frost, D. O. (1982). Anomalous visual connections to somatosen-
Di Cristo, G., Chattopadhyaya, B., Kuhlman, S. J., Fu, Y., sory and auditory systems following brain lesions in early life.
Bélanger, M. C., Wu, C. Z., et al. (2007). Activity-dependent Brain Res., 255, 627–635.
PSA expression regulates inhibitory maturation and onset of Frost, D. O., & Metin, C. (1985). Induction of functional
critical period plasticity. Nat. Neurosci., 10(12), 1569–1577. retinal projections to the somatosensory system. Nature, 317,
Dölen, G., Osterweil, E., Rao, B. S., Smith, G. B., Auerbach, 162–164.
B. D., Chattarji, S., et al. (2007). Correction of fragile X Fukuchi-Shimogori, T., & Grove, E. A. (2001). Neocortex
syndrome in mice. Neuron, 56(6), 955–962. patterning by the secreted signaling molecule FGF8. Science,
Doran, N. N., & LeDoux, J. E. (1999). Organization of projections 294(5544), 1071–1074.
to the lateral amygdale from auditory and visual areas of the Galarreta, M., & Hestrin, S. (2001). Spike transmission and
thalamus in the rat. J. Comp. Neurol., 430, 235–249. synchrony detection in networks of GABAergic interneurons.
Dräger, U. C., & Olsen, J. F. (1980). Origins of crossed and Science, 292(5525), 2295–2299.
uncrossed retinal projections in pigmented and albino mice. Gao, W. J., Wormington, A. B., Newman, D. E., & Pallas,
J. Comp. Neurol., 191(3), 383–412. S. L. (2000). Development of inhibitory circuitry in visual and
Ellsworth, C. A., Lyckman, A. W., Feldheim, D. A., Flanagan, auditory cortex of postnatal ferrets: Immunocytochemical
J. G., & Sur, M. (2005). Ephrin-A2 and -A5 influence patterning localization of calbindin- and parvalbumin-containing neurons.
of normal and novel retinal projections to the thalamus: Con- J. Comp. Neurol., 422(1), 140–157.
served mapping mechanisms in visual and auditory thalamic Gianfranceschi, L., Siciliano, R., Walls, J., Morales, B.,
targets. J. Comp. Neurol., 488, 140–151. Kirkwood, A., Huang, Z. J., et al. (2003). Visual cortex is
Erskine, L., Williams, S. E., Brose, K., Kidd, T., Rachel, rescued from the effects of dark rearing by overexpression of
R. A., Goodman, C. S., et al. (2000). Retinal ganglion cell axon BDNF. Proc. Natl. Acad. Sci. USA, 100(21), 12486–12491.
guidance in the mouse optic chiasm: Expression and function of Godement, P., Salaun, J., & Imbert, M. (1984). Prenatal and
robos and slits. J. Neurosci., 20(13), 4975–4982. postnatal development of retinogeniculate and retinocollicular
Fagiolini, M., & Hensch, T. K. (2000). Inhibitory threshold projections in the mouse. J. Comp. Neurol., 230, 552–575.
for critical-period activation in primary visual cortex. Nature, Gomez, L. L., Alam, S., Smith, K. E., Horne, E., & Dell’Acqua,
404(6774), 183–186. M. L. (2002). Regulation of A-kinase anchoring protein 79/
Fagiolini, M., Pizzorusso, T., Berardi, N., Domenici, L., & 150-cAMP-dependent protein kinase postsynaptic targeting by
Maffei, L. (1994). Functional postnatal development of the NMDA receptor activation of calcineurin and remodeling of
rat primary visual cortex and the role of visual experience: dendritic actin. J. Neurosci., 22, 7027–7044.
Dark rearing and monocular deprivation. Vis. Res., 34(6), Gordon, J. A., & Stryker, M. P. (1996). Experience-dependent
709–720. plasticity of binocular responses in the primary visual cortex of
Farley, B. J., Yu, H., Jin, D. Z., & Sur, M. (2007). Alteration of the mouse. J. Neurosci., 16(10), 3274–3286.
visual input results in a coordinated reorganization of multiple Grove, E. A., & Fukuchi-Shimogori, T. (2003). Generating
visual cortex maps. J. Neurosci., 27(38), 10299–10310. the cerebral cortical area map. Annu. Rev. Neurosci., 26, 355–
Feldheim, D. A., Kim, Y. I., Bergemann, A. D., Frisen, J., 380.
Barbacid, M., & Flanagan, J. G. (2000). Genetic analysis of Grubb, M. S., Rossi, F. M., Changeux, J. P., & Thompson,
ephrin-A2 and ephrin-A5 shows their requirement in multiple I. D. (2003). Abnormal functional organization in the dorsal
aspects of retinocollicular mapping. Neuron, 25, 563–574. lateral geniculate nucleus of mice lacking the beta 2 subunit
Feldheim, D. A., Vanderhaeghen, P., Hansen, M. J., Frisen, J., of the nicotinic acetylcholine receptor. Neuron, 40(6), 1161–
Lu, Q., Barbacid, M., et al. (1998). Topographic guidance 1172.
labels in a sensory projection to the forebrain. Neuron, 21, Guido, W. (2008). Refinement of the retinogeniculate pathway.
1303–1313. J. Physiol, 586, 4357–4362.
Ferster, D., & Miller, K. D. (2000). Neural mechanisms of ori- Gurung, B., & Fritzsch, B. (2004). Time course of embryonic
entation selectivity in the visual cortex. Annu. Rev. Neurosci., 23, midbrain and thalamic auditory connection development in
441–471. mice as revealed by carbocyanine dye tracing. J. Comp. Neurol.,
Figdor, M. C., & Stern, C. D. (1993). Segmental organization of 479(3), 309–327.
embryonic diencephalon. Nature, 363, 630–634. Hanover, J. L., Huang, Z. J., Tonegawa, S., & Stryker,
Fischer, Q. S., Aleem, S., Zhou, H., & Pham, T. A. (2007). Adult M. P. (1999). Brain-derived neurotrophic factor overexpression
visual experience promotes recovery of primary visual cortex induces precocious critical period in mouse visual cortex.
from long-term monocular deprivation. Learn. Memory, 14(9), J. Neurosci., 19(22), RC40.
573–580. Hansen, M. J., Dallal, G. E., & Flanagan, J. G. (2004). Retinal
Frégnac, Y., & Shulz, D. E. (1999). Activity-dependent regulation axon response to ephrin-As shows a graded, concentration-
of receptive field properties of cat area 17 by supervised Hebbian dependent transition from growth promotion to inhibition.
learning. J. Neurobiol., 41(1), 69–82. Neuron, 42(5), 717–730.
Frenkel, M. Y., & Bear, M. F. (2004). How monocular depriva- Harris, W. A. (1980). The effects of eliminating impulse activity
tion shifts ocular dominance in visual cortex of young mice. on the development of the retinotectal projection in salaman-
Neuron, 44(6), 917–923. ders. J. Comp. Neurol., 194, 303–317.
Frisen, J., Yates, P. A., McLaughlin, T., Friedman, G. C., Härtig, W., Singer, A., Grosche, J., Brauer, K., Ottersen,
O’Leary, D. D., & Barbacid, M. (1998). Ephrin-A5 (AL-1/ O. P., & Brückner, G. (2001). Perineuronal nets in the rat
horng and sur: patterning and plasticity of maps in mammalian visual pathway 103
medial nucleus of the trapezoid body surround neurons immu- Jaubert-Miazza, L., Green, E., Lo, F. S., Bui, K., Mills, J.,
noreactive for various amino acids, calcium-binding proteins & Guido, W. (2005). Structural and functional composition of
and the potassium channel subunit Kv3.1b. Brain Res., 899(1–2), the developing retinogeniculate pathway in the mouse. Visual
123–133. Neurosci., 22(5), 661–676.
He, H. Y., Hodos, W., & Quinlan, E. M. (2006). Visual depriva- Jiang, B., Treviño, M., & Kirkwood, A. (2007). Sequential devel-
tion reactivates rapid ocular dominance plasticity in adult visual opment of long-term potentiation and depression in different
cortex. J. Neurosci., 26(11), 2951–2955. layers of the mouse visual cortex. J. Neurosci., 27(36),
Hebb, D. O. (1949). The organization of behavior. New York: John 9648–9652.
Wiley and Sons. Job, C., & Tan, S. (2003). Constructing the mammalian
Hehr, C. L., Hocking, J. C., & McFarlane, S. (2005). Matrix neocortex: The role of intrinsic factors. Dev. Biol., 257,
metalloproteinases are required for retinal ganglion cell 221–232.
axon guidance at select decision points. Development, 132(15), Jones, E. G. (1985). The thalamus. New York: Plenum.
3371–3379. Jones, E. G., & Rubenstein, J. L. R. (2004). Expression of
Heldt, S., Sudin, V., Willott, J. F., & Falls, W. A. (2000). regulatory genes during differentiation of thalamic nuclei in
Posttraining lesions of the amygdale interfere with fear- mouse and monkey. J. Comp. Neurol., 47, 55–80.
potentiated startle to both visual and auditory conditioned Kalil, R. E., & Schneider, G. E. (1975). Abnormal synaptic con-
stimuli in C57BL/6J mice. Behav. Neurosci., 114, 749–759. nections of the optic tract in the thalamus after midbrain lesions
Hensch, T. K. (2005). Critical period plasticity in local cortical in newborn hamsters. Brain Res, 100, 690–698.
circuits. Nat. Rev. Neurosci., 6(11), 877–888. Kaneko, M., Stellwagen, D., Malenka, R. C., & Stryker,
Hensch, T. K., Fagiolini, M., Mataga, N., Stryker, M. P., M. P. (2008). Tumor necrosis factor–alpha mediates one
Baekkeskov, S., & Kash, S. F. (1998). Local GABA circuit component of competitive, experience-dependent plasticity in
control of experience-dependent plasticity in developing visual developing visual cortex. Neuron, 58(5), 673–680.
cortex. Science, 282(5393), 1504–1508. Katz, L. C., & Shatz, C. J. (1996). Synaptic activity and the con-
Herrera, E., Brown, L., Aruga, J., Rachel, R. A., Dölen, G., struction of cortical circuits. Science, 274, 1133–1138.
Mikoshiba, K., et al. (2003). Zic2 patterns binocular vision Kolpak, A., Zhang, J., & Bao, Z. Z. (2005). Sonic hedgehog has
by specifying the uncrossed retinal projection. Cell, 114(5), a dual effect on the growth of retinal ganglion axons depending
545–557. on its concentration. J. Neurosci., 25(13), 3432–3441.
Heynen, A. J., Yoon, B. J., Liu, C. H., Chung, H. J., Huganir, Leamey, C. A., Glendining, K. A., Kreiman, G., Kang, N. D.,
R. L., & Bear, M. F. (2003). Molecular mechanism for loss of Wang, K. H., Fassler, R., et al. (2008). Differential gene expres-
visual cortical responsiveness following brief monocular depriva- sion between sensory neocortical areas: Potential roles
tion. Nat. Neurosci., 6(8), 854–862. for ten_m3 and Bcl6 in patterning visual and somatosensory
Hindges, R., McLaughlin, T., Genoud, N., Henkemeyer, M., & pathways. Cereb. Cortex, 18(1), 53–66.
O’Leary, D. D. (2002). EphB forward signaling controls direc- Leamey, C. A., Merlin, S., Lattouf, P., Sawatari, A., Zhou, X.,
tional branch extension and arborization required for dorsal- Demel, N., et al. (2007). Ten_m3 regulates eye-specific
ventral retinotopic mapping. Neuron, 35(3), 475–487. patterning in the mammalian visual pathway and is required for
Hofer, S. B., Mrsic-Flogel, T. D., Bonhoeffer, T., & Hübener, binocular vision. PLoS Biol., 5, e241.
M. (2006). Prior experience enhances plasticity in adult visual Lee, R., Petros, T. J., & Mason, C. A. (2008). Zic2 regulates
cortex. Nat. Neurosci., 9(1), 127–132. retinal ganglion cell axon avoidance of ephrinB2 through induc-
Hopker, V. H., Shewan, D., Tessier-Lavigne, M., Poo, M., & ing expression of the guidance receptor EphB1. J. Neurosci.,
Holt, C. (1999). Growth-cone attraction to netrin-1 is converted 28(23), 5910–5919.
to repulsion by laminin-1. Nature, 401(6748), 69–73. Lee, W. C., Huang, H., Feng, G., Sanes, J. R., Brown, E. N., So,
Huang, Z. J., Kirkwood, A., Pizzorusso, T., Porciatti, V., P. T., et al. (2006). Dynamic remodeling of dendritic arbors in
Morales, B., Bear, M. F., et al. (1999). BDNF regulates the GABAergic interneurons of adult visual cortex. PLoS Biol., 4(2),
maturation of inhibition and the critical period of plasticity in e29.
mouse visual cortex. Cell, 98(6), 739–755. Li, Y., Fitzpatrick, D., & White, L. E. (2006). The development
Hubel, D. H., & Wiesel, T. N. (1963). Shape and arrangement of of direction selectivity in ferret visual cortex requires early visual
columns in cat’s striate cortex. J. Physiol., 165, 559–568. experience. Nat. Neurosci., 9, 676–681.
Hubel, D. H., & Wiesel, T. N. (1970). The period of susceptibility Liu, C. H., Heynen, A. J., Shuler, M. G., & Bear, M. F. (2008).
to the physiological effects of unilateral eye closure in kittens. Cannabinoid receptor blockade reveals parallel plasticity
J. Physiol., 206, 419–436. mechanisms in different layers of mouse visual cortex. Neuron, 58,
Huberman, A. D., Murray, K. D., Warland, D. K., Feldheim, 340–345.
D. A., & Chapman, B. (2005). Ephrin-As mediate targeting of Liu, X., & Chen, C. (2008). Different roles for AMPA and NMDA
eye-specific projections to the lateral geniculate nucleus. Nat. receptors in transmission at the immature retinogeniculate
Neurosci., 8(8), 1013–1021. synapse. J. Neurophysiol., 99(2), 629–643.
Huh, G. S., Boulanger, L. M., Du, H., Riquelme, P. A., Brotz, Luo, L., & Flanagan, J. G. (2007). Development of continuous and
T. M., & Shatz, C. J. (2000). Functional requirement for class discrete neural maps. Neuron, 56(2), 284–300.
I MHC in CNS development and plasticity. Science, 290(5499), Lyckman, A. W., Horng, S., Leamey, C. A., Tropea, D.,
2155–2159. Watakabe, A., Van Wart, A., et al. (2008). Gene expression
Inatani, M. (2005). Molecular mechanisms of optic axon patterns in visual cortex during the critical period: Synaptic
guidance. Naturwissenschaften, 92(12), 549–561. stabilization and reversal by visual deprivation. Proc. Natl. Acad.
Inoue, T., Nakamura, S., & Osumi, N. (2000). Fate mapping of Sci. USA, 105(27), 9409–9414.
the mouse prosencephalic neural plate. Dev. Biol. 219(2), Lyckman, A. W., Jhaveri, S., Feldheim, D. A., Vanderhaeghen,
373–383. P., Flanagan, J. G., & Sur, M. (2001). Enhanced plasticity of
104 plasticity
retinothalamic projections in an ephrin-A2/A5 double mutant. Müller, C. M., & Griesinger, C. B. (1998). Tissue plasminogen
J. Neurosci., 21, 7684–7690. activator mediates reverse occlusion plasticity in visual cortex.
Majdan, M., & Shatz, C. J. (2006). Effects of visual experience on Nat. Neurosci., 1(1), 47–53.
activity-dependent gene regulation in cortex. Nat. Neurosci., 9, Nakagawa, Y., & O’Leary, D. D. (2001). Combinatorial
650–659. expression patterns of LIM-homeodomain and other regulatory
Majewska, A., & Sur, M. (2003). Motility of dendritic spines in genes parcellate developing thalamus. J. Neurosci., 21(8), 2711–
visual cortex in vivo: Changes during the critical period and 2725.
effects of visual deprivation. Proc. Natl. Acad. Sci. USA, 100(26), Nakagawa, Y., & O’Leary, D. D. (2002). Patterning centers,
16024–16029. regulatory genes and extrinsic mechanisms controlling arealiza-
Malinow, R., & Malenka, R. C. (2002). AMPA receptor tion of the neocortex. Curr. Opin. Neurobiol., 12(1), 14–25.
trafficking and synaptic plasticity. Annu. Rev. Neurosci., 25, Nakamoto, M., Cheng, H. J., Friedman, G. C., McLaughlin,
103–126. T., Hansen, M. J., Yoon, C. H., et al. (1996). Topographically
Mann, F., Harris, W. A., & Holt, C. E. (2004). New views on specific effects of ELF-1 on retinal axon guidance in vitro and
retinal axon development: A navigation guide. Int. J. Dev. Biol., retinal axon mapping in vivo. Cell, 86(5), 755–766.
48(8–9), 957–964. Newton, J. R., Ellsworth, C., Miyakawa, T., Tonegawa, S., &
Marino, J., Schummers, J., Lyon, D. C., Schwabe, L., Beck, O., Sur, M. (2004). Acceleration of visually cued conditioned fear
Wiesing, P., et al. (2005). Invariant computations in local through the auditory pathway. Nat. Neurosci., 7(9), 968–973.
cortical networks with balanced excitation and inhibition. Nat. Nicol, X., Voyatzis, S., Muzerelle, A., Narboux-Nême, N.,
Neurosci, 8, 194–201. Südhof, T. C., Miles, R., et al. (2007). cAMP oscillations and
Mataga, N., Mizuguchi, Y., & Hensch, T. K. (2004). Experience- retinal activity are permissive for ephrin signaling during
dependent pruning of dendritic spines in visual cortex by tissue the establishment of the retinotopic map. Nat. Neurosci., 10(3),
plasminogen activator. Neuron, 44(6), 1031–1041. 340–347.
Mataga, N., Nagai, N., & Hensch, T. K. (2002). Permissive O’Leary, D. D. (1989). Do cortical areas emerge from a protocor-
proteolytic activity for visual cortical plasticity. Proc. Natl. Acad. tex? Trends Neurosci., 12(10), 400–406.
Sci. USA, 99(11), 7717–7721. Oray, S., Majewska, A., & Sur, M. (2004). Dendritic spine
McCurry, C., Tropea, D., Wang, K. H., & Sur, M. (2007). A role dynamics are regulated by monocular deprivation and extracel-
for Arc in constraining ocular dominance plasticity in adult visual cortex. lular matrix degradation. Neuron, 44(6), 1021–1030.
Program No. 1304/B22 Neuroscience Meeting Planner. San Oster, S. F., Bodeker, M. O., He, F., & Sretavan, D. W.
Diego: Society for Neuroscience. (2003). Invariant Sema5A inhibition serves an ensheathing func-
McGee, A. W., Yang, Y., & Fischer, Q. S., Daw, N. W., & tion during optic nerve development. Development, 130(4),
Strittmatter, S. M. (2005). Experience-driven plasticity of 775–784.
visual cortex limited by myelin and Nogo receptor. Science, Pak, W., Hindges, R., Lim, Y. S., Pfaff, S. L., & O’Leary,
309(5744), 2222–2226. D. D. (2004). Magnitude of binocular vision controlled by islet-2
McLaughlin, T., Hindges, R., Yates, P. A., & O’Leary, repression of a genetic program that specifies laterality of retinal
D. D. (2003). Bifunctional action of ephrin-B1 as a repellent and axon pathfinding. Cell, 119, 567–578.
attractant to control bidirectional branch extension in dorsal- Pallas, S. L., Hahm, J., & Sur, M. (1994). Morphology of
ventral retinotopic mapping. Development, 130(11), 2407–2418. retinal axons induced to arborize in a novel target, the medial
McLaughlin, T., Torborg, C. L., Feller, M. B., O’Leary, D. geniculate nucleus. I. Comparison with arbors in normal targets.
D. (2003). Retinotopic map refinement requires spontaneous J. Comp. Neurol., 349(3), 343–362.
retinal waves during a brief critical period of development. Pallas, S. L., Roe, A. W., & Sur, M. (1990). Visual projections
Neuron, 40(6), 1147–1160. induced into the auditory pathway of ferrets. I. Novel inputs to
Meister, M., Wong, R. O., Baylor, D. A., & Shatz, C. J. primary auditory cortex (AI) from the LP/pulvinar complex
(1991). Synchronous bursts of action potentials in ganglion cells and the topography of the MGN-AI projection. J. Comp. Neurol.,
of the developing mammalian retina. Science, 252(5008), 298(1), 50–68.
939–943. Pfeiffenberger, C., Cutforth, T., Woods, G., Yamada, J.,
Meliza, C. D., & Dan, Y. (2006). Receptive-field modification in Renteria, R. C., & Copenhagen, D. R. (2005). Ephrin-As
rat visual cortex induced by paired visual stimulation and single- and neural activity are required for eye-specific patterning during
cell spiking. Neuron, 49(2), 183–189. retinogeniculate mapping. Nat. Neurosci., 8, 1022–1027.
Monnier, P. P., Sierra, A., Macchi, P., Deitinghoff, L., Pfeiffenberger, C., Yamada, J., & Feldheim, D. A. (2006).
Andersen, J. S., Mann, M., et al. (2002). RGM is a repulsive Ephrin-As and patterned retinal activity act together in the
guidance molecule for retinal axons. Nature, 419(6905), development of topographic maps in the primary visual system.
392–395. J. Neurosci., 26, 12873–12884.
Mower, A. F., Liao, D. S., Nestler, E. J., Neve, R. L., & Ramoa, Pham, T. A., Rubenstein, J. L., Silva, A. J., Storm, D. R., &
A. S. (2002). cAMP/Ca2+ response element-binding protein Stryker, M. P. (2001). The CRE/CREB pathway is transiently
function is essential for ocular dominance plasticity. J. Neurosci., expressed in thalamic circuit development and contributes to
22(6), 2237–2245. refinement of retinogeniculate axons. Neuron, 31(3), 409–420.
Mower, G. D. (1991). The effect of dark rearing on the time Pizzorusso, T., Medini, P., Berardi, N., Chierzi, S., Fawcett,
course of the critical period in cat visual cortex. Brain Res. Dev. J. W., & Maffei, L. (2002). Reactivation of ocular dominance
Brain Res., 58(2), 151–158. plasticity in the adult visual cortex. Science, 298, 1248–1251.
Mrsic-Flogel, T. D., Hofer, S. B., Ohki, K., Reid, R. C., Pizzorusso, T., Medini, P., Landi, S., Baldini, S., Berardi, N.,
Bonhoeffer, T., & Hübener, M. (2007). Homeostatic & Maffei, L. (2006). Structural and functional recovery from
regulation of eye-specific responses in visual cortex during ocular early monocular deprivation in adult rats. Proc. Natl. Acad. Sci.
dominance plasticity. Neuron, 54(6), 961–972. USA, 103(22), 8517–8522.
horng and sur: patterning and plasticity of maps in mammalian visual pathway 105
Plump, A. S., Erskine, L., Sabatier, C., Brose, K., Epstein, Schmitt, A. M., Shi, J., Wolf, A. M., Lu, C. C., King,
C. J., Goodman, C. S., et al. (2002). Slit1 and Slit2 cooperate L. A., & Zou, Y. (2006). Wnt-Ryk signalling mediates medial-
to prevent premature midline crossing of retinal axons in the lateral retinotectal topographic mapping. Nature, 439(7072),
mouse visual system. Neuron, 33(2), 219–232. 31–37.
Pouille, F., & Scanziani, M. (2001). Enforcement of temporal Schneider, G. E. (1973). Early lesions of superior colliculus: Factors
fidelity in pyramidal cells by somatic feed-forward inhibition. affecting the formation of abnormal retinal projections. Brain
Science, 293(5532), 1159–1163. Behav. Evol., 8, 73–109.
Ragsdale, C. W., & Grove, E. A. (2001). Patterning the Sharma, J., Angelucci, A., & Sur, M. (2000). Induction of
mammalian cerebral cortex. Curr. Opin. Neurobiol., 11(1), visual orientation modules in auditory cortex. Nature, 404,
50–58. 841–847.
Rakic, P. (1988). Specification of cerebral cortical areas. Science, Shatz, C. J. (1983). The prenatal development of the cat’s retino-
242, 170–176. geniculate pathway. J. Neurosci., 3, 482–499.
Rao, S. C., Toth, L. J., & Sur, M. (1997). Optically imaged maps Shatz, C. J., & Stryker, M. P. (1988). Prenatal tetrodotoxin
of orientation preference in primary visual cortex of cats and infusion blocks segregation of retinogeniculate afferents. Science,
ferrets. J. Comp. Neurol., 387(3), 358–370. 242, 87–89.
Rial Verde, E. M., Lee-Osbourne, J., Worley, P. F., Malinow, Shewan, D., Dwivedy, A., Anderson, R., & Holt, C. E. (2002).
R., & Cline, H. T. (2006). Increased expression of the immedi- Age-related changes underlie switch in netrin-1 responsiveness
ate-early gene arc/arg3.1 reduces AMPA receptormediated as growth cones advance along visual pathway. Nat. Neurosci.,
synaptic transmission. Neuron, 52, 461–474. 5(10), 955–962.
Ringstedt, T., Braisted, J. E., Brose, K., Kidd, T., Goodman, Shimogori, T., Banuchi, V., Ng, H. Y., Strauss, J. B., & Grove,
C., & Tessier-Lavigne, M., et al. (2000). Slit inhibition of E. A. (2004). Embryonic signaling centers expressing BMP,
retinal axon growth and its role in retinal axon pathfinding and WNT and FGF proteins interact to pattern the cerebral cortex.
innervation patterns in the diencephalon. J. Neurosci., 20(13), Development, 131(22), 5639–5647.
4983–4991. Smith, S. L., & Trachtenberg, J. T. (2007). Experience-
Rodriguez, J., Esteve, P., Weinl, C., Ruiz, J. M., Fermin, Y., & dependent binocular competition in the visual cortex begins at
Trousse, F. (2005). SFRP1 regulates the growth of retinal gan- eye opening. Nat. Neurosci., 10(3), 370–375.
glion cell axons through the Fz2 receptor. Nat. Neurosci., 8(10), Somers, D. C., Nelson, S. B., & Sur, M. (1995). An emergent
1301–1309. model of orientation selectivity in cat visual cortical simple cells.
Roe, A. W., Garraghty, P. E., Esguerra, M., & Sur, J. Neurosci., 15, 5448–5465.
M. (1993). Experimentally induced visual projections to the Song, S., Miller, K. D., & Abbott, L. F. (2000). Competitive
auditory thalamus in ferrets: Evidence for a W cell pathway. Hebbian learning through spike-timing-dependent synaptic
J. Comp. Neurol., 334(2), 263–280. plasticity. Nat. Neurosci., 3(9), 919–926.
Roe, A. W., Hahm, J. O., & Sur, M. (1991). Experimentally Sperry, R. W. (1963). Chemoaffinity in the orderly growth of
induced establishment of visual topography in auditory thala- nerve fiber patterns and connections. Proc. Natl. Acad. Sci. USA,
mus. Soc. Neurosci. Abstracts, 17, 898. 50, 703–710.
Roe, A. W., Pallas, S. L., Hahm, J. O., & Sur, M. (1990). A Stellwagen, D., Beattie, E. C., Seo, J. Y., & Malenka, R. C.
map of visual space induced in primary auditory cortex. Science, (2005). Differential regulation of AMPA receptor and GABA
250(4982), 818–820. receptor trafficking by tumor necrosis factor-alpha. J. Neurosci.,
Roe, A. W., Pallas, S. L., Kwon, Y. H., & Sur, M. (1992). 25(12), 3219–3228.
Visual projections routed to the auditory pathway in ferrets: Stellwagen, D., & Malenka, R. C. (2006). Synaptic scaling
Receptive fields of visual neurons in primary auditory cortex. mediated by glial TNF-alpha. Nature, 440(7087), 1054–1059.
J. Neurosci., 12(9), 3651–3664. Stevens, B., Allen, N. J., Vazquez, L. E., Howell, G. R.,
Rogan, M. T., & LeDoux, J. E. (1995). LTP is accompanied by Christopherson, K. S., Nouri, N., et al. (2007). The classical
commensurate enhancement of auditor-evoked responses in a complement cascade mediates CNS synapse elimination. Cell,
fear conditioning circuit. Neuron, 15, 127–136. 131(6), 1164–1178.
Rossi, F. M., Pizzorusso, T., Porciatti, V., Marubio, L. M., Stryker, M. P., & Harris, W. A. (1986). Binocular impulse block-
Maffei, L., & Changeux, J. P. (2001). Requirement of the nico- ade prevents the formation of ocular dominance columns in cat
tinic acetylcholine receptor beta 2 subunit for the anatomical visual cortex. J. Neurosci., 6(8), 2117–2133.
and functional development of the visual system. Proc. Natl. Acad. Sur, M., Garraghty, P. E. & Roe, A. W. (1988). Experimentally
Sci. USA, 98(11), 6453–6458. induced visual projections into auditory thalamus and cortex.
Rubenstein, J. L., Martinez, S., Shimamura, K., & Puelles, L. Science, 242, 1437–1441.
(1994). The embryonic vertebrate forebrain: The prosomeric Sur, M., & Leamey, C. A. (2001). Development and plasticity
model. Science, 266(5185), 578–580. of cortical areas and networks. Nat. Rev. Neurosci., 2(4), 251–
Rubenstein, J. L., Shimamura, K., Martinez, S., & Puelles, L. 262.
(1998). Regionalization of the prosencephalic neural plate. Annu. Sur, M., Pallas, S. L., & Roe, A. W. (1990). Cross-modal plasticity
Rev. Neurosci., 21, 445–477. in cortical development: Differentiation and specification of
Sawtell, N. B., Frenkel, M. Y., Philpot, B. D., Nakazawa, sensory neocortex. Trends Neurosci., 13(6), 227–233.
K., Tonegawa, S., & Bear, M. F. (2003). NMDA receptor- Sur, M., & Rubenstein, J. L. R. (2005). Patterning and plasticity
dependent ocular dominance plasticity in adult visual cortex. of the cerebral cortex. Science, 310, 805–810.
Neuron, 38(6), 977–985. Suzuki, S., al-Noori, S., Butt, S. A., & Pham, T. A. (2004). Regu-
Schmidt, J. T., & Eisele, L. E. (1985). Stroboscopic illumination lation of the CREB signaling cascade in the visual cortex by
and dark rearing block the sharpening of the regenerated reti- visual experience and neuronal activity. J. Comp. Neurol., 479,
notectal map in goldfish. Neuroscience, 14(2), 535–546. 70–83.
106 plasticity
Swindale, N. V., Shoham, D., Grinvald, A., Bonhoeffer, T., Wang, K. H., Majewska, A., Schummers, J., Farley, B., Hu, C.,
& Hübener, M. (2000). Visual cortex maps are optimized for Sur, M., et al. (2006). In vivo two-photon imaging reveals a role
uniform coverage. Nat. Neurosci., 3(8), 822–826. of arc in enhancing orientation specificity in visual cortex. Cell,
Taha, S., & Stryker, M. P. (2002). Rapid ocular dominance 126(2), 389–402.
plasticity requires cortical but not geniculate protein synthesis. Webber, C. A., Hyakutake, M. T., & McFarlane, S. (2003).
Neuron, 34(3), 425–436. Fibroblast growth factors redirect retinal axons in vitro and in
Taha, S. A., & Stryker, M. P. (2005). Ocular dominance vivo. Dev. Biol., 263(1), 24–34.
plasticity is stably maintained in the absence of alpha calcium White, L. E., Coppola, D. M., & Fitzpatrick, D. (2001). The
calmodulin kinase II (alphaCaMKII) autophosphorylation. Proc. contribution of sensory experience to the maturation of orienta-
Natl. Acad. Sci. USA, 102(45), 16438–16442. tion selectivity in ferret visual cortex. Nature, 411, 1049–1052.
Tavazoie, S. F., & Reid, R. C. (2000). Diverse receptive fields in White, L. E., & Fitzpatrick, D. (2007). Vision and cortical map
the lateral geniculate nucleus during thalamocortical develop- development. Neuron, 56(2), 327–338.
ment. Nat. Neurosci., 3(6), 608–616. Williams, S. E., Mann, F., Erskine, L., Sakurai, T., Wei, S.,
Trachtenberg, J. T., & Stryker, M. P. (2001). Rapid anatomical Rossi, D. J., et al. (2003). Ephrin-B2 and EphB1 mediate retinal
plasticity of horizontal connections in the developing visual axon divergence at the optic chiasm. Neuron, 39(6), 919–935.
cortex. J. Neurosci., 21(10), 3476–3482. Wong, R. O., Meister, M., & Shatz, C. J. (1993). Transient
Trachtenberg, J. T., Trepel, C., & Stryker, M. P. (2000). Rapid period of correlated bursting activity during development of
extragranular plasticity in the absence of thalamocortical plastic- the mammalian retina. Neuron, 11(5), 923–938.
ity in the developing primary visual cortex. Science, 287(5460), Yang, Y., Fischer, Q. S., Zhang, Y., Baumgärtel, K.,
2029–2032. Mansuy, I. M., & Daw, N. W. (2005). Reversible blockade of
Tropea, D., Kreiman, G., Lyckman, A., Mukherjee, S., Yu, H., experience-dependent plasticity by calcineurin in mouse visual
Horng, S., et al. (2006). Gene expression changes and molecular cortex. Nat. Neurosci., 8(6), 791–796.
pathways mediating activity-dependent plasticity in visual cortex. Yu, H., Farley, B. J., Jin, D. Z., & Sur, M. (2005). The coordi-
Nat. Neurosci., 9, 660–668. nated mapping of visual space and response features in visual
Trousse, F., Marti, E., Gruss, P., Torres, M., & Bovolenta, P. cortex. Neuron, 47(2), 267–280.
(2001). Control of retinal ganglion cell axon growth: A new role Zhang, L. I., & Poo, M. M. (2001). Electrical activity and develop-
for sonic hedgehog. Development, 128(20), 3927–3936. ment of neural circuits. Nat. Neurosci., 4, Suppl., 1207–1214.
Turrigiano, G. G., & Nelson, S. B. (2004). Homeostatic Zhou, X. H., Brandau, O., Feng, K., Oohashi, T., Ninomiya, Y.,
plasticity in the developing nervous system. Nat. Rev. Neurosci., 5, Rauch, U., et al. (2003). The murine Ten-m/Odz genes show
97–107. distinct but overlapping expression patterns during development
Tuttle, R., Braisted, J. E., Richards, L. J., & O’Leary, and in adult brain. Gene Expr. Patterns, 3, 397–405.
D. D. (1998). Retinal axon guidance by region-specific cues in Ziburkus, J., Lo, F. S., & Guido, W. (2003). Nature of inhibitory
diencephalon. Development, 125(5), 791–801. postsynaptic activity in developing relay cells of the lateral genic-
von Melchner, L., Pallas, S. L., & Sur, M. (2000). Visual ulate nucleus. J. Neurophysiol., 90(2), 1063–1070.
behaviour mediated by retinal projections directed to the audi-
tory pathway. Nature, 404(6780), 871–876.
horng and sur: patterning and plasticity of maps in mammalian visual pathway 107
7 Synaptic Plasticity and Spatial
Representations in the Hippocampus
jonathan r. whitlock and edvard i. moser
abstract How does the brain acquire and remember new that there must be a mechanism by which a postsynaptic
experiences? It is believed that synaptic plasticity, the process by neuron (cell “B”) can stabilize its connections with a nearby
which synaptic connections are strengthened or weakened, is a key presynaptic neuron (cell “A”) when it activates the post-
mechanism for information storage in the central nervous system.
Long-term potentiation (LTP), the long-lasting enhancement of
synaptic cell strongly enough.
excitatory synaptic transmission, and long-term depression (LTD), It was not until approximately 20 years after Hebb’s
the persistent depression of synaptic responsiveness, are exp- seminal work that the first efforts were made that would
erimental models of synaptic plasticity thought to reveal how successfully demonstrate the physical reality of long-lasting,
synapses are modified during learning. In this chapter we focus on activity-dependent modifications at synapses. Prior to the
the properties of LTP and LTD that make them attractive func-
discovery of long-term potentiation (LTP), researchers had
tional models for memory and review key findings from studies
that demonstrate a link between LTP and behavior. We then sought and failed to elicit long-term synaptic modifications
discuss how synaptic modifications can affect spatial representa- in spinal pathways, where the observed enhancements were
tions expressed by hippocampal place cells, which have been used very short-lived (Eccles & McIntyre, 1953), and in the neo-
as tools for understanding how synaptic changes are implemented cortex, whose extensively intricate neuroanatomy proved
in neural networks. We conclude the chapter by discussing how
too complex to isolate responses from single synapses. It was
attractor states in neural networks can aid in the storage and recall
of many representations involving more than just space, and in the hippocampus, whose straightforward laminar archi-
how LTP may help fine-tune shifts between attractor states during tecture made easy the study of monosynaptic responses
behavior. (figure 7.1A), where LTP was discovered by Terje Lømo
and Tim Bliss (Bliss & Lømo, 1973). The initial characteriza-
tion was made in the dentate gyrus, the first of the three
Synaptic modifications as a means for memory: major subfields in the trisynaptic circuit of the hippocampus
The realization of an idea (figure 7.1A). Bliss and Lømo used a stimulating electrode to
deliver brief pulses of minute electrical current to the perforant
The idea that memory traces are stored as changes in path (PP, figure 7.1A), the largest direct input from the neo-
synaptic efficacy is anything but new. Since the late 19th cortex to the hippocampus. A recording electrode was placed
century, when the Spanish neuroanatomist Santiago in the dentate gyrus (DG), the subregion of the hippocampus
Ramón y Cajal first observed spinelike structures lining that receives the largest perforant path input, to record field
the dendrites of cortical pyramidal cells, the idea has excitatory postsynaptic potentials (fEPSPs) evoked in response to
existed that the connections between nerve cells provide electrical stimulation. A fEPSP is a transient voltage deflec-
an anatomical substrate for memory. This idea was formal- tion recorded at the tip of an extracellular recording elec-
ized by Donald Hebb in his 1949 work The Organization trode when ions flow into or out of the dendrites of large
of Behavior, in which he formulated his famous postulate that populations of cells (see traces at top of figure 7.1B; the
is still one of the most quoted phrases in neuroscience: responses are negative-going in this case because positive
“When an axon of cell A is near enough to excite a cell current is flowing away from the electrode). It was found that
B and repeatedly or persistently takes part in firing it, the amplitude of dentate fEPSPs, taken as a measure of syn-
some growth process or metabolic change takes place in aptic strength, showed substantial increases lasting for several
one or both cells such that A’s efficiency, as one of the hours in response to brief (10-second) episodes of tetanic
cells firing B, is increased” (Hebb, 1949). Hebb recognized (15 Hz) stimulation applied to the perforant path, and that
the enhancements were only expressed in the pathways that
received the tetanus (figure 7.1B). The fact that a brief stimu-
jonathan r. whitlock and edvard i. moser Kavli Institute for
Systems Neuroscience and Centre for the Biology of Memory, lus could induce changes that were (1) long lasting and (2)
Norwegian University of Science and Technology, Trondheim, input specific in a structure that was known to be involved
Norway in memory formation (Scoville & Milner, 1957) immediately
B C
Figure 7.2 Postsynaptic calcium entry is the key for inducing LTP Glutamate release into the synaptic cleft (3) is also enhanced. (B) A
and LTD. (A) The NMDA receptor is activated by coincident pre- large, brief increase in postsynaptic calcium, induced here by high-
and postsynaptic activity. (left) During synaptic transmission, gluta- frequency stimulation (HFS), favors the activation of protein kinases
mate is released into the synaptic cleft and acts on AMPA and and results in LTP, while small, sustained Ca2+ elevations during
NMDA receptors, though NMDA receptors are blocked by Mg2+ low-frequency stimulation (LFS) favor the activation of protein
ions at negative (resting) membrane potentials; (middle) if glutamate phosphatases, resulting in the dephosphorylation of synaptic pro-
release coincides with sufficient postsynaptic depolarization, the teins and LTD. (C) Long-term changes in synaptic strength can be
Mg2+ block is removed and Ca2+ enters the postsynaptic neuron explained as a function of the amount of calcium flowing into the
through NMDA receptors; (right) postsynaptic kinases initiate synap- postsynaptic neuron via NMDA receptors. (Modified with permis-
tic potentiation by (1) phosphorylating AMPA receptors already at sion from M. Bear, B. Connors, & M. Paradiso, Neuroscience: Exploring
the synapse, and (2) driving additional AMPA receptors to synapses. the brain, 3rd edition, © 2007, Lippincott Williams & Wilkins.)
into the synaptic cleft (Bliss, Errington, & Lynch, 1990; Bliss, baseline if protein synthesis inhibitors are applied within the
Errington, Lynch, & Williams, 1990; Dolphin, Errington, & first couple of hours following conditioning stimulation (Frey,
Bliss, 1982). Structural changes, such as the growth of new Krug, Reymann, & Matthies, 1988; Krug, Lossner, & Ott,
spines and the enlargement or splitting of synapses in two, 1984; Stanton & Sarvey, 1984; see Kelleher, Govindarajan,
have also been observed following LTP induction (Abraham & Tonegawa, 2004, for review).
& Williams, 2003; Chen, Rex, Casale, Gall, & Lynch, 2007; In the case of LTD, small increases in Ca2+ arising from
Nagerl, Eberhorn, Cambridge, & Bonhoeffer, 2004; see weak synaptic stimulation favor the activation of protein
Yuste & Bonhoeffer, 2001, for review). LTP lasting several phosphatases that dephosphorylate synaptic proteins includ-
hours or days (sometimes referred to as L-LTP) requires the ing glutamate receptors (figure 7.2B) (Mulkey, Endo,
synthesis of new proteins, and will gradually decay back to Shenolikar, & Malenka, 1994; Mulkey, Herron, & Malenka,
112 plasticity
1993). Contrary to LTP, LTD results in the removal and First, induction is rapid and long lasting. Changes in syn-
eventual degradation of AMPA receptors, NMDA receptors, aptic strength can be induced following very brief trains of
and structural synaptic proteins (Ehlers, 2000; Heynen et al., high-frequency stimulation (Douglas & Goddard, 1975), and
2000; Colledge et al., 2003), as well as the retraction of exist- the resulting potentiation can last anywhere from several
ing spines (Nagerl et al., 2004; Zhou, Homma, & Poo, 2004). minutes to perhaps the entire lifetime of an animal (Abraham,
Despite the fact that LTD involves the destruction of some Logan, Greenwood, & Dragunow, 2002; Barnes, 1979).
preexisting proteins, long-lasting LTD (i.e., L-LTD), like L- These properties allow information learned from very brief
LTP, also depends on the synthesis of new proteins—in fact, episodes to be remembered for a lifetime, such as to not stick
recent experiments have shown that proteins synthesized one’s finger in a light socket.
in response to L-LTP induction at one set of synapses can Second, LTP provides a cellular mechanism for associa-
also be used to sustain L-LTD at nearby synapses on the tion of inputs to different synapses of a cell. This is indirectly
same cell (Sajikumar & Frey, 2004). For a lengthier descrip- apparent from the fact that the probability of inducing LTP
tion of the mechanisms of LTP and LTD we recommend increases with the number of stimulated afferents, a phe-
reviews by Bliss and Collingridge (1993), Malenka and Bear nomenon referred to as cooperativity (McNaughton, Douglas,
(2004), and Malinow and Malenka (2002). & Goddard, 1978). Weak stimulation will only affect a small
proportion of synapses and is less likely to induce a post-
Properties of LTP and LTD that are relevant for synaptic change, whereas a strong stimulus will affect more
memory formation synapses and increase the likelihood of inducing a long-
lasting change in the postsynaptic response (figure 7.3, left).
Many of the physiological properties of LTP and LTD Transiently increasing the stimulation intensity during the
are homologous to the characteristics of behavioral delivery of a tetanus will lead to the recruitment of additional
memory expressed at the level of the whole animal, such afferents and cause potentiation of synapses that would not
as rapid induction (enabling fast learning) and longevity have been coactivated during a weaker stimulation. In this
(allowing some memories to last a lifetime). While LTP and sense, “cooperativity” is a form of associative synaptic poten-
LTD are generally accepted as the leading cellular mecha- tiation (discussed in the next paragraph).
nisms for learning and memory, it should be noted that A more direct illustration of associativity involves the obser-
certain aspects of memory do not necessarily translate vation that when both weak and strong inputs are stimulated
directly from changes at synapses, but more likely emerge at together, the weak input will show LTP, whereas if the
the level of the neural network in which the synaptic modi- weak input is stimulated alone, no LTP will be seen (figure
fications are embedded (Hebb, 1949; Marr, 1971). The 7.3, middle) (Barrionuevo & Brown, 1983; Levy & Steward,
functional features exhibited by LTP and LTD are exactly 1979). Associativity is relevant to learning and memory
the type that would be useful in enabling neural networks to because it allows neurons to associate arbitrary patterns
rapidly acquire and store large amounts of information of activity from distinct neural pathways that may relay
during behavior. information regarding distinct but related events. This
Figure 7.3 LTP exhibits physiological properties that make it a pathways results in the long-term strengthening of the weak
tenable cellular substrate for memory. Cooperativity (left) describes pathway. (right) LTP is input specific because only those synapses
the property whereby a weak tetanus that activates relatively few active at the time of the tetanus express potentiation; inactive inputs
afferents will not induce a change in the synaptic response, whereas do not share in the potentiation. Cooperativity, associativity, and
coactivating many inputs with sufficiently strong stimulation will input specificity apply similarly to LTD. (Modified with permission
induce a change. Associativity (middle) describes the property from R. Nicoll, J. Kauer, & R. Malenka, The current excitement
whereby the concurrent simulation of weak and strong convergent in long-term potentiation, Neuron, 1, 97–103, © 1988, Cell Press.)
114 plasticity
A
Figure 7.4 The Morris water maze is one of the most common Animals injected with saline (left) spent the greatest amount of time
behavioral tests of spatial learning and memory. (A) In the task, an in the target quadrant (which earlier had the escape platform),
animal is placed in a pool filled with opaque water and must locate whereas animals treated with the NMDA receptor antagonist
a submerged escape platform. At the start of training the animal’s D,L-APV (middle) showed no preference for the target quadrant.
swim path is typically long and circuitous. After several training Animals injected with the inactive L-isomer of APV (right) showed
trials the rat learns the platform location and will swim straight to normal spatial memory similar to the saline-injected group. (Modi-
it during a test trial. (Modified with permission from M. Bear, B. fied with permission from Morris, Anderson, Lynch, & Baudry,
Connors, & M. Paradiso, Neuroscience: Exploring the brain, 3rd edition, Selective impairment of learning and blockade of long-term
© 2007, Lippincott Williams & Wilkins.) (B) Shown at top are the potentiation by an N-methyl-D-aspartate receptor antagonist,
swim trajectories of animals tested in the Morris water maze after AP5, Nature, 319, 774–776, © 1986, Nature [Nature Publishing
having been given different drug treatments prior to training. Group].)
116 plasticity
A
Figure 7.5 Disrupting the pattern of synaptic weights in a amnesia. (Modified with permission from Brun et al., Retrograde
network results in a loss of the information stored across the syn- amnesia for spatial memory induced by NMDA receptor-mediated
apses. (A) A hypothetical distribution of synaptic enhancements long-term potentiation, Journal of Neuroscience, 21(1), 356–362, ©
induced in a network by learning; lines are neuronal processes 2001, Society for Neuroscience.) (B) Robust LTP induced in the
which intersect at synapses represented as circles; black circles are dentate gyrus in vivo can be rapidly reversed 22 hours later by
synapses potentiated by recent learning, gray circles are synapses intrahippocampal infusion of the selective PKMζ antagonist “zeta
already potentiated from an unrelated event; white circles are inhibitory peptide,” or ZIP. (C ) In parallel with the reversal of LTP,
unpotentiated synapses. (top right) Randomly potentiating irrelevant intrahippocampal infusions of ZIP caused abrupt and complete
synapses with high-frequency stimulation (HFS) after learning amnesia in a place-avoidance task in rats tested either 24 hours or
scrambles the pattern of learning-induced synaptic weights and 1 month after training. The avoidance memory of saline-infused
disrupts memory storage; this is the experimental strategy used by animals remained intact. (Modified with permission from Pastalkova
Brun, Ytterbo, Morris, Moser, and Moser (2001). (top left) The et al., Storage of spatial information by the maintenance mecha-
reversal of learning-related synaptic enhancements should erase the nism of LTP, Science, 313, 1141–1144, © 2006, American Associa-
information stored across the connections and cause retrograde tion for the Advancement of Science.)
in the primary motor cortex (M1) corresponding to the pre- gesting that learning had elevated the synapses in motor
ferred reaching paw were substantially larger than fEPSPs cortex closer to their ceiling for LTP expression and, concur-
from the hemisphere for the untrained paw (i.e., the rently, left more space for synaptic depression. The partial
“untrained” hemisphere) (figure 7.6A) (Rioult-Pedotti, Fried- reduction, or “occlusion,” of LTP by learning suggests that
man, Hess, & Donoghue, 1998). Follow-up studies investi- skill learning and LTP engage a common neural mechanism.
gated the impact of skill learning on subsequent LTP and More recent work has shown that the synaptic modification
LTD and revealed a marked reduction in the amount of LTP range shifts upward to accommodate the synaptic enhance-
and an enhancement in the magnitude of LTD in the ments a few weeks after learning, thereby restoring the capac-
“trained” portion of cortex (figure 7.6B) (Monfils & Teskey, ity of the connections to express their previous levels of LTP
2004; Rioult-Pedotti, Friedman, & Donoghue, 2000), sug- and LTD (Rioult-Pedotti, Donoghue, & Dunaevsky, 2007).
C D
Figure 7.6 Learning induces synaptic enhancements that American Association for the Advancement of Science.) (C ) In vivo
occlude LTP in brain areas relevant to the type of information recording experiments in rats revealed that single-trial inhibitory
learned. (A) Learning a new motor skill enhanced fEPSP amplitude avoidance (IA) training led to fEPSP enhancements in a subpopula-
specifically in the forelimb region of primary motor cortex (M1) tion of recording electrodes in the hippocampus of trained animals
corresponding to the preferred reaching paw in trained rats; no relative to controls (who walked through the training apparatus
enhancements were observed in the same area of M1 in untrained without receiving a foot shock, i.e., the “Walk through” group).
control animals. (Modified with permission from Rioult-Pedotti, Data were collected 2 hours after conditioning. (D) Electrodes
Friedman, Hess, & Donoghue, Strengthening of horizontal cortical showing fEPSP enhancements upon IA training reached LTP
connections following skill learning, Nature Neuroscience, 1(3), 230– saturation more rapidly and showed less LTP in response to
234, © 1998, Nature [Nature Publishing Group].) (B) The fEPSP repeated trains of HFS, demonstrating that this form of learning
enhancements associated with skill learning resulted in the partial mimicked and occluded hippocampal LTP in vivo. (Modified with
occlusion of LTP in M1 in the “reaching” hemisphere compared permission from Whitlock, Heynen, Shuler, & Bear, Learning
to the “nonreaching” hemisphere in trained rats. (Modified with induces long-term potentiation in the hippocampus, Science, 313,
permission from Rioult-Pedotti, Friedman, & Donoghue, Learn- 1093–1097, © 2006, American Association for the Advancement
ing-induced LTP in neocortex, Science, 290, 533–536, © 2000, of Science.)
Studies in the amygdala have also yielded strong evidence 1990s showed that repeatedly pairing a tone with a foot
linking LTP and memory formation. Because of its well- shock resulted in the strengthening of auditory thalamic
characterized anatomical connectivity, the amygdala has inputs to the amygdala and increases in the amplitude of
allowed neuroscientists the opportunity to directly investi- auditory-evoked responses when animals were replayed the
gate associative synaptic plasticity between distinct inputs tone after conditioning (McKernan & Shinnick-Gallagher,
following associative learning. One of the most common 1997; Rogan, Staubli, & LeDoux, 1997). Thus the initially
experimental approaches has been to use Pavlovian fear- weak tone representation became potentiated through its
conditioning paradigms in which the aversive, fear-evoking association with the foot shock. Similar to LTP, it was
stimulus of a foot shock (the US) is paired with a novel found that this form of learning resulted in the delivery of
environmental cue, such as a tone (the CS). Research in the AMPA receptors to amygdalar synapses and that blocking
118 plasticity
the synaptic delivery of AMPA receptors prevented the for- How does LTP influence hippocampal receptive fields?
mation of the fear memory (Rumpel, LeDoux, Zador, &
Malinow, 2005). If changes in synaptic transmission ultimately result in modi-
In addition to the amygdala, several studies have demon- fied behavior, then they must change the way in which the
strated learning-specific, LTP-like changes in the synaptic brain structures that mediate those behaviors communicate
expression and phosphorylation of AMPA receptors in the with one another. A mechanistic understanding of how LTP
hippocampus following tasks such as contextual fear condi- contributes to behavior therefore requires a description of
tioning, where animals learn to fear a context as opposed how changes in synaptic strength affect representations in
to a discrete tone, and inhibitory avoidance training neural networks. A well-studied experimental tool for under-
(Shukla, Kim, Blundell, & Powell, 2007; Matsuo, Reijmers, standing neural representations has been hippocampal place
& Mayford, 2008; Cammarota, Bernabeu, Levi De Stein, cells, first characterized in area CA1, which discharge only
Izquierdo, & Medina, 1998; Bevilaqua, Medina, Izquierdo, when an animal occupies a particular spatial location, the
& Cammarota, 2005; Whitlock, Heynen, Shuler, & Bear, “place field” (O’Keefe & Dostrovsky, 1971). Neighboring
2006). Many of the downstream biochemical cascades place cells express distinct but overlapping place fields such
initiated by inhibitory avoidance training are the same as that the entire surface of a recording environment is com-
those seen following LTP induction in the hippocampus pletely represented by a group of cells (O’Keefe, 1976). The
(for review, see Izquierdo et al., 2006). Electrophysiological spatial representations of place cells are extremely specific,
experiments have further confirmed the occurrence of with the cells firing at entirely unrelated locations from one
LTP-like modification of hippocampal synapses following recording environment to the next (O’Keefe & Conway,
various learning tasks. In one such study, Sacchetti and col- 1978), and can be incredibly stable, maintaining the same
leagues showed that hippocampal slices obtained from rats firing field locations for as long as the cells are identifiable
after contextual fear conditioning showed fEPSP enhance- (Thompson & Best, 1990). More recent advances in record-
ments that partially occluded subsequent LTP, suggesting ing technology have enabled researchers to simultaneously
that contextual learning and LTP shared a common expres- record the activity of large ensembles of place cells (>100) as
sion mechanism (Sacchetti et al., 2001, 2002). LTP-like new map representations emerged during exploration of a
enhancements in synaptic transmission have also been novel environment (Wilson & McNaughton, 1993). Because
recorded in area CA1 following trace eyeblink conditioning, the concerted activity of large groups of cells in the hippo-
a form of hippocampal-dependent associative learning. campus will completely cover any environment encountered
Enhanced fEPSP responses were reported following this by an animal, it has been hypothesized that the hippocam-
task in hippocampal slices prepared from trained rabbits pus provides the neural substrate for an integrative “cogni-
(Power, Thompson, Moyer, & Disterhoft, 1997), as well as tive map,” which provides “an objective spatial framework
in the intact hippocampus of freely behaving mice (Gruart, within which the items and events of an organism’s experi-
Munoz, & Delgado-Garcia, 2006). Some of the most con- ence are located and interrelated” (O’Keefe & Nadel, 1978).
clusive evidence demonstrating learning-induced LTP in the The remarkable spatial specificity and stability of place cells
hippocampus comes from recent in vivo recording experi- make them ideal candidates for contributing to a spatial
ments in rats that were given inhibitory avoidance training memory system. Long-term potentiation is thought to fit into
(Whitlock et al., 2006). Multielectrode arrays were chroni- this framework by providing a synapse-specific mechanism
cally implanted to record fEPSPs at several sites in the hip- enabling the long-lasting storage of spatial representations
pocampus of awake, behaving animals before and after for a potentially very large number of environments. In this
inhibitory avoidance training. The training caused abrupt section we review studies that have begun to establish a link
and long-lasting (>3 hr) enhancements of evoked fEPSPs in between LTP and place cell representations.
a subpopulation of the recording electrodes in trained One of the earliest studies demonstrating a mechanistic
animals relative to controls (figure 7.6C ). Additional experi- relationship between LTP and place fields used mice carry-
ments demonstrated that electrodes showing training-related ing a CA1-specific deletion of the gene encoding the NMDA
fEPSP enhancements expressed less subsequent LTP in receptor subunit NR1 (McHugh, Blum, Tsien, Tonegawa,
response to HFS than neighboring electrodes that were not & Wilson, 1996). In parallel with the previously mentioned
enhanced by training—that is, the learning-related enhance- impairments in LTP and spatial learning in these mice, the
ments partially occluded subsequent LTP (figure 7.6D). This authors found that the firing fields of CA1 place cells exhib-
demonstration of learning-induced fEPSP enhancements ited somewhat reduced spatial specificity, although the place
that occlude LTP in vivo provided long-awaited evidence fields did not disappear entirely (i.e., they were broader and
that the strengthening of hippocampal synapses is a natural had less well-defined boundaries than control animals; figure
physiological occurrence following some forms of associative 7.7A). Place cells with overlapping place fields also showed
learning. reduced covariance of firing, implying a reduced capacity
Figure 7.7 Genetic deletion of the obligatory NR1 subunit of the mice, Cell, 87, 1339–1349, © 1996, Cell Press.) (B) The firing fields
NMDA receptor alters place cell properties without preventing the of CA1 place cells in CA3-specific NR1 knockout mice differed
expression of place fields per se. (A) Examples of direction- from controls only during specific environmental manipulations.
specific CA1 place cell activity from CA1-specific NR1 knockout There were no differences in place field properties when four out
mice and control mice running on a one-dimensional linear track; of four distal cues were present in the recording arena (“full cue”
the panels show the firing rates of cells as a function of the location condition); however, when mice were returned to the arena with
of the animals on the track. In this example, the cells were virtually only one of four cues present (“partial cue”), CA3 knockout mice
silent when the animals traversed the track in the upward direction, expressed significantly smaller CA1 place fields and had lower
but fired in a spatially restricted manner as the animals ran back firing rates than controls. These experiments suggested a functional
down. Place fields in CA1 knockout mice were stable but signifi- role for CA3 in pattern completion. (Modified with permission
cantly larger than in controls. (Modified with permission from from Nakazawa et al., Requirement for hippocampal CA3 NMDA
McHugh, Blum, Tsien, Tonegawa, & Wilson, Impaired hippocam- receptors in associative memory recall, Science, 297, 211–218, ©
pal representation of space in CA1-specific NMDAR1 knockout 2002, American Association for the Advancement of Science.)
for coordinating ensemble codes for spatial location across role of NMDA receptors in induction and maintenance of
cells. Considerable sparing of place-specific firing was also spatial representations (Kentros et al., 1998). They found
seen in mice with a specific deletion of the NR1 subunit in that injecting rats with a selective NMDA-receptor antago-
the CA3 subfield (Nakazawa et al., 2002). In these animals, nist prevented the maintenance of new place fields acquired
the sharpness of firing fields in CA1 was not reduced at all. in novel environments when animals were reexposed to the
Impaired spatial firing appeared only under conditions environments a day later. The drug treatment did not affect
where a substantial fraction of the landmarks in the environ- preexisting place fields in a familiar environment, and new
ment were removed (figure 7.7B). Further insight was place fields were expressed instantaneously as animals
obtained in a study by Kentros and colleagues, where the explored a novel environment. These observations, together
authors used a pharmacological approach to compare the with the spared spatial firing observed in the NR1 knockout
120 plasticity
mice, suggest that the mechanism for the expression of 1987). This multiplicity of the hippocampal spatial map
place fields per se is NMDA-receptor independent. NMDA contrasts strongly with the universal nature of representa-
receptors may instead be necessary for maintaining place tions one synapse upstream, in the superficial layers of
fields in fixed locations across different experiences in the the medial entorhinal cortex, which interfaces most of
environment. the external sensory information from the cortex to the
NMDA receptor activation is also necessary for hippocampus and back (Fyhn, Molden, Witter, Moser,
experience-dependent changes in place cell discharge pro- & Moser, 2004). The key cell type among the entorhinal
perties, as revealed by experiments in which rats repeatedly inputs to the hippocampus is the grid cell, which fires at
traversed the length of a linear track. This behavior was sharply defined locations like place cells in the hippocampus
associated with the asymmetric, backward expansion of but differs from such cells in that each cell has multiple
CA1 place fields relative to the rat’s direction of motion, firing locations and that the firing locations of each cell
which was hypothesized to aid in predicting elements form a tessellating triangular pattern across the entire envi-
of upcoming spatial sequences before they actually occurred ronment available to the animal (Hafting, Fyhn, Molden,
(Mehta, Barnes, & McNaughton, 1997; Mehta, Quirk, & Moser, & Moser, 2005). Different grid cells have nonover-
Wilson, 2000). This form of behaviorally driven receptive lapping firing fields; that is, the grids are offset relative to
field plasticity was hypothesized to arise from LTP-like syn- each other (Hafting et al., 2005), but the grids of different
aptic enhancements between cells in CA3 and CA1 and, colocalized cells keep a constant spatial relationship between
indeed, the effect was blocked in animals injected with different environments, implying that a single spatial map
NMDA receptor antagonists (Ekstrom, Meltzer, McNaugh- may be used in all behavioral contexts, very much unlike the
ton, & Barnes, 2001). In addition to studies demonstrating recruitment of discrete and apparently nonoverlapping rep-
a permissive role for NMDA-receptor activation in place cell resentations in the hippocampus (Fyhn, Hafting, Treves,
plasticity, evidence supporting an instructive role for LTP in Moser, & Moser, 2007). These observations, taken as a
driving changes in place representations comes from a study whole, suggest that self-location is maintained and perhaps
by Dragoi and colleagues. It was found that inducing LTP generated in entorhinal cortex (McNaughton, Battaglia,
in the hippocampus caused remapping of place cell firing Jensen, Moser, & Moser, 2006), whereas the role of the hip-
fields in familiar environments, including the creation of new pocampus is to differentiate between places and experiences
fields, the disappearance of others, and changes in the direc- associated with places, and to associate each of them to the
tional preferences of others (Dragoi, Harris, & Buzsaki, particular features of each environment. Such a regional
2003). Additional work revealed that contextual fear condi- differentiation would be consistent with a critical role for the
tioning, which itself induces LTP-like enhancements of hip- hippocampus in memory for individual episodes.
pocampal fEPSPs (Sacchetti et al., 2001, 2002), also results But how are associative memories encoded and retrieved
in the partial remapping of place fields in CA1 (Moita, Rosis, in the place cell system? It is commonly believed that memo-
Zhou, LeDoux, & Blair, 2004), suggesting that synaptic plas- ries are encoded at the level of neural ensembles and that the
ticity and place field plasticity are merely different aspects of ensembles are implemented in neural attractor networks
a common mechanism engaged by the hippocampus during (Amit, Gutfreund, & Sompolinsky, 1985, 1987; Hopfield,
associative contextual learning. 1982). An attractor network has one or several preferred
positions or volumes in the space of network states, such that
Synaptic plasticity and attractor dynamics in when the system is started from any location outside the pre-
neural networks ferred positions, it will evolve until it reaches one of the
attractor basins (figure 7.8A). It will then stay there until the
Place cells are thought to be part of a hippocampal system system receives new input. These properties allow stored
for storage of episodic memories with a spatial component memories (the preferred positions of the system) to be recalled
(S. Leutgeb et al., 2005; S. Leutgeb, J. K. Leutgeb, Moser, from degraded versions of the original input (positions that
& Moser, 2005). Several studies over the years have revealed are slightly different from the preferred positions). Storing
that place cells encode more than just space, including different places and episodes as discrete states in such a
odors, textures, temporal sequences, and prior events network keeps memories separate and avoids memory inter-
(Hampson, Heyser, & Deadwyler, 1993; Moita et al., ference. Attractor networks could, in principle, be hardwired,
2004; Wood, Dudchenko, & Eichenbaum, 1999; Wood, but for the hippocampus, as well as any other memory-
Dudchenko, Robitsek, & Eichenbaum, 2000; Young, Fox, storing system, this is quite unlikely considering that thou-
& Eichenbaum, 1994), and the place cell network is known sands of new memories are formed in the system each day. It
to support a number of discrete and graded representations is more likely that representations evolve over time, with new
in the same environment (Bostock, Muller, & Kubie, 1991; states being formed each time a new event is experienced.
J. Leutgeb et al., 2005; Markus et al., 1995; Muller & Kubie, The formation of hippocampal attractor states is thought to
Figure 7.8 Patterns of activity in cell assemblies with attractor cells participating in the representation (“complete” patterns shown
dynamics. (A) Ambiguous patterns tend to converge to a familiar in the middle of “Attractor 1” and “Attractor 2”). (C ) Attractor
matching pattern (i.e., pattern completion) and, simultaneously, networks are thought to aid in disambiguating similar patterns of
diverge away from interfering patterns (i.e., pattern separation). input by favoring sharp transitions between network states as inputs
This nonlinear process can be illustrated with an illusion using an are changed gradually, as in the study by Wills, Lever, Cacucci,
ambiguous visual object. The perceived image in A will tend to Burgess, and O’Keefe (2005). It was found that spatial maps in
fluctuate between two familiar images (a chalice on the left, or two the hippocampus snapped sharply from a “square-environment”
kissing faces on the right), instead of stabilizing on the ambiguous representation to a “circle-environment” representation as a record-
white object in the middle. (B) The presence of attractor states in a ing enclosure was gradually “morphed” from one shape to the
neural network favors the emergence of familiar patterns even when other. (Modified with permission from S. Leutgeb, J. K. Leutgeb,
the initial input is severely degraded. Activating just a few cells in Moser, & Moser, Place cells, spatial maps and the population code
an attractor network (“partial” representations with just one or two for memory, Current Opinion in Neurobiology, 15, 738–746, © 2005,
black dots on the left and right examples in “Attractor 1” and Elsevier Ltd.)
“Attractor 2”) is sufficient to restore the full ensemble activity of
be based on LTP- and LTD-like synaptic modifications representations during progressive equal-step transforma-
between the cells that participate in the individual represen- tion of the recording environment, using so-called morph
tations and between these ensembles and external signals boxes. Recording in CA1, Wills and colleagues trained rats
providing information about the features of the environment in a square and a circular version of a box with flexible walls
or episode for which a representation is generated. until place cell representations in the two environments were
Unfortunately, there is limited direct experimental evi- very different (Wills, Lever, Cacucci, Burgess, & O’Keefe,
dence for LTP and LTD in attractor dynamics. Several 2005). The rats were then exposed to several intermediate
recent studies have suggested that the hippocampus has shapes. A sharp transition from squarelike representations to
attractor properties, however. For example, place cells keep circlelike representations was observed near the middle
their location of firing after removal of a significant subset between the familiar shapes, as predicted if the network had
of the landmarks that defined the original training environ- discrete attractor-based representations corresponding to
ment—for example, when a cue card is removed or the lights the trained shapes (figure 7.8C ). Parallel work by Leutgeb
are turned off (Muller & Kubie, 1987; O’Keefe & Conway, and colleagues showed that the representations are not
1978; Quirk, Muller, & Kubie, 1990). The persistence of always discrete ( J. Leutgeb et al., 2005). Under conditions
the place fields suggests that representations can be where the spatial reference frame is constant, place cells in
activated even under severely degraded input conditions CA3 and CA1 assimilate gradual or moderate changes in
(as schematized in figure 7.8B). However, such experiments the environment into the preexisting representations. It was
do not rule out the possibility that firing is controlled observed that stable states can be attained along the entire
by subtle cues that are still present in the deprived continuum between two preestablished representations,
version of the environment. In response to this concern, as long as the spatial environment remains unchanged.
more recent experiments have measured hippocampal place Adding the dimension of time, this ability to represent
122 plasticity
continua may allow hippocampal networks to encode and mation storage and recall. The available data suggest that
retrieve sequential inputs as uninterrupted episodes. The the question is no longer whether LTP is involved in memory,
existence of both discrete and continuous representations but how. A major challenge for future research will be to
and their dependence on the exact experience in the envi- determine more exactly how LTP and LTD contribute to
ronment are consistent with the existence of attractors in the dynamic representation in the heavily interconnected
hippocampus, but the attractors must be dynamic, implying neural networks of the hippocampus and elsewhere. The
a possible role for LTP and LTD in their formation and evidence for attractors is indirect, and we do not know, for
maintenance. example, what numbers of cells are involved in each repre-
Where should we begin the study of synaptic plasticity in sentation, whether there are multiple representations, and,
hippocampal attractor dynamics? Theoretical models have if there are, whether and how they overlap and interact. The
pointed to the neural architecture of CA3 as a good candi- mechanisms for maintaining and separating discrete repre-
date (Marr, 1971; McNaughton & Morris, 1987). The dense sentations, as well as the processes by which new information
and modifiable recurrent circuitry of this system (Amaral & is assimilated into existing network states, are not known.
Witter, 1989; Lorente de Nó, 1934) and the sparse firing LTP and LTD, as well as more short-term plasticity pro-
of the pyramidal cells in this area (Barnes, McNaughton, cesses, are likely to play major functions in these processes,
Mizumori, Leonard, & Lin, 1990; S. Leutgeb, J. K. Leutgeb, but how these functions are implemented in the network
Treves, Moser, & Moser, 2004) are properties that would be remains an enigma.
expected if the system were to form rapid distinguishable
representations that could be recalled in the presence of
considerable noise. Widespread collaterals interconnect REFERENCES
pyramidal cells along nearly the entire length of the Abraham, W. C., & Kairiss, E. W. (1988). Effects of the NMDA
CA3 (Ishizuka, Weber, & Amaral, 1990; Li, Somogyi, antagonist 2AP5 on complex spike discharge by hippocampal
Ylinen, & Buzsaki, 1994). On average, each pyramidal pyramidal cells. Neurosci. Lett., 89(1), 36–42.
Abraham, W. C., Logan, B., Greenwood, J. M., &
neuron makes synapses with as many as 4% of the pyramidal
Dragunow, M. (2002). Induction and experience-dependent
cells in the ipsilateral CA3, and more than three-quarters consolidation of stable long-term potentiation lasting months in
of the excitatory synapses on a CA3 pyramidal cell are the hippocampus. J. Neurosci., 22(21), 9626–9634.
from other CA3 pyramidal neurons (Amaral, Ishizuka, & Abraham, W. C., & Williams, J. M. (2003). Properties
Claiborne, 1990). The recurrent synapses exhibit LTP and mechanisms of LTP maintenance. Neuroscientist, 9(6),
(Zalutsky & Nicoll, 1990), enabling the formation of an 463–474.
Amaral, D. G., Ishizuka, N., & Claiborne, B. (1990). Neurons,
extensive number of interconnected cell groups in the numbers and the hippocampal network. Prog. Brain Res., 83,
network. In agreement with these ideas, CA3 cells show 1–11.
remarkably persistent and coherent firing after removal Amaral, D. G., & Witter, M. P. (1989). The three-dimensional
of significant parts of the original sensory input (Lee, organization of the hippocampal formation: A review of ana-
Yoganarasimha, Rao, & Knierim, 2004; S. Leutgeb tomical data. Neuroscience, 31(3), 571–591.
Amit, D. J., Gutfreund, H., & Sompolinsky, H. (1985). Storing
et al., 2004; Vazdarjanova & Guzowski, 2004). NMDA- infinite numbers of patterns in a spin-glass model of neural
receptor-dependent plasticity in CA3 is necessary for this networks. Phys. Rev. Lett., 55(14), 1530–1533.
neural reactivation process as well as the ability to retrieve Amit, D. J., Gutfreund, H., & Sompolinsky, H. (1987). Informa-
spatial memories from small subsets of the cues of the origi- tion storage in neural networks with low levels of activity. Phys.
nal environment (see figure 7.7B) (Nakazawa et al., 2002). Rev. A, 35(5), 2293–2303.
Andersen, P., Sundberg, S. H., Sveen, O., & Wigstrom, H.
Formation of discrete representations is apparent in the
(1977). Specific long-lasting potentiation of synaptic transmission
same network as a nearly complete replacement of the in hippocampal slices. Nature, 266(5604), 736–737.
active cell population in CA3 when animals are transferred Bannerman, D. M., Good, M. A., Butcher, S. P., Ramsay, M., &
between enclosures with common features (S. Leutgeb et al., Morris, R. G. (1995). Distinct components of spatial learning
2004). Thus several studies suggest that CA3 has the pre- revealed by prior training and NMDA receptor blockade. Nature,
378(6553), 182–186.
dicted properties of an attractor network and that NMDA
Barnes, C. A. (1979). Memory deficits associated with senescence:
receptor-dependent long-term plasticity may underlie its A neurophysiological and behavioral study in the rat. J. Comp.
functions in encoding and recall of memory. Physiol. Psychol., 93(1), 74–104.
Barnes, C. A., McNaughton, B. L., Mizumori, S. J., Leonard,
Summary B. W., & Lin, L. H. (1990). Comparison of spatial and temporal
characteristics of neuronal activity in sequential stages of hippo-
campal processing. Prog. Brain Res., 83, 287–300.
There is still some work to do. Demonstrating that LTP Barrionuevo, G., & Brown, T. H. (1983). Associative long-term
plays a role in memory is only a first step to a deeper potentiation in hippocampal slices. Proc. Natl. Acad. Sci. USA,
mechanistic understanding of how the brain achieves infor- 80(23), 7347–7351.
124 plasticity
amygdala, medial septum, and hippocampus of the rat. Behav. Maren, S. (2001). Neurobiology of Pavlovian fear conditioning.
Neural Biol., 58(1), 16–26. Annu. Rev. Neurosci., 24, 897–931.
Jerusalinsky, D., Ferreira, M. B., Walz, R., Da Silva, R. C., Markus, E. J., Qin, Y. L., Leonard, B., Skaggs, W. E.,
Bianchin, M., Ruschel, A. C., et al. (1992). Amnesia by post- McNaughton, B. L., & Barnes, C. A. (1995). Interactions
training infusion of glutamate receptor antagonists into the between location and task affect the spatial and directional
amygdala, hippocampus, and entorhinal cortex. Behav. Neural firing of hippocampal neurons. J. Neurosci., 15(11), 7079–
Biol., 58(1), 76–80. 7094.
Kelleher, R. J., 3rd, Govindarajan, A., & Tonegawa, Marr, D. (1971). Simple memory: A theory for archicortex. Philos.
S. (2004). Translational regulatory mechanisms in persistent Trans. R. Soc. Lond. B Biol. Sci., 262(841), 23–81.
forms of synaptic plasticity. Neuron, 44(1), 59–73. Martin, S. J., Grimwood, P. D., & Morris, R. G. (2000).
Kentros, C., Hargreaves, E., Hawkins, R. D., Kandel, Synaptic plasticity and memory: An evaluation of the hypothesis.
E. R., Shapiro, M., & Muller, R. V. (1998). Abolition of long- Annu. Rev. Neurosci., 23, 649–711.
term stability of new hippocampal place cell maps by NMDA Matsuo, N., Reijmers, L., & Mayford, M. (2008). Spine-
receptor blockade. Science, 280(5372), 2121–2126. type-specific recruitment of newly synthesized AMPA receptors
Krug, M., Lossner, B., & Ott, T. (1984). Anisomycin blocks with learning. Science, 319(5866), 1104–1107.
the late phase of long-term potentiation in the dentate gyrus of Mayer, M. L., Westbrook, G. L., & Guthrie, P. B. (1984).
freely moving rats. Brain Res. Bull., 13(1), 39–42. Voltage-dependent block by Mg2+ of NMDA responses in spinal
Lee, I., Yoganarasimha, D., Rao, G., & Knierim, J. J. (2004). cord neurones. Nature, 309(5965), 261–263.
Comparison of population coherence of place cells in hippocam- Mayford, M., & Kandel, E. R. (1999). Genetic approaches to
pal subfields CA1 and CA3. Nature, 430(6998), 456–459. memory storage. Trends Genet., 15(11), 463–470.
Leutgeb, J. K., Leutgeb, S., Treves, A., Meyer, R., Barnes, McHugh, T. J., Blum, K. I., Tsien, J. Z., Tonegawa, S., &
C. A., McNaughton, B. L., et al. (2005). Progressive transforma- Wilson, M. A. (1996). Impaired hippocampal representation of
tion of hippocampal neuronal representations in “morphed” space in CA1-specific NMDAR1 knockout mice. Cell, 87(7),
environments. Neuron, 48(2), 345–358. 1339–1349.
Leutgeb, S., Leutgeb, J. K., Barnes, C. A., Moser, E. I., McKernan, M. G., & Shinnick-Gallagher, P. (1997). Fear con-
McNaughton, B. L., & Moser, M. B. (2005). Independent ditioning induces a lasting potentiation of synaptic currents in
codes for spatial and episodic memory in hippocampal neuronal vitro. Nature, 390(6660), 607–611.
ensembles. Science, 309(5734), 619–623. McNaughton, B. L., Barnes, C. A., Rao, G., Baldwin, J., &
Leutgeb, S., Leutgeb, J. K., Moser, M. B., & Moser, E. I. (2005). Rasmussen, M. (1986). Long-term enhancement of hippocam-
Place cells, spatial maps and the population code for memory. pal synaptic transmission and the acquisition of spatial informa-
Curr. Opin. Neurobiol., 15(6), 738–746. tion. J. Neurosci., 6(2), 563–571.
Leutgeb, S., Leutgeb, J. K., Treves, A., Moser, M. B., & Moser, McNaughton, B. L., Battaglia, F. P., Jensen, O., Moser,
E. I. (2004). Distinct ensemble codes in hippocampal areas CA3 E. I., & Moser, M. B. (2006). Path integration and the
and CA1. Science, 305(5688), 1295–1298. neural basis of the “cognitive map.” Nat. Rev. Neurosci., 7(8),
Levy, W. B., & Steward, O. (1979). Synapses as associative 663–678.
memory elements in the hippocampal formation. Brain Res., McNaughton, B. L., Douglas, R. M., & Goddard, G. V. (1978).
175(2), 233–245. Synaptic enhancement in fascia dentata: Cooperativity among
Li, X. G., Somogyi, P., Ylinen, A., & Buzsaki, G. (1994). The coactive afferents. Brain Res., 157(2), 277–293.
hippocampal CA3 network: An in vivo intracellular labeling McNaughton, B. L., & Morris, R. G. M. (1987). Hippocampal
study. J. Comp. Neurol., 339(2), 181–208. synaptic enhancement and information storage within a distrib-
Lisman, J. E. (1985). A mechanism for memory storage insensitive uted memory system. Trends Neurosci., 10(10), 408–414.
to molecular turnover: A bistable autophosphorylating kinase. Mehta, M. R., Barnes, C. A., & McNaughton, B. L. (1997).
Proc. Natl. Acad. Sci. USA, 82(9), 3055–3057. Experience-dependent, asymmetric expansion of hippocampal
Lorente de Nó, R. (1934). Studies on the structure of the cerebral place fields. Proc. Natl. Acad. Sci. USA, 94(16), 8918–8921.
cortex. II. Continuation of the study of the ammonic system. Mehta, M. R., Quirk, M. C., & Wilson, M. A. (2000).
J. Psychol. Neurol., 46, 113–177. Experience-dependent asymmetric shape of hippocampal
Madison, D. V., Malenka, R. C., & Nicoll, R. A. (1991). receptive fields. Neuron, 25(3), 707–715.
Mechanisms underlying long-term potentiation of synaptic Moita, M. A., Rosis, S., Zhou, Y., LeDoux, J. E., & Blair,
transmission. Annu. Rev. Neurosci., 14, 379–397. H. T. (2004). Putting fear in its place: Remapping of
Malenka, R. C., & Bear, M. F. (2004). LTP and LTD: An hippocampal place cells during fear conditioning. J. Neurosci.,
embarrassment of riches. Neuron, 44(1), 5–21. 24(31), 7015–7023.
Malenka, R. C., & Nicoll, R. A. (1999). Long-term potentia- Monfils, M. H., & Teskey, G. C. (2004). Skilled-learning-induced
tion—A decade of progress? Science, 285(5435), 1870–1874. potentiation in rat sensorimotor cortex: A transient form
Malinow, R., Mainen, Z. F., & Hayashi, Y. (2000). LTP of behavioural long-term potentiation. Neuroscience, 125(2),
mechanisms: From silence to four-lane traffic. Curr. Opin. 329–336.
Neurobiol., 10(3), 352–357. Morris, R. G., Anderson, E., Lynch, G. S., & Baudry,
Malinow, R., & Malenka, R. C. (2002). AMPA receptor traf- M. (1986). Selective impairment of learning and blockade of
ficking and synaptic plasticity. Annu. Rev. Neurosci., 25, 103–126. long-term potentiation by an N-methyl-D-aspartate receptor
Mansuy, I. M., Winder, D. G., Moallem, T. M., Osman, M., antagonist, AP5. Nature, 319(6056), 774–776.
Mayford, M., Hawkins, R. D., et al. (1998). Inducible and Moser, E. I., Krobert, K. A., Moser, M. B., & Morris,
reversible gene expression with the rtTA system for the study of R. G. (1998). Impaired spatial learning after saturation of
memory. Neuron, 21(2), 257–265. long-term potentiation. Science, 281(5385), 2038–2042.
126 plasticity
Wilson, M. A., & McNaughton, B. L. (1993). Dynamics of Young, B. J., Fox, G. D., & Eichenbaum, H. (1994). Correlates of
the hippocampal ensemble code for space. Science, 261(5124), hippocampal complex–spike cell activity in rats perform-
1055–1058. ing a nonspatial radial maze task. J. Neurosci., 14(11, Pt. 1),
Wolfman, C., Fin, C., Dias, M., Bianchin, M., Da Silva, 6553–6563.
R. C., Schmitz, P. K., et al. (1994). Intrahippocampal or intra- Yuste, R., & Bonhoeffer, T. (2001). Morphological changes in
amygdala infusion of KN62, a specific inhibitor of calcium/ dendritic spines associated with long-term synaptic plasticity.
calmodulin-dependent protein kinase II, causes retrograde Annu. Rev. Neurosci., 24, 1071–1089.
amnesia in the rat. Behav. Neural Biol., 61(3), 203–205. Zalutsky, R. A., & Nicoll, R. A. (1990). Comparison of two forms
Wood, E. R., Dudchenko, P. A., & Eichenbaum, H. (1999). of long-term potentiation in single hippocampal neurons. Science,
The global record of memory in hippocampal neuronal activity. 248(4963), 1619–1624.
Nature, 397(6720), 613–616. Zhou, Q., Homma, K. J., & Poo, M. M. (2004). Shrinkage
Wood, E. R., Dudchenko, P. A., Robitsek, R. J., & of dendritic spines associated with long-term depression of
Eichenbaum, H. (2000). Hippocampal neurons encode informa- hippocampal synapses. Neuron, 44(5), 749–757.
tion about different types of memory episodes occurring in the
same location. Neuron, 27(3), 623–633.
abstract Plasticity is an integral property of a functioning brain review, Jones, 1994). As for the rule governing wiring and
throughout life. In the visual system, cortical plasticity is engaged rewiring between neurons, Donald Hebb theoretically pos-
for encoding the geometric regularities of the visual environment
tulated that neurons are wired together if they fire together
early in life, as well as for functionally adaptive changes in response
to lesions and neurodegenerative diseases. In addition to the pli- (Hebb, 1949). This Hebbian rule of synaptic plasticity has
ability during postnatal maturation and during the restoration of been widely adopted into physiological, psychophysical, and
disrupted functions, the visual system also maintains remarkable computational studies of learning and memory. At the
plasticity for encoding the specific shapes of figures to which we system and behavior levels, Jerzy Konorski (1948) distin-
become familiar. This is known as perceptual learning, and it is guished plasticity from excitability as an independent prop-
important for rapid recognition of the learned shapes in complex
environments and for enhanced sensitivity to delicate nuances of erty of the brain whereby “certain permanent functional
the learned stimulus features. Moreover, the visual system also transformations arise in particular systems of neurons as the
exhibits fast functional switching capabilities, whereby response result of appropriate stimuli.” On top of these earlier insight-
properties of neurons are dynamically adjusted by top-down influ- ful reasoning and speculations, the last half century has wit-
ences for efficient processing of behaviorally relevant stimuli. The nessed the advances of our understanding of cortical plasticity
dynamic nature of neuronal responses is tightly coupled with the
long-term plasticity seen in perceptual learning, as repeated per-
in various respects, from different perspectives, and using a
forming of the same perceptual task, and therefore, repetitive variety of approaches. This chapter focuses on the cortical
invoking of top-down influences specific to the task, can potentiate plasticity in the visual system.
the dynamic changes useful for solving the perceptual tasks, leading Processing of visual information in the brain is distributed
to encoding and retrieving of the implicit memory formed during among more than 30 cortical areas (Van Essen, Anderson,
perceptual learning.
& Felleman, 1992). These functionally specialized and hier-
archically organized areas are interwired by feedforward
Our brain needs to constantly adapt to the environment and and feedback connections, forming partially segregated
to assimilate knowledge about the external world by main- modules and pathways for processing different attributes of
taining a certain degree of functional and architectural mal- visual stimuli. On the one hand, this specific connectivity has
leability. This notion has been appreciated for centuries. been genetically determined or innately hardwired for medi-
The idea that our perceptual and cognitive functions can be ating both stimulus-driven bottom-up process and behavior-
shaped by an individual’s experience was originally expressed driven top-down influences. On the other hand, accumulated
by philosophers such as John Locke, who asserted that the evidence has revealed that visual experience can modify the
human mind at birth is like a blank slate, and that all ideas preexisting functionality and connectivity of the visual system
and knowledge are derived from individual’s experiences throughout life.
(Locke, 1689/1995). The earliest psychological inference
and definition of cortical plasticity were made by William Plasticity in postnatal development
James (1890/1950), who compared the formation of habits
and skills to the plastic changes of materials, and attributed Early Development The maturation process of the
behavioral changes to the plasticity of the brain. One of the visual system continues well into postnatal periods in terms
most influential speculations about the neuronal substrates of both circuitry and functionality. The neural circuitry
of cortical plasticity was vividly drawn by Santiago Ramón within a cortical area, such as the primary visual cortex (area
y Cajal (1911), who proposed that changes in connections V1), comprises two types of connections (for reviews see
between neurons are responsible for our ability to learn (see Gilbert, 1983; Callaway, 1998). The vertical connections,
which link neurons across different cortical layers that
wu li Beijing Normal University, Beijing, China represent the same visual field location, are responsible for
charles d. gilbert The Rockefeller University, New York, New processing local simple stimulus attributes. The horizontal
York connections, which extend parallel to the cortical surface
Figure 8.1 Contour integration. Within a complex background, (compare A with B); and the same array of collinear lines appears
those discrete line segments following the Gestalt law of continuity less salient when they are spaced further apart (compare B with C).
are easily grouped together, forming a visual contour. A contour (From W. Li, Piech, & Gilbert, 2008.)
consisting of more collinear lines is more salient than a shorter one
130 plasticity
in a complex background improves with age and does not (LPZ), and silences neurons within that cortical region (figure
approximate the adult’s level until adolescence (Kovacs, 8.2B). After the lesion, continuous plastic changes in V1
Kozma, Feher, & Benedek, 1999). Surface segmentation, have been observed within a period of time ranging from
another important intermediate level visual function, also minutes to months (for example, see Gilbert & Wiesel, 1992).
matures at a late age comparable to contour integration Within minutes after the lesion, a remarkable increase in
(Sireteanu & Rieth, 1992). Similar to contour integration, RF sizes occurs for V1 neurons whose RFs are located near
the process of partitioning visual images into segregated the boundary of the retinal scotoma. A couple of months
surfaces relies heavily on integration of information across a after the retinal injury, the size of the LPZ dramatically
large visual field area. The late maturation of contour- shrinks (figure 8.2C): neurons within the original cortical
integration and surface-segmentation capabilities suggests LPZ regain responsiveness by shifting their RFs outside the
that natural scene geometries and regularities continue to retinal scotoma. This plastic change is not simply a con-
shape neural circuitry as well as response properties of visual sequence of a rearrangement of thalamocortical afferents;
neurons during a very long period of time after birth. but rather, it is cortically mediated through the long-range
horizontal connections intrinsic to V1 (Gilbert & Wiesel,
Plasticity in response to lesions 1992; Darian-Smith & Gilbert, 1995; Calford, Wright,
Metha, & Taglianetti, 2003).
The closure of critical periods does not necessarily mean Even for the intact visual system in adults, a dramatic
that the neural connections and circuits in the adult brain change in visual experiences by itself can cause a large-scale
have been completely fixed. Abnormal experiences like functional reorganization of the visual cortex. V1 neurons
injuries and neurodegenerative diseases during adulthood in a cerebral hemisphere are driven by inputs from the
can also trigger marked plastic reactions in the central contralateral visual field. After monkeys wore special spec-
nervous system. tacles for several months to reverse their left and right visual
field, some V1 cells begin to respond to stimuli presented in
Lesion Experiments Pronounced changes were first both hemifields (Sugita, 1996).
reported in the spinal cord of adult animals after an injury
to the periphery nerves (Devor & Wall, 1978, 1981). Neurodegenerative Diseases Similar to the retinal lesion
Subsequently, striking reorganization in adult primary experiments, macular degeneration (MD) has also been
sensory cortices has also been widely demonstrated, including
the somatosensory cortex in response to deafferentation of
sensory input from a skin area (Rasmusson, 1982; Merzenich
et al., 1983a, 1983b, 1984; Calford & Tweedale, 1988; Pons
et al., 1991; Weiss, Miltner, Liepert, Meissner, & Taub,
2004), the primary auditory cortex in response to restricted
cochlear lesions (Robertson & Irvine, 1989; Rajan, Irvine,
Wise, & Heil, 1993), and the primary visual cortex in
response to lesions on the retina (Kaas et al., 1990; Heinen
& Skavenski, 1991; Chino, Kaas, Smith, Langston, & Cheng,
1992; Gilbert & Wiesel, 1992; Schmid, Rosa, Calford, &
Ambler, 1996; Eysel et al., 1999; Calford et al., 2000;
Giannikopoulos & Eysel, 2006). All these lesion-induced
plastic changes have comparable effects in the relevant
cortical regions: the cortical territory devoted to representing
the deafferented region on the sensory surface (the skin, the
cochlea, or the retina) becomes responsive to adjacent
sensory surfaces spared from the lesion, a process referred
to as cortical reorganization.
Here we use retinal lesions as an example. The retina is
mapped point-by-point onto the primary visual cortex, gen-
erating a two-dimensional topographic map called the reti- Figure 8.2 Reorganization of V1 in response to retinal lesion. A
notopic map. A restricted lesion on the retina destroys the retinal scotoma produced by focal laser lesion (A, the small gray
area) creates a silent region in V1 (B, the gray area). During recov-
photoreceptors within a small area (figure 8.2A). This retinal ery (C ), neurons within the cortical scotoma regain responsiveness
scotoma cuts off visual input to the corresponding retino- to visual input from the retinal area surrounding the laser-induced
topic region in V1, known as the lesion projection zone scotoma. (Adapted from Gilbert, 1992.)
132 plasticity
somatosensory and auditory systems. The observed changes to the left or to the right with respect to the vertical. In
are analogous to the cortical reorganization observed in detection tasks, either a target presented alone near its
the primary somatosensory and auditory cortices in response contrast detection threshold or a target embedded in a
to peripheral lesions. For example, training monkeys to background of noise or distracters needs to be identified
perform a tactile frequency discrimination task using a as present or absent. Instead of getting more neurons
restricted skin area induces remarkable reorganization of involved by recruiting, other potential mechanisms to
the primary somatosensory cortex, leading to a significant improve performance on these tasks are to increase neuronal
increase in the size and complexity of the territory selectivity for the stimulus attribute that is relevant to the
representing the trained skin area (Recanzone, Merzenich, discrimination task, or to enhance signal-to-noise ratio by
& Jenkins, 1992; Recanzone, Merzenich, Jenkins, Grajski, selectively boosting neuronal responsiveness to the familiar
& Dinse, 1992). Similarly, training on an acoustic frequency target, or to achieve automatization and accelerated
discrimination task dramatically increases the cortical processing speed by shifting cortical representation of the
territory representing the trained frequencies in the primary learned stimulus from higher to lower cortical areas. Neural
auditory cortex (Recanzone, Schreiner, & Merzenich, 1993). correlates in all these respects have been found in visual
This mechanism has been referred to as cortical recruitment, cortical areas, including V1—the first stage of cortical visual
whereby a larger cortical region and thus a greater number processing.
of neurons are recruited to encode the trained stimuli.
Nonetheless, it is still a matter of debate whether the cortical Increased neuronal selectivity in discrimination learning Simple dis-
recruitment is directly responsible for the improved dis- crimination tasks, such as orientation discrimination, only
crimination ability, as the recruitment seems unnecessary involve processing of a basic stimulus attribute. It has been
for enhanced performance on acoustic frequency discrimi- shown that training monkeys on orientation discrimination
nation (Brown, Irvine, & Park, 2004). Moreover, over- selectively sharpens orientation-tuning functions of those V1
representation of the familiar frequencies in the auditory neurons whose RFs are at the trained visual field location
cortex could even be detrimental to discrimination of the and whose preferred orientations are close to the trained
overrepresented frequencies (Han, Kover, Insanally, orientation (Schoups et al., 2001; but see Ghose, Yang, &
Semerdjian, & Bao, 2007). Maunsell, 2002). Similar and stronger effects have also
In the visual system, an fMRI study has shown that prac- been observed in area V4, an intermediate stage in the
ticing a coherent-motion detection task, in which a small visual pathway responsible for object recognition (Yang &
proportion of randomly positioned dots move in the same Maunsell, 2004; Raiguel, Vogels, Mysore, & Orban, 2006).
direction among randomly moving dots, causes a significant The theoretical interpretations of these observations are
enlargement of the cortical territory representing the trained mixed. Intuitively, a sharpening of the orientation-tuning
stimulus in area MT, a cortical area involved in motion curve around the trained orientation would result in an
processing (Vaina, Belliveau, Roziers, & Zeffiro, 1998). increase in neuronal selectivity for the trained orientation,
However, cortical recruitment associated with perceptual which would in turn benefit the discrimination task. This
training has never been documented so far in early visual idea is supported by a computational study (Teich & Qian,
areas (for an attempt to search for such a change in V1, see 2003). Conversely, a modeling study argues that a sharpen-
Crist, Li, & Gilbert, 2001). The lack of transfer or interfer- ing of original tuning curves actually causes a general loss of
ence of learning across visual field locations and between information content conveyed by neuronal responses (Series,
visual stimuli also argues against cortical recruitment as an Latham, & Pouget, 2004).
effective mechanism of visual perceptual learning, because Unlike discrimination of a simple stimulus attribute, some
recruiting by “robbing” adjacent cortical regions would discrimination tasks require lateral integration of contextual
inevitably interfere with processing of other stimuli. However, information. The visual percept of a stimulus, as well as
studies in search of the neural basis of perceptual learning responses of visual neurons to the stimulus, can be modified
have shown some other cortical changes that can better by the global stimulus context within which the stimulus is
account for the observed learning effects. displayed (for reviews see Gilbert, 1998; Albright & Stoner,
2002; Allman, Miezin, & McGuinness, 1985). This phenom-
Neuronal Mechanisms The visual stimuli and tasks used enon, known as contextual modulation, takes place through-
for studies of perceptual learning can be roughly put into out visual cortical areas along the visual pathways,
two categories: visual discrimination and visual detection or representing a general lateral integrative mechanism of
identification. In discrimination tasks, observers need to visual processing. Contextual interactions seen in V1 indi-
discriminate a subtle change in stimulus with respect to a cate that V1 neurons are selective for more complex features
reference dimension or attribute, such as an orientation in visual scenes in addition to simple stimulus attributes
discrimination task, to judge whether a line is slightly tilted like contour orientation. It has been shown that extensive
134 plasticity
Figure 8.4 Learning- and task-dependent changes in V1 associ- mation in V1 responses. (B) Over the course of training the animals
ated with training on contour detection. Shown here are averaged on contour detection, a late response component associated with
population neuronal responses to visual contours consisting of 1, 3, contour saliency emerges—the longer the contours, the stronger
5, 7, and 9 collinear lines embedded in an array of randomly ori- the neuronal responses. (C) In trained animals the contour-related
ented lines (for example see figure 8.1). Time 0 indicates stimulus V1 responses are much weakened when the animals perform tasks
onset. (A) Neuronal responses in V1 of untrained monkeys are that are irrelevant to contour detection. (D) Contour-related
independent of contour lengths (the six peristimulus time histo- responses disappear in the trained V1 region under anesthesia.
grams are superimposed), indicating the absence of contour infor- (Adapted from Li, Piech, & Gilbert, 2008.) (See color plate 6.)
Similar to contour integration, the detection of a & Newsome, 1994; but see Law & Gold, 2008, which
difference in texture between a small area and a large area argues that learning-associated improvement in detection
surrounding it involves the horizontal integrative mecha- of coherent motion does not involve changes in MT, but
nisms. A study showed that a single session of training on rather it largely relies on the stage that makes perceptual
such a surface segmentation task increases fMRI signals in decisions).
early visual areas (Schwartz, Maquet, & Frith, 2002). A Visual search can be taken as a special detection task in
further study has shown that the maximal increases occur which a target is camouflaged in an array of similar distract-
in the first couple of weeks of training before the subjects’ ers. Increased neuronal responsiveness in V1 has been
detection performance reaches a plateau (Yotsumoto, reported to be associated with animals’ familiarity with the
Watanabe, & Sasaki, 2008). With prolonged training, the target (Lee, Yang, Romero, & Mumford, 2002). In addition
elevated fMRI signals drop back to the levels before training. to heightened activity in early visual areas, learning to search
This result is opposite to the electrophysiological finding for a simple geometric shape within distractors causes a
that for monkeys extensively trained on contour detection, concomitant decrease in fMRI signals in higher visual areas
the learning-induced neuronal responses in V1 are retained involved in shape processing (Sigman et al., 2005). This
(W. Li et al., 2008). finding suggests that extensive training can shift cortical rep-
Training on detection of an isolated target near contrast resentation of the learned shape from higher to lower visual
threshold can also selectively boost activity in early areas for more efficient and less effortful processing. This
visual cortex. After training human subjects to detect a idea is further supported by the evidence that extensive
near-threshold grating patch, the fMRI signals in V1 training on a perceptual task significantly reduces activity in
are significantly increased for the trained orientation the frontoparietal cortical network for attentional control
(Furmanski, Schluppeck, & Engel, 2004). Enhancement (Pollmann & Maertens, 2005; Sigman et al., 2005; Mukai
of neuronal responsiveness associated with detection et al., 2007).
training has also been demonstrated in higher cortical
areas along the visual processing streams. For instance, Temporal code In addition to the firing rates, changes in
training monkeys to identify natural scene images that are temporal response properties of neurons have also been
degraded by noise specifically enhances V4 neuronal suggested to be related to perceptual learning. In the
responses to those familiar and degraded pictures (Rainer, primary somatosensory cortex, neuronal responses become
Lee, & Logothetis, 2004). In detection of coherent more coherent with training on tactile frequency discrimina-
motion of dynamic random dots, an improvement in tion. This change correlates better with the improved
monkeys’ performance is correlated by enhanced neuronal discrimination ability than does cortical recruitment
responses in areas MT and MST (Zohary, Celebrini, Britten, (Recanzone, Merzenich, & Schreiner, 1992). Likewise, in
Figure 8.5 Task-specific top-down influences on V1 responses. misaligned with the central line to either side (the cartoons at the
(A) Monkeys were trained to do two different discrimination tasks bottom of C). The animal was cued to perform either a bisection
with identical stimulus patterns at the same visual field location. task based on the three side-by-side lines or a vernier task based on
The stimuli consisted of five simultaneously presented lines: an the three end-to-end lines, using the same set of five-line stimuli.
optimally oriented line fixed in the RF center and flanked by four (B) Responses of a V1 cell were examined as a function of the posi-
additional lines surrounding the RF. In different trials, the arrange- tion of the two side flankers s1 and s2 when the animal either per-
ment of the two side flankers (s1, s2) was randomly assigned from formed the bisection task, in which s1 and s2 were task-relevant;
a set of five different configurations (illustrated in the cartoons at or performed the vernier task, in which the same s1 and s2 were
the bottom of B, labeled from −2 to +2). Each configuration differs task-irrelevant. (C) Responses of a V1 cell were examined as a func-
from the others in the separation between the three side-by-side tion of the position of the two end flankers e1 and e2 when the
lines (in condition 0 the three lines were equidistant; in the other animal either performed the vernier task, in which e1 and e2 were
conditions either s1 or s2 was closer to the central line). In the same task-relevant; or performed the bisection task, in which the same
trials, the two end-flankers (e1, e2) were also independently assigned e1 and e2 were task-irrelevant (Adapted from W. Li, Piech, &
a random configuration from a set of predefined arrangements, Gilbert, 2004.) (See color plate 7.)
such that the end flankers were collinear with each other but
136 plasticity
Moreover, similar to so-called state-dependent learning REFERENCES
(for example, see Shulz, Sosnik, Ego, Haidarliu, & Ahissar, Adini, Y., Wilkonsky, A., Haspel, R., Tsodyks, M., & Sagi, D.
2000), retrieval of the acquired neuronal response properties (2004). Perceptual learning in contrast discrimination: The effect
requires a recurrence of the same stimulus and task used of contrast uncertainty. J. Vis., 4, 993–1005.
for the training. Task-dependent modification of neuronal Ahissar, M., & Hochstein, S. (1993). Attentional control of early
perceptual learning. Proc. Natl. Acad. Sci. USA, 90, 5718–5722.
response properties has also been reported in auditory cortex
Albright, T. D., & Stoner, G. R. (2002). Contextual influences
(for review see Fritz, Elhilali, & Shamma, 2005). This corti- on visual processing. Annu. Rev. Neurosci., 25, 339–379.
cal mechanism can account for the stimulus and task speci- Allman, J., Miezin, F., & McGuinness, E., (1985). Stimulus specific
ficity of perceptual learning. The information related to a responses from beyond the classical receptive field: Neuro-
given stimulus attribute is represented at the level of subsets physiological mechanisms for local-global comparisons in visual
neurons. Annu. Rev. Neurosci., 8, 407–430.
of inputs to a cell, which are gated by the top-down signals
Baker, C. I., Peli, E., Knouf, N., & Kanwisher, N. G. (2005).
via interactions between feedback connections from higher Reorganization of visual processing in macular degeneration.
cortical areas and intrinsic connections within V1. This J. Neurosci., 25, 614–618.
mechanism enables multiple attributes to be represented by Ball, K., & Sekuler, R. (1982). A specific and enduring improve-
the same cells without cross talk, greatly expanding the information- ment in visual motion discrimination. Science, 218, 697–698.
processing capability of neurons. The fast functional switch- Bao, S., Chang, E. F., Woods, J., & Merzenich, M. M. (2004).
Temporal plasticity in the primary auditory cortex induced by
ing or multiplexing capability of visual neurons under operant perceptual learning. Nat. Neurosci., 7, 974–981.
task-specific top-down control is tightly coupled with percep- Brown, M., Irvine, D. R. F., & Park, V. N. (2004). Perceptual
tual learning, as repeated execution of the same perceptual learning on an auditory frequency discrimination task by cats:
task, and therefore, repetitive invoking of top-down influ- Association with changes in primary auditory cortex. Cereb.
ences specific to the task can potentiate the dynamic changes Cortex, 14, 952–965.
Burkhalter, A., Bernardo, K. L., & Charles, V. (1993).
useful for solving the perceptual tasks, leading to encoding Development of local circuits in human visual cortex. J. Neurosci.,
and retrieving of the implicit memory formed during 13, 1916–1931.
perceptual learning. Burton, H. (2003). Visual cortex activity in early and late blind
people. J. Neurosci., 23, 4005–4011.
Epilogue Calford, M. B., & Tweedale, R. (1988). Immediate and chronic
changes in responses of somatosensory cortex in adult flying fox
Visual cortical plasticity is not limited to postnatal develop- after digit amputation. Nature, 332, 446–448.
ment and to contingent reactions induced by anomalous Calford, M. B., Wang, C., Taglianetti, V., Waleszczyk,
W. J., Burke, W., & Dreher, B. (2000). Plasticity in adult cat
experiences. It is a lifelong ongoing process accompanying
visual cortex (area 17) following circumscribed monocular lesions
visual perception, as shown in various cortical changes asso- of all retinal layers. J. Physiol., 524(Pt. 2), 587–602.
ciated with perceptual learning. There has been consider- Calford, M. B., Wright, L. L., Metha, A. B., & Taglianetti,
able debate about the neural basis of perceptual learning V. (2003). Topographic plasticity in primary visual cortex is
regarding the cortical loci where the plastic changes occur, mediated by local corticocortical connections. J. Neurosci., 23,
6434–6442.
since conflicting results are often reported. To derive an
Callaway, E. M. (1998). Local circuits in primary visual cortex of
unbiased point of view based on the mixed results, one must the macaque monkey. Annu. Rev. Neurosci., 21, 47–74.
take into account the nature of visual perception, which, Callaway, E. M., & Katz, L. C. (1990). Emergence and refine-
according to Helmholtz, is nothing more than our subjec- ment of clustered horizontal connections in cat striate cortex.
tive ideas or inference derived from sensory stimulation J. Neurosci., 10, 1134–1153.
(Helmholtz, 1866). It is now evident that the generation of Chino, Y. M., Kaas, J. H., Smith, E. L., Langston, A. L., &
Cheng, H. (1992). Rapid reorganization of cortical maps in adult
visual percepts depends on information processing distri- cats following restricted deafferentation in retina. Vis. Res., 32,
buted across a large number of cortical areas, such as the 789–796.
visual areas dedicated to sensory processing, the attentional Crist, R. E., Kapadia, M. K., Westheimer, G., & Gilbert,
network engaged in top-down control, and the executive C. D. (1997). Perceptual learning of spatial localization: Spe-
network involved in making perceptual decisions. Therefore, cificity for orientation, position, and context. J. Neurophysiol., 78,
2889–2894.
it is not surprising that changes associated with perceptual Crist, R. E., Li, W., & Gilbert, C. D. (2001). Learning to see:
learning could be observed in any of these cortical areas. Experience and attention in primary visual cortex. Nat. Neurosci.,
Another complication comes from the variety of possible 4, 519–525.
visual stimuli and tasks, as well as the limitation of individual Darian-Smith, C., & Gilbert, C. D. (1995). Topographic
approaches used in different studies. The classical fable of reorganization in the striate cortex of the adult cat and monkey
is cortically mediated. J. Neurosci., 15, 1631–1647.
“the blind men and the elephant” is always good to keep in
Devor, M., & Wall, P. D. (1978). Reorganization of spinal
mind when considering the rigorous debate on perceptual cord sensory map after peripheral nerve injury. Nature, 276,
learning and on cortical plasticity. 75–76.
138 plasticity
Merzenich, M. M., Kaas, J. H., Wall, J., Nelson, R. J., Recanzone, G. H., Schreiner, C. E., & Merzenich, M. M. (1993).
Sur, M., & Felleman, D. (1983a). Topographic reorganization Plasticity in the frequency representation of primary auditory
of somatosensory cortical areas 3b and 1 in adult monkeys cortex following discrimination training in adult owl monkeys.
following restricted deafferentation. Neuroscience, 8, 33–55. J. Neurosci., 13, 87–103.
Merzenich, M. M., Kaas, J. H., Wall, J. T., Sur, M., Nelson, Robertson, D., & Irvine, D. R. F. (1989). Plasticity of frequency
R. J., & Felleman, D. J. (1983b). Progression of change following organization in auditory cortex of guinea pigs with partial uni-
median nerve section in the cortical representation of lateral deafness. J. Comp. Neurol., 282, 456–471.
the hand in areas 3b and 1 in adult owl and squirrel monkeys. Saarinen, J., & Levi, D. M. (1995). Perceptual learning in vernier
Neuroscience, 10, 639–665. acuity: What is learned? Vis. Res., 35, 519–527.
Merzenich, M. M., Nelson, R. J., Stryker, M. P., Cynader, Saffell, T., & Matthews, N. (2003). Task-specific perceptual
M. S., Schoppmann, A., & Zook, J. M. (1984). Somatosensory learning on speed and direction discrimination. Vis. Res., 43,
cortical map changes following digit amputation in adult 1365–1374.
monkeys. J. Comp. Neurol., 224, 591–605. Salazar, R. F., Kayser, C., & Konig, P. (2004). Effects of
Mukai, I., Kim, D., Fukunaga, M., Japee, S., Marrett, S., & training on neuronal activity and interactions in primary
Ungerleider, L. G. (2007). Activations in visual and attention- and higher visual cortices in the alert cat. J. Neurosci., 24,
related areas predict and correlate with the degree of perceptual 1627–1636.
learning. J. Neurosci., 27, 11401–11411. Schiltz, C., Bodart, J. M., Dubois, S., Dejardin, S., Michel, C.,
Pascual-Leone, A., Amedi, A., Fregni, F., & Merabet, L. B. Roucoux, A., et al. (1999). Neuronal mechanisms of perceptual
(2005). The plastic human brain cortex. Annu. Rev. Neurosci., 28, learning: Changes in human brain activity with training in ori-
377–401. entation discrimination. Neuroimage, 9, 46–62.
Poggio, T., Fahle, M., & Edelman, S. (1992). Fast perceptual Schmid, L. M., Rosa, M. G. P., Calford, M. B., & Ambler,
learning in visual hyperacuity. Science, 256, 1018–1021. J. S. (1996). Visuotopic reorganization in the primary visual
Pollmann, S., & Maertens, M. (2005). Shift of activity from atten- cortex of adult cats following monocular and binocular retinal
tion to motor-related brain areas during visual learning. Nat. lesions. Cereb. Cortex, 6, 388–405.
Neurosci., 8, 1494–1496. Schoups, A., Vogels, R., & Orban, G. A. (1995). Human percep-
Pons, T. P., Garraghty, P. E., Ommaya, A. K., Kaas, J. H., tual learning in identifying the oblique orientation: Retinotopy,
Taub, E., & Mishkin, M. (1991). Massive cortical reorganization orientation specificity and monocularity. J. Physiol. (Lond.), 483,
after sensory deafferentation in adult macaques. Science, 252, 797–810.
1857–1860. Schoups, A., Vogels, R., Qian, N., & Orban, G. (2001). Practising
Raiguel, S., Vogels, R., Mysore, S. G., & Orban, G. A. orientation identification improves orientation coding in V1
(2006). Learning to see the difference specifically alters the most neurons. Nature, 412, 549–553.
informative V4 neurons. J. Neurosci., 26, 6589–6602. Schwartz, S., Maquet, P., & Frith, C. (2002). Neural correlates of
Rainer, G., Lee, H., & Logothetis, N. K. (2004). The effect of perceptual learning: A functional MRI study of visual texture
learning on the function of monkey extrastriate visual cortex. discrimination. Proc. Natl. Acad. Sci. USA., 99, 17137–17142.
PLoS Biol., 2, E44. Series, P., Latham, P. E., & Pouget, A. (2004). Tuning
Rajan, R., Irvine, D. R. F., Wise, L. Z., & Heil, P. (1993). Effect curve sharpening for orientation selectivity: Coding efficiency
of unilateral partial cochlear lesions in adult cats on the repre- and the impact of correlations. Nat. Neurosci., 7, 1129–
sentation of lesioned and unlesioned cochleas in primary audi- 1135.
tory cortex. J. Comp. Neurol., 338, 17–49. Shiu, L. P., & Pashler, H. (1992). Improvement in line orientation
Ramachandran, V. S., & Braddick, O. (1973). Orientation- discrimination is retinally local but dependent on cognitive set.
specific learning in stereopsis. Perception, 2, 371–376. Percept. Psychophys., 52, 582–588.
Ramón y Cajal, S. (1911). Histologie du système nerveux de l’homme Shulz, D. E., Sosnik, R., Ego, V., Haidarliu, S., & Ahissar, E.
et des vertébrés. Madrid: Consejo Superior de Investigaciones (2000). A neuronal analogue of state-dependent learning. Nature,
Cientificas, reprinted 1972. 403, 549–553.
Rasmusson, D. D. (1982). Reorganization of raccoon somatosen- Sigman, M., Cecchi, G. A., Gilbert, C. D., & Magnasco,
sory cortex following removal of the 5th digit. J. Comp. Neurol., M. O. (2001). On a common circle: Natural scenes and Gestalt
205, 313–326. rules. Proc. Natl. Acad. Sci. USA, 98, 1935–1940.
Recanzone, G. H., Merzenich, M. M., & Jenkins, Sigman, M., & Gilbert, C. D. (2000). Learning to find a shape.
W. M. (1992). Frequency discrimination training engaging Nat. Neurosci., 3, 264–269.
a restricted skin surface results in an emergence of a cutaneous Sigman, M., Pan, H., Yang, Y. H., Stern, E., Silbersweig, D.,
response zone in cortical area 3A. J. Neurophysiol., 67, & Gilbert, C. D. (2005). Top-down reorganization of activity
1057–1070. in the visual pathway after learning a shape identification task.
Recanzone, G. H., Merzenich, M. M., Jenkins, W. M., Grajski, Neuron, 46, 823–835.
K. A., & Dinse, H. R. (1992). Topographic reorganization of Singer, W. (1999). Neuronal synchrony: A versatile code for the
the hand representation in cortical area 3B of owl monkeys definition of relations? Neuron, 24, 49–65.
trained in a frequency-discrimination task. J. Neurophysiol., 67, Sireteanu, R., & Rettenbach, R. (2000). Perceptual learning
1031–1056. in visual search generalizes over tasks, locations, and eyes. Vis.
Recanzone, G. H., Merzenich, M. M., & Schreiner, C. E. (1992). Res., 40, 2925–2949.
Changes in the distributed temporal response properties of S1 Sireteanu, R., & Rieth, C. (1992). Texture segregation in infants
cortical neurons reflect improvements in performance on a tem- and children. Behav. Brain Res., 49, 133–139.
porally based tactile discrimination task. J. Neurophysiol., 67, Squire, L. R., Stark, C. E., & Clark, R. E. (2004). The medial
1071–1091. temporal lobe. Annu. Rev. Neurosci., 27, 279–306.
140 plasticity
9 Characterizing and Modulating
Neuroplasticity of the Adult
Human Brain
alvaro pascual-leone
abstract Neurons are highly specialized structures, are resistant be widely dispersed anatomically but are structurally inter-
to change, but are engaged in distributed networks that do dynami- connected, and that can be functionally integrated to serve
cally change over the lifespan. Changes in functional connectivity,
a specific behavioral role. Such nodes can be conceptualized
for example by shifts in synaptic strength, can be followed by more
stable structural changes. Therefore, the brain is continuously as operators that contribute a given computation indepen-
undergoing plastic remodeling. Plasticity is not an occasional state dent of the input (“metamodal brain”; see Pascual-Leone &
of the nervous system but is the normal ongoing state of the nervous Hamilton, 2001). However, the computations at each node
system throughout the lifespan. It is not possible to understand might also be defined by the inputs themselves. Inputs shift
normal psychological function or the manifestations or conse- depending on the integration of a node in a distributed
quences of disease without invoking the concept of brain plasticity.
The challenge is to understand the mechanisms and consequences neural network, and the layered and reticular structure of
of plasticity in order to modulate them, suppressing some and the cortex with rich reafferent loops provides the substrate
enhancing others, in order to promote adaptive brain changes. for rapid modulation of the engaged network nodes. Depend-
Behavioral, neurostimulation, and targeted neuropharmacological ing on behavioral demands, neuronal assemblies can be
interventions can modulate plasticity and promote desirable out- integrated into different functional networks by shifts in
comes for a given individual.
weighting of connections (functional and effective connectiv-
ity). Indeed, timing of interactions between elements of a
network, beyond integrity of structural connections, might
Human behavior is molded by environmental changes and be a critical binding principle for the functional establish-
pressures, physiological modifications, and experiences. The ment of given network action and behavioral output. Such
brain, as the source of human behavior, must thus have the notions of dedicated, but multifocal, networks, which
capacity to dynamically change in response to shifting affer- can dynamically shift depending on demands for a given
ent inputs and efferent demands. However, individual behavioral output, provide a current resolution to the long-
neurons are highly complex and exquisitely optimized cel- standing dispute between localizationists and equipotential
lular elements, and their capacity for change and modifica- theorists. Function comes to be identified with a certain
tion is necessarily very limited. Fortunately, these stable pattern of activation of specific, spatially distributed, but
cellular elements are engaged into neural networks that interconnected neuronal assemblies in a specific time window
assure functional stability while providing a substrate for and temporal order. In such distributed networks, specific
rapid adaptation to shifting demands. Dynamically chang- nodes may be critical for a given behavioral outcome.
ing neural networks might thus be considered evolution’s Knowledge of such instances is clinically useful to explain
invention to enable the nervous system to escape the restric- findings in patients and localize their lesions, but it provides
tions of its own genome (and its highly specialized cellular an oversimplified conceptualization of brain-behavior rela-
specification) and adapt fluidly and promptly to environ- tions. In the setting of dynamically plastic neural networks,
mental pressures, physiological changes, and experiences. behavior following an insult is never simply the result of the
Therefore, representation of function in the brain may be lesion, but rather the consequence of how the rest of the
best conceptualized by the notion of distributed neural net- brain is capable of sustaining function following a given
works, a series of assemblies of neurons (nodes) that might lesion. Neural plasticity can confer no perceptible change in
the behavioral output of the brain, cannot lead to changes
alvaro pascual-leone Berenson-Allen Center for Noninvasive demonstrated only under special testing conditions, and
Brain Stimulation, Department of Neurology, Beth Israel Deaconess cannot cause behavioral changes that constitute symptoms
Medical Center, Boston, Massachusetts of disease. There may be loss of a previously acquired
142 plasticity
Figure 9.1 (A) Brain activation in fMRI while subjects performed (20 Hz, 90% of motor threshold intensity, 1,600 stimuli; bottom row)
the same rhythmic hand movement (under careful kinematic results in a decrease in activation of rostral SMA. (See color plate
control) before and after repetitive transcranial magnetic stimula- 8.) (B) Areas of the brain showing differential movement-related
tion (rTMS) of the contralateral motor cortex. Following sham responses and coupling after rTMS. Circle, square, and triangle
rTMS (top row) there is no change in the significant activation of symbols indicate sites in primary motor cortex (open symbols) that
the motor cortex (M1) contralateral to the moving hand and of the are more strongly coupled to activity in sensorimotor cortex (SM1),
rostral supplementary motor cortex (SMA). After M1 activity is dorsal premotor cortex (PMd), and supplementary motor cortex
suppressed using 1-Hz rTMS (1,600 stimuli, 90% of motor thresh- (SMA) during a finger movement task after rTMS. X marks the site
old intensity; middle row), there is an increased activation of the of stimulation with 1-Hz rTMS. (B) modified from Lee and col-
rostral SMA and of M1 ipsilateral to the moving hand. Increasing leagues (2003).
excitability in the contralateral M1 using high-frequency rTMS
cortex; (c) right primary motor cortex; and (d) sham stimula- not correct, for it implies, for example, that a lesion to the
tion. Hilgetag and colleagues observed a clear extinction brain will always lead to a loss rather than enhancement of
phenomenon for stimuli presented contralaterally to the function. In fact, we have seen that this view is challenged
stimulated hemisphere (right or left parietal cortex). However, by the conceptualization of the brain as endowed with
the deficit was accompanied by increased detection for dynamic plasticity.
unilateral stimuli presented on the side of the stimulated However, the scope of possible dynamic changes across a
hemisphere compared to baseline (figure 9.2B). None of given neural network is defined by existing connections.
the control stimulation sites had any effect on the detection Genetically controlled aspects of brain development define
performance. These insights can be translated to parietal- neuronal elements and initial patterns of connectivity. Given
damaged patients with neglect, in whom rTMS to the such initial, genetically determined, individually different
undamaged (frequently left) hemisphere can alleviate brain substrates, the same events will result in diverse con-
hemi-inattention symptoms (Brighina et al., 2003; Oliveri sequences as plastic brain mechanisms act upon individually
et al., 1999). distinct neural substrates. Similarly, within each individual,
Therefore, activity in neural networks is dynamically differences across neural networks (e.g., visual system, audi-
modulated, and this fact can be illustrated by the neuro- tory system, or language system) will also condition the range
physiological adaptations to focal brain disruptions or lesions. of plastic modification (Bavelier & Neville, 2002; Neville &
Behavioral outcome, however, does not map in a fixed Bavelier, 2002). Plastic changes across brain systems vary as
manner to changes in activity in distributed networks. Thus a function of differences in patterns of existing connections
changes in network activity can give rise to no behavioral and in molecular and genetically controlled factors across
change, behavioral improvements, or losses. The frequently brain systems that define the range, magnitude, stability, and
held notion that the brain optimizes behavior is therefore chronometry of plasticity.
Dynamic network changes can be followed by more metronome gave a tempo of 60 beats per minute for which
stable plastic changes the subjects were asked to aim, as they performed the exer-
cise under auditory feedback. Subjects were studied on five
Rapid, ongoing changes in neural networks in response to consecutive days, and each day they had a two-hour practice
environmental influences (for example, by dynamic shifts in session followed by a test. The test consisted of the execution
the strength of preexisting connections across distributed of 20 repetitions of the five-finger exercise. The number of
neural networks, changes in task-related cortico-cortical sequence errors decreased, and the duration, accuracy, and
and cortico-subcortical coherence, or modifications of the variability of the intervals between key pushes (as marked by
mapping between behavior and neural activity) may be the metronome beats) improved significantly over the course
followed by the establishment of new connections through of the five days. Before the first practice session on the first
dendritic growth and arborization resulting in structural day of the experiment and daily thereafter, we used TMS to
changes and establishment of new pathways. map the motor cortical areas targeting long finger flexor and
These two steps of plasticity are illustrated by the follow- extensor muscles bilaterally. As the subjects’ performance
ing experiment (Pascual-Leone et al., 1995). Normal subjects improved, the threshold for TMS activation of the finger
were taught to perform with one hand a five-finger exercise flexor and extensor muscles decreased steadily. Even consid-
on a piano keyboard connected to a computer through a ering this change in threshold, the size of the cortical repre-
musical interface. They were instructed to perform the sentation for both muscle groups increased significantly
sequence of finger movements fluently, without pauses, and (figure 9.3A, Week 1). Remarkably, this increase in size of
without skipping any keys, while paying particular attention the cortical output maps could be demonstrated only when
to keeping the interval between the individual key presses the cortical mapping studies were conducted shortly after
constant and the duration of each key press the same. A the practice session, but no longer the next day, after a night
144 plasticity
Figure 9.3 (A) Cortical output maps for the finger flexors during exercise (black bars) in control subjects and subjects with a val-
acquisition of a five-finger movement exercise on a piano. There 66met polymorphism for BDNF (left side). Following exercise,
are marked changes of the output maps for finger flexors of the control subjects had significantly larger representations than at
trained hand over the five weeks of daily practice (Monday to baseline, whereas subjects with a Met allele did not show a signifi-
Friday). Note that there are two distinct processes in action, one cant change. This difference is further illustrated by the representa-
accounting for the rapid modulation of the maps from Mondays to tive motor maps from control and Val-Met polymorphism subjects
Fridays and the other responsible for the slow and more discrete superimposed onto a composite brain MRI image of the cortex
changes in Monday maps over time. (Modified from Pascual- (right side). Sites from which TMS evoked criterium responses in the
Leone, 1996; Pascual-Leone et al., 1995.) (B) Histogram displaying target muscle are marked in green; negative sites are marked in
the size of the cortical output maps before (gray bars) and after red. (Modified from Kleim et al., 2006.) (See color plate 10.)
of sleep and before the next day’s practice session. Interest- (before the first practice session of that week in Group 1) and
ingly, even such initial steps of experience- and practice- on Fridays (after the last practice session for the week in
related plasticity seem critically regulated by genetic factors. Group 1). In the group that continued practicing (Group 1),
Kleim and colleagues (2006) used TMS to map cortical the cortical output maps obtained on Fridays showed an
motor output and show that training-dependent changes in initial peak and eventually a slow decrease in size despite
motor-evoked potentials and motor map organization are continued performance improvement. However, the maps
reduced in subjects with a val66met polymorphism in the obtained on Mondays, before the practice session and fol-
brain-derived neurotrophic factor (BDNF) gene, as com- lowing the weekend rest, showed a small change from base-
pared to subjects without the polymorphism (figure 9.3B). line with a tendency to increase in size over the course of
Once a near-perfect level of performance was reached at the study. In Group 2, the maps returned to baseline after
the end of a week of daily practice, subjects continued daily the first week of follow-up and remained stable thereafter.
practice of the same piano exercise during the following four This experiment illustrates two distinct phases of
weeks (Group 1) or stopped practicing (Group 2) (Pascual- modulation of motor output maps. The rapid time course in
Leone, 1996). During the four weeks of follow-up (figure the initial modulation of the motor outputs, by which a
9.3A, Weeks 2–5), cortical output maps for finger flexor and certain region of motor cortex can reversibly increase its
extensor muscles were obtained in all subjects on Mondays influence on a motoneuron pool, is most compatible with
Two complementary mechanisms control plasticity Figure 9.4 A schematic diagram of the conceptualization
of plasticity as the balance of plasticity-enhancing and plasticity-
limiting mechanisms, which are dependent on different
As indicated in the preceding section, dynamic network
neuromodulators.
changes can lead to more stable plastic changes, which
involve synaptic plasticity as well as dendritic arborization
and network remodeling. Such changes might be conceptu- that enhance current flow through the receptors. In parallel,
alized as the result of a balance between two complimentary signaling from adhesion receptors, particularly integrins,
mechanisms—one promoting and the other limiting plastic- and modulatory receptors, particularly BDNF, induces the
ity (figure 9.4). Both these mechanisms are critical in assur- rapid polymerization of actin and the formation of a new
ing that appropriate synapses are formed and unnecessary cytoskeleton. This polymerization of actin filaments consoli-
synapses are pruned in order to optimize functional systems dates the new dendritic spine morphology and thus the LTP.
necessary for cognition and behavior. Though the molecular Despite the complexity of such a process and the numerous
mechanisms that contribute to plasticity are numerous molecules involved, BDNF appears to be the most potent
and complex, the plasticity-promoting mechanism appears enhancer of plasticity discovered thus far, playing a critical
to be critically dependent on the neurotrophin BDNF role in LTP consolidation across multiple brain regions.
(brain-derived neurotrophic factor) (Lu, 2003), while genes BDNF has been shown to facilitate LTP in the visual cortex
within the major histocompatibility complex (MHC) Class I (Akaneya, Tsumoto, Kinoshita, & Hatanaka, 1997) and the
appear to be involved in the plasticity-limiting mechanism hippocampus (Korte et al., 1995). At CA1 synapses, a weak
(Boulanger, Huh, & Shatz, 2001; Huh et al., 2000). tetanic stimulation, which in and of itself would only induce
At the synaptic level, mechanisms of long-term potentia- short-term potentiation of low magnitude, leads to strong
tion (LTP) and long-term depression (LTD) involve a series LTP when paired with BDNF (Figurov, Pozzo-Miller, Olafs-
of induction and consolidation steps that are dependent on son, Wang, & Lu, 1996). During motor training, BDNF
various structural changes and can be modified, increased, levels are elevated within motor cortex (Klintsova, Dickson,
or suppressed by distinct modulatory influences (Lynch, Yoshida, & Greenough, 2004), and human subjects who
Rex, & Gall, 2007). LTP is initiated by the influx of calcium have a single nucleotide polymorphism in the BDNF gene
through glutamate receptors in the postsynaptic density. (val66met) show reduced experience-dependent plasticity
Calcium-activated kinases and proteinases disassemble the of the motor cortex following a voluntary motor task (Kleim
cytoskeleton, made up of actin filaments cross-linked by et al., 2006).
spectrin and other proteins, that normally maintains the In contrast, adenosine (Arai, Kessler, & Lynch, 1990) and
shape of the dendritic spines. Thus the spine becomes ligands for integrins (Staubli, Vanderklish, & Lynch, 1990)
rounder and shorter, effectively enlarging the surface of the block LTP when applied immediately after theta burst stim-
postsynaptic density, which can then accept a greater number ulation because of the disruption of actin polymerization and
of glutamate receptors and provide better access to proteins LTP consolidation. Along these lines, a blind screen for
146 plasticity
genes involved in normal developmental activity-dependent the auditory system induced by abnormal cochlear input
remodeling of neuronal connectivity revealed a region of (Bartels, Staal, & Albers, 2007). Schizophrenia, depression,
DNA better known for its role in immune functioning, posttraumatic stress disorder, and attention-deficit/
namely Class I major histocompatibility complex (Class hyperactivity disorder are all conditions that may, in part,
I MHC) (Corriveau, Huh, & Shatz, 1998). More recent represent disorders of brain plasticity (Frost et al., 2004;
studies suggest that MHC Class I genes are an integral part Hayley, Poulter, Merali, & Anisman, 2005; Rapoport &
of an experience-dependent plasticity-limiting pathway Gogtay, 2008). Drug addiction and perhaps addictive
(Syken, Grandpre, Kanold, & Shatz, 2006). Such negative behaviors in general are argued to represent examples of
modulators of synaptic plasticity are needed. Establishing pathology as the consequence of plasticity (Kalivas &
and strengthening new synapses is an important part of O’Brien, 2008; Kauer & Malenka, 2007). Alzheimer’s
developmental plasticity, but this has to be coupled with disease appears to be linked to abnormal synaptic plasticity
normal regressive events including activity-dependent syn- that may in fact constitute a crucial initial step in the patho-
aptic weakening and elimination of inappropriate connec- genesis of the disease (Selkoe, 2008). Autism may be another
tions. Without these regressive events, superfluous synapses example of plasticity-mediated pathology: genetic factors
may persist and may impair normal neural development. may lead to a predisposition such that developmentally
Therefore, different modulators, including BDNF on the mediated plasticity (possibly in itself controlled by abnormal
one side and adenosine or MHC Class I genes on the other, regulators) results in pathological complex behaviors affect-
serve complimentary functions that lead to the development ing social interactions, language acquisition, or sensory pro-
and rapid modulation of functional circuits across the whole cessing (Morrow et al., 2008).
brain. Such dynamic systems do harbor potential dangers, Therefore, human behavior and the manifestations of
and disruption of these pathways or their relative balance human disease are ultimately heavily defined by brain plas-
may lead to severe pathological states. However, these oppos- ticity. An initial, genetically determined neural substrate is
ing pathways offer the opportunity for interventions and thus modified during development and environmental interac-
for guiding plasticity for the benefit of individual subjects. tions by plasticity. The processes of neural plasticity them-
selves can be normal, but may act upon an abnormal nervous
Plasticity as the cause of disease system as a consequence of genetic or specific environmental
factors. Alternatively, the mechanisms of plasticity them-
Focal hand dystonia (Quartarone, Siebner, & Rothwell, selves may be abnormal, potentially compounding the
2006) may be a good example of pathological consequences consequences of an abnormal substrate on the basis of a
of plasticity that can be promoted by suitable genetic predis- genetically determined “starting point” or environmental
positions, such as DYT-1 or others. Importantly, though, the insult. In any case, interventions to guide behavior or treat
mere induction of certain plastic changes is not sufficient to pathological symptomatology might be more immediate in
lead to disability. Similar plastic changes can be documented their behavioral repercussions and thus more effective if
in patients with focal dystonia and proficient musicians aimed at modulating plasticity than if intent on addressing
(Quartarone et al., 2006; Rosenkranz, Williamon, & Roth- underlying genetic predispositions.
well, 2007). Furthermore, musicians can develop focal hand Fragile X syndrome provides a suitable illustration for
dystonia (Chamagne, 2003), and the underlying pathophysi- such notions (Bear, Dolen, Osterweil, & Nagarajan, 2008;
ology appears to be slightly different than in other forms of O’Donnell & Warren, 2002; Penagarikano, Mulle, &
dystonia, such as writer’s cramp (Rosenkranz et al., 2008). Warren, 2007). The genetic mutation responsible for fragile
Perhaps “faulty” practice or excessive demand in the pres- X syndrome, FMR1, leads to the absence of the en-
ence of certain predisposing factors may result in unwanted coded protein FMRP, which appears to play an important
cortical rearrangement and lead to disease. It seems clear, role in synaptic plasticity by regulating metabotropic-
though, that plastic changes in the brain do not speak to glutamate-receptor-dependent LTD. Thus in the absence
behavioral impact. Similar changes can be associated with of FMRP there is excessive experience-dependent LTD.
behavioral advantages (as in the professional musicians) Mouse models of fragile X syndrome have also demon-
or neurological disability (as in the case of focal dystonia), strated impairments in LTP, possibly as a result of immature
presumably on the basis of modulatory influences from development of dendritic spines. However, the application
distributed neural activity. of BDNF to slices from FMR1 knockout mice fully restores
Chronic, neuropathic pain syndromes have also been LTP to normal levels (Lauterborn et al., 2007), and thus it
argued to represent “pathological” consequences of plastic- might be possible to normalize cognitive function and behav-
ity (Flor, 2008; Fregni, Pascual-Leone, & Freedman, 2007; ior in patients with fragile X by pharmacologically “normal-
Zhuo, 2008). Tinnitus may be the result of plasticity in izing” the affected mechanisms of plasticity.
148 plasticity
Figure 9.5 (A) Histogram illustrates the significant improvement patients with acute ischemic strokes undergoing 10 days of daily
in performance of the Purdue Pegboard task in stroke patients (on sessions of real or sham, fast rTMS over the affected motor cortex.
average 12 months after the stroke) following real (but not sham) Disability scales (Barthel Index and NIH Stroke Scale) measured
slow-frequency repetitive transcranial magnetic stimulation (rTMS) before rTMS, at the end of the last rTMS session, and 10 days later
to the unaffected hemisphere to decrease interhemispheric inhibi- show that real rTMS (filled symbols) improved patients’ scores sig-
tion of the lesioned hemisphere and improve motor function. nificantly more than sham (open symbols). (Modified from Khedr,
(Modified from Mansur et al., 2005.) (B) Serial assessments in Ahmed, Fathy, & Rothwell, 2005.)
et al., 2004; Naeser et al., 2005). However, challenges for ipsilateral or enhances excitability in the M1 contralateral
such approaches remain, as our understanding of the various to a training hand might result in varying degrees of
issues involved and how to optimize and individualize the improvement in motor function in healthy humans.
neuromodulatory interventions is still rather sketchy. In any Anodal transcranial direct current stimulation (tDCS)
case, neuromodulatory approaches based on brain stimula- applied over M1 to increase its excitability before or during
tion techniques are certainly not the only potential avenues practice can lead to improvements in implicit motor learning
to guide plasticity with therapeutic intent. Behavioral inter- as measured with the serial reaction time task (Nitsche et al.,
ventions, including technology-supported approaches, such 2003), performance of a visuomotor coordination task (Antal
as robotic or computerized task training, as well as pharma- et al., 2004) and a sequential finger movement task (Vines,
cological methods, might be equally effective. Nair, & Schlaug, 2006), and performance of the Jebsen
A most intriguing question to consider is the possibility Taylor Hand function test (JTT) (Boggio et al., 2006). Simi-
of similarly modulating plasticity in the attempt to promote larly, the application of 1-Hz rTMS to suppress excitability
functional gains in normal subjects (Canli et al., 2007; de of M1 ipsilateral to a training hand results in improvements
Jongh, Bolt, Schermer, & Olivier, 2008; Farah et al., 2004; in motor sequence learning (Kobayashi, Hutchinson,
Lanni et al., 2008). Might it, for example, be possible to Théoret, Schlaug, & Pascual-Leone, 2004). However such
promote skill acquisition or verbal or nonverbal learning effects might be task and condition specific. For example,
by enhancing certain plastic processes and suppressing learning of a more complex finger tracking task was not
others? This type of question raises important ethical issues, modified by the same 1 Hz rTMS to suppress excitability of
but also offers the potential for interventions that might be M1 ipsilateral to a training hand (Carey, Fregni, & Pascual-
applicable in educational settings and translationally to Leone, 2006), and the beneficial effects of anodal tDCS to
patients. For example, consistent with the findings in the contralateral hand in the JTT were limited to the non-
recovery of hand motor function after a stroke, noninvasive dominant hand in young healthy adults and the elderly
cortical stimulation that suppresses excitability in the M1 (Boggio et al., 2006).
150 plasticity
Hilgetag, C. C., Theoret, H., & Pascual-Leone, A. (2001). Liepert, J., Storch, P., Fritsch, A., & Weiller, C. (2000).
Enhanced visual spatial attention ipsilateral to rTMS-induced Motor cortex disinhibition in acute stroke. Clin. Neurophysiol., 111,
“virtual lesions” of human parietal cortex. Nat. Neurosci., 4(9), 671–676.
953–957. Lu, B. (2003). BDNF and activity-dependent synaptic modulation.
Huh, G. S., Boulanger, L. M., Du, H., Riquelme, P. A., Learn. Memory, 10, 86–98.
Brotz, T. M., & Shatz, C. J. (2000). Functional requirement Lynch, G., Rex, C. S., & Gall, C. M. (2007). LTP consolidation:
for Class I MHC in CNS development and plasticity. Science, Substrates, explanatory power, and functional significance.
290, 2155–2159. Neuropharmacology, 52, 12–23.
Jenkins, I. H., Brooks, D. J., Nixon, P. D., Frackowiak, Mansur, C. G., Fregni, F., Boggio, P. S., Riberto, M.,
R. S., & Passingham, R. E. (1994). Motor sequence learning: Gallucci-Neto, J., Santos, C. M., et al. (2005). A
A study with positron emission tomography. J. Neurosci., 14, sham stimulation-controlled trial of rTMS of the unaffected
3775–3790. hemisphere in stroke patients. Neurology, 64(10), 1802–1804.
Kalivas, P. W., & O’Brien, C. (2008). Drug addiction as a Martin, P. I., Naeser, M. A., Theoret, H., Tormos, J. M.,
pathology of staged neuroplasticity. Neuropsychopharmacology, 33(1), Nicholas, M., Kurland, J., et al. (2004). Transcranial
166–180. magnetic stimulation as a complementary treatment for aphasia.
Kapur, N. (1996). Paradoxical functional facilitation in brain- Sem. Speech Lang., 25(2), 181–191.
behaviour research: A critical review. Brain, 119(Pt. 5), Morrow, E. M., Yoo, S. Y., Flavell, S. W., Kim, T. K., Lin, Y.,
1775–1790. Hill, R. S., et al. (2008). Identifying autism loci and genes
Karni, A., Meyer, G., Jezzard, P., Adams, M. M., Turner, R., by tracing recent shared ancestry. Science, 321(5886), 218–223.
& Ungerleider, L. G. (1995). Functional MRI evidence Murase, N., Duque, J., Mazzocchio, R., & Cohen, L. G.
for adult motor cortex plasticity during motor skill learning. (2004). Influence of interhemispheric interactions on motor func-
Nature, 377, 155–158. tion in chronic stroke. Ann. Neurol., 55(3), 400–409.
Karni, A., Meyer, G., Rey-Hipolito, C., Jezzard, P., Adams, Naeser, M. A., Martin, P. I., Nicholas, M., Baker, E. H.,
M. M., et al. (1998). The acquisition of skilled motor perfor- Seekins, H., Kobayashi, M., et al. (2005). Improved picture
mance: fast and slow experience-driven changes in primary naming in chronic aphasia after TMS to part of right Broca’s
motor cortex. Proc. Natl. Acad. Sci. USA, 95, 861–868. area: An open-protocol study. Brain Lang., 93(1), 95–105.
Kauer, J. A., & Malenka, R. C. (2007). Synaptic plasticity and Neville, H., & Bavelier, D. (2002). Human brain plasticity:
addiction. Nat. Rev. Neurosci., 8(11), 844–858. Evidence from sensory deprivation and altered language experi-
Khedr, E. M., Ahmed, M. A., Fathy, N., & Rothwell, J. C. ence. Prog. Brain Res., 138, 177–188.
(2005). Therapeutic trial of repetitive transcranial magnetic Nitsche, M. A., Schauenburg, A., Lang, N., Liebetanz, D.,
stimulation after acute ischemic stroke. Neurology, 65(3), Exner, C., Paulus, W., et al. (2003). Facilitation of implicit
466–468. motor learning by weak transcranial direct current stimulation
Kleim, J. A., Chan, S., Pringle, E., Schallert, K., of the primary motor cortex in the human. J. Cogn. Neurosci.,
Procaccio, V., Jimenez, R., et al. (2006). BDNF val66met 15(4), 619–626.
polymorphism is associated with modified experience-dependent Nudo, R. J. (2006). Mechanisms for recovery of motor function
plasticity in human motor cortex. Nat. Neurosci., 9(6), 735–737. following cortical damage. Curr. Opin. Neurobiol., 16(6),
Kleim, J. A., Hogg, T. M., VandenBerg, P. M., Cooper, 638–644.
N. R., Bruneau, R., & Remple, M. (2004). Cortical synaptogen- O’Donnell, W. T., & Warren, S. T. (2002). A decade of
esis and motor map reorganization occur during late, but not molecular studies of fragile X syndrome. Annu. Rev. Neurosci., 25,
early, phase of motor skill learning. J. Neurosci., 24(3), 628–633. 315–338.
Klintsova, A. Y., Dickson, E., Yoshida, R., & Greenough, Oliveri, M., Rossini, P. M., Traversa, R., Cicinelli,
W. T. (2004). Altered expression of BDNF and its high-affinity P., Filippi, M. M., Pasqualetti, P., et al. (1999). Left frontal
receptor TrkB in response to complex motor learning and mod- transcranial magnetic stimulation reduces contralesional extinc-
erate exercise. Brain Res., 1028, 92–104. tion in patients with unilateral right brain damage. Brain, 122,
Kobayashi, M., Hutchinson, S., Théoret, H., Schlaug, 1731–1739.
G., & Pascual-Leone, A. (2004). Repetitive TMS of the motor Oliviero, A., Strens, L. H., Di Lazzaro, V., Tonali, P. A.,
cortex improves ipsilateral sequential simple finger movements. & Brown, P. (2003). Persistent effects of high frequency repeti-
Neurology, 62(1), 91–98. tive TMS on the coupling between motor areas in the human.
Korte, M., Carroll, P., Wolf, E., Brem, G., Thoenen, Exp. Brain Res., 149, 107–113.
H., & Bonhoeffer, T. (1995). Hippocampal long-term potentia- Pascual-Leone, A. (1996). Reorganization of cortical motor
tion is impaired in mice lacking brain-derived neurotrophic outputs in the acquisition of new motor skills. In J. Kinura &
factor. Proc. Natl. Acad. Sci. USA, 12, 8856–8860. H. Shibasaki (Eds.), Recent advances in clinical neurophysiology (pp.
Lanni, C., Lenzken, S. C., Pascale, A., Del Vecchio, I., 304–308). Amsterdam: Elsevier Science.
Racchi, M., Pistoia, F., et al. (2008). Cognition enhancers Pascual-Leone, A., Amedi, A., Fregni, F., & Merabet, L. B.
between treating and doping the mind. Pharmacol. Res., 57(3), (2005). The plastic human brain cortex. Annu. Rev. Neurosci., 28,
196–213. 377–401.
Lauterborn, J. C., et al. (2007). Brain-derived neurotrophic Pascual-Leone, A., & Hamilton, R. (2001). The metamodal
factor rescues synaptic plasticity in a mouse model of fragile X organization of the brain. Prog. Brain. Res., 134, 427–445.
syndrome. J. Neurosci., 27, 10685–10694. Pascual-Leone, A., Nguyet, D., Cohen, L. G., Brasil-Neto,
Lee, L., Siebner, H. R., Rowe, J. B., Rizzo, V., Rothwell, J. C., J. P., Cammarota, A., & Hallett, M. (1995). Modulation of
Frackowiak, R. S., et al. (2003). Acute remapping within the muscle responses evoked by transcranial magnetic stimulation
motor system induced by low-frequency repetitive transcranial during the acquisition of new fine motor skills. J. Neurophysiol,
magnetic stimulation. J. Neurosci., 23(12), 5308–5318. 74(3), 1037–1045.
152 plasticity
10 Exercising Your Brain:
Training-Related Brain Plasticity
daphne bavelier, c. shawn green, and matthew w. g. dye
abstract Learning and brain plasticity are fundamental The second obstacle is that while brain plasticity is typi-
properties of the nervous system, and they hold considerable cally adaptive and beneficial, it can also be maladaptive,
promise when it comes to learning a second language faster, dramatically so at times, as when expert string musicians
maintaining our perceptual and cognitive skills as we age, or
recovering lost functions after brain injury. Learning is critically
suffer from dystonia or motor weaknesses in their fingers as
dependent on experience and the environment that the learner a result of extensive practice with their instruments.
has to face. A central question then concerns the types of experi- Finally, and subsumed in the first two obstacles, is the
ence that favor learning and brain plasticity. Existing research fact that we are still missing the recipe for successful brain
identifies three main challenges in the field. First, not all improve- plasticity intervention at the practical level. Our current
ments in performance are durable enough to be relevant. Second,
understanding of the causal relationship between one type
the conditions that optimize learning during the acquisition
phase are not necessarily those that optimize retention. Third, of training experience and the functional changes it induces
learning is typically highly specific, showing little transfer from the through brain plasticity is still very much incomplete.
trained task to even closely related tasks. Against these limiting However, progress is being made in each of these areas.
factors, the emergence of complex learning environments provides In particular, research in recent years has revealed the
promising new avenues when it comes to optimizing learning in
potential benefits of what are sometimes termed complex
real-world settings.
learning environments. These appear to promote behavior-
ally beneficial plastic changes at a more general level than
previously seen. This chapter provides an overview of these
The ability to learn is fundamentally important to the sur-
recent advances.
vival of all animals. Brain plasticity, together with the learn-
ing it enables, therefore embodies a pivotal evolutionary
force. The human species appears remarkable in this respect, Specificity of learning
as more than a century of research has demonstrated that
In the field of learning, transfer of learning from the trained
humans possess the ability to acquire virtually any skill given
task to even other very similar tasks is generally the exception
appropriate training. Yet, while the exceptional capacity of
rather than the rule. This fact is well documented in the field
humans to learn should certainly reassure those seeking to
“perceptual learning” literature. For instance, Fiorentini and
design educational or rehabilitative training programs, there
Berardi (1980) trained subjects to discriminate between two
are still several key obstacles that need to be overcome before
complex gratings that differed only in the relative spatial
these programs can reach their full potential.
phase of the two component sinusoids (figure 10.1A). Perfor-
The first is that brain plasticity is typically highly specific.
mance on this task improved very rapidly over the course of
While individuals trained on a task will improve on that very
a single training session and remained consistently high when
task, other tasks, even closely related ones, often show little
subjects were tested on two subsequent days. However, when
or no improvement. Obviously, this obstacle potentially
the gratings were rotated by 90 degrees or the spatial fre-
limits the benefits of learning-based interventions, be they
quency was doubled, no evidence of transfer was observed
educational or clinical. After all, it is of little use to improve
(figure 10.1B). Specificity has also been demonstrated in the
the performance of a stroke patient on a visual motion task
discrimination of oriented texture objects, where learning is
in the laboratory if this same training will not allow her to
specific to the location and orientation of the trained stimuli
effectively see moving cars as she tries to safely cross the
(Karni & Sagi, 1991), in the discrimination of dot motion
street.
direction, where the learning is specific to the direction and
speed of the trained stimuli (Ball & Sekuler, 1982; Saffell &
daphne bavelier and matthew w. g. dye Department of Matthews, 2003), and in some types of hyperacuity tasks,
Brain and Cognitive Sciences, University of Rochester, Rochester,
New York
where in addition to being specific for location and orienta-
c. shawn green Department of Psychology, University of tion, learning can even be specific for the trained eye (Fahle,
Minnesota, Minneapolis, Minnesota 2004).
154 plasticity
indeed be fleeting rather than constituting true learned
aggression effects (Carnagey & Anderson, 2005; Carnagey,
Anderson, & Bushman, 2007).
The second class of effects that may masquerade as expe-
rience-dependent learning consists of effects caused by
hidden or unmeasured variables that are unrelated to the
experience of interest. While these effects may represent
learning, they do not represent experience-dependent learn-
ing. For instance, it is well documented that individuals who
have an active interest taken in their performance tend to
improve more than individuals who have no such interest
taken—an effect often dubbed the Hawthorne effect (Lied
& Karzandjian, 1998). This effect can lead to powerful
improvements in performance that have little to do with the
Figure 10.2 Participants’ performance on the letter-number
specific cognitive training regimen being studied, but instead
sequencing test (a measure of working memory skills) and the reflect social and motivational factors that influence perfor-
paper folding and cutting test (a measure of visuospatial construc- mance. In the same vein, the mere presence of mental or
tive skills). Participants were tested shortly after listening to either physical stimulation may lead to performance changes in
an up-tempo sonata of Mozart in a major key, which conveyed groups that are chronically understimulated (as may be the
a mood of happiness, or a slow-tempo adagio of Albinoni in
case with the institutionalized elderly), which again would
a minor key, which conveyed a mood of sadness. Participants
performed better on both tests after listening to the Mozart not be considered experience-dependent learning as it is not
piece compared to the Albinoni piece. This work illustrates that dependent on the type of experience.
the “Mozart effect” has little to do with learning per se. Rather, A related issue arises when researchers attempt to infer the
music listening seems to affect performance for better or for presence of experience-dependent learning by examining
worse on a wide variety of tests by changing arousal and mood just
behavioral differences in groups that perform various activi-
before testing. Asterisks denote statistical significance. (Adapted
from Schellenberg, Nakata, Hunter, & Tomato, 2007, figure 2; ties as part of their everyday lives (for instance, athletes, musi-
Thompson, Schellenberg, & Husain, 2001, figure 1.) cians, or video game players). The obvious concern here is
population bias—in other words, inherent differences in abil-
influences performance. For example, pop music such as ities may lead to the differences in the activities experienced,
“Country House” by Blur led to a greater spatial IQ enhance- rather than the other way around. For example, individuals
ment than a piece by Mozart (Schellenberg & Hallam, born with superior hand-eye coordination may be quite suc-
2005). Further confirming the arousal-mood hypothesis, lis- cessful at baseball and thus preferentially tend to play base-
tening to a high-tempo piece by Mozart was found to lead ball, while individuals born with poor hand-eye coordination
to better verbal IQ measures than listening to a slower piece may tend to avoid playing baseball. A hypothetical study that
by Albinoni (figure 10.2; Schellenberg, Nakata, Hunter, & examined differences in hand-eye coordination between
Tomato, 2007). baseball players and nonplayers may observe a difference in
Along the same line, studies that have examined the hand-eye coordination, but it would be erroneous to link
impact of playing violent video games on aggressive behav- baseball experience to superior hand-eye coordination when
ior may suffer from the same weakness, as the tests used to a population bias was truly at the root of the effect.
assess changes in the dependent variables of interest (behav- Training studies aiming to establish experience-dependent
ior, cognition, affect, etc.) are typically given within minutes learning should therefore demonstrate (1) benefits that go
of the end of exposure to the violent video games. Given that beyond the temporary arousal or mood changes an experi-
violent video games are known to trigger a host of transient ence can induce, and (2) a clear causal link between the
physiological changes associated with increased arousal and specific training experience and learning. The effect of train-
stress (i.e., “fight-or-flight” responses), it is important to dem- ing should be measured at least a full day after completion
onstrate that any changes in behavior or cognition are not of training to ensure that it is a robust learning effect. As
likewise transient in nature. It is interesting to note that while illustrated by the Mozart effect, training participants for
several recent papers in this field have reported changes in 20 minutes and immediately showing changes in measures
aggressive cognition and affect as well as desensitization to of their performance does not mean that a long-lasting
violence immediately following 30 minutes of exposure to alteration of performance has taken place. Furthermore, to
violent video games, the same studies failed to find a signifi- establish a definitive causal link between a given form of
cant relationship between these variables and being a regular experience and any enhancement in skills, it is necessary not
player of violent video games, suggesting that the effects may only to train nonexperts on the experience in question and
156 plasticity
faster visual reaction times and better spatial orienting abili- Playing action video games improves fundamental prop-
ties. Lum, Enns, and Pratt (2002), McAuliffe (2004), and erties of vision (Green & Bavelier, 2007; Li, Polat, Makous,
Nougier, Azemar, and Stein (1992) observed similar sports- & Bavelier, in press). One visual ability often diminished
related differences in a Posner cuing task, while Kida, Oda, in patients with poor vision, such as amblyopes or older
and Matsumura (2005) demonstrated that trained baseball adults (Bonneh, Sagi, & Polat, 2007), is the ability to read
players respond faster than novices in a go-no-go task (press small print, with letters appearing unstable and jumbled.
the button if you see color A, do not press the button if you The tendency for the resolvability of letters to be adversely
see color B), but interestingly show no enhancements in affected by near neighbors, termed crowding, is typically
simple reaction time tasks (press a button when a light turns evaluated by asking subjects to identify the orientation of a
on). Unfortunately, no training studies are available at this letter flanked by distractors, and by determining the smallest
point to establish a causal link between these performance distance between target and distractors at which subjects can
enhancements and the specific physical activity under inves- still correctly identify the target (figure 10.4A). Individuals
tigation. The possibility that aerobic exercise of any sort may with better vision can tolerate distractors being brought
enhance cognitive abilities has received much attention lately nearer to the target while still maintaining high-accuracy
with respect to aging. Consistently positive results have been performance. To establish the causal effect of action video
reported in many cross-sectional studies comparing older game playing on this visual skill, a training study was carried
adults who normally exercise with those who do not. Enhance- out whereby subjects were randomly assigned to one of
ments have been documented in tasks as varied as dual- two training groups: an action video games trained group
task performance or executive attention/distractor rejection (e.g., Unreal Tournament,) or a control trained group (e.g.,
(for recent reviews see Colcombe et al., 2003; Hillman, Erick- Tetris). Each group was tested pre- and post-training on the
son, & Kramer, 2008; Kramer & Erickson, 2007). More crowding task. Participants trained on the action game
training studies are needed to unambiguously establish the improved significantly more than those trained on the
causal effect of aerobic exercise on perception and cognition. control game (figure 10.4B). The inclusion of a control game
Yet, taken together, studies of the effect of athletic training group allows us to measure any possible improvements due
and exercise on perception and cognition are tantalizing, and to test-retest (i.e., familiarity with the task) or to Hawthorne-
they have prompted renewed interest for demonstrating a like effects (Lied & Karzandjian, 1998). Finally, the control
causal link between the physical nature of the training games were chosen to be as pleasurable and engrossing as
regimen and enhancement of cognitive skills. the experimental training games in order to minimize differ-
Perhaps the most popular training regimen over the past ences in arousal across groups. Critically, posttraining evalu-
decade has been video games. The possibility that percep- ation was always performed at least a day after the completion
tual and cognitive abilities are enhanced in video game of the training phase.
players has raised much attention (for a review, see Green Playing action video games was also shown to enhance
& Bavelier, 2006b). Indeed, the variety of different skills and several different aspects of visual selective attention. Action
the degree to which they are modified in video game players game training improves the ability of young adults to search
appears remarkable. These include improved hand-eye their visual environment for a prespecified target, to monitor
coordination (Griffith, Voloschin, Gibb, & Bailey, 1983), moving objects in a complex visual scene, and to process a
increased processing in the periphery (Green & Bavelier, fast-paced stream of visual information (Feng, Spence, &
2006c), enhanced mental rotation skills (Sims & Mayer, Pratt, 2007; Green & Bavelier, 2003, 2006c, 2006d). In one
2002), greater divided attention abilities (Greenfield, DeWin- such experiment, the efficiency with which attention is dis-
stanley, Kilpatrick, & Kaye, 1994), faster reaction times tributed across the visual field was measured with a visual
(Castel, Pratt, & Drummond, 2005), and even job-specific search task called the Useful Field of View paradigm (Ball,
skills such as laparoscopic manipulation (Rosser, Lynch, Beard, Roenker, Miller, & Griggs, 1988). This task is akin to
Cuddihy, Gentile, & Merrell, 2007) and airplane piloting looking for a set of keys on a cluttered desk. Subjects are
procedures (Gopher, Weil, & Bareket, 1994). Although asked to localize a briefly presented peripheral target in a
intriguing, this literature has little to say about learning per field of distracting objects; accuracy of performance is
se unless the causal effect of game playing is unambiguously recorded (figure 10.5A). Training on an action video game
established. So far, only a few studies have established a for just 10 hours improved performance on that task by about
causal link between video game play and long-lasting changes 30%, an improvement which is greatly in excess of that
in performance. Among these is a series of studies that which can be induced by training on a control game (figure
provide compelling evidence that playing action video 10.5B). In a related study, Feng and colleagues (2007) showed
games—such as first-person perspective shooter games— that performance on the Useful Field of View task differs
promotes widespread changes ranging from early sensory across gender, with males showing an advantage. Yet, after
functions to higher cognitive functions in adults. 10 hours of action game training, this gender difference was
158 plasticity
such as those derived from connectionism or machine learn- A key factor in ensuring flexible learning is high variabil-
ing, provide some clues about the factors that facilitate ity. Variability is important both at the level of the exemplars
bottom-up learning based upon the statistics of the input. to be learned and the context in which they appear (Schmidt
Recently, the framework of Bayesian inference has been & Bjork, 1992). For example, subjects learn to recognize
proposed to provide a good first-order model of how subjects objects in a more flexible way if the objects are presented in
learn to optimize behavior in dynamic complex tasks, be a highly variable context (Brady & Kersten, 2003). High
they perceptual or cognitive in nature (Courville, Daw, & contextual variability ensures that subjects learn to ignore
Touretzky, 2006; Ernst & Banks, 2002; Orbán, Fiser, Aslin, the specifics of the objects, such as are brought about by
& Lengyel, 2008; Tenenbaum, Griffiths, & Kemp, 2006). changes in view, lighting, camouflage, or shape, and rather
Another key feature of recent advances has been the realiza- learn to extract more general principles about object cate-
tion that actions and the feedback they provide about the gory. Statistical approaches such as mutual information
next step to be computed can greatly reduce the computa- show that subjects implicitly develop knowledge of the frag-
tional load of a task, as well as facilitate learning and gener- ments or chunks that carry information about the categories
alization (Ballard, Hayhoe, Pook, & Rao, 1997; Taagten, to be learned (Hegdé, Bart, & Kersten, 2008; Orbán et al.,
2005). Finally, symbolic cognitive architectures such as 2008). A key issue then arises as to when these informative
SOAR and ACT-R provide insights into how knowledge fragments allow for learning that generalizes as compared to
representations should be structured to explain the acquisi- learning that is item specific. Work on object classification
tion of abstract systems of knowledge, and possibly transfer and artificial grammar learning shows that low input vari-
of knowledge across these systems (Anderson et al., 2004; ability induces learning at levels of representation that are
Lehman, Laird, & Rosenbloom, 1998). Based on this variety specific to the items being learned, and thus too rigid to
of theoretical approaches, one can begin to identify charac- generalize to new stimuli. High variability is crucial in ensur-
teristics inherent to complex training regimens that seem ing that the newly learned informative fragments be at levels
more likely to be at the root of general learning. These of representation that can flexibly recombine (Gomez, 2002;
include, but are not limited to, (1) level of representation, (2) Onnis, Monaghan, Christiansen, & Chater, 2004; Reeler,
task difficulty, (3) goals, action, and feedback, and (4) motiva- Newport, & Aslin, 2008). Research on the video game Tetris
tion and arousal. and its effect on mental rotation illustrates this point well.
Even though mental rotation is at a premium in Tetris,
Levels of Representation Learning is more likely to expert Tetris players have been found to exhibit mental
be flexible and general if it occurs at the level of richly rotation capacities similar to those of naïve subjects, except
structured representations that contribute to a wide array of when tested on Tetris or Tetris-like shapes (Sims & Mayer,
behaviors, rather than if it changes neural networks whose 2002). The use of a limited number of shapes in Tetris allows
functions are highly specialized. The field of perceptual the learner to memorize spatial configurations and moves
learning has identified task difficulty as one of the main (Destefano & Gray, 2007). This approach allows for the
factors controlling the level of representation at which development of excellent expertise at the game itself, but
learning occurs. In their reverse hierarchy theory of what is learned in this low-variability game is less likely to
perceptual learning, Ahissar and Hochstein (2004) generalize to other environments. By this view, an efficient
hypothesize that learning is a top-down guided process, scheme to enforce mental rotation learning would be to use
where learning occurs at the highest level of representation a highly variable set of objects preventing learning of specific
that is sufficient for the given task. Easy tasks can be learned configurations.
at a reasonably high level of representation that may be
shared with many other tasks, allowing for sizable learning Task Difficulty The proposal that task difficulty
transfer. When tasks become exceedingly difficult—at least controls the type and rate of learning is implicit in all theories
in the perceptual domain, such as in Vernier acuity tasks of learning. The perceptual learning literature nicely
near the hyperacuity range—lower levels of representation illustrates the impact of manipulating task difficulty
with better signal-to-noise ratios are required for adequate appropriately (Sireteanu & Rettenbach, 1995, 2000). In
task performance. In such cases, only tasks that make use of particular, when it comes to promoting learning transfer,
this low-level neural network, down to the specific retinal harder tasks are at a disadvantage. For example, in a task
location and stimulus orientation, will benefit. Although the where participants had to view arrays of oriented lines and
reverse hierarchy theory was developed to account for determine which contained a single oddly oriented line, task
perceptual learning effects, it aligns well with the more difficulty was manipulated by limiting exposure time (Ahissar
general proposal that transfer of learned knowledge to & Hochstein, 1997). With practice, the minimal exposure
different tasks and contexts will be more likely when learning time that could be tolerated by the participants decreased
and inference operate at higher levels of representation. substantially. Interestingly, when the task was started at a
160 plasticity
The importance of reward in learning is already sup- represent a salient difference between traditional learning
ported by neurophysiology studies that show that the paradigms and video game play. In the same vein, although
brain systems thought to convey the utility of reward, again with barn owls, Bergan, Ro, Ro, & Knudsen (2005)
such as the ventral tegmental area and the nucleus observed that adult owls who were forced to hunt (an activity
basilis, play a large role in producing plastic changes that involves motivation and arousal) while wearing displac-
in sensory areas. In particular, when specific auditory ing prisms demonstrated significant learning compared to
tones are paired with stimulation of either of these adult owls who wore the prisms for the same period of time,
structures, the area of primary auditory cortex that but who were fed dead prey. The latter failed to adapt to
represents the given tone increases dramatically in size (Bao, the displacing prism.
Chan, & Merzenich, 2001; Kilgard & Merzenich, 1998).
Interestingly, at least some of the brain areas known Conclusions
to be sensitive to reward have been shown to be extremely
active when individuals play action video games. For The field of experience-dependent plasticity is rapidly
instance, Koepp and colleagues (1998) demonstrated that expanding, thanks in part to new technologies. Cognitive
roughly the same amount of dopamine is released in the training on handheld devices and job-related training in
basal ganglia when playing an action video game as when immersive environments are now within the reach of
methamphetamines are injected intravenously. Determining most institutions, if not individuals. This trend is exciting
the exact role of reward-processing areas in the promotion because the most successful interventions, when it comes
of learning and neural plasticity will continue to be an area to ameliorating deficits in patients or enhancing skills in
of active research. an educational context, rely on complex training regimens.
These regimens require the simultaneous use of perceptual,
Motivation and Arousal Motivation is a critical attentional, memory, and motor skills to trigger learning
component of most major theories of learning, with that goes beyond the specifics of the training regimen
motivation level being posited to depend highly on an itself. New technologies are perfectly positioned to enhance
individual’s internal belief about her ability to meet the the development of such complex learning environments.
current challenge. Vygotsky’s (1978) concept of a zone of For all the excitement, challenges lie ahead. First among
proximal development matches well with the skill-learning these is developing an understanding of which ingredients
literature discussed previously. According to this theory, should be included in training regimens in order to
motivation is highest and learning is most efficient when promote widespread learning. Studies of the neural bases
tasks are made just slightly more difficult than can be of arousal, motivation, and reward processing hold promise
matched by the individual’s current ability. Tasks that are in that respect. Second, although the type of improvement
much too difficult or much too easy will lead to lower levels desired is usually clear, as when educators or rehabilitation
of motivation and thus substantially reduced learning. This therapists state their goals for a student or a patient, identify-
is not to say that learning will never occur if the task is too ing the cognitive component of a training regimen aimed
difficult or too easy (Amitay et al., 2006; Seitz & Watanabe, at realizing those goals is not always so straightforward.
2003; Watanabe, Nanez, & Sasake, 2001), but learning rate At first glance, playing action video games does not appear
should be at a maximum when the task is challenging, yet to be a mind-enhancing activity. Yet it seems to generate
still doable. beneficial effects for perception, attention, and decision
Like motivation, arousal is a key component of many making beyond what one may have expected. In contrast,
learning theories. The Yerkes-Dodson law predicts that the game Tetris clearly requires mental rotation, and yet it
learning is a U-shaped function of arousal level (Yerkes does not lead to a general benefit in mental rotation skill.
& Dodson, 1908). Training paradigms that lead to low Cognitive analysis is needed to determine the level of repre-
levels of arousal will tend to lead to low amounts of sentation at which the learning is most likely to occur given
learning, as will training paradigms that lead to excessively the nature of the training regimen. We are understanding
high levels of arousal (Frankenhaeuser & Gardell, 1976). more about the conditions necessary to develop interven-
Between these extremes there is an arousal level that tions that will lead to generalizable learning effects, and
leads to a maximum amount of learning, which no doubt these hold promise for benefiting individuals and the societ-
differs greatly between individuals. Interestingly, video ies within which they live.
games are known to elicit both the autonomic responses
acknowledgments This research was supported by grants to
(Hebert, Beland, Dionne-Fournelle, Crete, & Lupien,
DB from the National Institutes of Health (EY016880 and
2005; Segal & Dietz, 1991; Shosnik, Chatterton, Swisher, & CD04418) and the Office of Naval Research (N00014-07-1-0937).
Park, 2000) and neurophysiological responses (Koepp et al., We also thank Bjorn Hubert-Wallander for help in figure prepara-
1998) that are characteristic of arousal. These responses tion and manuscript preparation.
162 plasticity
Griffith, J. L., Voloschin, P., Gibb, G. D., & Bailey, McAuliffe, J. (2004). Differences in attentional set between
J. R. (1983). Differences in eye-hand motor coordination of athletes and nonathletes. J. Gen. Psychol., 131(4), 426–437.
video-game users and non-users. Percept. Mot. Skills, 57, McCutcheon, L. E. (2000). Another failure to generalize the
155–158. Mozart effect. Psychol. Rep., 87, 325–330.
Hebert, S., Beland, R., Dionne-Fournelle, O., Crete, M., Mollon, J. D., & Danilova, M. V. (1996). Three remarks on
& Lupien, S. J. (2005). Physiological stress response to video- perceptual learning. Spatial Vis., 10(1), 51–58.
game playing: The contribution of built-in music. Life Sci., 76, Nougier, V., Azemar, G., & Stein, J. (1992). Covert orienting
2371–2380. to central visual cues and sport practice relations in the develop-
Hegdé, J., Bart, E., & Kersten, D. (2008). Fragment-based ment of visual attention. J. Exp. Child Psychol., 54, 315–333.
learning of visual object categories. Curr. Biol., 18(8), 597–601. Olesen, P., Westerberg, H., & Klingberg, T. (2004). Increased
Herzog, M. H., & Fahle, M. (1997). The role of feedback prefrontal and parietal activity after training of working memory.
in learning a vernier discrimination task. Vis. Res., 37, 2133– Nat. Neurosci., 7(1), 75–79.
2141. Onnis, L., Monaghan, P., Christiansen, M. H., & Chater, N.
Hetland, L. (2000). Learning to make music enhances spatial (2004). Variability is the spice of learning, and a crucial ingredient
reasoning. J. Aesthetic Educ., 34, 179–238. for detecting and generalising in nonadjacent dependencies. In Proceedings
Hillman, C. H., Erickson, K. I., & Kramer, A. F. (2008). of the Annual Conference of the Cognitive Science Society.
Be smart, exercise your heart: Exercise effects on brain and Mahwah, NJ: Erlbaum.
cognition. Nat. Rev. Neurosci., 9, 58–65. Orbán, G., Fiser, J., Aslin, R., & Lengyel, M. (2008). Bayesian
Ho, Y. C., Cheung, M. C., & Chan, A. S. (2003). Music training learning of visual chunks by human observers. Proc. Natl. Acad.
improves verbal but not visual memory: Cross-sectional and Sci. USA, 105(7), 2745–2750.
longitudinal explorations in children. Neuropsychology, 17(3), Pashler, H., & Baylis, G. (1991). Procedural learning. 2. Intertrial
439–450. repetition effects in speeded choice tasks. J. Exp. Psychol. Learn.
Karni, A., & Sagi, D. (1991). Where practice makes perfect in Mem. Cogn., 17, 33–48.
texture discrimination: Evidence for primary visual cortex plas- Ponzi, A. (2008). Dynamical model of salience gated working
ticity. Proc. Natl. Acad. Sci. USA, 88(11), 4966–4970. memory, action selection and reinforcement based on basal
Kida, N., Oda, S., & Matsumura, M. (2005). Intensive baseball ganglia and dopamine feedback. Neural Net., 21(2–3), 322–330.
practice improves the go/nogo reaction time, but not the simple Proteau, L. (1992). On the specificity of learning and the role of
reaction time. Cogn. Brain Res., 22(2), 257–264. visual information for movement control. In L. Proteau &
Kilgard, M., & Merzenich, M. (1998). Cortical map reorganiza- D. Elliott (Eds.), Vision and motor control (Vol. 85, pp. 67–103).
tion enabled by nucleus basalis activity. Science, 279, Amsterdam: North Holland.
1714–1718. Rauscher, F. H., Shaw, G. L., & Ky, K. N. (1993). Music and
Kioumourtzoglou, E., Kourtessis, T., Michalopoulou, M., spatial task performance. Nature, 365(6447), 611.
& Derri, V. (1998). Differences in several perceptual abilities Rauscher, F. H., Shaw, G. L., Levine, L. J., Wright, E. L.,
between experts and novices in basketball, volleyball, and water- Dennis, W. R., & Newcomb, R. L. (1997). Music training causes
polo. Percept. Mot. Skills, 86(3, Pt. 1), 899–912. long-term enhancement of preschool children’s spatial-temporal
Koepp, M., Gunn, R., Lawrence, A., Cunningham, V., Dagher, reasoning. Neurol. Res., 19(1), 2–8.
A., Jones, T., et al. (1998). Evidence for striatal dopamine release Redding, G. M., Rossetti, Y., & Wallace, B. (2005). Applications
during a video game. Nature, 393, 266–268. of prism adaptation: A tutorial in theory and method. Neurosci.
Kramer, A. F., & Erickson, K. I. (2007). Capitalizing on cortical Biobehav. Rev., 29(3), 431–444.
plasticity: Influence of physical activity on cognition and brain Redding, G. M., & Wallace, B. (2006). Generalization of prism
function. Trends Cogn. Sci., 11(8), 342–348. adaptation. J. Exp. Psychol. Hum. Percept. Perform., 32(4),
Lehman, J., Laird, J., & Rosenbloom, P. (1998). A gentle intro- 1006–1022.
duction to soar: An architecture for human cognition. In Reeler, P., Newport, E. L., & Aslin, R. N. (2008). The role of
D. Scarborough & S. Sternberg (Eds.), Methods, models and concep- distributional information in linguistic categories. Paper presented at the
tual issues (2nd ed., Vol. 4, pp. 211–254). Boston: MIT Press. Boston University Conference on Language Development,
Li, R., Polat, U., Makous, W., & Bavelier, D. (in press). Enhanc- Boston.
ing the contrast sensitivity function through action video game Rieser, J. J., Pick, H. L., Jr., Ashmead, D. H., & Garing, A. E.
training. Nat. Neurosci. (1995). Calibration of human locomotion and models of per-
Lied, T. R., & Karzandjian, V. A. (1998). A Hawthorne strategy: ceptual-motor organization. J. Exp. Psychol. Hum. Percept. Perform.,
Implications for performance measurement and improvement. 21(3), 480–497.
Clin. Performance Qual. Health Care, 1998(6), 4. Rosser, J. C. Jr., Lynch, P. J., Cuddihy, L., Gentile, D. A.,
Linkenhoker, B. A., & Knudsen, E. I. (2002). Incremental training Klonsky, J., & Merrell, R. (2007). The impact of video games
increases the plasticity of the auditory space map in adult barn on training surgeons in the 21st century. Arch. Surg., 142(2),
owls. Nature, 419(6904), 293–296. 181–186.
Liu, Z., & Weinshall, D. (2000). Mechanisms of generalization in Saffell, T., & Matthews, N. (2003). Task-specific perceptual
perceptual learning. Vis. Res., 40(1), 97–109. learning on speed and direction discrimination. Vis. Res., 43(12),
Lum, J., Enns, J., & Pratt, J. (2002). Visual orienting in college 1365–1374.
athletes: Explorations of athlete type and gender. Res. Q. Exerc. Schellenberg, E. G. (2004). Music lessons enhance IQ. Psychol.
Sport, 73(2), 156–167. Sci., 15(8), 511–514.
Martin, T. A., Keating, J. G., Goodkin, H. P., Bastian, A. J., & Schellenberg, E. G. (2006). Exposure to music: The truth about
Thach, W. T. (1996). Throwing while looking through prisms. the consequences. In G. McPherson (Ed.), The child as musician:
II. Specificity and storage of multiple gaze-throw calibrations. A handbook of musical development. Oxford, UK: Oxford University
Brain, 119(Pt. 4), 1199–1211. Press.
164 plasticity
11 Profiles of Development and
Plasticity in Human
Neurocognition
courtney stevens and helen neville
abstract We describe changes in neural organization and related ent regions. The prolonged developmental time course and
aspects of processing after naturally occurring alterations in audi- considerable pruning of connections are considered major
tory, visual, and language experience. The results highlight the forces that permit and constrain human neuroplasticity.
considerable differences in the degree and time periods of neuro-
plasticity displayed by different subsystems within vision, hearing,
Recently an additional factor that appears to be important
language, and attention. We also describe results showing the two has been identified. The occurrence of polymorphisms in
sides of neuroplasticity, that is, the capability for enhancement and some genes is widespread in humans and rhesus monkeys
the vulnerability to deficit. Finally we describe several intervention but apparently not in other primate species. Polymorphisms
studies in which we have targeted systems that display more provide the capability for environmental modification of the
neuroplasticity and show significant improvements in cognitive
effects of gene expression (gene × environment interactions),
function and related aspects of brain organization.
and such effects have been observed in rhesus monkeys
and humans (Suomi, 2003, 2004, 2006; Sheese, Voelker,
Rothbart, & Posner, 2007; Bakermans-Kranenberg, Van
Extensive research on animals has elucidated both genetic Ijzendoom, Pijlman, Mesman, & Femmie, 2008).
and environmental factors that constrain and shape neuro- For several years we have employed psychophysics, elec-
plasticity (Hunt et al., 2005; Garel, Huffman, & Rubenstein, trophysiological (ERP), and magnetic resonance imaging
2003; Bishop et al., 1999; Bishop, 2003). Such research, (MRI) techniques to study the development and plasticity
together with noninvasive neuroimaging and genetic of the human brain. We have studied deaf and blind indi-
sequencing techniques, has guided a burgeoning literature viduals, people who learned their first or second spoken or
characterizing the nature, time course, and mechanisms of signed language at different ages, and children of different
neuroplasticity in humans (Pascual-Leone, Amedi, Fregni, & ages and of different cognitive capabilities. As detailed in the
Merabet, 2005; Bavelier & Neville, 2002; Movshon & Blake- sections that follow, in each of the brain systems examined
more, 1974). Electron microscopic studies of synapses and in this research—including those important in vision, audi-
neuroimaging studies of metabolism and of gray and white tion, language, and attention—we observe the following
matter development in the human brain reveal a generally characteristics:
prolonged postnatal development that nonetheless displays
• Different brain systems and subsystems and related
considerable regional variability in time course (Chugani,
Phelps, & Mazziotta, 1987; Huttenlocher & Dabholkar, sensory and cognitive abilities display different degrees
1997; Neville, 1998; Webb, Monk, & Nelson, 2001). In and time periods (“profiles”) of neuroplasticity. These
general, development across brain regions follows a hierar- may depend on the variable time periods of development
chical progression in which primary sensory areas mature and redundant connectivity displayed by different brain
before parietal, prefrontal, and association regions impor- regions.
• Neuroplasticity within a system acts as a double-edged
tant for higher-order cognition (Giedd et al., 1999; Gogtay
et al., 2004). Within each region there is a pattern of promi- sword, conferring the possibility for either enhancement or
nent overproduction of synapses, dendrites, and gray matter deficit.
• Multiple mechanisms both support and constrain modi-
that is subsequently pruned back to about 50% of the
maximum value, which is reached at different ages in differ- fiability across different brain systems and subsystems.
In the sections that follow, we describe our research on
courtney stevens Willamette University, Salem, Oregon neuroplasticity within vision, audition, language, and atten-
helen neville Department of Psychology, University of Oregon, tion. In each section, we note different profiles of plasticity
Eugene, Oregon observed in the system, situations in which enhancements
166 plasticity
A 1981; Hollants-Gilhuijs, Ruijter, & Spekreijse, 1998a, 1998b;
Packer, Hendrickson, & Curcio, 1990). Further, in develop-
mental studies using the color and motion stimuli described
previously and in Armstrong and colleagues (2002), we
observed that while children aged 6–19 years show responses
to color stimuli that are very similar to adults, their ERPs to
the motion stimuli are delayed in latency relative to those
for adults (Coch, Skendzel, Grossi, & Neville, 2005; Mitchell
& Neville, 2004). Together, these anatomical, chemical, and
developmental mechanisms could render the dorsal pathway
more modifiable by experience and more likely to display
either enhanced or deficient processing.
B In addition to enhanced dorsal pathway functioning we
have recently observed that deaf (but not hearing) partici-
pants recruit a large, additional network of supplementary
cortical areas when processing far peripheral relative to
central flickering visual stimuli (Scott, Dow, & Neville, 2003,
see figure 11.2). These include contralateral primary audi-
tory cortex (figure 11.2). Studies of a mouse model of
congenital deafness suggest that altered subcortical-cortical
connectivity could account for such changes (Hunt et al.,
2005). In deaf but not hearing mice the retina projects to
the medial (auditory) geniculate nucleus as well as the lateral
(visual) geniculate nucleus. In our study of deaf humans we
also observe significant increases in anterior, primary visual
Figure 11.1 Performance on two visual tasks for deaf participants
(gray bars) and dyslexic participants (white bars) relative to matched cortex and regions associated with multisensory integration
control groups. The zero line represents performance of the respec- (STS), motion processing (MT/MT+), and attention (poste-
tive control groups. (A) On a central visual field contrast sensitivity rior parietal and anterior cingulate regions) (Dow, Scott,
task, neither deaf nor dyslexic participants differed from matched Stevens, & Neville, 2006; Scott et al., 2003; Scott, Dow,
controls. (B) On a peripheral visual field motion detection task, deaf
Stevens, & Neville, under review). In a separate study, we
participants showed enhancements (P < .001) and dyslexic partici-
pants showed deficits (P < .01) relative to matched controls. (Data used structural equation modeling to estimate the strength
from Stevens & Neville, 2006.) of cortical connections between early visual areas (V1/V2),
area MT/MST, and part of the posterior parietal cortex
the central visual field are more strongly genetically speci- (PPC) (Bavelier et al., 2000). During attention to the center
fied, whereas connections within the portions of the visual the connectivity was comparable across groups, but during
system that represent the visual periphery contain redundant the attend-periphery condition the effective connectivity
connections that can be shaped by experience over a longer between MT/MST and PPC was increased in the deaf
developmental time course (Chalupa & Dreher, 1991). A as compared with the hearing subjects. The findings of
molecular difference has also been observed between the two increased activation and effective connectivity between
visual pathways. In cats and monkeys the dorsal pathway visual areas and areas important in attention suggest that
has a greater concentration of the Cat-301 antigen, a the enhanced responsiveness to peripheral motion in deaf
molecule hypothesized to play a role in stabilizing synaptic individuals may be in part linked to increases in attention
connections by means of experience-dependent plasticity (see next section for further discussion).
(DeYoe, Hockfield, Garren, & Van Essen, 1990; Hockfield,
1983). Moreover, recent anatomical studies in nonhuman Audition
primates (Falchier, Clavagnier, Barone, & Kennedy, 2002;
Rockland & Ojima, 2003) and neuroimaging studies of To test whether the specificity of plasticity observed in the
humans (Eckert et al., 2008) report cross-modal connections visual system generalizes to other sensory systems, we have
between primary auditory cortex and the portion of primary conducted studies of the effects of visual deprivation on the
visual cortex that represents the periphery (anterior calca- development of the auditory system. Although less is known
rine sulcus). In addition there is considerable, though not about the organization of the auditory system, as in the
unequivocal, evidence indicating that the dorsal pathway visual system there are large (magno) cells in the medial
matures more slowly than the ventral pathway (Hickey, geniculate nucleus that conduct faster than the smaller
(parvo) cells, and recent evidence suggests that there may be McArthur, 2004; Tallal & Piercy, 1974; Tallal, 1975, 1976).
dorsal and ventral auditory processing streams with different In a study of children with specific language impairment
functional specializations (Rauschecker, 1998). Furthermore, (SLI), we observed that auditory ERPs were smaller (i.e.,
animal and human studies of blindness have reported more refractory) than in controls at short interstimulus inter-
changes in the parietal cortex (i.e., dorsal pathway) as a vals (Neville, Coffey, Holcomb, & Tallal, 1993). This finding
result of visual deprivation (Hyvarinen & Linnankoski, 1981; suggests that in audition, as in vision, neural subsystems that
Pascual-Leone et al., 2005; Weeks et al., 2000). display more neuroplasticity show both greater potential for
To determine whether similar patterns of plasticity occur enhancement and also greater vulnerability to deficit under
following auditory and visual deprivation, we developed an other conditions.
auditory paradigm similar to one of the visual paradigms The mechanisms that give rise to greater modifiability of
employed in our studies of deaf adults. Participants detected rapid auditory processing are as yet unknown. However,
infrequent pitch changes in a series of tones that were as mentioned earlier, some changes might be greater for
preceded by different interstimulus intervals (Röder, Rösler, magnocellular layers of the medial geniculate nucleus. For
Hennighausen, & Näcker, 1996). Congenitally blind partici- example, magno cells in both the lateral and medial genicu-
pants were faster at detecting the target and displayed ERPs late nucleus are smaller than normal in dyslexia (Galaburda
that were less refractory, that is, recovered amplitude faster & Livingstone, 1993; Galaburda, Menard, & Rosen, 1994).
than normally sighted participants. These results parallel Rapid auditory processing, including the recovery cycles of
those of our study showing faster amplitude recovery of neurons, might also engage aspects of attention to a greater
the visual ERP in deaf than in hearing participants (Neville degree than other aspects of auditory processing. In the case
et al., 1983) and suggest that rapid auditory and visual of congenital blindness, changes in auditory processing may
processing may show specific enhancements following be facilitated by compensatory reorganization. A number of
sensory deprivation. studies have confirmed that visual areas are functionally
Similar to the two sides of plasticity observed in the involved in nonvisual tasks in congenitally blind adults
dorsal visual pathway, the refractory period for rapidly pre- (Cohen, Weeks, Celnik, & Hallett, 1999; Sedato et al., 1996).
sented acoustic information, which is enhanced in the blind, More recently, studies have reported highly differentiated
shows deficits in many developmental disorders (Bishop & auditory language processing in primary visual cortex in
168 plasticity
congenitally blind humans (Burton et al., 2002; Röder, Stock, We conducted a series of ERP studies to develop a neural
Bien, Neville, & Rösler, 2002). Thus aspects of auditory pro- index of one aspect of phonological processing: speech
cessing that either depend upon or can recruit multimodal, segmentation. By 100 ms after word onset, syllables at the
attentional, or normally visual regions may show greater beginning of a word elicit a larger negativity than acousti-
degrees of neuroplasticity. Parallel studies of animals have cally similar syllables in the middle of the word (Sanders &
revealed information about mechanisms underlying this type Neville, 2003a). This effect has been demonstrated with
of change. For example, in blind mole rats, normally tran- natural speech and with synthesized nonsense speech in
sient, weak connections between the ear and primary visual which only newly learned lexical information could be used
cortex become stabilized and strong (Bavelier & Neville, for segmentation (Sanders, Newport, & Neville, 2002). The
2002; Cooper, Herbin, & Nevo, 1993; Doron & Wollberg, early segmentation ERP effect resembles the effect of tem-
1994; Heil, Bronchti, Wollberg, & Scheich, 1991). porally selective attention, which allows for the preferential
processing of information presented at specific time points
Language in rapidly changing streams, and it has also been shown to
modulate early (100 ms) auditory ERPs (Lange, Rösler, &
It is reasonable to hypothesize that the same principles that Röder, 2003; Lange & Röder, 2005; Sanders & Astheimer,
characterize neuroplasticity of sensory systems—including in press). Thus the neural mechanisms of speech segmenta-
different profiles, degrees, and mechanisms of plasticity— tion may rely on the deployment of temporally selective
also characterize language. Here, we focus on the subsys- attention during speech perception to aid in processing the
tems of language examined in our studies of neuroplasticity, most relevant rapid acoustic changes.
including those supporting semantics, syntax, and speech To the extent that language is made up of distinct neural
segmentation. subsystems, it is possible that, as in vision and audition, these
Several ERP and fMRI studies have described the non- subsystems show different profiles of neuroplasticity. In
identical neural systems that mediate semantic and syntactic support of this hypothesis, behavioral studies of language
processing. For example, semantic violations in sentences proficiency in second-language learners document that pho-
elicit a bilateral negative potential that is largest around nology and syntax are particularly vulnerable following
400 ms following the semantic violation (Kutas & Hillyard, delays in second-language acquisition ( Johnson & Newport,
1980; Neville, Nicol, Barss, Forster, & Garrett, 1991; 1989). In several studies, we have examined whether delays
Newman, Ullman, Pancheva, Waligura, & Neville, 2007). In in second-language exposure are also associated with differ-
contrast, syntactic violations elicit a biphasic response con- ences in the neural mechanisms underlying these different
sisting of an early, left-lateralized anterior negativity (LAN) language subsystems. In one study, we compared the ERP
followed by a later, bilateral positivity, peaking over poste- responses to semantic and syntactic errors in English among
rior sites ∼600 ms after the violation (P600; Friederici, 2002; Chinese/English bilinguals who were first exposed to English
Neville et al., 1991). The LAN is hypothesized to index more at different ages (Weber-Fox & Neville, 1996). Accuracy in
automatic aspects of the processing of syntactic structure and judging the grammaticality of the different types of syntactic
the P600 to index later, more controlled processing of syntax sentences and their associated ERPs were affected by delays
associated with attempts to recover the meaning of syn- in second-language exposure as short as 4–6 years. By com-
tactically anomalous sentences. These neurophysiological parison, the N400 response and the behavioral accuracy in
markers of language processing show a degree of biological detecting semantic anomalies were altered only in subjects
invariance as they are also observed when deaf and hearing who were exposed to English after 11–13 years of age. In
native signers process American Sign Language (ASL) studies of the effects of delayed second-language acquisition
(Capek, 2004; Capek et al., under review). While spoken and on indices of speech segmentation, second-language learners
signed language processing share a number of modality- who were exposed to their second language late in life (>14
independent neural substrates, there is also specialization years) show a delay in the ERP measure of speech segmenta-
based on language modality. The processing of ASL, for tion when processing their second language (Sanders &
example, is associated with additional and/or greater recruit- Neville, 2003b).
ment of right-hemisphere structures, perhaps owing to the Many deaf children are born to hearing parents and,
use of spatial location and motion in syntactic processing in because of their limited access to the spoken language that
ASL (Capek et al., 2004; Neville et al., 1998). In support of surrounds them, do not have full access to a first language
this hypothesis, we have recently shown that syntactic vio- until exposed to a signed language, which often occurs very
lations in ASL elicit a more bilateral anterior negativity late in development. Behavioral studies of deaf individuals
for violations of spatial syntax, whereas a left-lateralized with delayed exposure to sign language indicate that with
anterior negativity is observed for other classes of syntactic increasing age of acquisition, proficiency in sign language
violations in ASL (Capek et al., under review). decreases (Mayberry & Eichen, 1991; Mayberry, 1993;
170 plasticity
in peripheral auditory space, and ERPs revealed a sharper through 300 ms in children age 3–5. These data suggest that
tuning of early spatial attention mechanisms (the N1 atten- with sufficient attentional cues, children as young as three
tion effect) (Röder et al., 1999). In a recent study of adults years of age are able to attend selectively to an auditory
blinded later in life, we observed possible limits on the time stream and that doing so alters neural activity within 100 ms
periods during which these early mechanisms of attention of processing.
are enhanced (Fieger, Röder, Teder-Sälejärvi, Hillyard, & We have employed this paradigm to examine the timing
Neville, 2006). Whereas adults blinded later in life showed and mechanisms of selective auditory attention in children
similar behavioral improvements in peripheral auditory with specific language impairment (SLI) aged six to eight
attention, these improvements were mediated by changes in years and typically developing (TD) control children matched
the tuning of later ERP indices of attention, several hundred for age, gender, nonverbal IQ, and socioeconomic status
milliseconds after stimulus onset (i.e., P300). There were (SES) (Stevens, Sanders, & Neville, 2006). As shown in figure
no group differences in the early (N1) attention effects. If 11.3A,C, by 100 ms, typically developing children in this
the early neural mechanisms of selective attention can be study showed an amplification of the sensorineural response
enhanced after altered experience, it is possible that, as with to attended as compared to unattended stimuli, just as
other systems that display a high degree of neuroplasticity, observed in our larger samples of typically developing chil-
attention may be particularly vulnerable during develop- dren. In contrast, children with SLI showed no evidence of
ment. In line with this hypothesis, recent behavioral studies sensorineural modulation with attention, despite behavioral
suggest that children at risk for school failure, including performance indicating that they were performing the task
those with poor language or reading abilities or from lower as directed (figure 11.3B,D). Moreover, the group differences
socioeconomic backgrounds, exhibit deficits in aspects of were specific to signal enhancement (figure 11.4, left).
attention including filtering and noise exclusion (Atkinson, In a related line of research, we examined the neural
1991; Cherry, 1981; Farah et al., 2006; Lipina, Martelli, mechanisms of selective attention in children from different
Vuelta, & Colombo, 2005; Noble, Norman, & Farah, 2005; socioeconomic backgrounds. Previous behavioral studies
Sperling, Lu, Manis, & Seidenberg, 2005; Stevens, Sanders, indicated that children from lower socioeconomic back-
Andersson, & Neville, 2006; Ziegler, Pech-Georgel, George, grounds experience difficulty with selective attention, par-
Alanio, & Lorenzi, 2005). These attentional deficits span ticularly in tasks of executive function and tasks that require
linguistic and nonlinguistic domains within the auditory and filtering irrelevant information or suppressing prepotent
visual modalities, suggesting that the deficits are both domain responses (Farah et al., 2006; Lupien, King, Meaney, &
general and pansensory. In order to determine whether McEwen, 2001; Mezzacappa, 2004; Noble et al., 2005;
these attentional deficits can be traced to the earliest effects Noble, McCandliss, & Farah, 2007). Using the same
of attention on sensorineural processing, we have recently selective auditory attention ERP task described earlier, we
used ERPs to examine the neural mechanisms of selective observed differences in the neural mechanisms of selective
attention in typically developing, young children and in attention in children from different socioeconomic back-
groups of children at risk for school failure. grounds (Stevens, Lauinger, & Neville, in press). Specifically,
These studies were modeled after those we and others children whose mothers had lower levels of education
have used with adults (Hillyard et al., 1973; Neville & (no college experience) showed reduced effects of selective
Lawson, 1987a; Röder et al., 1999; Woods, 1990). The task attention on neural processing compared to children whose
was designed to be difficult enough to demand focused selec- mothers had higher levels of education (at least some college)
tive attention, while keeping the physical stimuli, arousal (figure 11.5). These differences were related specifically to a
levels, and task demands constant. Two different children’s reduced ability to filter irrelevant information (i.e., to sup-
stories were presented concurrently from speakers to the left press the response to ignored sounds) (figure 11.4, right) and
and right of the participant. Participants were asked to could not be accounted for by differences in receptive lan-
attend to one story and ignore the other. Superimposed on guage skill. Thus the mechanism implicated in attention
the stories were probe stimuli to which ERPs were recorded. deficits in children from lower socioeconomic backgrounds
Adults tested with this paradigm showed typical N1 atten- (i.e., distractor suppression) was not the same as the mecha-
tion effects (Coch, Sanders, & Neville, 2005). Children, who nism implicated in children with SLI, who showed a deficit
showed a different ERP morphology to the probe stimuli, in signal enhancement of stimuli in the attended channel
also showed early attentional modulation within the first (Stevens, Sanders, & Neville, 2006). Similar results have
100 ms of processing. This attentional modulation was an been reported by other research groups (D’Angiulli,
amplification of the broad positivity occurring in this time Herdman, Stapells, & Hertzman, 2008). Taken together,
window. In a later study (Sanders, Stevens, Coch, & Neville, these studies point to the two sides of the plasticity of early
2006), we found that this attention effect was complete by mechanisms of attention, which show both enhancements
200 ms in older children age 6–8 years but prolonged and vulnerabilities in different populations.
C D
Figure 11.3 Grand average event-related potentials (ERPs) broadly distributed effect and (D) in children with specific language
for attended and unattended stimuli (A) in typically developing impairment no modulation with attention. (Data from Stevens,
children (P = .001) and (B) in children with specific language Sanders, & Neville, 2006. Image reproduced with permission from
impairment (P > 0.4). Voltage map of the attention effect (Attended- Brain Research.)
Unattended) shows (C ) in typically developing children a large,
Figure 11.4 Mean amplitude of the ERP from 100 to 200 ms of panel shows data from children from higher versus lower socioeco-
responses to unattended and attended probes. Error bars represent nomic backgrounds. Children from different socioeconomic back-
standard error of the mean. Left panel shows data from typically grounds did not differ in the magnitude of response to attended
developing children (TD) and children with specific language stimuli. However, children from lower socioeconomic backgrounds
impairment (SLI). The two groups did not differ in the magnitude showed a larger response (i.e., poorer filtering) to unattended stimuli
of response to unattended stimuli. However, typically developing compared to children from higher socioeconomic backgrounds.
children showed a larger amplitude response (i.e., better signal (Data from Stevens, Sanders, & Neville, 2006, Brain Research, and
enhancement) than children with SLI to attended stimuli. Right Stevens, Lauinger, & Neville, in press, Developmental Science.)
Several mechanisms might underlie the plasticity of (Coull, Frith, Frackowiak, & Grasby, 1996; Raz & Buhle,
attention. Whereas the research described previously 2006; Shipp, 2004). These components of attention depend
focused on sustained, selective attention, research in cogni- upon different neural substrates and neurotransmitters
tive science and cognitive neuroscience has also identified (Bush, Luu, & Posner, 2000; Gomes, Molholm, Christodou-
several different subsystems, or components of attention lou, Ritter, & Cowan, 2000; Posner & Petersen, 1990) and
172 plasticity
Figure 11.5 Grand average evoked potentials for attended and was significantly larger in children from higher socioeconomic
unattended stimuli in children from higher socioeconomic back- backgrounds (P = .001). (Data from Stevens, Lauinger, & Neville,
grounds (upper panel) and lower socioeconomic backgrounds in press, Developmental Science.)
(lower panel). The effect of attention on sensorineural processing
mature along different timetables (Andersson & Hugdahl, (Bakermans-Kranenberg et al., 2008; Sheese et al., 2007;
1987; Doyle, 1973; Geffen & Wale, 1979; Hiscock & unpublished observations from our lab).
Kinsbourne, 1980; Pearson & Lane, 1991; Rueda, Fan,
et al., 2004; Rueda, Posner, Rothbart, & Davis-Stober, Interventions
2004; Schul, Townsend, & Stiles, 2003). Sustained, selective
attention shows a particularly long time course of develop- As described in the preceding section, selective attention
ment. The abilities both to selectively attend to relevant influences early sensory processing across a number of
stimuli and to successfully ignore irrelevant stimuli improve domains. In our most recent research, we have been inves-
progressively with increasing age across childhood (Cherry, tigating the possibility that attention itself might be trainable,
1981; Geffen & Sexton, 1978; Geffen & Wale, 1979; Hiscock and that this training can impact processing in a number of
& Kinsbourne, 1980; Lane & Pearson, 1982; Maccoby & different domains. Indeed, in his seminal work Principles of
Konrad, 1966; Sexton & Geffen, 1979; Zukier & Hagen, Psychology, William James raised the idea of attention training
1978). Further, there is some evidence that background for children, proposing that this would be “the education par
noise creates greater interference effects for younger excellence” ( James, 1890, italics in original). While James went
children than for adolescents or adults (Elliott, 1979; on to say that such an education is difficult to define and
Ridderinkhof & van der Stelt, 2000). In a review of both bring about, attention training has recently been imple-
behavioral and ERP studies of the development of selective mented in curricula for preschool and school-age children
attention, Ridderinkhof and van der Stelt (2000) proposed (Bodrova & Leong, 2007; Chenault, Thomson, Abbot, &
that the abilities to select among competing stimuli and to Berninger, 2006; Diamond, Barnett, Thomas, & Munro,
preferentially process more relevant information are essen- 2007; Rueda et al., 2005). These programs are associated
tially available in very young children, but that the speed with improvements in behavioral and neurophysiological
and efficiency of these behaviors and the systems contribut- indices of attention, as well as in measures of academic
ing to these abilities improve as children develop. Addition- outcomes and nonverbal intelligence. Furthermore, one
ally, since the key sources of selective attention within the program showed that attention training translated to
parietal and frontal lobes constitute parts of the dorsal increased benefits of a subsequent remedial writing interven-
pathway, similar chemical and anatomical factors noted in tion for adolescents with dyslexia (Chenault et al., 2006).
the section on vision may contribute to the plasticity of atten- Recent proposals suggest that some interventions designed
tion in a similar way. In addition, recent evidence suggests to improve language skills might also target or train selective
that there are considerable genetic effects on attention attention (Gillam, 1999; Gillam, Loeb, & Friel-Patti, 2001;
(Bell et al., 2008; Fan, Fossella, Sommer, Wu, & Posner, Gillam, Crofford, Gale, & Hoffman, 2001; Hari & Renvall,
2003; Posner, Rothbart, & Sheese, 2007; Rueda, Rothbart, 2001). We have tested this hypothesis in a series of interven-
McCandliss, Saccamanno, & Posner, 2005) and that these tion studies. In this research, we have documented changes
may also be modified by environmental input epigenetically in the neural mechanisms of selective attention following
Figure 11.6 ERP responses to attended and ignored auditory during the 100–200-ms time window. Following training, both
stimuli in typically developing (TD) children and children with children with SLI (P < .05) and typically developing children (P <
specific language impairment (SLI) before and after six weeks of .1) showed evidence of increased effects of attention on sensorineu-
daily, 100-minute computerized language training. Grand average ral processing. These changes were larger than those made in a
evoked potentials for attended and unattended stimuli are collapsed no-treatment control group (P < .01), who showed no change in
across linguistic and nonlinguistic probes. Voltage maps show mag- the effects of attention on sensorineural processing when retested
nitude and distribution of the attention effect (attended-unattended) after a comparable time period (P = .96).
174 plasticity
Figure 11.7 Grand average ERP waveforms from the selective Intervention (ERI). Voltage map indicates the magnitude and dis-
auditory attention paradigm show the effects of attention on sen- tribution of the attention effect (Attended-Unattended). Changes
sorineural processing in kindergarten children of diverse early in the effects of attention differed from pretest to posttest in the two
reading ability across the first semester of kindergarten. Top row groups (P < .05), with the OT group showing no change (P = .92)
shows data from pretest, and bottom row shows data from posttest and the AR group showing a significant increase in the attention
for five-year-old kindergarten children on track (OT) in early effect (P < .01). At pretest, the OT group tended to have a larger
literacy skills or at risk (AR) for reading difficulty. The OT group attention effect than the AR group (P = .06). At posttest, the AR
received eight weeks of kindergarten between pretest and posttest. group had a nonsignificantly larger attention effect than the OT
The AR group received eight weeks of kindergarten with 45 group (P = .17). (See color plate 12.)
minutes of daily, supplemental instruction with the Early Reading
Figure 11.8 Functional MRI activations for letter > false font activation. (C) Following one semester of kindergarten and, for
while performing a 1-back task in adults and kindergarten children children in the at-risk group, daily supplemental instruction with
of diverse reading ability across the first semester of formal reading the Early Reading Intervention, on-track children showed left-lat-
instruction. (A) Adults performing the task displayed activation in eralized activation in temporoparietal regions, and at-risk children
classic left temporoparietal regions. (B) In contrast, at the beginning showed bilateral temporoparietal activation and large activation of
of kindergarten, children on track in early literacy skills (upper frontal regions, including the ACC. The left hemisphere is dis-
panel) showed bilateral temporoparietal activation, and children at played on the left. In the upper left corner are example stimuli.
risk for reading difficulty (lower panel) showed no regions of greater (See color plate 13.)
176 plasticity
Andersson, B., & Hugdahl, K. (1987). Effects of sex, age, and nouns in early and late blind. Poster presented at the Society for
forced attention on dichotic listening in children: A longitudinal Neuroscience, Orlando, FL.
study. Dev. Neuropsychol., 3(3–4), 191–206. Bush, G., Luu, P., & Posner, M. I. (2000). Cognitive and emotional
Armstrong, B., Hillyard, S. A., Neville, H. J., & Mitchell, influences in anterior cingulate cortex. Trends Cogn. Sci., 4(6),
T. V. (2002). Auditory deprivation affects processing of motion, 215–222.
but not color. Cogn. Brain Res., 14(3), 422–434. Capek, C. (2004). The cortical organization of spoken and signed sentence
Atkinson, J. (1991). Review of human visual development: Crowd- processing in adults. Unpublished doctoral dissertation, University
ing and dyslexia. In J. Cronly-Dillon & J. Stein (Eds.), Vision and of Oregon, Eugene.
visual dysfunction (Vol. 13, pp. 44–57). London: Nature Publishing Capek, C., Bavelier, D., Corina, D., Newman, A. J., Jezzard, P.,
Group. & Neville, H. J. (2004). The cortical organization for audio-
Atkinson, J. (1992). Early visual development: Differential visual sentence comprehension: An fMRI study at 4 Tesla. Cogn.
functioning of parvocellular and magnocellular pathways. Eye, Brain Res., 20(2), 111–119.
6, 129–135. Capek, C., Corina, D., Grossi, G., McBurney, S. L.,
Atkinson, J., King, J., Braddick, O., Nokes, L., Anker, S., & Mitchell, T. V., Neville, H. J., et al. (under review).
Braddick, F. (1997). A specific deficit of dorsal stream function Semantic and syntactic processing in American Sign Language:
in Williams’ syndrome. Neuroreport, 8(8), 1919–1922. Electrophysiological evidence.
Baizer, J. S., Ungerleider, L. G., & Desimone, R. Capek, C., Corina, D., Grossi, G., McBurney, S. L., Neville, H.
(1991). Organization of visual inputs to the inferior temporal J., Newman, A. J., et al. (in preparation). American Sign Lan-
and posterior parietal cortex in macaques. J. Neurosci., 11, guage sentence processing: ERP evidence from adults with dif-
168–190. ferent ages of acquisition.
Bakermans-Kranenberg, M., & Van Ijzendoom, M. H. Chalupa, L. M., & Dreher, B. (1991). High precision systems
(2006). Gene-environment interaction of the dopamine D4 require high precision “blueprints”: A new view regarding
receptor (DRD4) and observed maternal insensitivity predicting the formation of connections in the mammalian visual system.
externalizing behavior in preschoolers. Dev. Psychobiol., 48, J. Cogn. Neurosci., 3(3), 209–219.
406–409. Chenault, B., Thomson, J., Abbott, R. D., & Berninger,
Bakermans-Kranenberg, M., Van Ijzendoom, M. H., Pijlman, V. W. (2006). Effects of prior attention training on child dyslex-
F. T. A., Mesman, J., & Femmie, J. (2008). Experimental ics’ response to composition instruction. Dev. Neuropsychol., 29(1),
evidence for differential susceptibility: Dopamine D4 receptor 243–260.
polymorphism (DRD4 VNTR) moderates intervention effects Cherry, R. (1981). Development of selective auditory attention
on toddlers’ externalizing behavior in a randomized controlled skills in children. Percept. Mot. Skills, 52, 379–385.
trial. Dev. Psychol., 44, 293–300. Chugani, H. T., Phelps, M. E., & Mazziotta, J. C. (1987). Posi-
Bavelier, D., Brozinsky, C., Tomann, A., Mitchell, T., Neville, tron emission tomography study of human brain functional
H., & Liu, G. (2001). Impact of early deafness and early exposure development. Ann. Neurol., 22, 487–497.
to sign language on the cerebral organization for motion process- Coch, D., Sanders, L. D., & Neville, H. J. (2005). An event-
ing. J. Neurosci., 21(22), 8931–8942. related potential study of selective auditory attention in children
Bavelier, D., & Neville, H. J. (2002). Cross-modal plasticity: and adults. J. Cogn. Neurosci., 17(4), 605–622.
Where and how? Nat. Rev. Neurosci., 3, 443–452. Coch, D., Skendzel, W., Grossi, G., & Neville, H. (2005).
Bavelier, D., Tomann, A., Hutton, C., Mitchell, T., Liu, G., Motion and color processing in school-age children and adults:
Corina, D., et al. (2000). Visual attention to the periphery is An ERP study. Dev. Sci., 8(4), 372–386.
enhanced in congenitally deaf individuals. J. Neurosci., 20(17), Cohen, L. G., Weeks, R. A., Celnik, P., & Hallett, M.
1–6. (1999). Role of the occipital cortex during Braille reading in
Bell, T., Batterink, L., Currin, L., Pakulak, E., Stevens, C., subjects with blindness acquired late in life. J. Neurosci., 45,
& Neville, H. (2008). Genetic influences on selective auditory 451–460.
attention as indexed by ERPs. Poster presented at the Cognitive Cooper, H., Herbin, M., & Nevo, E. (1993). Visual system of a
Neuroscience Society, San Francisco. naturally microphthalmic mammal: The blind mole rat, Spalax
Bishop, D. (2003). Genetic and environmental risks for specific ehrenbergi. J. Comp. Neurol., 328, 313–350.
language impairment in children. Int. J. Pediatr. Otorhinolaryngol., Corbetta, M., Miezin, F., Dobmeyer, S., Shulman, G., &
67, S143–157. Petersen, S. E. (1990). Attentional modulation of neural
Bishop, D., Bishop, S. J., Bright, P., James, C., Delaney, T., & processing of shape, color, and velocity in humans. Science, 248,
Tallal, P. (1999). Different origin of auditory and phonological 1556–1559.
processing problems in children with language impairment: Cornelissen, P., Richardson, A., Mason, A., Fowler, S., &
Evidence from a twin study. J. Speech Lang. Hear. Res., 42, Stein, J. (1995). Contrast sensitivity and coherent motion detec-
155–168. tion measured at photopic luminance levels in dyslexics and
Bishop, D., & McArthur, G. M. (2004). Immature cortical controls. Vis. Res., 35(10), 1483–1494.
responses to auditory stimuli in specific language impairment: Coull, J. T., Frith, C. D., Frackowiak, R. S., & Grasby,
Evidence from ERPs to rapid tone sequences. Dev. Sci., 7, P. M. (1996). A fronto-parietal network for rapid visual informa-
F11–18. tion processing: A PET study of sustained attention and working
Bodrova, E., & Leong, D. (2007). Tools of the mind: The Vygotskian memory. Neuropsychologia, 34, 1085–1095.
approach to early childhood education (2nd ed.). Upper Saddle River, D’Angiulli, A., Herdman, A., Stapells, D., & Hertzman,
NJ: Pearson Education. C. (2008). Children’s event-related potentials of auditory
Burton, H., Synder, A., Conturo, T., Akbudak, E., Ollinger, selective attention vary with their socioeconomic status.
J., & Raichle, M. (2002). A fMRI study of verb generation to auditory Neuropsychology, 22, 293–300.
178 plasticity
Hiscock, M., & Kinsbourne, M. (1980). Asymmetries of Maccoby, E., & Konrad, K. (1966). Age trends in selective
selective listening and attention switching in children. Dev. listening. J. Exp. Child Psychol., 3, 113–122.
Psychol., 16(1), 70–82. Mangun, G., & Hillyard, S. (1990). Electrophysiological studies
Hockfield, S. (1983). A surface antigen expressed by a subset of of visual selective attention in humans. In A. Scheibel &
neurons in the vertebrate central nervous system. Proc. Natl. Acad. A. Wechsler (Eds.), Neurobiology of higher cognitive function (pp. 271–
Sci. USA, 80(18), 5758–5761. 295). New York: Guilford.
Hollants-Gilhuijs, M. A. M., Ruijter, J. M., & Spekreijse, H. Mayberry, R. (1993). First-language acquisition after childhood
(1998a). Visual half-field development in children: Detection of differs from second-language acquisition: The case of American
colour-contrast-defined forms. Vis. Res., 38(5), 645–649. Sign Language. J. Speech Hear. Res., 36(6), 1258–1270.
Hollants-Gilhuijs, M. A. M., Ruijter, J. M., & Spekreijse, H. Mayberry, R. (2003). Age constraints on first versus second
(1998b). Visual half-field development in children: Detection of language acquisition: Evidence for linguistic plasticity and
motion-defined forms. Vis. Res., 38(5), 651–657. epigenesis. Brain Lang., 87(3), 369–384.
Hunt, D. L., King, B., Kahn, D. M., Yamoah, E. N., Shull, G. Mayberry, R., & Eichen, E. (1991). The long-lasting advantage
E., & Krubitzer, L. (2005). Aberrant retinal projections in con- of learning sign language in childhood: Another look at the
genitally deaf mice: How are phenotypic characteristics critical period for language acquisition. J. Mem. Lang., 30,
specified in development and evolution? Anat. Rec. A Discov. Mol. 486–512.
Cell Evol. Biol., 287(1), 1051–1066. Mayberry, R., Lock, E., & Kazmi, H. (2002). Linguistic ability
Huttenlocher, P. R., & Dabholkar, A. S. (1997). Regional and early language exposure. Nature, 417, 38.
differences in synaptogenesis in human cerebral cortex. J. Comp. Merigan, W. H. (1989). Chromatic and achromatic vision of
Neurol., 387, 167–178. macaques: Role of the P pathway. J. Neurosci., 9(3), 776–783.
Hyvarinen, L., & Linnankoski, I. (1981). Modification of Merigan, W. H., & Maunsell, J. (1990). Macaque vision after
parietal association cortex and functional blindness after magnocellular lateral geniculate lesions. Visual Neurosci., 5,
binoocular deprivation in young monkeys. Exp. Brain Res., 42, 347–352.
1–8. Mezzacappa, E. (2004). Alerting, orienting, and executive atten-
James, W. (1890). Principles of psychology. New York: Henry Holt. tion: Developmental properties and sociodemographic correlates
Johnson, J., & Newport, E. (1989). Critical period effects in second in epidemiological sample of young, urban children. Child Dev.,
language learning; The influence of maturational state on the 75(5), 1373–1386.
acquisition of English as a second language. Cogn. Psych., 21, Mills, D. L., Coffey-Corina, S. A., & Neville, H. J. (1993).
60–99. Language acquisition and cerebral specialization in 20-month-
Kondo, M., Gray, L., Pelka, G., Christodoulou, J., Tam, P., & old infants. J. Cogn. Neurosci., 5(3), 317–334.
Hannan, A. (2008). Environmental enrichment ameliorates a Mills, D. L., Coffey-Corina, S. A., & Neville, H. J. (1997).
motor coordination deficit in a mouse model of Rett syndrome— Language comprehension and cerebral specialization from 13 to
Mecp2 gene dosage effects and BDNF expression. Eur. J. Neuro- 20 months. Dev. Neuropsychol., 13(3), 397–445.
sci., 27, 3342–3350. Mitchell, T. V., & Neville, H. J. (2004). Asynchronies in the
Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences: development of electrophysiological responses to motion and
Brain potentials reflect semantic incongruity. Science, 207, color. J. Cogn. Neurosci., 16(8), 1–12.
203–204. Movshon, J. A., & Blakemore, C. (1974). Functional reinnervation
Lane, D., & Pearson, D. (1982). The development of selective in kitten visual cortex. Nature, 251(5474), 504–505.
attention. Merrill-Palmer Q., 28(3), 317–337. Neville, H. J. (1998). Human brain development. In M. Posner &
Lange, K., & Röder, B. (2005). Orienting attention to points in L. Ungerleider (Eds.), Fundamental Neuroscience (pp. 1313–1338).
time improves stimulus processing both within and across New York: Academic Press.
modalities. J. Cogn. Neurosci., 18(5), 715–729. Neville, H. J., Bavelier, D., Corina, D., Rauschecker, J., Karni,
Lange, K., Rösler, F., & Röder, B. (2003). Early processing stages A., Lalwani, A., et al. (1998). Cerebral organization for lan-
are modulated when auditory stimuli are presented at an guage in deaf and hearing subjects: Biological constraints and
attended moment in time: An event-related potential study. effects of experience. Proc. Natl. Acad. Sci. USA, 95(3), 922–929.
Psychophysiology, 40, 806–817. Neville, H. J., Coffey, S. A., Holcomb, P. J., & Tallal,
Lipina, S., Martelli, M., Vuelta, B., & Colombo, J. (2005). P. (1993). The neurobiology of sensory and language processing
Performance on the A-not-B task of Argentinian infants from in language-impaired children. J. Cogn. Neurosci., 5(2), 235–253.
unsatisfied and satisfied basic needs homes. Interamerican Neville, H. J., & Lawson, D. (1987a). Attention to central and
J. Psychol., 39, 49–60. peripheral visual space in a movement detection task: An event-
Livingstone, M., & Hubel, D. (1988). Segregation of form, color, related potential and behavioral study. I. Normal hearing adults.
movement and depth: Anatomy, physiology, and perception. Brain Res., 405, 253–267.
Science, 240, 740–749. Neville, H. J., & Lawson, D. (1987b). Attention to central and
Lovegrove, W., Martin, F., & Slaghuis, W. (1986). A theoretical peripheral visual space in a movement detection task: An event-
and experimental case for a visual deficit in specific reading dis- related potential and behavioral study. II. Congenitally deaf
ability. Cogn. Neuropsychol., 3, 225–267. adults. Brain Res., 405, 268–283.
Luck, S. J., Woodman, G. F., & Vogel, E. K. (2000). Event- Neville, H. J., Nicol, J., Barss, A., Forster, K., & Garrett, M.
related potential studies of attention. Trends Cogn. Sci., 4(11), (1991). Syntactically based sentence processing classes:
432–440. Evidence from event-related brain potentials. J. Cogn. Neurosci.,
Lupien, S. J., King, S., Meaney, M. J., & McEwen, B. S. (2001). 3, 155–170.
Can poverty get under your skin? Basal cortisol levels and cogni- Neville, H. J., Schmidt, A., & Kutas, M. (1983). Altered visual-
tive function in children from low and high socioeconomic status. evoked potentials in congenitally deaf adults. Brain Res., 266,
Dev. Psychopathol., 13, 653–676. 127–132.
180 plasticity
Shipp, S. (2004). The brain circuitry of attention. Trends Cogn. Sci., nurture: The complex interplay of genetic and environmental influences
8(5), 223–230. on human behavior and development (pp. 35–51). Mahwah, NJ:
Simmons, D., Kame’enui, E., Harn, B., Coyne, M., Lawrence Erlbaum.
Stoolmiller, M., Edwards, L., et al. (2007). Attributes of effec- Suomi, S. (2006). Risk, resilience, and gene × environment interac-
tive and economic kindergarten reading intervention: An exami- tions in rhesus monkeys. Ann. NY Acad. Sci., 1994, 52–62.
nation of instructional time and design specificity. J. Learn. Talcott, J. B., Hansen, P. C., Assoku, E. L., & Stein, J. F.
Disabil. 40, 331–347. (2000). Visual motion sensitivity in dyslexia: Evidence for
Simmons, D., Kame’enui, E. J., Stoolmiller, M., Coyne, temporal and energy integration deficits. Neuropsychologia, 38(7),
M. D., & Harn, B. (2003). Accelerating growth and maintaining 935–943.
proficiency: A two-year intervention study of kindergarten and Tallal, P. (1975). Perceptual and linguistic factors in the language
first-grade children at risk for reading difficulties. In B. Foorman impairment of developmental dysphasics: An experimental
(Ed.), Preventing and remediating reading difficulties: Bringing science to investigation with the Token Test. Cortex, 11, 196–205.
scale (pp. 197–228). Timonium, MD: York Press. Tallal, P. (1976). Rapid auditory processing in normal and disor-
Sperling, A., Lu, Z.-l., Manis, F. R., & Seidenberg, M. S. (2003). dered language development. J. Speech Hear. Res., 19, 561–571.
Selective magnocellular deficits in dyslexia: A “phantom contour” Tallal, P., & Piercy, M. (1974). Developmental aphasia: Rate of
study. Neuropsychologia, 41(10), 1422–1429. auditory processing and selective impairment of consonant per-
Sperling, A., Lu, Z., Manis, F. R., & Seidenberg, M. S. (2005). ception. Neuropsychologia, 12, 83–93.
Deficits in perceptual noise exclusion in developmental dyslexia. Ungerleider, L., & Haxby, J. V. (1994). “What” and “where” in
Nat. Neurosci., 8, 862–863. the human brain. Curr. Opin. Neurobiol., 4, 157–165.
Stevens, C., Fanning, J., Coch, D., Sanders, L., & Neville, H. Ungerleider, L., & Mishkin, M. (1982). Two cortical visual
(2008). Neural mechanisms of selective auditory attention systems. In D. J. Ingle, M. A. Goodale, & R. J. Mansfield (Eds.),
are enhanced by computerized training: Electrophysiological Analysis of visual behavior (pp. 549–586). Cambridge, MA: MIT
evidence from language-impaired and typically developing Press.
children. Brain Res. 1205, 55–69. Webb, S. J., Monk, C. S., & Nelson, C. A. (2001). Mechanisms of
Stevens, C., Currin, J., Paulsen, D., Harn, B., Chard, D., postnatal neurobiological development: Implications for human
Larsen, D., et al. (2008). Kindergarten children at-risk for reading development. Dev. Neuropsychol., 19(2), 147–171.
failure: Electrophysiological measures of selective auditory attention before Weber-Fox, C. M., & Neville, H. J. (1996). Neural systems for
and after the early reading intervention. Poster presented at the Cogni- language processing: Effects of delays in second language expo-
tive Neuroscience Society, San Francisco. sure. Brain Cogn., 30, 264–265.
Stevens, C., Harn, H., Chard, D., Currin, J., Parisi, D., & Weeks, R., Horwitz, B., Aziz-Sultan, A., Tian, B., Wessinger,
Neville, H. (in press). Examining the role of attention and C. M., Cohen, L. G., et al. (2000). A positron emission tomo-
instruction in at-risk kindergarteners: Electrophysiological graphic study of auditory localization in the congenitally blind.
measures of selective auditory attention before and after an early J. Neurosci., 20(7), 2664–2672.
literacy intervention. J. Learn. Disabil. Woods, D. (1990). The physiological basis of selective attention:
Stevens, C., Lauinger, B., & Neville, H. (in press). Differences in Implications of event-related potential studies. In J. Rohrbaugh,
the neural mechanisms of selective attention in children from R. Parasuraman, & R. Johnson (Eds.), Event-related brain potentials:
different socioeconomic backgrounds: An event-related brain Issues and interdisciplinary vantages. New York: Oxford Press.
potential study. Dev. Sci. Yamada, Y., Stevens, C., & Neville, H. (under review). Develop-
Stevens, C., & Neville, H. (2006). Neuroplasticity as a double- mental changes in cortical activations during visual letter
edged sword: Deaf enhancements and dyslexic deficits in motion processing in kindergarteners: An fMRI study.
processing. J. Cogn. Neurosci., 18(5), 701–704. Yamada, Y., Stevens, C., Sabourin, L., Klein, S. A., Dow, M.,
Stevens, C., Sanders, L., Andersson, A., & Neville, H. Paulsen, D., et al. (2008). Changes in cortical activations during visual
(2006). Vulnerability and plasticity of selective auditory attention in letter processing across the kindergarten year: A longitudinal fMRI study.
children: Evidence from language-impaired and second-language learners. Poster presented at the Cognitive Neuroscience Society, San
Poster presented at the Cognitive Neuroscience Society, San Francisco.
Francisco. Zeki, S., Watson, J. D. G., Lueck, C. J., Friston, K. J., Kennard,
Stevens, C., Sanders, L., & Neville, H. (2006). Neurophy- C., & Frackowiak, R. S. J. (1991). A direct demonstration of
siological evidence for selective auditory attention deficits functional specialization in human visual cortex. J. Neurosci.,
in children with specific language impairment. Brain Res., 1111, 11(3), 641–649.
143–152. Ziegler, J. C., Pech-Georgel, C., George, F., Alanio, F. X., &
Suomi, S. (2003). Gene-environment interactions and the neurobi- Lorenzi, C. (2005). Deficits in speech perception predict lan-
ology of social conflict. Ann. NY Acad. Sci., 1008, 132–139. guage learning impairment. Proc. Natl. Acad. Sci. USA, 102(39),
Suomi, S. (2004). How gene-environment interactions can 14110–14115.
influence emotional development in rhesus monkeys. In C. Zukier, H., & Hagen, J. W. (1978). The development of selective
Barcia-Coll, E. L. Bearer, & R. M. Lerner (Eds.), Nature and attention under distracting conditions. Child Dev., 49, 870–873.
13 kastner, mcmains,
and beck 205
14 corbetta, sylvester,
and shulman 219
17 karnath 259
18 robertson 269
19 maunsell 281
186 attention
Finally, the chapters in this section describe new insights the operation of feature-based attention sometimes precedes
into the allocation of attention to nonspatial features, such and guides the allocation of space-based attention. We
as color and direction of motion. Recent studies have shown expect that the next five years will lead to new insights into
that attention can be directed to specific feature values across how these different varieties of attention work together in
the visual field, and not just at attended locations. Indeed, the service of perception and behavior.
abstract This chapter reviews research on attention using example, does attention select the relevant stimuli early or
behavioral and psychological methods. It attempts to illustrate late in perceptual processing? The answer typically proves
what was learned through these tools alone and what is gained to be “both or all the above.” Simplistic questions have
when tools from cognitive neuroscience are added. The psychologi-
cal approaches defined many of the theoretical issues, such as
evolved into attempts to specify when each answer applies
the nature of the overloads that make attention necessary, the level and why. I select eight such issues to discuss here, using
of selection, the method of selection (enhancement of attended mostly psychological methods and bringing in neural evi-
stimuli or suppression of unattended ones), the targets of selection dence where it can decide questions that otherwise could not
(locations, objects, or attributes), the ways in which attention is be answered. I also introduce many experimental paradigms
controlled, and the role of attention in solving the feature-binding
that have been used to study attention. The goal has been
problem. Psychology also developed many of the paradigms
used to probe the underlying mechanisms that are now being con- to “bottle” the wide range of everyday phenomena encom-
firmed by converging evidence from brain imaging and from passed by the label “attention” and bring them into the
studies of brain-damaged patients. Theories of attention have controlled conditions necessary for scientific study.
evolved from the early sequential “pipeline” model of processing
to a more flexible and interactive model with parallel streams spe-
cializing in different forms of perceptual analysis, iterative cycles of Why is attention limited?
processing, and reentry to earlier levels. Attentional selection takes
many forms and applies at many levels. We learn as much from As the gorilla example that opened this chapter suggests,
exploring the constraints on flexibility—what cannot be done—as attention seems to be severely limited. Other examples
from discovering what can. abound, as we show in later sections. We typically see only
four items in a brief visual flash (Woodworth, 1938). We can
follow the content of only one auditory message at a time
While watching a movie of a basketball game and counting (Broadbent, 1958). We can track only four moving circles
the passes made by one of the teams, participants completely among other identical circles moving in random directions
miss seeing a large black gorilla walk through the game, even (Pylyshyn & Storm, 1988). Why do these limits arise? There
though it is clearly visible if attended to (Simons & Chabris, are three general ideas about their nature, and all could play
1999; see also Neisser & Becklen, 1975). Why should we still a part.
be interested in purely psychological studies like this one? If,
as Minsky said, the mind is what the brain does, then that Structural Interference One hypothesis is that limits
is what, as psychologists, we are interested in, and it would arise only when two concurrent tasks use the same specialized
be foolish not to use the tools from neuroscience. However, subsystems. Proponents compared the interference between
brain-imaging data and findings with neurological patients tasks that seemed likely to share common mechanisms and
depend critically for their interpretation on the designs of tasks that did not—for example, both were speech shadowing
the behavioral tasks being performed. We can directly or one was piano playing (Allport, Antonis, & Reynolds,
observe actions, or we can measure brain activation, but by 1972), both were visual or one was auditory (Treisman &
putting them together we further constrain the possible theo- Davies, 1973), both used verbal rehearsal or one used
ries. This chapter is intended to set the scene, both histori- imagery (Brooks, 1968). The results clearly showed more
cally and conceptually, for the subsequent chapters exploring interference between tasks that were more similar. When
neuroscientific approaches to attention in more detail. they were sufficiently different, they were sometimes
The traditional questions in attention research mostly combined without impairment.
started as di- or trichotomies: “Is it x, or y, or z?” For Different attributes like color, motion, and shape are
processed by at least partially separate systems (e.g.,
anne treisman Psychology Department, Princeton University, Corbetta, Miezin, Dobmeyer, Shulman, & Petersen, 1991).
Princeton, New Jersey Thus the structural interference view predicts little difficulty
Percent failure
Certain stimuli may have privileged access to attention,
bypassing the structural limits: In a dichotic listening
40 0.4
task, participants’ own names sometimes broke through
from the unattended message (Moray, 1959). Emotional
stimuli, like a snake (Ohman, Flykt, & Esteves, 2001) or an 30 0.3
angry face, are more likely to be seen than neutral stimuli
(Eastwood, Smilek, & Merikle, 2001; for a review see 0.2
20
Vuilleumier, 2005). The advantage extends also to guns and
other nonevolutionary stimuli, suggesting that attention can
be drawn to learned categories of fearful stimuli (Blanchette, 10 0.1
2006). Emotional stimuli may directly activate a separate
pathway to the ventral prefrontal cortex and the amygdala 0 0
(e.g., Yamasaki, LaBar, & McCarthy, 2002), although,
contrary to some prior claims, some attentional resources Listen Report
are still needed for their detection (Pessoa, Kastner, & Time (sec)
Ungerleider, 2002).
Figure 12.1 A measure of perceptual deficit and the pupillary
response to a digit-transformation task. Black symbols show percent
General Resources There are also more general limits to missed letters in a rapid visual sequence while participants listened
attention. Kahneman (1973) argued for a limited pool of to four digits, adding one to each and reporting the results (at rate
resources or “effort.” He showed that a secondary task of 1 per second). Errors increase with each extra digit at intake;
(monitoring a stream of visual letters) was impaired when they are highest at the time when participants are doing the mental
combined with a primary task of adding one to each of a addition; and they decrease as each transformed digit is reported.
Open symbols show the size of the pupil, reflecting the amount of
string of auditory digits, although these two tasks are unlikely effort or resources being used. Note that the pupil index has about
to share the same brain systems. Kahneman used the size of a 2-second lag behind the mental processing. (Modified with
the pupil as an online index of effort, having previously permission from Kahneman, 1973.)
shown that it correlates closely with difficulty across a wide
range of tasks. Interference with visual letter detection was
maximal when the memory load was highest and effort, as difficult). Discriminating phonology is unlikely to involve
indexed by the pupil, was at a peak (see figure 12.1). area MT, yet fMRI activation to the irrelevant dots was
Purely psychological studies are handicapped in deter- reduced during the more difficult word task. Thus attention
mining what information is extracted from unattended mes- limits appear between two very different forms of visual
sages by the fact that observable responses are needed as perception. However, when auditory words replaced the
evidence, so that limits could arise from our inability to carry visual ones, task difficulty in the auditory word task had no
out simultaneous actions or to remember stimuli that we did impact on visual activation of area MT. Psychological tests
in fact observe. Brain imaging allows us to monitor the provided converging evidence: With visual words, the diffi-
incidental processing of unattended stimuli as they are pre- cult task reduced the motion aftereffect generated by the
sented, and it may give us more sensitive indications of irrelevant dots, whereas with auditory words it was unaf-
where the limits arise. Results have cast additional doubt on fected. Resources are at least partly shared across very
claims that only structural interference matters. Rees, Frith, different tasks within vision, suggesting resource limits
and Lavie (1997) observed fMRI activity in area MT pro- rather than structural interference within the visual modal-
duced by irrelevant moving dots surrounding a central, task- ity, but may be separate across different modalities. An inter-
relevant word. Two tasks differing in difficulty were used to esting exception concerns spatial coding, where a shared
assess the effects of load in processing the central word: case representation of space may create some overlap in resources
discrimination (easy) or detecting a bisyllabic word (more (e.g., Spence & Driver, 1996).
190 attention
Behavioral Coherence The premotor theory of two separately located colored ovals increased the fMRI
attention suggests that attention is simply a preparation for activation produced by either a house or a face stimulus that
response, selecting the goal of an intended action (Rizzolatti, shared the same location.
Riggio, Dascola, & Umilta, 1987). Attention is facilitated A number of findings also favor object selection. When
when the actions afforded by the stimulus are compatible overlapping objects share the same general location, like the
with the response required (e.g., Craighero, Fadiga, basketball game and the gorilla described earlier, attention
Rizzolatti, & Umilta, 1999; Tucker & Ellis, 1998). Hand can still be very efficient. Selection may be guided by proper-
location can facilitate detection of targets near the hand ties of the attended object, perhaps a color or a range of
(Reed, Grubb, & Steele, 2006). Spatial attention normally spatial frequencies, or simply by the collinearity and spatial
follows a saccadic eye movement and can be triggered by continuity of its contours. Attention spreads more easily
subliminal stimulation of neurons that control that saccade within than between objects (Duncan, 1984; Egly, Driver, &
(Moore & Armstrong, 2003). Attentional systems differ for Rafal, 1994; Tipper & Behrmann, 1996; see figure 12.2).
space that is within reach and space that is beyond it Patients with neglect due to parietal damage who are
(Ladavas, 2002). Actions may themselves affect the way oblivious to the left side of space often also neglect the left
attention is deployed. If a salient target like a unique color side of objects (e.g., Halligan & Marshall, 1994). A dramatic
need only be detected, it “pops out” of the display; but if the
goal is to touch the object, focused attention is required
(Song & Nakayama, 2006). Although motor performance
clearly interacts with attention, it seems unlikely that intended
actions are the only limits to attention. Even when we
passively watch stimuli go by (e.g., at the movies), we do
select a subset of the information that reaches the senses.
192 attention
and on decision criteria. Shaw (1984) found that in detection (Posner & Snyder, 1975). Probe items in search get slightly
of luminance increments, attention load affected only slower responses when they appear in locations previously
the criterion, but in a letter localization task it also affected occupied by nontarget items (Cepeda, Cave, Bichot, & Kim,
d ′. However, Hawkins and colleagues (1990) found that 1998; Klein & MacInnes, 1999). In the Marking Paradigm
spatial cuing affected both d ′ and criterion in a luminance (Watson & Humphreys, 1997), a subset of distractors is
detection task. shown in advance of the full search display. This procedure
A related question, best answered with neural measures, eliminates their contribution to search latencies, producing
is whether attention produces a multiplicative effect on the efficient feature search of just the items that appear in the
signal (gain control) or simply changes the baseline activity final display instead of what would otherwise be a slower
on which the signal is superimposed. Hillyard, Vogel, and search for a conjunction target.
Luck (1998) concluded that the gain control model fits best Functional MRI measures offer more direct evidence
in early spatial selection. But there is also evidence in some of inhibition, suggesting that attention to a stimulus at the
conditions for changes in baseline activity (Kastner, Pinsk, fovea strongly suppresses baseline activity in brain areas
De Weerd, Desimone, & Ungerleider, 1999; Chawla, Rees, responding to other spatial locations (Smith, Singh, &
& Friston, 1999). Greenlee, 2000). Event-related potential (ERP) differences
between attended and unattended items show both inhibi-
Inhibition Whether inhibition is invoked in an attention tion of irrelevant items (shown in the P1 ERP component)
task may depend on how distracting the irrelevant stimuli and facilitation of relevant ones (shown in the N1 compo-
would otherwise be. More active suppression is needed when nent) (Luck et al., 1994). When participants must bind
the target and distractors are superimposed rather than spa- features to identify the target, both a P1 and an N1 are
tially separated. Participants name one of two superimposed shown, whereas when the presence of a color is sufficient,
pictures more slowly when the currently relevant picture was only the N1 (facilitation) effect remains (Luck & Hillyard,
the irrelevant one on the previous trial, suggesting that it was 1995). Again, inhibition is used only when distractors would
inhibited when it was irrelevant and the inhibition then had otherwise cause interference.
to be removed (Tipper, 1985). Thus this negative priming
paradigm may show aftereffects of inhibition. A reduction Changes of Tuning or Selectivity Receptive field
in Stroop interference demonstrates the online inhibition of sizes can change, shrinking with attention to give finer
an irrelevant object, not just the aftereffects of inattention selectivity (Moran & Desimone, 1985). Selectivity to
(Wühr & Frings, 2008; see figure 12.3). particular features can also be sharpened. For example,
Inhibition may also be used to prevent rechecking the when participants attended to the direction in which objects
same locations or stimuli that have already proved fruitless. were rotated, the selectivity of fMRI in area LOC to
Thus responses are slower at locations that have previously orientation differences was increased relative to when they
received attention, an effect known as inhibition of return attended to the color of a central dot (Murray & Wojciulik,
2004). (Again, see chapter 19 by Maunsell for evidence from
neural recordings.)
oooo
Does attention act early or late?
In the early days of attention research, information process-
ing was seen as a pipeline of successive stages, the output of
each becoming the input of the next, with information of
xxxx green increasing complexity abstracted at each level. Attention
could potentially select between outputs at any level to deter-
mine which should be passed on to the next. This model has
been replaced by a more interactive system with reentry to
early levels and extensive lateral communication between
separate parallel streams of analysis, dealing with different
Figure 12.3 The task is to name the color of the square shape types of information—“what?” in the ventral versus “where?”
(which was yellow). Stroop interference is greater from the word in the dorsal areas (Ungerleider & Mishkin, 1982), or objects
“green” inside the relevant object than it is if the word is presented and events for conscious representation in the ventral
in the background where the O’s are shown, and it is reduced
further when presented in the irrelevant red circle, where the X ’s pathway and the online control of actions in the dorsal
are shown in the figure. (Modified with permission from Wühr & pathway (Milner & Goodale, 1995). Within each pathway,
Frings, 2008.) selection can occur at various levels, depending on the task,
the load, and the degree to which concurrent tasks engage In the so-called psychological refractory period, attention
the same subsystems. A recent framework that captures limits seem to arise late. When separate speeded responses
this flexibility is the reverse hierarchy theory of Hochstein are required to two stimuli presented in close succession, the
and Ahissar (2002), in which an initial feedforward sweep response to the second is typically delayed, reflecting an
through the sequence of visual areas takes place automati- attentional bottleneck (Welford, 1952). Pashler (1993, 1994)
cally, followed by optional controlled processing that may used evidence of underadditivity of factors contributing to
return to lower areas as required by the task (see figure 12.4). the two reaction times to locate the point at which overlap
Access to awareness is initially at the highest levels of repre- in processing becomes impossible. He found convincing evi-
sentation where receptive fields are large and discrimination dence that the bottleneck arises not in perception but in
is categorical. central decision and response selection (see figure 12.5). The
Without assuming that the two must be correlated in a fact that attention limits can and sometimes do arise at late
fixed order, we can still ask about either the level or the time stages does not refute the claim that they can also act early.
at which selection is made. The early-late dichotomy was A coherent account relates the level of selection to the
actually always a “straw” question. Proponents of early level at which the potential overload occurs. If perception is
selection did not deny that selection could also occur late. demanding, selection needs to be early, whereas if the per-
The real question was “Can attention act early (Broadbent, ceptual load is low, early selection may be not only unneces-
1958), or is all perceptual processing automatic, with atten- sary but actually impossible. Lavie and Tsal (1994) and
tion selecting only at the level of memory and response?” Lavie (1995) showed that interference from a flanking dis-
(Deutsch & Deutsch, 1963). On one hand, proponents of late tractor decreased, and, by inference, early selection effi-
selection argued that attention limits are determined by deci- ciency increased, as the attended task became more difficult
sion effects alone: More stimuli lead to increased uncertainty (see figure 12.6). But if the load arises in the control systems
and increased chances that noise will exceed a response cri- that direct attention, high load may reduce the efficiency of
terion (e.g., Bundesen, 1990; Kinchla, 1974; Palmer, 1995). early selection and increase the effects of irrelevant stimuli.
On the other hand, behavioral tests showed that selection Using a dual task where participants were to remember the
based on properties that are processed early (simple physical order of four digits, presented either in random orders (high
characteristics like location, color, auditory pitch) was more working memory load) or in a fixed regular order (low load),
efficient than selection based on properties presumably pro- while classifying famous names printed over irrelevant dis-
cessed only later (semantic content, abstract categories). This tractor faces, de Fockert, Rees, Frith, and Lavie (2001) found
conclusion was true even when the load on responses and that incongruent faces produced more interference in the
memory was minimized (e.g., Treisman & Riley, 1969). high load condition, presumably because working memory
194 attention
Figure 12.5 Psychological refractory period (A) Objective stages of mental processing: A, perceptual processing; B, central
sequence of events (S1, stimulus 1; R1, response 1; SOA, stimulus decision time; C, response programming time. Stage B forms a
onset asynchrony). (B) Observed reaction time to second stimulus bottleneck where two separate decisions cannot overlap with each
is delayed as the interval between the tasks is reduced. The slope other. Other stages can operate in parallel. (From Pashler, 1994,
approaches −1, indicating that (on average) the second response with permission.)
cannot be produced until a certain time after S1. (C ) Hypothesized
involves the same frontal lobe executive system as attentional evidence, such as priming, interference, or emotional
selection. Brain imaging provided converging evidence: responses, and direct neural measures of responses to
Activation in the fusiform face area was higher when working unattended stimuli.
memory load was high, making selection inefficient. Patients with unilateral neglect due to a right parietal
lesion often show indirect evidence that they have identified
Implicit Processing In distinguishing the level at which stimuli that they are unable to report (e.g., McGlinchey-
selection is made, we must also distinguish implicit processing Berroth, 1997). There are also many instances of implicit
from explicit accessibility. In the 1960s and 1970s, it was perception in normal participants. When participants judged
often assumed that perceptual processing was fully reflected the relative length of the arms of a cross, an additional unex-
in conscious experience and that behavioral responses were pected stimulus was often simply not seen (Mack & Rock,
a reliable guide to the information available. This assumption 1998). Yet a subsequent word-completion task showed
was challenged by an early finding (Corteen & Dunn, 1974): priming from visual words to which the participant was
Shock-associated words in an unattended message produced “inattentionally blind” (see figure 12.7). A smiling face and
a galvanic skin response without also being consciously the participant’s own name were among other stimuli that
detected. Since then, other examples of implicit processing were also detected, presumably because of their subjective
have been documented, using both indirect behavioral importance. Implicit processing of surprising complexity and
Fixation
Z Z
Selective attention task
VNYXWT X
Time C. Judge which arm is longer.
Response probe
* *
196 attention
showed attention effects as early in the ascending visual connections. However, combined with temporal evidence
pathways as the lateral geniculate (O’Connor, Fukui, Pinsk, from ERPs and single-neuron recordings, both reflecting
& Kastner, 2002). Because of the multiple connections back early cortical activity, the occurrence of early, short-latency
from higher areas and the low temporal resolution of fMRI, attentional modulation is now clearly established.
it is ambiguous here whether attention affects a first pass Implicit processing of unattended items complicates the
through the visual hierarchy or acts only through reentrant attentional story, suggesting that attention can block access
to consciousness without blocking all forms of perceptual
processing. The fact that this result can occur does not imply
that it always does. The perceptual load was low in most
experiments that showed implicit effects, and when it was
Prime
raised, the implicit effects disappeared (Neumann & Deschep-
per, 1992; Rees et al., 1997). But we still need to explain why
attention should limit conscious access when the perceptual
load is low. Interference may be greater from consciously
perceived objects than from implicitly distinguished stimuli.
Probe
The role of attention in feature binding
The visual system comprises many specialized areas coding
different aspects of the scene. This modularity poses the
green
binding problem—to specify how the information is recom-
red bined in the correct conjunctions—red shirt and blue
white pants rather than an illusory blue shirt. Behavioral results
(Treisman & Gelade, 1980) suggest that we bind features by
Figure 12.8 Negative priming. Participants judge whether the focusing attention on each object in turn. Evidence includes
green shape on the left matches the white shape on the right while findings that when attention is prevented, binding errors or
ignoring the red shape on the left. They have no explicit memory illusory conjunctions are frequently perceived (Treisman &
for the unattended shapes, yet when one reappears as the shape to Schmidt, 1982); a spatial precue helps detection of a conjunc-
be attended, responses are slightly slowed, as though it had previ- tion much more than of a feature target (Treisman, 1988);
ously been inhibited or labeled “irrelevant” and the label had to
be cleared when the shape became relevant. This negative priming boundaries between groups defined only by conjunctions are
effect can last across hundreds of intervening trials and days or hard to detect, whereas those between features are easy
weeks of delay. (After DeSchepper & Treisman, 1996.) (Treisman & Gelade, 1980); visual search depends on focused
Figure 12.9 The attentional blink. (A) Participants monitor a bined tasks, T2 is very likely to be missed if it occurs within a few
rapid visual sequence for two different targets. For example, T1 hundred milliseconds of T1, suggesting that detecting a first target
might be a white letter in a black string, and T2 might be a letter makes the participant refractory to detecting a second for the next
X. (B) The open circles give the detection rates when participants few hundred milliseconds. (From Shapiro & Raymond, 1994, with
do both tasks, and the black circles show performance on T2 in the permission.)
control condition in which they ignore the first target. In the com-
198 attention
mainly within receptive fields. Since these increase in size
with the level in the hierarchy, the further apart the stimuli, Until
response:
the higher the level of processing at which they compete “Same” or
(Kastner et al.). However, there may also be attention limits “Different”
outside the classical receptive field. The early ERP effects of
attention reflect selection between stimuli in different visual
hemifields and therefore hemispheres of the brain (Van 300ms Time
Voorhis & Hillyard, 1977). Another issue for the biased
competition theory is to specify how local neurons “know”
whether their activity is produced by parts of the same 500ms
object, which should cooperate, or by different objects that
should compete. This “knowledge” may require substantial Figure 12.10 Implicit priming from the mean size of the pre-
top-down control of local competition. viewed display. The task is to judge whether the two circles in the
final display are the same size or different. Participants respond a
little faster if one or both matches the mean size of the preceding
Focused versus distributed attention prime display, and faster even than when they match one of the
presented sizes. It seems that participants automatically compute
There is a paradox in attention research: Many studies show the mean size of an array of circles. (Experiment described in
sharp limits such that only three or four objects can be Chong & Treisman, 2001.)
tracked through space (Pylyshyn & Storm, 1988) or identi-
fied at a glance (Woodworth, 1938); change detection in an is needed, statistical processing and parallel feature detection
alternating pair of otherwise identical scenes is surprisingly may provide sufficient information about most redundant
difficult (Rensink, O’Regan, & Clark, 1997); and accurate natural scenes.
binding depends on focused attention. Yet natural scenes
can be rapidly and effortlessly monitored for semantic targets How does attention relate to consciousness?
(Potter, 1975). While doing an attention-demanding task
at the fovea, participants failed to discriminate which side Some theories equate attention and consciousness. This
of a peripheral circle was red, yet they easily detected approach is probably misleading. Not everything that
an unknown animal target in a peripheral natural scene receives attention reaches awareness. We look at an ambigu-
(Li, VanRullen, Koch, & Perona, 2002). This finding raises ous figure and experience only one interpretation. We may
the question whether attention limits apply primarily to attend to a spatial location and show implicit priming without
simplified laboratory stimuli, while in the natural world becoming aware that anything was there (Marcel, 1983a).
information is easily absorbed and understood. Can these Attention can facilitate unconscious perception in blindsight
contradictions be resolved? patients (Kentridge, Heywood, & Weiskrantz, 2004) and in
Using natural scenes, Oliva and Torralba (2006) showed normal participants (Kentridge, Nijboer, & Heywood, 2008).
that the gist can be inferred from a combination of statistical Attention limits appear even in unconscious perception:
properties. Chong and Treisman (2003) suggested that the Kahneman and Chajczyk (1983) found “dilution” of inter-
global deployment of attention generates a statistical mode ference when a neutral word was presented together with
of processing. We confirmed a finding by Ariely (2001) that the irrelevant color name in a Stroop color-naming task.
observers can accurately estimate the mean size of elements Bahrami, Lavie, and Rees (2007) found that load effects in
and showed that such estimating happens automatically a foveal task modulated V1 activation produced by unseen
when attention is distributed over the display (see figure pictures of tools in the periphery.
12.10). Combining these findings with the idea that the Thus attention is not sufficient to ensure consciousness. Is
parallel intake of sets of diagnostic features could mediate it necessary? Can we be conscious of something that was not
detection of familiar objects even before those features are attended? There is an ambiguity here. I can be conscious of
bound may resolve the paradox (Treisman, 2006). Partici- an unattended voice, without identifying the words that are
pants, shown rapid sequences of natural scenes, were able spoken. Am I conscious of the stimulus? We are probably
to detect target animals quite well, while often being unable never conscious of every property, even of fully attended
to specify which animal or where in the picture it appeared stimuli—for example, that this cat is smaller than the Eiffel
(Evans & Treisman, 2005). If two successive targets had to tower. We become explicitly aware of just a small fraction
be identified, a severe attentional blink was incurred, but of the possible propositions that could be formulated about
when the targets could simply be detected, the blink disap- an object or event that we are observing. With unattended
peared. If features must be bound to identify or locate a objects, we lose those aspects for which capacity was over-
target, focused attention is required, but when only the gist loaded, but we may retain some information.
Conclusions
Psychological studies posed many of the relevant questions,
outlined possible mechanisms, and developed experimental
Trailing Mask paradigms to capture different aspects of what is meant by
attention. The data provide constraints, ruling out many
possible accounts. Neuroscience has added powerful tools to
cast votes on issues that remained controversial, or some-
times to reframe the questions in ways that more closely
match the way the brain functions.
Some suggest that attention acts primarily through biases
Figure 12.11 Object substitution masking. The display contains on intrinsic competitive local interactions. Others see it
up to 16 rings, half of which have a vertical bar across the bottom. arising primarily or only at the decision level, following
The target is singled out by four dots, as shown, which also serve parallel perceptual processing. Still others (myself included)
as the mask. Observers indicate whether the target contains the suggest that conscious perception, detailed localization, and
vertical bar. The sequence begins with a combined display of the
target, mask, and distractors for 45 ms and continues with a display binding of features may depend on focused attention through
of the mask alone for durations of 0, 45, 90, 135, or 180 ms. The reentrant pathways. An initial rapid pass through the visual
four dots produce no masking if they end with the display, but if hierarchy provides the global framework and gist of the
they continue after it disappears, they render the stimulus that they scene and may prime target objects through the features that
surround invisible. The suggestion is that visual processing will not are detected. Attention is then focused back to early areas
reach conscious awareness unless a reentry check confirms the
information extracted on a first pass through the visual system. In
to allow a serial check of the initial rough bindings and to
the case illustrated, the dots remain alone in the location of one of form the representations that are consciously experienced.
the Q’s and are substituted for it in conscious perception, whereas The impact of neuroscience is obvious in these develop-
in the other locations there are no alternative stimuli to compete. ments, but so is the ingenious and careful use of psychologi-
(From Di Lollo, Enns, & Rensink, 2000.) cal paradigms to tease apart the mechanisms controlling our
perception and action.
Attention, then, seems to be neither necessary nor suffi- acknowledgments The work was supported by NIH grant
cient for conscious awareness, although the two are normally number 2RO1 MH 058383-04A1; by the Israeli Binational Science
highly correlated. What is necessary for conscious experi- Foundation, grant number 1000274; and by NIH grant R01
ence? The idea of reentry is much in the air these days. MH62331.
Several authors propose that the initial registration of
stimuli consists of a rapid feedforward sweep through the
visual areas without conscious awareness, then a possible REFERENCES
return back to the early levels to check a tentative identifica-
Allport, D. A., Antonis, B., & Reynolds, P. (1972). Division of
tion for selected elements against the sensory data (Damasio,
attention—Disproof of single channel hypothesis. Q. J. Exp.
1989; Hochstein & Ahissar, 2002; Lamme & Roelfsema, Psychol., 24(May), 225–235.
2000; Marcel, 1983b). Binding may also depend on reen- Ariely, D. (2001). Seeing sets: Representation by statistical pro-
try to early visual areas to ensure fine spatial resolution perties. Psychol. Sci., 12(2), 157–162.
(Treisman, 1996). We become conscious of objects or events Ashbridge, E., Walsh, V., & Cowey, A. (1997). Temporal aspects
of visual search studied by transcranial magnetic stimulation.
only if the initial match is confirmed. Supporting evidence
Neuropsychologia, 35(8), 1121–1131.
comes from “object substitution” masking, in which a mask Bahrami, B., Lavie, N., & Rees, G. (2007). Attentional load modu-
that begins at the same time as the target but outlasts it can lates responses of human primary visual cortex to invisible
render a stimulus invisible in a search array, even with stimuli. Curr. Biol., 17(6), 509–513.
200 attention
Blanchette, I. (2006). Snakes, spiders, guns, and syringes: of reentrant visual processes. J. Exp. Psychol. Gen., 129(4),
How specific are evolutionary constraints on the detection of 481–507.
threatening stimuli? Q. J. Exp. Psychol., 59(8), 1484–1504. Donner, T. H., Kettermann, A., Diesch, E., Ostendorf, F.,
Blaser, E., Pylyshyn, Z. W., & Holcombe, A. O. (2000). Tracking Villringer, A., & Brandt, S. A. (2002). Visual feature
an object through feature space. Nature, 408(6809), 196–199. and conjunction searches of equal difficulty engage only
Broadbent, D. E. (1958). Perception and communication. New York: partially overlapping frontoparietal networks. Neuroimage, 15(1),
Pergamon Press. 16–25.
Brooks, L. R. (1968). Spatial and verbal components of act of Downing, P., Liu, J., & Kanwisher, N. (2001). Testing cognitive
recall. Can. J. Psychol., 22(5), 349–368. models of visual attention with fMRI and MEG. Neuropsychologia,
Bulakowski, P. F., Bressler, D. W., & Whitney, D. (2007). 39(12), 1329–1342.
Shared attentional resources for global and local motion process- Duncan, J. (1984). Selective attention and the organization of
ing. J. Vis., 7(10), 1–10. visual information. J. Exp. Psychol. Gen., 113(4), 501–517.
Bundesen, C. (1990). A theory of visual-attention. Psychol. Rev., Eastwood, J. D., Smilek, D., & Merikle, P. M. (2001). Differ-
97(4), 523–547. ential attentional guidance by unattended faces expressing
Cepeda, N. J., Cave, K. R., Bichot, N. P., & Kim, M. S. (1998). positive and negative emotion. Percept. Psychophys., 63(6),
Spatial selection via feature-driven inhibition of distractor 1004–1013.
locations. Percept. Psychophys., 60(5), 727–746. Egly, R., Driver, J., & Rafal, R. D. (1994). Shifting visual-
Chawla, D., Rees, G., & Friston, K. J. (1999). The physiologi- attention between objects and locations—Evidence from normal
cal basis of attentional modulation in extrastriate visual areas. and parietal lesion subjects. J. Exp. Psychol. Gen., 123(2),
Nat. Neurosci., 2(7), 671–676. 161–177.
Chong, S. C., & Treisman, A. (2001). Representation of statis- Evans, K. K., & Treisman, A. (2005). Perception of objects in
tical properties. J. Vis., 1(3), 54. natural scenes: Is it really attention free? J. Exp. Psychol. Hum.
Chong, S. C., & Treisman, A. (2003). Representation of statis- Percept. Perform., 31(6), 1476–1492.
tical properties. Vis. Res., 43(4), 393–404. Fan, J., McCandliss, B. D., Sommer, T., Raz, A., & Posner,
Corbetta, M., Kincade, J. M., Ollinger, J. M., McAvoy, M. I. (2002). Testing the efficiency and independence of atten-
M. P., & Shulman, G. L. (2000). Voluntary orienting is dissoci- tional networks. J. Cogn. Neurosci., 14(3), 340–347.
ated from target detection in human posterior parietal cortex. Freedman, D. J., Riesenhuber, M., Poggio, T., & Miller,
Nat. Neurosci., 3(3), 292–297. E. K. (2002). Visual categorization and the primate prefrontal
Corbetta, M., Miezin, F. M., Dobmeyer, S., Shulman, G. L., & cortex: Neurophysiology and behavior. J. Neurophysiol., 88(2),
Petersen, S. E. (1991). Selective and divided attention during 929–941.
visual discriminations of shape, color, and speed—Functional- Halligan, P. W., & Marshall, J. C. (1994). Toward a principled
anatomy by positron emission tomography. J. Neurosci., 11(8), explanation of unilateral neglect. Cogn. Neuropsychol., 11(2),
2383–2402. 167–206.
Corbetta, M., Shulman, G. L., Miezin, F. M., & Petersen, Hawkins, H. L., Hillyard, S. A., Luck, S. J., Downing, C. J.,
S. E. (1995). Superior parietal cortex activation during spatial Mouloua, M., & Woodward, D. P. (1990). Visual-attention
attention shifts and visual feature conjunction. Science, 270(5237), modulates signal detectability. J. Exp. Psychol. Hum. Percept.
802–805. Perform., 16(4), 802–811.
Corteen, R. S., & Dunn, D. (1974). Shock-associated words Hillyard, S. A., Vogel, E. K., & Luck, S. J. (1998). Sensory gain
in a nonattended message—Test for momentary awareness. control (amplification) as a mechanism of selective attention:
J. Exp. Psychol., 102(6), 1143–1144. Electrophysiological and neuroimaging evidence. Philos. Trans.
Craighero, L., Fadiga, L., Rizzolatti, G., & Umilta, C. R. Soc. London B Biol. Sci., 353(1373), 1257–1270.
(1999). Action for perception: A motor-visual attentional effect. Hochstein, S., & Ahissar, M. (2002). View from the top:
J. Exp. Psychol. Hum. Percept. Perform., 25(6), 1673–1692. Hierarchies and reverse hierarchies in the visual system. Neuron,
Damasio, A. R. (1989). Time-locked multiregional retroactiva- 36(5), 791–804.
tion—A systems-level proposal for the neural substrates of recall Hoffman, J. E., & Nelson, B. (1981). Spatial selectivity in
and recognition. Cognition, 33(1–2), 25–62. visual-search. Percept. Psychophys., 30(3), 283–290.
de Fockert, J. W., Rees, G., Frith, C. D., & Lavie, N. (2001). Hopfinger, J. B., Buonocore, M. H., & Mangun, G. R. (2000).
The role of working memory in visual selective attention. Science, The neural mechanisms of top-down attentional control. Nat.
291(5509), 1803–1806. Neurosci., 3(3), 284–291.
DeSchepper, B., & Treisman, A. (1996). Visual memory for Ivry, R. B., & Robertson, L. C. (1998). The two sides of perception.
novel shapes: Implicit coding without attention. J. Exp. Psychol. Cambridge, MA: MIT Press.
Learn. Mem. Cogn., 22(1), 27–47. Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ:
Desimone, R., & Duncan, J. (1995). Neural mechanisms of Prentice-Hall.
selective visual-attention. Annu. Rev. Neurosci., 18, 193–222. Kahneman, D., & Chajczyk, D. (1983). Tests of the automaticity
D’Esposito, M., Detre, J. A., Alsop, D. C., Shin, R. K., Atlas, of reading—Dilution of Stroop effects by color-irrelevant stimuli.
S., & Grossman, M. (1995). The neural basis of the cen- J. Exp. Psychol. Hum. Percept. Perform., 9(4), 497–509.
tral executive system of working-memory. Nature, 378(6554), Kastner, S., De Weerd, P., Pinsk, M. A., Elizondo, M. I.,
279–281. Desimone, R., & Ungerleider, L. G. (2001). Modulation of
Deutsch, J. A., & Deutsch, D. (1963). Attention—Some theo- sensory suppression: Implications for receptive field sizes in the
retical considerations. Psychol. Rev., 70(1), 80–90. human visual cortex. J. Neurophysiol., 86(3), 1398–1411.
Di Lollo, V., Enns, J. T., & Rensink, R. A. (2000). Competition Kastner, S., Pinsk, M. A., De Weerd, P., Desimone, R., &
for consciousness among visual events: The psychophysics Ungerleider, L. G. (1999). Increased activity in human
202 attention
Prinzmetal, W., Nwachuku, I., Bodanski, L., Blumenfeld, L., Tipper, S. P. (1985). The negative priming effect—Inhibitory
& Shimizu, N. (1997). The phenomenology of attention. priming by ignored objects. Q. J. Exp. Psychol. [A], 37(4),
2. Brightness and contrast. Consciousness Cogn., 6(2–3), 571–590.
372–412. Tipper, S. P., & Behrmann, M. (1996). Object-centered not
Pylyshyn, Z. W., & Storm, R. W. (1988). Tracking multiple scene-based visual neglect. J. Exp. Psychol. Hum. Percept. Perform.,
independent targets: Evidence for a parallel tracking mecha- 22(5), 1261–1278.
nism. Spatial Vis., 3(3), 179–197. Treisman, A. (1969). Strategies and models of selective attention.
Reed, C. L., Grubb, J. D., & Steele, C. (2006). Hands up: Psychol. Rev., 76, 282–299.
Attentional prioritization of space near the hand. J. Exp. Psychol. Treisman, A. (1988). Features and objects—The 14th Bartlett
Hum. Percept. Perform., 32(1), 166–177. Memorial Lecture. Q. J. Exp. Psychol. [A], 40(2), 201–237.
Rees, G., Frith, C. D., & Lavie, N. (1997). Modulating irrelevant Treisman, A. (1993). The perception of features and objects. In
motion perception by varying attentional load in an unrelated A. Baddeley & L. Weiskrantz (Eds.), Attention: Selection, awareness
task. Science, 278(5343), 1616–1619. and control: A tribute to Donald Broadbent (pp. 5–35). Oxford, UK:
Rensink, R. A., O’Regan, J. K., & Clark, J. J. (1997). To see or Clarendon Press.
not to see: The need for attention to perceive changes in scenes. Treisman, A. (1996). The binding problem. Curr. Opin. Neurobiol.,
Psychol. Sci., 8(5), 368–373. 6(2), 171–178.
Reynolds, J. H., Chelazzi, L., & Desimone, R. (1999). Treisman, A. (2006). How the deployment of attention determines
Competitive mechanisms subserve attention in macaque areas what we see. Visual Cogn., 14(4–8), 411–443.
V2 and V4. J. Neurosci., 19(5), 1736–1753. Treisman, A., & Davies, A. (1973). Divided attention to ear
Rizzolatti, G., Riggio, L., Dascola, I., & Umilta, C. (1987). and eye. In S. Kornblum (Ed.), Attention and performance IV (pp.
Reorienting attention across the horizontal and vertical meri- 101–117). New York: Academic Press.
dians—Evidence in favor of a premotor theory of attention. Treisman, A. M., & Gelade, G. (1980). Feature-integration
Neuropsychologia, 25(1A), 31–40. theory of attention. Cogn. Psych., 12(1), 97–136.
Robertson, L., Treisman, A., Friedman-Hill, S., & Grabowecky, Treisman, A., & Gormican, S. (1988). Feature analysis in early
M. (1997). The interaction of spatial and object pathways: Evi- vision—Evidence from search asymmetries. Psychol. Rev., 95(1),
dence from Balint’s syndrome. J. Cogn. Neurosci., 9(3), 295–317. 15–48.
Robertson, L. C., & Lamb, M. R. (1991). Neuropsychological Treisman, A. M., & Riley, J. G. A. (1969). Is selective attention
contributions to theories of part whole organization. Cogn. Psych., selective perception or selective response—A further test. J. Exp.
23(2), 299–330. Psychol., 79, 27–34.
Schneider, W., & Shiffrin, R. M. (1977). Controlled and Treisman, A., & Sato, S. (1990). Conjunction search revisited.
automatic human information-processing. 1. Detection, search, J. Exp. Psychol. Hum. Percept. Perform., 16(3), 459–478.
and attention. Psychol. Rev., 84(1), 1–66. Treisman, A., & Schmidt, H. (1982). Illusory conjunctions in the
Scholl, B. J., Pylyshyn, Z. W., & Feldman, J. (2001). What is perception of objects. Cogn. Psych., 14(1), 107–141.
a visual object? Evidence from target merging in multiple object Tucker, M., & Ellis, R. (1998). On the relations between seen
tracking. Cognition, 80(1–2), 159–177. objects and components of potential actions. J. Exp. Psychol. Hum.
Serences, J. T., & Boynton, G. M. (2007). The representation Percept. Perform., 24(3), 830–846.
of behavioral choice for motion in human visual cortex. Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual
J. Neurosci., 27(47), 12893–12899. systems. In D. J. Ingle, M. A. Goodale, & R. J. W. Mansfield
Shapiro, K. L., & Raymond, J. E. (1994). Temporal allocation of (Eds.), Analysis of visual behavior (pp. 549–586). Cambridge, MA:
visual attention: Inhibition or interference? In D. Dagenbach & MIT Press.
T. Carr (Eds.), Inhibitory processes in attention, memory and language Van Voorhis, S., & Hillyard, S. A. (1977). Visual evoked-
(pp. 151–188). New York: Academic Press. potentials and selective attention to points in space. Percept.
Shaw, M. L. (1984). Division of attention among spatial Psychophys., 22(1), 54–62.
locations—A fundamental difference between detection of Vuilleumier, P. (2005). How brains beware: Neural mechanisms
letters and detection of luminance increments. In H. Bouma & of emotional attention. Trends Cogn. Sci., 9(12), 585–594.
D. Bouwhuis (Eds.), Attention and performance X (pp. 109–121). Walsh, V., & Cowey, A. (1998). Magnetic stimulation studies of
Hillsdale, NJ: Erlbaum. visual cognition. Trends Cogn. Sci., 2(3), 103–110.
Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: Ward, L. M. (1982). Determinants of attention to local and global
Sustained inattentional blindness for dynamic events. Perception, features of visual forms. J. Exp. Psychol. Hum. Percept. Perform., 8(4),
28(9), 1059–1074. 562–581.
Smith, A. T., Singh, K. D., & Greenlee, M. W. (2000). Watson, D. G., & Humphreys, G. W. (1997). Visual marking:
Attentional suppression of activity in the human visual cortex. Prioritizing selection for new objects by top-down attentional
NeuroReport, 11(2), 271–277. inhibition of old objects. Psychol. Rev., 104(1), 90–122.
Song, J. H., & Nakayama, K. (2006). Role of focal attention Welford, A. T. (1952). The psychological refractory period and
on latencies and trajectories of visually guided manual pointing. the timing of high-speed performance: A review and a theory.
J. Vis., 6(9), 982–995. Br. J. Psychol., 43, 219.
Spelke, E., Hirst, W., & Neisser, U. (1976). Skills of divided Wojciulik, E., & Kanwisher, N. (1999). The generality of
attention. Cognition, 4(3), 215–230. parietal involvement in visual attention. Neuron, 23(4),
Spence, C., & Driver, J. (1996). Audiovisual links in endogenous 747–764.
covert spatial attention. J. Exp. Psychol. Hum. Percept. Perform., Wolfe, J. M., Cave, K. R., & Franzel, S. L. (1989). Guided
22(4), 1005–1030. search—An alternative to the feature integration model for
Stroop, J. R. (1935). Studies of interference in serial verbal visual-search. J. Exp. Psychol. Hum. Percept. Perform., 15(3),
reactions. J. Exp. Psychol., 18, 643–662. 419–433.
204 attention
13 Mechanisms of Selective Attention
in the Human Visual System:
Evidence from Neuroimaging
sabine kastner, stephanie a. mcmains, and diane m. beck
abstract In this chapter, we review evidence from functional The two most common behavioral paradigms employed
brain imaging revealing that attention operates at various process- to study visual attention are the spatial cuing paradigm
ing levels within the visual system including the lateral geniculate that probes attention to a single location or stimulus
nucleus of the thalamus and the striate and extrastriate cortex.
Attention modulates visual processing by enhancing neural (Posner, 1980) and the visual search task that probes atten-
responses to attended stimuli, attenuating responses to ignored tion in the presence of distracters (Treisman & Gelade, 1980;
stimuli, and increasing baseline activity in the absence of visual Wolfe, Cave, & Franzel, 1989). In the spatial cuing para-
stimulation. These mechanisms operate dynamically on spatial digm, subjects are instructed to maintain fixation and to
locations, entire objects, or particular features, which constitute the direct attention covertly, that is, without shifting their gaze,
units of selection. At intermediate cortical processing stages such
as areas V4 and MT, the filtering of unwanted information is
to a peripheral target location, which is indicated by a cue.
achieved by resolving competitive interactions among multiple After a variable delay, a target stimulus, which subjects are
simultaneously present stimuli. Together, these mechanisms required to detect, is presented briefly. On some trials,
allow us to select relevant information from the cluttered visual known as valid trials, the target appears at the cued (i.e.,
world in which we live to guide behavior. attended) location, and on other trials, known as invalid
trials, the target appears at an uncued (i.e., unattended) loca-
tion. The typically observed response difference in detecting
Natural visual scenes are cluttered and contain many differ- stimuli on valid and invalid trials is thought to reflect the
ent objects. However, the capacity of the visual system to effects of attention on selected locations in space. In visual
process information about multiple objects at any given search tasks, subjects are given an array of stimuli (e.g.,
moment in time is limited (e.g., Broadbent, 1958). Hence, circles of different colors) and asked to report if a particular
attentional mechanisms are needed to select relevant infor- target stimulus (e.g., a red circle) is present in the array.
mation and to filter out irrelevant information from cluttered Several factors affect performance in this task, such as the
visual scenes. Selective visual attention is a broad term that number of features that the target shares with other elements
refers to a variety of different behavioral phenomena. Direct- in the array. If the target (e.g., red circle) has a unique
ing attention to a spatial location has been shown to improve feature, such as being a different color from the distracters
the accuracy and speed of subjects’ responses to target stimuli (e.g., green circles), the search is completed quickly, regard-
that occur in that location (Posner, 1980). Attention also less of the number of elements in the array. This phenome-
increases the perceptual sensitivity for the discrimination non is known as pop-out or efficient search. For other search
of target stimuli (Lu & Dosher, 1998), increases contrast arrays, where the target is defined by a conjunction of fea-
sensitivity (Cameron, Tai, & Carrasco, 2002; Carrasco, tures (e.g., red horizontal line) that are shared by the distract-
Marie Giordano, & McElree, 2004), reduces the interference ers (e.g., red vertical and green horizontal lines), search time
caused by distracters (Shiu & Pashler, 1995), and improves increases as a function of the number of elements in the
acuity (Carrasco, Loula, & Ho, 2006; Yeshurun & Carrasco, array. This phenomenon is known as inefficient search, and
1998). the increase in search times is thought to reflect a serial
search through the array for the target. However, under
some circumstances, only a subset of the array needs to be
sabine kastner and stephanie a. mcmains Department of
searched. Simple features, such as color, can be used to guide
Psychology, Princeton Neuroscience Institute, Princeton University,
Princeton, New Jersey search to just those elements that share a particular target
diane m. beck Department of Psychology, University of Illinois, feature (Wolfe et al., 1989). Visual search tasks have a clearer
Urbana-Champaign, Champaign, Illinois relationship than spatial cuing paradigms with our everyday
206 attention
dence that the enhancement of activity at an attended loca-
tion and the suppression of activity at unattended locations
operate in a push-pull fashion and thus represent codepen-
dent mechanisms (Pinsk, Doniger, & Kastner, 2004; Schwartz
et al., 2005).
An important component of the Posner task is the cuing
period during which subjects deploy attention to a location
in space at which visual stimuli are expected to occur. A
neural correlate of cue-related activity has been found in
physiology studies demonstrating that spontaneous (baseline)
firing rates were 30–40% higher for neurons in areas V2 and
V4 when the animal was cued to attend covertly to a location
within the neuron’s receptive field (RF) before the stimulus
was presented there—that is, in the absence of visual stimu-
lation (Lee, Williford, & Maunsell, 2007; Luck, Chelazzi,
Hillyard, & Desimone, 1997; but see McAdams & Maunsell,
1999). This increased baseline activity has been interpreted
as a direct demonstration of a top-down signal that feeds
back from higher-order to lower-order areas. In the latter
areas, this feedback signal appears to bias neurons represent-
ing the attended location, thereby favoring stimuli that will
Figure 13.2 Attentional response modulation in the visual appear there at the expense of those appearing at unat-
system. Attention effects that were obtained in the experiments pre- tended locations.
sented in figure 13.1 were quantified by defining several indices: (A) To investigate attention-related baseline increases in the
attentional enhancement index (AEI), (B) attentional suppression human visual system in the absence of visual stimulation,
index (ASI), (C ) baseline modulation index (BMI). For all indices,
larger values indicate larger effects of attention. Index values were
fMRI activity was measured while subjects were cued to
computed for each subject based on normalized and averaged covertly direct attention to the periphery of the left or right
signals obtained in the different attention conditions and are pre- visual hemifield and to expect the onset of a stimulus
sented as averaged index values from four subjects (for index defini- (Kastner, Pinsk, De Weerd, Desimone, & Ungerleider, 1999;
tions, see O’Connor et al., 2002). In visual cortex, attention effects O’Connor et al., 2002; Sylvester, Shulman, Jack, &
increased from early to later processing stages. Attention effects in
Corbetta, 2007). The expectation period, during which sub-
the LGN were larger than in V1. Vertical bars indicate standard
error of the mean across subjects. (From O’Connor et al., 2002.) jects were attending to the periphery without receiving visual
input, was followed by attended presentations of a high-
neural responses to unattended stimuli should be attenuated contrast checkerboard. During the attended presentations,
depending on the attentional load necessary to process the subjects counted the occurrences of luminance changes.
attended stimulus. This idea was tested by using the check- Relative to the preceding blank period in which subjects
erboard paradigm described earlier while subjects performed maintained fixation at the center of the screen and did not
either an easy (low-load) attention task or a hard (high-load) attend to the periphery, fMRI signals increased during the
attention task at fixation and ignored the peripheral check- expectation period in the LGN and the striate and extrastri-
erboard stimuli. Relative to the easy-task condition, mean ate cortex (figures 13.1C,F, 13.2C ). This elevation of baseline
fMRI signals evoked by the high-contrast and by the low- activity was followed by a further response increase evoked
contrast stimuli decreased significantly in the hard-task by the visual stimuli (figure 13.1C ).
condition across the visual system with the smallest effects Similar to response modulation, the magnitude of increases
in early visual cortex and the largest effects in LGN and in baseline activity depends on several variables, including
extrastriate cortex (figure 13.1B,E). Taken together, these the expected task difficulty (Ress, Backus, & Heeger, 2000)
findings suggest that neural activity evoked by ignored or the expected presence or absence of distracter stimuli
stimuli is attenuated at several stages of visual processing as (Serences, Yantis, Culberson, & Awh, 2004). Early studies
a function of the load of attentional resources engaged else- have found evidence that baseline increases are feature spe-
where (O’Connor et al., 2002; Rees, Frith, & Lavie, 1997; cific; that is, they are stronger during the expectation of a
Schwartz et al., 2005). Attentional-load-dependent suppres- preferred compared to a nonpreferred stimulus feature in
sion of unattended stimuli may be a neural correlate for areas that preferentially process a particular stimulus feature
behavioral effects such as reduction of interference caused (e.g., color in area V4 or motion in area MT) (Chawla, Rees,
by distracters (Shiu & Pashler, 1995). Further, there is evi- & Friston, 1999; Shulman et al., 1999). However, more
208 attention
Figure 13.3 Object-based attention in visual cortex. (A) Example fusiform face area (FFA) and the parahippocampal place area (PPA).
stimulus from experiment 2 of the study by O’Craven, Downing, Solid lines represent activity when subjects attended to the static
and Kanwisher (1999) demonstrating object-based attention. The stimulus, and dotted lines represent trials during which subjects
stimuli consisted of overlapping house and face stimuli. On each attended to the moving stimulus. Activity was higher in the FFA
trial either the house or the face stimulus moved while subjects per- when “faceness” was the irrelevant property of the attended object
formed a consecutive repetition-detection task on either the direc- (Attend Moving, Face Moving) than when it was a property of the
tion of motion of the moving stimuli or the position of the stationary unattended object (Attend Moving, House Moving). The response
object, which was offset slightly from trial to trial. (B) Averaged pattern of the PPA was identical for “houseness.” (Modified figure
fMRI signals (n = 4), computed as percent signal change, for the 42.1, Freiwald & Kanwisher, 2004.)
attention can also bias processing in favor of a particular feature, including enhanced activity in the posterior fusiform
stimulus attribute, or feature. In experiments investigating gyrus for attention to shape, area V4 for attention to color,
feature-based attention, stimuli are typically composed of and area MT for attention to motion.
multiple stimulus features (e.g., colored shapes moving in Feature-based attention mechanisms have also been
different directions), and subjects are cued to attend to a investigated in the presence of distracters. The observation
particular feature dimension (e.g., the color red) while that neural responses to a selected feature increase regardless
ignoring the other dimensions. In monkey physiology studies, of where the animal attends has led to the hypothesis that
neural responses increased when the feature in the RF feature-based attention may operate globally throughout the
matched the cued feature regardless of where the animal visual field (Bichot, Rossi, & Desimone, 2005; Martinez-
was attending. Feature-based attention effects have been Trujillo & Treue, 2004; McAdams & Maunsell, 2000; Saenz,
observed in area V4 for several feature dimensions (for a Buracas, & Boynton, 2002; Serences & Boynton, 2007). If
review see Maunsell & Treue, 2006) including color (Motter, one considers a visual search task where subjects are looking
1994), luminance (Motter, 1994), and orientation (Haenny, for a red circle in an array of colored shapes, it will certainly
Maunsell, & Schiller, 1988), and in MT for direction of be advantageous from a computational point of view to
motion (Martinez-Trujillo & Treue, 2004; Treue & Martinez- increase neural responses to any red items, thereby marking
Trujillo, 1999). In addition, the temporal characteristics of the candidate target stimuli and restricting the remaining
feature-based and space-based attention were investigated in search to the subset of red shapes to ultimately find the circle.
area V4 (Hayden & Gallant, 2005). Feature-based attention This approach is opposed to space- and object-based atten-
effects were found to be sustained throughout the visual tion, which are both inherently tied to a spatial location.
evoked responses, whereas space-based attention effects were This hypothesis has been tested in a physiology study where
more transient, peaking in the later portion of the response. monkeys performed a visual search task (Bichot et al., 2005).
These findings suggest that space- and feature-based attention Neuronal responses were enhanced when the stimulus inside
effects rely on different neural mechanisms. the RF was the same color or shape as the target stimulus.
In human neuroimaging studies, where activity is mea- This result occurred throughout the search period, regard-
sured at the level of entire areas within neural networks, less of where the monkey was attending.
researchers have taken advantage of the functional special- A similar effect has been observed in humans in an experi-
ization of visual cortex in studying feature-based attention mental design with two stimuli, one presented in each visual
(Beauchamp, Cox, & DeYoe, 1997; Buechel et al., 1998; hemifield (Saenz et al., 2002). The attended stimulus con-
Clark et al., 1997; Corbetta, Miezin, Dobmeyer, Shulman, sisted of two overlapping dot patterns, one moving upward
& Petersen, 1991; McMains et al., 2007; O’Craven, Rosen, and the other moving downward. The stimulus in the unat-
Kwong, Treisman, & Savoy, 1997; Serences & Boynton, tended hemifield always moved in the same direction (e.g.,
2007; Sohn, Chong, Papathomas, & Vidnyanszky, 2005). In downward). Functional MRI signals were measured in the
one such study, Corbetta and colleagues investigated atten- retinotopic representation of the unattended stimulus while
tion to shape, color, or speed and observed enhanced activity subjects alternated between attending to the same direction
in visual regions specialized for processing the attended as the distracter (i.e., downward) or to the opposite direction
210 attention
It is important to note that the suppressive (competitive)
interactions across visual cortex discussed thus far occurred
automatically and in the absence of attentional allocation
to the stimuli. In fact, in these paradigms, participants
were engaged in an attention-demanding task at fixation.
Thus neural competition would appear to be pervasive in
the representation of cluttered visual scenes. In order to
overcome this less than optimal representation of objects
within a visual scene, there need to be mechanisms by which
this ongoing competition among multiple stimuli can be
resolved. The allocation of top-down attention is one such
mechanism.
212 attention
stimuli were identical, the suppression was considerably perspective is consistent with effects of Gestalt grouping and
reduced relative to the heterogeneous conditions. This result figure-ground segmentation found in early visual cortex
suggests that grouping by similarity represents a bottom-up (Kapadia, Ito, Gilbert, & Westheimer, 1995; Kastner et al.,
bias that may influence or even determine the amount of 2000; Lamme, 1995; Nothdurft et al., 1999; Qiu et al., 2007;
competition among items. Zhou, Friedman, & von der Heydt, 2000). Alternatively, the
Although grouping is arguably the most well-known of the degree to which perceptual organization occurs may be a
perceptual organization processes, there are a number of consequence of competitive interactions. As mentioned, the
other processes critical to our ability to segment and orga- response of V4 neurons to a pair of stimuli is best described
nize a scene. Before objects can be grouped together, the as a weighted average of the responses to the two stimuli
visual system must decide what regions in the scene consti- when presented alone (Luck et al., 1997; Reynolds et al.,
tute potential objects; that is, it must segment figure from 2000). If the two stimuli that comprise the pair are identical,
ground (Rubin, 1958). Further complicating this process is as in the grouping-by-similarity study, then the weighted-
the fact that some potential objects may be partially occluded average model would predict that the response to the pair
from view, requiring the visual system to infer the presence should be indistinguishable from the response to each of the
of objects on the basis of the information present in the individual stimuli (Reynolds et al.). Thus there may not be
scene; that is, it must rely on visual interpolation mecha- any need to appeal to additional grouping mechanisms to
nisms (Palmer, 1999). Both of these processes were probed explain these findings. Instead, the reduced competition
in a second study (McMains & Kastner, 2007), using the present in the displays with identical items, relative to the
Kanizsa illusion (Kanizsa, 1976). In the Kanizsa illusion, one with different stimuli, may simply be the result of the
four circular “Pacman” items, also called inducers, are averaging procedure performed by the neurons in areas such
aligned to form an illusory square (figure 13.4C ) that is per- as V4. If less competition is evoked by items that are per-
ceived as a single foreground element with the inducers lying ceptually organized, then there is no need to select or filter
behind it. When the four inducers are rotated inward, the any one of them, and instead the items are processed as a
illusion occurs as a result of the assignment of the group. Importantly, these are not mutually exclusive possi-
L-shaped borders to a common object, but does not occur bilities. Further, it is unlikely that these interactions can be
when they are rotated outward. Based on the hypothesis that explained by a unified neural mechanism. Rather, the variety
the degree of perceptual organization in a visual display of perceptual organization principles may rely on a variety
should determine the degree of competition, it was predicted of underlying neural processes.
that when the four inducers were rotated inward giving rise
to the illusion and were thus part of a single object, they Relation of Bottom-Up and Top-Down Mechanisms
should not compete with each other. Alternatively, if the The studies described thus far suggest that both top-
four inducers were rotated outward, thereby disrupting the down and bottom-up processes can bias competitive
illusion, they would be treated as four separate objects, interactions in visual cortex. How might these processes
which compete for neural representation independently. As interact? Evidence comes from a physiology study (Qiu
predicted, the competitive interactions were significantly et al., 2007) in which the effects of attention on image
reduced across visual cortex when the four inducers formed segmentation processes were probed in V2 neurons. Neurons
a single foreground object, but not when they were rotated in V2 have previously been found to integrate contextual
outward representing four separate items (figure 13.4F ). information from beyond their small RFs in order to signal
Thus far, a number of perceptual organization processes, when a border in their RF belongs to an attended object, an
such as grouping by the Gestalt factor of similarity and effect termed border ownership (Zhou et al., 2000). Consistent
figure-ground segmentation and visual interpolation pro- with previous studies (Driver et al., 1992; Kastner et al.,
cesses related to illusory contour formation, have been shown 2000; Lamy, Segal, & Ruderman, 2006; C. Moore & Egeth,
to affect the outcome of competitive interactions, thereby 1997), V2 neurons signaled border ownership for both
suggesting that any form of perceptual organization might attended and unattended figures, suggesting that attention is
influence the ongoing competition (Beck & Kastner, 2007; not necessary for figure-ground segmentation (Qiu et al.).
McMains & Kastner, 2007). When the interaction of attention and border ownership was
The relationship between perceptual organization princi- investigated, the magnitudes of the attention effects were
ples and competition can be interpreted in at least two ways. found to be predicted by the neurons’ border ownership
Competition may be influenced by mechanisms that mediate responses, such that attention effects were larger when the
perceptual organization from elsewhere in the cortex. These object that owned the border in the RF was attended to
mechanisms could boost the activity related to the set compared to when a different figure of equal distance away
of stimuli as it enters V4, effectively counteracting any com- from the RF was attended to (figure 13.5E ). These results
petition that may have occurred between stimuli. Such a were interpreted using a novel framework, the “interface
214 attention
effects in simultaneously presented stimulus displays. NeuroImage, human visual cortex: Functional MRI studies. Hum. Brain Mapp.,
30(2), 506–511. 25(4), 424–432.
Brefczynski, J. A., & DeYoe, E. A. (1999). A physiological Hayden, B. Y., & Gallant, J. L. (2005). Time course of attention
correlate of the “spotlight” of visual attention. Nat. Neurosci., 2(4), reveals different mechanisms for spatial and feature-based atten-
370–374. tion in area V4. Neuron, 47, 637–643.
Broadbent, D. E. (1958). Perception and communication. Oxford, UK: Heinze, H. J., Luck, S. J., Munte, T. F., Gos, A., Mangun,
Oxford University Press. G. R., & Hillyard, S. A. (1994). Attention to adjacent and
Buechel, C., Josephs, O., Rees, G., Turner, R., Frith, C. D., separate positions in space: An electrophysiological analysis.
& Friston, K. J. (1998). The functional anatomy of attention Percept. Psychophys., 56(1), 42–52.
to visual motion: A functional MRI study. Brain, 121(Pt. 7), Hopf, J. M., Boehler, C. N., Luck, S., Tsotsos, J. K., Heinze,
1281–1294. H. J., & Schoenfeld, M. A. (2006). Direct neurophysiological
Buffalo, E. A., Bertini, G., Ungerleider, L. G., & evidence for spatial suppression surrounding the focus of
Desimone, R. (2005). Impaired filtering of distracter stimuli attention in vision. Proc. Natl. Acad. Sci. USA, 103(4),
by TE neurons following V4 and TEO lesions in macaques. 1053–1058.
Cereb. Cortex, 15(2), 141–151. Kanizsa, G. (1976). Subjective contours. Sci. Am., 234, 48–52.
Cameron, E. L., Tai, J. C., & Carrasco, M. (2002). Covert Kapadia, M., Ito, M., Gilbert, C., & Westheimer, G. (1995).
attention affects the psychometric function of contrast sensitivity. Improvement in visual sensitivity by changes in local context:
Vis. Res., 42(8), 949–967. Parallel studies in human observers and in V1 of alert monkeys.
Carrasco, M., Loula, F., & Ho, Y. X. (2006). How attention Neuron, 15, 843–856.
enhances spatial resolution: Evidence from selective adaptation Kastner, S., De Weerd, P., Desimone, R., & Ungerleider,
to spatial frequency. Percept. Psychophys., 68(6), 1004–1012. L. G. (1998). Mechanisms of directed attention in the human
Carrasco, M., Giordano, A. M., & McElree, B. (2004). Tempo- extrastriate cortex as revealed by functional MRI. Science,
ral performance fields: Visual and attentional factors. Vis. Res., 282(5386), 108–111.
44(12), 1351–1365. Kastner, S., De Weerd, P., Pinsk, M. A., Elizondo, M. I.,
Chawla, D., Rees, G., & Friston, K. J. (1999). The physiologi- Desimone, R., & Ungerleider, L. G. (2001). Modulation of
cal basis of attentional modulation in extrastriate visual areas. sensory suppression: Implications for receptive field sizes in the
Nat. Neurosci., 2(7), 671–676. human visual cortex. J. Neurophysiol., 86(3), 1398–1411.
Chelazzi, L., Duncan, J., Miller, E. K., & Desimone, Kastner, S., De Weerd, P., & Ungerleider, L. G. (2000).
R. (1998). Responses of neurons in inferior temporal cortex Texture segregation in the human visual cortex: A functional
during memory-guided visual search. J. Neurophysiol., 80(6), MRI study. J. Neurophysiol., 83, 2453–2457.
2918–2940. Kastner, S., Nothdurft, H., & Pigarev, I. (1999). Neuronal
Clark, V. P., Parasuraman, R., Keil, K., Kulansky, R., responses to motion and orientation contrast in cat striate cortex.
Fannon, S., Maisog, J. M., et al. (1997). Selective attention to Visual Neurosci., 16, 587–600.
face identity and color studied with fMRI. Hum. Brain Mapp., 5, Kastner, S., Pinsk, M. A., De Weerd, P., Desimone, R., &
293–297. Ungerleider, L. G. (1999). Increased activity in human
Cook, E., & Maunsell, J. (2002). Attentional modulation visual cortex during directed attention in the absence of visual
of behavioral performance and neuronal responses in middle stimulation. Neuron, 22(4), 751–761.
temporal and ventral intraparietal areas of macaque monkey. Knierim, J. J., & Van Essen, D. C. (1992). Neuronal responses
J. Neurosci., 22(5), 1994–2004. to static texture patterns in area V1 of the alert macaque monkey.
Corbetta, M., Miezin, F. M., Dobmeyer, S., Shulman, G. L., J. Neurophysiol., 67(4), 961–980.
& Petersen, S. E. (1991). Selective and divided attention during Lamme, V. A. (1995). The neurophysiology of figure-ground
visual discrimination of shape, color, and speed: Functional segregation in primary visual cortex. J. Neurosci., 15,
anatomy by positron emission tomography. J. Neurosci., 11(8), 1605–1615.
2383–2402. Lamy, D., Segal, H., & Ruderman, L. (2006). Grouping does
Crick, F. H. C. (1984). Function of the thalamic reticular complex: not require attention. Percept. Psychophys., 68(1), 17–31.
The searchlight hypothesis. Proc. Natl. Acad. Sci. USA, 81(14), Lavie, N., & Tsal, Y. (1994). Perceptual load as a major deter-
4586–4590. minant of the locus of selection in visual attention. Percept.
Desimone, R., & Duncan, J. (1995). Neural mechanisms of Psychophys., 56(2), 183–197.
selective visual attention. Annu. Rev. Neurosci., 18, 193–222. Lee, J., Williford, T., & Maunsell, J. H. (2007). Spatial atten-
Driver, J., Baylis, G. C., & Rafal, R. D. (1992). Preserved tion and the latency of neuronal responses in macaque area V4.
figure-ground segregation and symmetry perception in visual J. Neurosci., 27(36), 9632–9637.
neglect. Nature, 360, 73–75. Li, Z. (1999). Contextual influences in V1 as a basis for pop-out
Duncan, J. (1984). Selective attention and the organization of and asymmetry in visual search. Proc. Natl. Acad. Sci. USA, 96(18),
visual information. J. Exp. Psychol. Gen., 113(4), 501–517. 10530–10535.
Freiwald, W. A., & Kanwisher, N. G. (2004). Visual selective Lu, Z. L., & Dosher, B. A. (1998). External noise distinguishes
attention: Insights from brain imaging and neurophysiology. attention mechanisms. Vis. Res., 38(9), 1183–1198.
In M. S. Gazzaniga (Ed.), The Cognitive Neurosciences (3rd ed., Luck, S. J., Chelazzi, L., Hillyard, S. A., & Desimone,
pp. 575–588). Cambridge, MA: MIT Press. R. (1997). Neural mechanisms of spatial selective attention in
Haenny, P. E., Maunsell, J. H., & Schiller, P. H. (1988). State areas V1, V2, and V4 of macaque visual cortex. J. Neurophysiol.,
dependent activity in monkey visual cortex. II. Retinal and 77, 24–42.
extraretinal factors in V4. Exp. Brain Res., 69(2), 245–259. Mack, A., Tang, B., Tuma, R., Kahn, S., & Rock, I. (1992).
Han, S., Jiang, Y., Mao, L., Humphreys, G. W., & Gu, Perceptual organization and attention. Cogn. Psych., 24(4),
H. (2005). Attentional modulation of perceptual grouping in 475–501.
216 attention
Snowden, R. J., Treue, S., Erickson, R. G., & Andersen, Treue, S., & Martinez-Trujillo, J. C. (1999). Feature-based
R. A. (1991). The response of area MT and V1 neurons to attention influences motion processing gain in macaque visual
transparent motion. J. Neurosci., 11(9), 2768–2785. cortex. Nature, 399(6736), 575–579.
Sohn, W., Chong, S. C., Papathomas, T. V., & Vidnyanszky, Wertheimer, M. (1923). Laws of organization in perceptual
Z. (2005). Cross-feature spread of global attentional modulation forms. In A source book of Gestalt psychology. London: W. Ellis
in human area MT+. NeuroReport, 16(12), 1389–1393. (1938).
Sylvester, C. M., Shulman, G. L., Jack, A. I., & Corbetta, Wolfe, J. M., Cave, K. R., & Franzel, S. L. (1989). Guided
M. (2007). Asymmetry of anticipatory activity in visual cortex search: An alternative to the feature integration model for visual
predicts the locus of attention and perception. J. Neurosci., 27(52), search. J. Exp. Psychol. Hum. Percept. Perform., 15(3), 419–433.
14424–14433. Yeshurun, Y., & Carrasco, M. (1998). Attention improves or
Tootell, R., Hadjikhani, N., Hall, E., Marrett, S., impairs visual performance by enhancing spatial resolution.
Vanduffel, W., Vaughan, J., et al. (1998). The retinotopy of Nature, 396(6706), 72–75.
visual spatial attention. Neuron, 21, 1409–1422. Zhou, H., Friedman, H. S., & von der Heydt, R. (2000).
Treisman, A., & Gelade, G. (1980). A feature-integration theory Coding of border ownership in monkey visual cortex. J. Neurosci.,
of attention. Cogn. Psych., 12, 97–136. 20(17), 6594–6611.
abstract This chapter is concerned with the attentional mecha- constitute a dorsal frontoparietal attention network that performs
nisms that ensure that behavior is directed toward important stimuli this final integration step. While this network operates irre-
in the environment. We review the evidence for a coherent neural spective of the criteria used to select stimuli (e.g., location,
network in dorsal parietal and frontal cortex that sends top-down
signals, reflecting both the location and features of task-relevant
features) or responses (e.g., effector), subregions may show
objects, that bias processing in sensory regions such as occipital specializations for particular attributes, as we will discuss.
cortex. Top-down signals for location aid selection of an object by We will also briefly review the coordination of this network
changing neural activity throughout an occipital retinotopic map, with other networks involved in assessing value, generating
not just at the attended location, resulting in a relative increase in and maintaining goals, accessing information in working
activity at that location in the map. The overall behavioral goals
memory, and retrieving information from long-term memory
that determine which objects are selected are not set within the
frontoparietal network, but reflect the interaction of networks (figure 14.1).
involved in reward, memory, and executive control. These net-
works may provide inputs to dorsal frontoparietal regions that are The dorsal frontoparietal attention network
transformed into biasing signals.
Definition: Functional Connectivity and Anticipatory
Signals While the involvement of different brain regions
“Attention” broadly refers to a set of mechanisms that allow in different functions is not controversial, it may not seem
people to selectively perceive and respond to events that are justified to segregate sets of regions into coherent brain
relevant to their behavioral goals. Because of the importance networks, such as a dorsal frontoparietal network. However,
of this function for many aspects of human behavior, discus- an important development over the last decade has been the
sions of attention crop up in treatments of such diverse topics refinement of physiological techniques for identifying related
as language, memory, emotion, perception, and motor sets of brain regions. One such technique, called functional
control. Here we concentrate on the selection of objects in connectivity magnetic resonance imaging (fcMRI), measures
the environment for action. Chapter 13 in this book (Kastner, the temporal correlation of the blood-oxygenation-level
McMains, & Beck) discussed how attending to an object dependent (BOLD) signal across multiple regions (Biswal,
biases sensory evoked activity in sensory cortex and how Yetkin, Haughton, & Hyde, 1995). Related regions show
these biases are thought to result in selective processing of strong low-frequency (< 0.1 Hz) correlations over time, even
the object (Desimone & Duncan, 1995). when the subject is lying at rest with no task or stimulation
Because complex goal-directed behaviors reflect the inter- (resting-state fcMRI). The origin of these correlations is still
action of many different brain systems, it is not possible to controversial, but they are thought to reflect both anatomical
speak of a single attentional control system. Different brain and functional factors. Several studies in the last five years
networks may be recruited when formulating behavioral have identified a number of resting-state networks that
goals, assessing those goals with respect to current knowl- correspond to regions that are coactivated when subjects
edge of the environment, retrieving relevant information perform a task (Damoiseaux et al., 2006; Fox, Corbetta,
from memory, and integrating all of these influences into Snyder, Vincent, & Raichle, 2006; Fox et al., 2005; Fransson,
specific biasing signals that can be sent down to sensory and 2005; Greicius, Krasnow, Reiss, & Menon, 2003; Hampson,
motor systems (hence the term “top-down” biases). In this Peterson, Skudlarski, Gatenby, & Gore, 2002). Relevant to
review, we suggest that a set of dorsal frontoparietal regions this discussion is the strong correlation between the frontal
eye field (FEF) at the intersection of superior frontal sulcus
and precentral sulcus, and regions within the intraparietal
maurizio corbetta, chad m. sylvester, and gordon l.
shulman Departments of Neurology, Radiology, and Anatomy sulcus (IPS). IPS and FEF represent the core regions of
and Neurobiology, Washington University School of Medicine, the dorsal attention network (figure 14.2). These regions
St. Louis, Missouri show spontaneous correlation also with visual areas like
MT+ and V7. Other networks involved in regulation of Perhaps the most direct method for observing sensory
attention are shown in figure 14.2 and will not be further biases in isolation is to provide subjects with a cue telling
considered in this chapter. A right-hemisphere-dominant them to attend to a specific location in space or a
ventral frontoparietal attention network, with core regions visual feature and to measure the resulting physiological
in right temporoparietal junction and ventral frontal cortex, signals prior to the onset of a target stimulus (Corbetta,
is involved in stimulus-driven reorienting and resetting task- Kincade, Ollinger, McAvoy, & Shulman, 2000; Hopfinger,
relevant networks; its physiological properties have been Buonocore, & Mangun, 2000; Kastner, Pinsk, De Weerd,
recently reviewed (Corbetta, Patel, & Shulman, 2008). A Desimone, & Ungerleider, 1999; N. Muller, Bartelt, Donner,
bilateral “default” network is consistently deactivated during Villringer, & Brandt, 2003; Serences, Yantis, Culberson, &
goal-directed behavior (Mazoyer et al., 2001; Raichle et al., Awh, 2004; Sylvester, Shulman, Jack, & Corbetta, 2007).
2001; Shulman et al., 1997) and may be important in filtering Cuing experiments in humans have routinely observed pre-
information from internal task-irrelevant processes. paratory or endogenous activations—that is, activations not
220 attention
Figure 14.2 Functional connectivity by fMRI (fcMRI) defines roughly reproduce the default network, possibly indicating a push-
separate dorsal and ventral networks. (A) Dorsal attention and pull relationship between the two networks. (B) Ventral attention
default networks. The map indicates regions that showed signifi- network. Five ventral regions (R TPJ, R VFC, R MFG, R PrCe)
cant positive correlations with three (red) or four (yellow) of the were used as seeds for an FC analysis. Regions showing consistent
seed regions in the dorsal attention network (IPS, FEF, V7, MT+). positive correlations largely reproduce the ventral network, but
The dorsal network is largely reproduced in the resting state FC negative correlations in default regions are not observed. The pos-
maps. Regions that show significant negative correlations with terior MFG near the inferior frontal sulcus appears to be connected
three (green) or four (blue) of the seed regions are also shown and to both networks. (He et al., 2007.) (See color plate 15.)
driven by a sensory stimulus—in dorsal parietal regions of activations in parietal and other regions predict performance
IPS, extending medially into superior parietal lobule, and in on subsequent targets (Pessoa & Padmala, 2005; Sapir,
dorsal precentral sulcus at the intersection with the superior d’Avossa, McAvoy, Shulman, & Corbetta, 2005). Finally,
frontal sulcus (FEF; figure 14.3A). As noted earlier, these purely endogenous activations in dorsal frontoparietal
regions also show strong resting-state fcMRI. Dorsal fronto- regions are spatially selective, with greater activity following
parietal activations are observed whether the cue stimulus is a cue in contralateral FEF and IPS (Sylvester et al.), as
visual (Corbetta et al., 2000; Hopfinger et al.; Kastner et al.) expected if these regions control the spatial selection of
or auditory (Sylvester et al.) and are sustained as attention is information.
maintained over extended durations (Corbetta, Kincade, & In some studies of spatial cuing, endogenous cue-related
Shulman, 2002). Subtle but consistent topographic differ- responses in dorsal frontoparietal regions are accompanied
ences have been reported between regions that encode a cue by spatially selective endogenous activation of retinotopic
and maintain attention (Woldorff et al., 2004). Preparatory occipital cortex (Kastner et al., 1999; Sylvester et al., 2007),
but not in other studies (Corbetta, Tansy et al., 2005; figure physical threshold for detection of a low-contrast stimulus
14.3A). Although the reasons for this variation are not well presented within the movement field of the stimulated site
understood, it may reflect the degree to which selection of in FEF (Moore & Fallah, 2001). In humans, Ruff and
an object or stimulus is limited by perceptual factors, which colleagues showed that TMS of human FEF produced
may be associated with endogenous modulation of visual BOLD activation in peripheral V1–V4, independently of
cortex, as opposed to factors related to memory or stimulus- whether a stimulus was present (Ruff et al., 2006; figure
response translation. 14.4A). Correspondingly, stimulation enhanced the perceived
Therefore dorsal frontal and parietal regions in IPS and contrast of peripheral stimuli. These studies show that
FEF that are coactivated by cues to attend to a visual object activity in FEF can produce the physiological changes in
form a distinct network in resting-state fcMRI studies (Fox occipital cortex and the changes in behavioral performance
et al., 2006, 2005) and can be considered a separate func- that are expected for a brain region involved in top-down
tional network. Interestingly, consistent with the activation control of spatial attention.
results, this network is not correlated under resting condi- However, these results were not obtained under physio-
tions with regions in the occipital lobe, except for human logical conditions, and more importantly they do not dem-
MT+. The interaction of dorsal frontoparietal cortex with onstrate an asymmetry in the interaction between occipital
occipital cortex is highly task contingent. In a later section, and dorsal frontoparietal regions. A recent study using
we discuss a possible mechanism for flexibly changing the Granger causality analysis meets both these objections
effects of signals in one area (e.g., FEF) on those in another (Bressler, Tang, Sylvester, Shulman, & Corbetta, 2008). The
(e.g., V4). These human imaging studies of preparatory and authors compared the degree to which endogenous pre-
resting-state activity in dorsal frontoparietal regions in paratory signals in occipital cortex temporally predicted
humans are complemented by monkey single-unit studies signals in dorsal frontoparietal regions over and above
showing anticipatory signals for spatial attention in FEF the prediction based on the dorsal frontoparietal regions
(Kodaka, Mikami, & Kubota, 1997) and LIP (Bisley & themselves, and vice versa. They found that signals in dorsal
Goldberg, 2003), as well as resting-state functional connec- frontoparietal regions strongly predicted occipital signals
tivity between FEF and LIP in anesthetized monkeys (top-down direction), and this prediction was significantly
(Vincent et al., 2007). greater than that from occipital signals to dorsal frontopari-
etal signals (bottom-up direction). In fact, the latter predict-
Causality of Top-Down Biases from Dorsal ability did not exceed a baseline level of predictability
Attention Network onto Visual Cortex Physiological between any two voxels in the brain. These results support
studies are generally correlational, demonstrating a the hypothesis that under physiological conditions, following
relationship between a spatial or temporal pattern of neural a cue to attend to a location, dorsal frontoparietal regions
activity and some task or behavioral parameters. Although modulate occipital regions (figure 14.1).
cues to attend can produce endogenous signals in both dorsal
frontoparietal and sensory cortex, and these signals may be Topographic Organization of Maps in Dorsal
predictive of behavioral performance, these results do not Attention Network To understand how control is
imply a causal influence of control regions on data-processing implemented, a helpful hint is the functional organization of
regions. Several recent studies, however, have provided an area or system, that is, the parameters that are coded in
evidence for this proposition. Moore and colleagues showed the pattern of neural activity. Recent studies have begun to
in monkey that stimulation of R FEF modulated sensory- detail the organization of the spatially selective neurons in
evoked activity in V4 neurons whose receptive fields matched frontal and parietal cortex that may be the source of top-
the movement field of the stimulated FEF site (Moore & down biases to visual cortex. Human parietal and frontal
Armstrong, 2003). Stimulation also changed the psycho- cortex appears to contain topographic maps of contralateral
visual space. The initial report of a single topographic map location are relevant to the current task and, as such, have
of the contralateral hemifield in human parietal cortex by been biased by preparatory activity. Selection is therefore
Sereno, Pitzalis, and Martinez (2001) has been followed by based on the magnitude of an object’s activation relative to
several studies that have found multiple maps (Hagler, the activation of all other objects in the scene. A single-unit
Riecke, & Sereno, 2007; Schluppeck, Glimcher, & Heeger, study in area LIP (Bisley & Goldberg, 2003), a likely homo-
2005; Silver, Ress, & Heeger, 2005), including a report of logue of human IPS regions, supports the idea that selection
five contiguous maps along IPS (Swisher, Halko, Merabet, is based on the activity in one spatially selective set of neurons
McMains, & Somers, 2007). Unlike early retinotopic occipital relative to that of another. The duration for which a monkey
cortex, some of these regions may also show substantial attended to the location of a flashed distracter object, pre-
nontopographic activations to ipsilateral stimuli (Jack et al., sented in the opposite hemifield from an attended target
2007). In the animal single-unit literature, the evidence for location, was predicted from the duration for which LIP
topographic maps in parietal areas such as LIP is inconsistent activity from the neurons responding to the distracter loca-
(Ben Hamed, Duhamel, Bremmer, & Graf, 2001; Platt & tion was greater than the activity from neurons responding
Glimcher, 1998), although a recent monkey fMRI study to the target location. In other words, the locus of attention
reported clear evidence for a hemifield map in both in frontoparietal control regions may be coded by a differ-
hemispheres (Gaurav Patel, Larry Snyder, and Maurizio ence signal between the level of anticipatory activity at the
Corbetta, personal communication). attended location versus unattended locations in different
Models of spatial selection propose that objects in a scene parts of the map.
are represented in a topographically organized “salience”
map and that the most activated object or location in Mechanisms: Coding the Locus of Attention
the map is selected as part of a “winner-take-all” process Based on Relative Activity Within a Map The
(Koch & Ullman, 1985; Wolfe, 1994). The salience of importance of relative rather than absolute activity extends
an object in the map is partly determined by its sensory to top-down biases in visual cortex, where attending to a
properties (e.g., high contrast) and by whether its features or location changes activity not only at the attended location
224 attention
of the retinotopic visual map but also throughout the map. well as the predictability of the locus of attention, increased
While preparatory increases are observed at the attended over the same period, reflecting the larger signal decreases
location in an occipital retinotopic map (Hopfinger et al., at the homotopic uncued location (Sylvester et al., 2007).
2000; Kastner et al., 1999; N. Muller et al., 2003; Serences, Therefore, the maintenance of attention may be reflected
Yantis, et al., 2004; Sylvester et al., 2007), preparatory in the sustained magnitude of a relative signal, not an
decreases are observed at unattended locations in the map absolute signal.
(Silver, Ress, & Heeger, 2007; Sylvester, Jack, Corbetta, & Finally, the mapwide distribution of preparatory activity
Shulman, 2008). in retinotopic cortex reflects not simply the location of the
Mapwide changes in occipital areas support the hypothe- attended object, but also the computational demands of the
sis that the operative spatial signal determining the locus of task. Serences, Yantis, and colleagues (2004) demonstrated
attention and salience is relative, such as a difference signal, that when subjects expected a target stimulus to be sur-
but more direct evidence has recently been reported. Most rounded by closely spaced distracter objects, preparatory
studies of attentional modulations compare the activations signals increased, even when task difficulty remained con-
from an object when it is attended versus unattended. The stant. However, the relationship between the additional pre-
assumption is that the two situations are analogous to when paratory signal at the attended location and the suppression
attended and unattended objects are simultaneously present. of distractor information at nontarget locations was unclear.
However, this assumption is only correct when the activa- A recent study (Sylvester et al., 2008) has shown that prepa-
tions from the two objects are uncorrelated over time. In ratory activity at nontarget locations depends on whether
fact, a recent fMRI study has shown that preparatory signals noise at those locations can adversely affect performance.
between homotopic locations of occipital retinotopic maps, Sylvester and colleagues compared mapwide changes in
as well as between left and right IPS and FEF, are highly activity when subjects performed a coarse orientation dis-
correlated over trials (Sylvester et al., 2007). Correlated crimination on a low-contrast, near-threshold Gabor patch
activity is significant but less strong at nonhomotopic loca- and an equally difficult but fine orientation discrimination
tions (e.g., between fovea and periphery) or across separate on a high-contrast, suprathreshold Gabor patch. The loca-
areas (e.g., between FEF and V3A). Correlated neural activ- tion of the Gabor was cued, and expected contrast was
ity across separate parts of V1 following stimulus presenta- blocked, allowing subjects to optimally adjust the distribution
tion has also been reported (Chen, Geisler, & Seidemann, of attention to both the attended location and the nature of
2006), suggesting that the BOLD correlations reflect neural the discrimination. In the low-contrast condition, perfor-
rather than or in addition to hemodynamic factors. While mance was partly limited by noise at nontarget locations
the neural causes of the BOLD correlations are not known, that created spurious false alarms, but in the high-contrast
they might reflect nonspatial signals that carry information condition, performance was only limited by noise at the
about the upcoming stimulus features or task or overall target location. Even though the spatial distribution of the
changes in arousal. As a result of the correlated signal, the task stimulus was identical in the two conditions, preparatory
absolute BOLD signal at the attended location in occipital signal decreases at nontarget locations in retinotopic occipital
and dorsal frontoparietal cortex is only a moderate trial-to- maps were greater when subjects expected a low-contrast
trial predictor of the direction of attention. Predictability in than high-contrast Gabor, reflecting the need to suppress
both occipital and dorsal frontoparietal areas is greatly noise at these locations and “mark” the target location by
improved by subtracting out the common “noise,” that is, creating a steep target-nontarget location gradient. No effects
taking the difference between activity at the attended loca- of expected contrast were observed at the cued location.
tion in the map and the homotopic location in the opposite How are these task-dependent changes in the mapwide
hemisphere (Sestieri et al., 2008; Sylvester et al., 2007; figure distribution of preparatory signals controlled? Sylvester
14.3B). The biological relevance of the difference signal is and colleagues (2008) reported that regions in FEF and IFS
demonstrated by the fact that performance for subsequent (inferior frontal sulcus), but not IPS, showed additive effects
targets is better predicted in V3A by the magnitude of the of expected target location and contrast, with greater activa-
difference signal than by the magnitude of the absolute tions when contralateral locations were cued but also when
signal at the attended location. low-contrast stimuli were expected at either contralateral or
The importance of relative rather than absolute signal ipsilateral locations. The additive contrast and cue location
levels qualifies the association of sustained BOLD activity signals in FEF and IFS were combined to produce the inter-
with the maintenance of attention (Silver, Ress, & Heeger, acting, mapwide changes observed in retinotopic occipital
2007). In one fMRI study, while the absolute signal at cortex, although the manner in which this process occurred
the cued location in occipital retinotopic maps decreased was unclear. Interestingly, stimulation of FEF by TMS, in
over the course of the cue period, the difference between the absence of a visual stimulus, decreases activity in por-
the signal at that location and at the homotopic location, as tions of early visual cortex corresponding to the central
226 attention
suggesting that feature-based selection may operate in part areas such as FEF, whether attention is switched between
by biasing sensory activity to an object, which evokes shifts stimuli in different locations (Yantis et al., 2002), between
of spatial attention or eye movements to the object (Shih & superimposed objects (Serences, Schwarzbach, Courtney,
Sperling, 1996). This interaction suggests important links Golay, & Yantis, 2004), between stimuli in different modali-
between or within brain regions involved in feature-based ties (Shomstein & Yantis, 2004), or between superimposed
selection and overt or covert spatial selection. random-dot arrays with different visual features (Liu,
Accordingly, feature-selective selection signals have been Slotnick, Serences, & Yantis, 2003). Switch signals have
found in parietal and FEF cortex. Fast-latency single-unit been hypothesized to enable networks to “settle” into a state
responses to a particular color, for example, have been appropriate to the newly attended object (Serences & Yantis,
reported in FEF following extended experience with targets 2006).
defined by that color in multiobject displays (Bichot, Schall,
& Thompson, 1996). Imaging studies have reported dorsal Establishing goals in frontoparietal network
frontoparietal activity during cuing of features in addition to
spatial location (Shulman et al., 1999). Explicit comparisons In the laboratory, instructions to direct selective attention
of the activations to cues for location and color indicate are provided by symbolic or nonsymbolic cues (arrows,
common activity in many dorsal frontoparietal regions, but symbols, sensory cues, etc). In real life, however, selective
also activations in subregions that are greater for location attention is controlled by a complex combination of signals
than color (Giesbrecht, Woldorff, Song, & Mangun, 2003; (goals, desires, memories), which sit in the background of
Slagter et al., 2007). Similarly, studies comparing motion and awareness while guiding behavior and are generated in dis-
color cues have reported dorsal frontoparietal activations tributed brain systems that interact with the frontoparietal
that are greater for motion than color (Mangun & Fannon, network.
2007; Shulman, D’Avossa, Tansy, & Corbetta, 2002), and Several classes of internal signals can drive selective atten-
Mangun and colleagues have suggested that the location- tion. If we take the simple case of “reaching for a cup in the
selective regions are similar to the motion-selective regions cupboard to drink some water,” orienting to the cupboard
(Mangun, Fannon, Geng, & Saron, 2009). These studies and selecting the appropriate action to reach for the cup
indicate that preparing to select a visual attribute activates engages over time the frontoparietal dorsal attention
dorsal frontoparietal regions that generalize across visual network. But this preparatory activity reflects the interaction
dimensions as well as subregions that show specificity for of the dorsal network with other neural systems. For instance,
some dimensions. Finally, during sustained attention to a this behavior is motivated by an error signal in the hypo-
visual stimulus, feature-specific attentional modulations—for thalamus indicating a mismatch in the concentration of
example, modulations that are specific to a particular direc- blood (or osmolarity), which we subjectively perceive as
tion of motion—have been reported in parietal cortex and thirst. The overall organization of the behavior (“reach for
FEF, in addition to retinotopic occipital cortex, using multi- the cup in the cupboard”) may be built from learned pat-
voxel classification techniques (Serences & Boynton, 2007). terns that can be loaded in working memory. Signals from
While feature-based and location-based selection can be long-term memory indicate the kitchen’s layout and the
coordinated, as in the case of visual search discussed previ- position of the cups in the cupboard. Therefore a network
ously, feature-based selection need not drive an oculomotor controlling spatial attention and target selection should
mechanism. Selection by hierarchical scale is one example in interact over time with other neural systems monitoring the
which eye movements do not aid changes in selection, yet internal milieu and expected reward, working memory, and
preparatory activations are observed in IPS, with larger pre- long-term memory. The actual sequence of activity is cur-
paratory activations during selection of local scales in left rently unknown because of the lack of suitable methods for
than right IPS (Weissman & Woldorff, 2005). Therefore, tracking the temporal evolution of neural activity in distrib-
while eye movements, feature-based selection, and object- uted neural networks. However, the available evidence does
based selection sometimes operate in a coordinated fashion, show that the spatial attention system is jointly activated with
premotor theories that explain selection of task-relevant other cognitive systems (working memory/executive control;
visual stimuli solely in terms of the signals that drive or are reward; long-term memory) under conditions in which these
observed within oculomotor mechanisms remain incomplete. systems provide inputs for the spatial selection of objects and
Dorsal frontoparietal regions include both oculomotor-based responses. This idea is presented in figure 14.1, and some of
and non-oculomotor-based selection mechanisms. the evidence is presented in the following sections.
Finally, some transient neural signals appear to generalize
across virtually any change in selection criteria. Transient Reward/Value Signals and the Limbic System
“switch” signals are observed most strongly in medial pari- Neurons in areas involved in the control of spatial attention
etal areas such as precuneus, rather than IPS, and in frontal and eye movements carry information about the amount of
228 attention
level control. Moreover, the same regions formed a network Long-Term Memory and Hippocampus Long-term
in a resting-state fcMRI study that was distinct from a fron- memory signals also guide spatial orienting under ecological
toparietal network. Dosenbach and colleagues suggested conditions. Once a goal is established, prior knowledge
that the latter network, which only partially overlapped the about the surrounding environment guides the execution
dorsal frontoparietal network involved in stimulus selection, of complex behavior in a nearly effortless manner. In
was involved in moment-to-moment task adjustments. While reaching for a cup, we direct attention to the location of
the functional analysis of the cingulo-opercular network is the cabinet based on prior knowledge about the layout of
just beginning and its role in behavioral performance is not the kitchen as well as the location of a particular cup in the
well understood, these results support a functional subdivi- cabinet.
sion between a dorsal frontoparietal attention network and Behavioral studies have shown that implicit memory
a high-level ACC/FO network that maintains an abstract derived from previous exposure to a particular stimulus con-
specification of a task. figuration can facilitate performance during visual search of
During task performance, task information is thought to a target among distractors. These results indicate influences
be maintained in “working memory,” a short-term storage of long-term memory on orienting as we move and act in
in which information can be easily manipulated and accessed. the environment (Chun & Jiang, 2003). There is also evi-
The neural structures underlying visual working memory dence that semantic knowledge, such as the association of a
have been investigated under conditions in which this word with an object, can facilitate the detection of that
task information is highly constrained and specific (e.g., the object (Moores, Laiti, & Chelazzi, 2003).
position of each cup in the cupboard), with parietal regions A recent study showed that attention can be directed
showing load-dependent activity and a correlation of to specific objects in a visual scene based on memory and
this activity with memory performance (Marois, Chun, that the frontoparietal attention network is recruited
& Gore, 2004; Todd, Fougnie, & Marois, 2005; Vogel, when attention is guided by memory similarly to the way
McCollough, & Machizawa, 2005). Early studies showed a in which it is guided by explicit visual cues (Summerfield,
substantial overlap of frontoparietal regions involved in Lepsien, Gitelman, Mesulam, & Nobre, 2006). Summerfield
maintaining spatial attention at a location and maintaining and colleagues asked subjects to learn the location of a
spatial working memory (Cabeza & Nyberg, 1997; Corbetta, target object in several visual scenes. The next day they
Kincade, & Shulman, 2002), with behavioral results were asked to detect the same target object in previously
clearly showing that one is important for the other (Awh & learned or novel scenes. Performance was facilitated
Jonides, 2001). when subjects knew where to look based on prior experi-
More recent studies by Nobre and colleagues indicate ence or a visual cue. Interestingly, the time course of this
overlapping mechanisms for orienting to objects in spatial memory-based facilitation was quite rapid (∼100 ms), con-
working memory and orienting to objects in the environ- sistent with an automatic deployment of attention. The
ment (Lepsien, Griffin, Devlin, & Nobre, 2005; Nobre et al., frontoparietal attention network was recruited similarly in
2004). In a “pre-cue” environment condition, a cue directed the memory and visual cuing conditions, whereas in the
attention to a location before the onset of an array. In a memory condition additional regions related to memory
working-memory “retro-cue” condition, subjects loaded the retrieval were engaged (parahippocampal, retrosplenial,
same array into working memory, and then saw a cue indi- hippocampus). Interestingly, the degree of hippocampus
cating a location in the array. In both cases subjects decided activation across subjects correlated with the strength of
whether a test object was present in the stimulus array. the spatial facilitation in the memory condition, consistent
Both pre- and retro-cues facilitated the identification of with a large literature in rats associating the hippocampus
objects that were respectively presented in the visual field and related structures to spatial memory (O’Keefe, Burgess,
or in working memory, and correspondingly, the standard Donnett, Jeffery, & Maguire, 1998; Wilson & Tonegawa,
frontoparietal network was also recruited in both conditions. 1997).
However, in addition, “retro-cues” activated to a greater
degree medial and lateral prefrontal cortex, while “pre-cues” Physiological mechanisms for top-down influences and
activated visual cortex prior to stimulus presentation. selection of visual objects
Prefrontal activation precedes posterior activation in the
memory condition (Lepsien et al.). These findings suggest, The preceding evidence indicates that the dorsal frontopa-
as in the case of reward and motivational signals discussed rietal attention network for stimulus selection is recruited
earlier, that prefrontal regions putatively involved in under different task conditions, and that correspondingly
working memory coactivate with the dorsal frontoparietal different neural systems are coactivated with that network.
network when spatial attention is directed to a memory Although coactivation does not imply a functional interac-
representation. tion, it seems likely that distributed networks underlying
230 attention
Bichot, N. P., Schall, J. D., & Thompson, K. G. (1996). Visual Dorris, M. C., & Munoz, D. P. (1998). Saccadic probability
feature selectivity in frontal eye fields induced by experience in influences motor preparation signals and time to saccadic initia-
mature macaques. Nature, 381(6584), 697–699. tion. J. Neurosci., 18(17), 7015–7026.
Bisley, J. W., & Goldberg, M. E. (2003). Neuronal activity in the Dosenbach, N. U., Fair, D. A., Miezin, F. M., Cohen, A. L.,
lateral intraparietal area and spatial attention. Science, 299(5603), Wenger, K. K., Dosenbach, R. A., et al. (2007). Distinct brain
81–86. networks for adaptive and stable task control in humans. Proc.
Biswal, B., Yetkin, F., Haughton, V., & Hyde, J. (1995). Func- Natl. Acad. Sci. USA, 104(26), 11073–11078.
tional connectivity in the motor cortex of resting human brain Dosenbach, N. U., Visscher, K. M., Palmer, E. D., Miezin,
using echo-planar MRI. Magn. Reson. Med., 34, 537–541. F. M., Wenger, K. K., Kang, H. C., et al. (2006). A core
Bressler, S. L., Tang, W., Sylvester, C. M., Shulman, G. L., system for the implementation of task sets. Neuron, 50(5),
& Corbetta, M. (2008). Top-down control of human visual 799–812.
cortex by frontal and parietal cortex in anticipatory visual spatial Engel, A. K., Fries, P., & Singer, W. (2001). Dynamic predic-
attention. J. Neurosci, 28(40), 10056–10061. tions: Oscillations and synchrony in top-down processing. Nat.
Bruce, C. J., & Goldberg, M. E. (1985). Primate frontal eye Rev. Neurosci., 2(10), 704–716.
fields. I. Single neurons discharging before saccades. J. Neuro- Folk, C. L., Remington, R. W., & Johnston, J. C. (1992).
physiol., 53, 603–635. Involuntary covert orienting is contingent on attentional control
Bushnell, M. C., Goldberg, M. E., & Robinson, D. L. (1981). settings. J. Exp. Psychol. Hum. Percept. Perform., 18(4), 1030–1044.
Behavioral enhancement of visual responses in monkey cerebral Fox, M. D., Corbetta, M., Snyder, A. Z., Vincent, J. L., &
cortex. I. Modulation in posterior parietal cortex related to selec- Raichle, M. E. (2006). Spontaneous neuronal activity distin-
tive visual attention. J. Neurophysiol., 46(4), 755–772. guishes human dorsal and ventral attention systems. Proc. Natl.
Cabeza, R., & Nyberg, L. (1997). Imaging cognition: An empirical Acad. Sci. USA, 103(26), 10046–10051.
review of PET studies with normal subjects. J. Cogn. Neurosci., 9, Fox, M. D., Snyder, A. Z., Vincent, J. L., Corbetta, M., Van
1–26. Essen, D. C., & Raichle, M. E. (2005). The human brain is
Chen, Y., Geisler, W. S., & Seidemann, E. (2006). Optimal intrinsically organized into dynamic, anticorrelated functional
decoding of correlated neural population responses in the networks. Proc. Natl. Acad. Sci. USA, 102(27), 9673–9678.
primate visual cortex. Nat. Neurosci., 9(11), 1412–1420. Fransson, P. (2005). Spontaneous low-frequency BOLD signal
Chun, M. M., & Jiang, Y. (2003). Implicit, long-term spatial fluctuations: An fMRI investigation of the resting-state default
contextual memory. J. Exp. Psychol. Learn. Mem. Cogn., 29(2), mode of brain function hypothesis. Hum. Brain Mapp., 26(1),
224–234. 15–29.
Corbetta, M., Akbudak, E., Conturo, T. E., Snyder, A. Z., Fries, P. (2005). A mechanism for cognitive dynamics: Neuronal
Ollinger, J. M., Drury, H. A., et al. (1998). A common communication through neuronal coherence. Trends Cogn. Sci.,
network of functional areas for attention and eye movements. 9(10), 474–480.
Neuron, 21, 761–773. Fuster, J. M. (1985). The prefrontal cortex and temporal integra-
Corbetta, M., Kincade, J. M., Ollinger, J. M., McAvoy, tion. In E. G. Jones & A. Peters (Eds.), Cerebral Cortex (Vol. 4,
M. P., & Shulman, G. L. (2000). Voluntary orienting is dissoci- pp. 151–177). New York: Plenum Press.
ated from target detection in human posterior parietal cortex. Giesbrecht, B., Woldorff, M. G., Song, A. W., & Mangun,
Nat. Neurosci., 3, 292–297. G. R. (2003). Neural mechanisms of top-down control during
Corbetta, M., Kincade, J. M., & Shulman, G. L. (2002). Neural spatial and feature attention. NeuroImage, 19(3), 496–512.
systems for visual orienting and their relationship with working Gold, J. I., & Shadlen, M. N. (2003). The influence of
memory. J. Cogn. Neurosci., 14(3), 508–523. behavioral context on the representation of a perceptual decision
Corbetta, M., Kincade, M. J., Lewis, C., Snyder, A. Z., & in developing oculomotor commands. J. Neurosci., 23(2),
Sapir, A. (2005). Neural basis and recovery of spatial 632–651.
attention deficits in spatial neglect. Nat. Neurosci., 8(11), Greicius, M. D., Krasnow, B., Reiss, A. L., & Menon,
1603–1610. V. (2003). Functional connectivity in the resting brain: A network
Corbetta, M., Miezin, F. M., Shulman, G. L., & Petersen, analysis of the default mode hypothesis. Proc. Natl. Acad. Sci. USA,
S. E. (1993). A PET study of visuospatial attention. J. Neurosci., 100(1), 253–258.
13(3), 1202–1226. Hagler, D. J., Jr., Riecke, L., & Sereno, M. I. (2007). Parietal
Corbetta, M., Patel, G., & Shulman, G. L. (2008). The reori- and superior frontal visuospatial maps activated by pointing and
enting system of the human brain: From environment to theory saccades. NeuroImage, 35(4), 1562–1577.
of mind. Neuron, 58(3), 306–324. Hampson, M., Peterson, B. S., Skudlarski, P., Gatenby,
Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed J. C., & Gore, J. C. (2002). Detection of functional connectivity
and stimulus-driven attention in the brain. Nat. Rev. Neurosci., using temporal correlations in MR images. Hum. Brain Mapp.,
3(3), 201–215. 15(4), 247–262.
Corbetta, M., Tansy, A. P., Stanley, C. M., Astafiev, S. V., He, B. J., Snyder, A. Z., Vincent, J. L., Epstein, A., Shulman,
Snyder, A. Z., & Shulman, G. L. (2005). A functional MRI G. L., & Corbetta, M. (2007). Breakdown of functional con-
study of preparatory signals for spatial location and objects. nectivity in frontoparietal networks underlies behavioral deficits
Neuropsychologia, 43(14), 2041–2056. in spatial neglect. Neuron, 53(6), 905–918.
Damoiseaux, J. S., Rombouts, S. A., Barkhof, F., Scheltens, P., Hoffman, J. E., & Subramaniam, B. (1995). The role of visual
Stam, C. J., Smith, S. M., et al. (2006). Consistent resting-state attention in saccadic eye movements. Percept. Psychophys., 57,
networks across healthy subjects. Proc. Natl. Acad. Sci. USA, 787–795.
103(37), 13848–13853. Hopfinger, J. B., Buonocore, M. H., & Mangun, G. R. (2000).
Desimone, R., & Duncan, J. (1995). Neural mechanisms of The neural mechanisms of top-down attentional control. Nat.
selective visual attention. Annu. Rev. Neurosci., 18, 193–222. Neurosci., 3, 284–291.
232 attention
human frontal eye fields as revealed by fMRI. J. Neurophysiol., 77, anticipatory signals for spatial attention from number of nontar-
3386–3390. get stimuli in the visual field. J. Neurophysiol., 100(2), 829–838.
Platt, M. L., & Glimcher, P. W. (1998). Response fields of Sheliga, B. M., Riggio, L., & Rizzolatti, G. (1994). Orienting
intraparietal neurons quantified with multiple saccadic targets. of attention and eye movements. Exp. Brain Res., 98, 507–522.
Exp. Brain Res., 121(1), 65–75. Shepherd, M., Findlay, J. M., & Hockey, R. J. (1986). The
Platt, M. L., & Glimcher, P. W. (1999). Neural correlates of relationship between eye movements and spatial attention. Q. J.
decision variables in parietal cortex. Nature, 400(6741), 233– Exp. Psychol., 38, 475–491.
238. Shih, S., & Sperling, G. (1996). Is there feature-based attentional
Posner, M. I., & Petersen, S. E. (1990). The attention system of selection in visual search? J. Exp. Psychol. Hum. Percept. Perform.,
the human brain. Annu. Rev. Neurosci., 13, 25–42. 22(3), 758–779.
Posner, M. I., Walker, J. A., Friedrich, F. J., & Rafal, R. D. Shomstein, S., & Yantis, S. (2004). Control of attention shifts
(1984). Effects of parietal injury on covert orienting of attention. between vision and audition in human cortex. J. Neurosci., 24(47),
J. Neurosci., 4(7), 1863–1874. 10702–10706.
Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, Shulman, G. L., D’Avossa, G., Tansy, A. P., & Corbetta,
W. J., Gusnard, D. A., & Shulman, G. L. (2001). Inaugural M. (2002). Two attentional processes in the parietal lobe. Cereb.
article: A default mode of brain function. Proc. Natl. Acad. Sci. Cortex, 12(11), 1124–1131.
USA, 98(2), 676–682. Shulman, G. L., Fiez, J. A., Corbetta, M., Buckner, R. L.,
Rizzolatti, G., Riggio, L., Dascola, I., & UmiltÁ, C. (1987). Miezin, F. M., Raichle, M. E., et al. (1997). Common blood
Reorienting attention across the horizontal and vertical flow changes across visual tasks. II. Decreases in cerebral cortex.
meridians: Evidence in favor of a premotor theory of attention. J. Cogn. Neurosci., 9, 648–663.
Neuropsychologia, 25(1a), 31–40. Shulman, G. L., Ollinger, J. M., Akbudak, E., Conturo,
Rossi, A. F., Bichot, N. P., Desimone, R., & Ungerleider, L. G. T. E., Snyder, A. Z., Petersen, S. E., et al. (1999). Areas
(2007). Top down attentional deficits in macaques with lesions involved in encoding and applying directional expectations to
of lateral prefrontal cortex. J. Neurosci., 27(42), 11306–11314. moving objects. J. Neurosci., 19(21), 9480–9496.
Ruff, C. C., Blankenburg, F., Bjoertomt, O., Bestmann, S., Silver, M. A., Ress, D., & Heeger, D. J. (2005). Topographic
Freeman, E., Haynes, J.-D., et al. (2006). Concurrent maps of visual spatial attention in human parietal cortex.
TMS-fMRI and psychophysics reveal frontal influences on J. Neurophysiol., 94(2), 1358–1371.
human retinotopic visual cortex. Curr. Biol., 16(15), 1479–1488. Silver, M. A., Ress, D., & Heeger, D. J. (2007). Neural corre-
Saenz, M., Buracas, G. T., & Boynton, G. M. (2002). Global lates of sustained spatial attention in human early visual cortex.
effects of feature-based attention in human visual cortex. Nat. J. Neurophysiol., 97(1), 229–237.
Neurosci., 5, 631–632. Slagter, H. A., Giesbrecht, B., Kok, A., Weissman, D. H.,
Sapir, A., d’Avossa, G., McAvoy, M., Shulman, G. L., & Kenemans, J. L., Woldorff, M. G., et al. (2007). fMRI
Corbetta, M. (2005). Brain signals for spatial attention predict evidence for both generalized and specialized components of
performance in a motion discrimination task. Proc. Natl. Acad. Sci. attentional control. Brain Res., 1177, 90–102.
USA, 102(49), 17810–17815. Small, D. M., Gitelman, D., Simmons, K., Bloise, S. M.,
Sauseng, P., Klimesch, W., Stadler, W., Schabus, M., Parrish, T., & Mesulam, M. M. (2005). Monetary incentives
Doppelmayr, M., Hanslmayr, S., et al. (2005). A shift of visual enhance processing in brain regions mediating top-down control
spatial attention is selectively associated with human EEG alpha of attention. Cereb. Cortex, 15(12), 1855–1865.
activity. Eur. J. Neurosci., 22(11), 2917–2926. Snyder, L. H., Batista, A. P., & Andersen, R. A. (1997).
Schluppeck, D., Glimcher, P., & Heeger, D. J. (2005). Topo- Coding of intention in the posterior parietal cortex. Nature, 386,
graphic organization for delayed saccades in human posterior 167–170.
parietal cortex. J. Neurophysiol., 94(2), 1372–1384. Stuss, D. T., & Benson, D. F. (1986). The frontal lobes. New York:
Serences, J. T., & Boynton, G. M. (2007). The representation Raven Press.
of behavioral choice for motion in human visual cortex. Summerfield, J. J., Lepsien, J., Gitelman, D. R., Mesulam,
J. Neurosci., 27(47), 12893–12899. M. M., & Nobre, A. C. (2006). Orienting attention based on
Serences, J. T., Schwarzbach, J., Courtney, S. M., Golay, long-term memory experience. Neuron, 49(6), 905–916.
X., & Yantis, S. (2004). Control of object-based attention in Sweeney, J. A., Mintum, M. A., Kwee, S., Wiseman, M. B.,
human cortex. Cereb. Cortex, 14(12), 1346–1357. Brown, D. L., Rosenberg, D. R., et al. (1996). Positron
Serences, J. T., Shomstein, S., Leber, A. B., Golay, X., emission tomography study of voluntary saccadic eye movement
Egeth, H. E., & Yantis, S. (2005). Coordination of voluntary and spatial working memory. J. Neurophysiol., 75, 454–468.
and stimulus-driven attentional control in human cortex. Psychol. Swisher, J. D., Halko, M. A., Merabet, L. B., McMains,
Sci., 16(2), 114–122. S. A., & Somers, D. C. (2007). Visual topography of human
Serences, J. T., & Yantis, S. (2006). Selective visual attention intraparietal sulcus. J. Neurosci., 27(20), 5326–5337.
and perceptual coherence. Trends Cogn. Sci., 10(1), 38–45. Sylvester, C. M., Jack, A. I., Corbetta, M., & Shulman,
Serences, J. T., Yantis, S., Culberson, A., & Awh, E. (2004). G. L. (2008). Anticipatory suppression of nonattended locations
Preparatory activity in visual cortex indexes distractor in visual cortex marks target location and predicts perception.
suppression during covert spatial orienting. J. Neurophysiol., 92(6), J. Neurosci., 28(26), 6549–6556.
3538–3545. Sylvester, C. M., Shulman, G. L., Jack, A. I., & Corbetta, M.
Sereno, M. I., Pitzalis, S., & Martinez, A. (2001). Mapping (2007). Asymmetry of anticipatory activity in visual cortex pre-
of contralateral space in retinotopic coordinates by a parietal dicts the locus of attention and perception. J. Neurosci, 27(52),
cortical area in humans. Science, 294(5545), 1350–1354. 14424–14433.
Sestieri, C., Sylvester, C. M., Jack, A. I., d’Avossa, G., Taylor, K., Mandon, S., Freiwald, W. A., & Kreiter, A. K.
Shulman, G. L., & Corbetta, M. (2008). Independence of (2005). Coherent oscillatory activity in monkey area V4 predicts
234 attention
15 Spatiotemporal Analysis of
Visual Attention
jens-max hopf, hans-jochen heinze, mircea a. schoenfeld,
and steven a. hillyard
abstract The fact that perception of a visual event can be change in component latencies or scalp topographies,
improved by focusing attention upon its spatial location has been suggesting that spatial attention operates by simply control-
documented in numerous experiments going back more than a ling the gain of visual input during early processing stages
century. It is also well established that visual attention can be
selectively allocated to nonspatial stimulus features or to entire
(Hillyard, Vogel, & Luck, 1998); such a sensory gain control
objects as integrated feature ensembles. In the quest to identify the mechanism is consistent with observations from single-unit
neural mechanisms that underlie this perceptual selectivity in recordings in monkeys (Lee, Williford, & Maunsell, 2007;
human observers, neuroimaging methods, including noninvasive Luck, Chelazzi, Hillyard, & Desimone, 1997; Maunsell &
recordings of event-related potentials (ERPs) and event-related Cook, 2002). The earliest consistent ERP modulations
magnetic fields (ERMFs), have provided valuable insights. Con-
produced by spatial attention were found to be amplitude
tributions from the ERP and ERMF methodologies have been
particularly important for revealing the time course and rapid enhancements of the initial positive (P1) and subsequent
coordination of the underlying selection processes. The ever- negative (N1) components of the ERP elicited at latencies
expanding body of research in this field has made it abundantly of around 80–100 ms and 130–200 ms, respectively (see
clear that visual attention does not rely upon a unitary neural figure 15.1). While these two components were generally
mechanism, but instead that multiple selection processes cooperate
modulated in tandem, there is increasing evidence that they
in a flexible manner to guarantee the adaptability of attention to a
wide range of circumstances. Here we outline some of the princi- reflect different aspects of attentional selection (reviewed in
ples underlying the flexible coordination of space-, feature-, and Hopfinger et al.), with the initial P1 modulation reflecting
object-based selection processes that have emerged from recent location selection per se and the subsequent N1-modulation
studies, with particular emphasis on the contributions of ERP and reflecting discriminative processing of the stimulus within
ERMF recordings in human observers. the focus of attention (Hopf, Vogel, Woodman, Heinze, &
Luck, 2002; Vogel & Luck, 2000). The amplitudes of
both the P1 and N1 components have been found to covary
Spatial attention with the amount of processing resources that are voluntarily
allocated to a spatial location (Handy & Mangun, 1997,
Event-related potential (ERP) recordings have shown how 2000; Mangun & Hillyard, 1990). Moreover, reflexive ori-
spatial attention influences sensory processing in the visual enting to the location of a nonpredictive cue was found to
cortex when attention is directed to briefly flashed stimuli in enhance the P1 amplitude to a subsequent co-localized
one visual field while equivalent stimuli in the opposite field target, suggesting that at least initially the same modulatory
are ignored (for reviews see Hillyard & Anllo-Vento, 1998; processes of location selection are engaged during voluntary
Hopfinger, Luck, & Hillyard, 2004; Luck, Woodman, & and reflexive orienting (Hopfinger & Mangun, 1998, 2001;
Vogel, 2000; Mangun, 1995). The general finding has been Hopfinger & Ries, 2005).
that attended stimuli elicit enlarged early ERP components
in the visual cortex during the interval 80–200 ms after The Profile of the Spatial Focus of Attention The
stimulus onset relative to unattended stimuli. These ampli- focus of spatial attention has been likened to a spotlight
tude enhancements typically occur without a significant (Posner, 1980), a zoom lens (Eriksen & Yeh, 1985), or a
Gaussian gradient (Downing & Pinker, 1985; Shulman,
Wilson, & Sheehy, 1985), which enhances processing of
jens-max hopf, hans-jochen heinze, and mircea a. visual stimuli within a circumscribed region of space. While
schoenfeld Department of Neurology, Otto von Guericke
there is general agreement that the size of this attended
University; Leibniz Institute for Neurobiology, Magdeburg,
Germany region may be adjusted voluntarily, it has long been debated
steven a. hillyard Department of Neuroscience, University of whether the spotlight of spatial attention has a unitary
California, San Diego, California “beam” or whether it can be divided flexibly to disparate
236 attention
Figure 15.2 (A) Stimuli used in the study of Hopf and colleagues probe distances (ERMF responses were collapsed across equivalent
(2006). Subjects searched for a unique red C (shown in black) that probe distances toward the horizontal and vertical meridian). A
randomly appeared at one of nine item locations in the right lower substantial reduction of the probe response was observed when
visual quadrant. On 50% of the trials a small white ring (the probe) attention was focused next to the probe (PD1) relative to when
was presented at the center position 250 ms after search frame attention was focused at the probe’s location (PD0) or two to four
onset (frame-probe, or FP, trials). On the remaining trials no probe items away (PD2–PD4). This narrow zone of sensory attenuation
appeared (frame-only, or FO, trials). On trials with a probe, the surrounding the target is further illustrated in the bar graph showing
target could appear either at the probe’s location (probe-distance the average size of the probe-related ERMF response between 130
0, or PD0, shown on the middle left) or at a location one through and 150 ms. A source localization analysis of the surround attenu-
four items away from the probe (PD1–PD4) (the situation for PD3 ation effect (PD1-minus-PD4 difference) is shown in the lower
is illustrated on the middle right). The relative timing of search- right, which revealed strongest modulations in early visual cortex
frame and probe presentation is shown below. (B) ERMF response areas and smaller effects in lateral and dorsal extrastriate areas.
to the probe (FP-minus-FO difference waves) for the five different (Adapted from Hopf et al., 2006.)
suggesting that the attended location was surrounded combined analyses have revealed that the amplitude
by a narrow zone of relative inhibition. The spatial distribu- modulations of the P1 and N1 components produced by
tion of attention around a search target thus appears to spatial attention in the interval 80–200 ms take place in
resemble a Mexican hat profile rather than a simple mono- extrastriate cortical areas of both the dorsal and ventral
tonic gradient. Such a profile may be advantageous in atten- streams (Di Russo, Martinez, & Hillyard, 2003; Heinze,
uating the most deleterious noise directly adjacent to the Mangun et al., 1994; Hopf et al., 2002; Mangun, Buonocore,
target. Studies using ERPs (Slotnick, Hopfinger, Klein, & Girelli, & Jha, 1998; Martinez et al., 1999; Noesselt et al.,
Sutter, 2002; Slotnick, Schwarzbach, & Yantis, 2003) and 2002; Woldorff et al., 1997). Notably, in none of these studies
fMRI (Müller & Kleinschmidt, 2004) have provided con- were modulatory effects due to attention observed in the
verging evidence and have shown further that surround initial feedforward sweep of processing in the primary visual
inhibition may also arise under conditions of sustained cortex, which is reflected in the early ERP/ERMF component
focusing. known as C1 (50–80 ms) (Aine, Supek, & George, 1995;
Clark, Fan, & Hillyard, 1995; Clark & Hillyard, 1996; Di
Locus of Spatial Selection in the Visual System Russo, Martinez, Sereno, Pitzalis, & Hillyard, 2002; Foxe &
To specify the anatomical locus of spatial selection in the Simpson, 2002; Olson, Chun, & Allison, 2001). Instead,
visual pathways, recordings of the early sensory ERP/ERMF modulations of V1 activity caused by attention were found
components have been combined with functional brain- to appear after a considerable delay, at latencies of around
imaging methods that provide high spatial resolution. These 150–250 ms, well after the onset of modulatory effects in
238 attention
ERPs known as the selection negativities or selection positivities, in a mixed display that was presented to one visual field. In
which are typically enlarged in response to stimuli having some studies the display was a field of intermingled horizon-
the attended feature value (Anllo-Vento, Luck, & Hillyard, tally and vertically moving dots, and in other studies it was
1998; Baas, Kenemans, & Mangun, 2002; Eimer, 1997b; intermingled red and green dots. The general finding was
Harter & Aine, 1984; Kenemans, Lijffijt, Camfferman, & that a larger BOLD signal was elicited in the visual cortex
Verbaten, 2002; Martinez, Di Russo, Anllo-Vento, & by the attended than the unattended feature value even for
Hillyard, 2001; Wijers, Mulder, Okita, Mulder, & Scheffers, stimuli presented to the visual field opposite to where atten-
1989). These feature-selective modulations generally occur tion was being directed, thereby supporting the concept of
at longer latencies (120–300 ms) than the initial effects of “global feature-based attention.” A further demonstration
spatial attention, but their timing can vary in a flexible that feature-selective attention is not location bound has
manner depending upon feature discriminability and task recently been provided in a study of SSVEPs in humans
demands. For example, Schoenfeld and colleagues (2007) (Müller et al., 2006). Müller and colleagues presented sub-
found that ERMF and ERP modulations associated with jects with superimposed random dot arrays of two colors that
selection between feature dimensions (color versus motion) flickered at different rates (7.0 and 11.7 Hz) while changing
occurred with earlier onset than the modulations typically position randomly. The frequency-tagged SSVEP to each
reported for selection within a feature dimension (e.g., one color could therefore be measured independently under con-
color versus another color). ditions where location could not be used as a basis for selec-
In an early ERP investigation, Hillyard and Münte (1984) tion. It was found that the color-specific SSVEP was
obtained evidence that selection of a relevant stimulus color enhanced when that color was attended versus when the
did not take place outside the spatial focus of attention. In other color was attended (figure 15.3), indicating again that
their design, red and blue bars were flashed in random order attention can select specific color values independently of
to both the right and left visual fields, and one color in one their particular location.
field was designated as relevant. It was found that all stimuli These demonstrations of global feature-based attention
in the attended field elicited the enhanced P1/N1 compo- appear to conflict with the original finding of Hillyard and
nents characteristic of spatial attention, whereas stimuli Münte (1984) that color selection was suppressed outside the
of the attended color only elicited an enlarged selection focus of spatial attention. A critical difference in experimen-
negativity in the attended field. This result (confirmed by tal design that may explain these disparate results, however,
Anllo-Vento & Hillyard, 1996) suggested that nonspatial is that Hillyard and Münte presented stimuli briefly and
feature selection was hierarchically dependent upon the intermittently, while all the studies that observed global
prior selection of location, a notion in line with psychophysi- feature selection presented the stimuli continuously. This
cal experiments showing that location selection has priority difference suggests that the attended feature must actually
in visual attention (e.g., Cave & Pashler, 1995; Treisman & be present in the display in order for global feature selection
Gelade, 1980; Tsal & Lamy, 2000). to override spatial selection.
In contrast to the conclusions of Hillyard and Münte Evidence that the relative priority of location- and feature-
(1984), however, subsequent studies using a variety of meth- based selection may be flexibly adjusted according to task
odologies have found that paying attention to a nonspatial demands comes from ERP/MEG recordings in a visual
stimulus feature does enhance neural responses to that search task (Hopf et al., 2004). Subjects searched for a simple
feature even outside the spatial focus of attention. In single- color-orientation conjunction among distracters, half of
unit recordings from the motion-selective area MT in which shared an orientation feature with the target and half
the monkey, for example, Treue and Martinez-Trujillo of which did not (figure 15.4A). It was found that a lateral-
(Martinez-Trujillo & Treue, 2004; Treue & Martinez- ized brain response indicating the presence of the relevant
Trujillo, 1999) found that cells tuned to an attended direc- orientation feature preceded the N2pc response indicating
tion of motion showed enhanced firing even when attention the spatial localization of the conjunction target by about
was directed to that direction of motion outside the cell’s 30 ms (figure 15.4B). These observations were taken to indi-
receptive field. The degree of enhancement increased as a cate that visual search involves a short phase of parallel,
function of the similarity between the attended motion direc- location-independent feature selection that occurs prior to
tion and the cell’s directional preference in a multiplicative the localization and selection of the target—a sequence of
way, leading to the proposal that attention operates by operations proposed by many influential theories of visual
increasing the “feature-similarity gain” within the visual search (Cave, 1999; Treisman & Sato, 1990; Wolfe & Bennet,
cortex (see also Maunsell & Treue, 2006; McAdams & 1997).
Maunsell, 1999; Motter, 1994). Analogous observations Single-unit recording studies in monkeys (Bichot, Rossi,
were made using fMRI in human observers (Saenz et al., & Desimone, 2005) have provided additional evidence for
2002, 2003). These observers attended to one feature value parallel feature selection in visual search. Monkeys were
trained to search for a target defined by its color or form (or This evidence for parallel, additive feature enhancement is
a combination of both) among many colored items of various consistent with the mechanism proposed by “guided search”
forms. Search was performed under free gaze conditions, theories (Wolfe et al., 1989) to account for the rapid identi-
and the cell-firing responses in area V4 were analyzed during fication of feature conjunction targets.
intermediate fixations along the search path toward the These physiological studies in monkeys and humans
target, but before the target was fixated. It was observed that show that the time course and priority order of feature- and
cell firing increased whenever a change in fixation brought location-based attention effects can be flexibly adjusted
a distracter that possessed one of the target’s defining fea- depending on task demands and the particular selection
tures into the cell’s receptive field. These results showed that operations that are required. The selection of nonspatial
any item with a relevant feature was highlighted in area V4 features is frequently delayed relative to the selection of
even before ultimate target identification; this adds to the locations but can be considerably accelerated for simple
ERP/MEG evidence that attention to features acts in a selections between feature dimensions. Moreover, feature
“global” way prior to target selection in visual search. selection may precede the selection of location in visual
A recent SSVEP study in humans (Andersen, Hillyard, & search tasks where the decoding of features is given priority
Müller, 2008) obtained evidence that the multiple features to guide the subsequent spatial focusing of attention. In the
of an attended stimulus are selected and facilitated in a par- next section we describe how such task-dependent flexibility
allel, additive fashion. Subjects viewed a display containing also applies to the relation between feature- and object-
150 red and 150 blue bars, with half of each color oriented based selection operations.
horizontally and half vertically, all of which were randomly
intermixed and moving unpredictably. On each trial sub- Object-based selection
jects were cued to attend to one of these four types of bars,
each of which flickered at a different rate and thus elicited A large body of psychophysical research indicates that atten-
its own frequency-tagged SSVEP. It was found that SSVEP tion can select entire objects for preferential processing
amplitudes were largest to the bars of the attended color (Driver, Davis, Russell, Turatto, & Freeman, 2001; Duncan
orientation, intermediate to the bars having one of the two & Nimmo-Smith, 1996; Egly, Driver, & Rafal, 1994; Scholl,
attended features, and smallest to the bars that lacked either 2001), which is not surprising given the ecological relevance
attended feature. Most importantly, the SSVEP amplitude of the multitudinous objects in the environment. It is impor-
to the attended conjunction stimulus was equal to the sum tant to realize, however, that object-based attention is a
of the amplitudes for the individual feature enhancements. heterogeneous concept, and it is debated whether elemen-
240 attention
Figure 15.4 (A) Stimuli from the study of Hopf, Boelmans, The location of RODs was varied relative to the location of the
Schoenfeld, Luck, and Heinze (2004). Search frames consisted of target item, such that RODs appeared (i) on the target side only,
distinctively colored C’s (red and green, shown as black and dashed, (ii) on the nontarget side only, (iii) on both sides, or (iv) on neither
respectively), one presented in the left and one in the right visual side (control condition). (B) ERP responses elicited by LVF targets.
field, surrounded by blue distracter Cs on each side. The red C Waveforms of the different ROD-distributions (i–iii, solid tracings)
served as target for half of the trial blocks and the green C for the are separately overlaid with the control condition (iv, dashed trac-
other half. Subjects had to discriminate the orientation of the target ings). The topographical maps show the distribution of the corre-
C (here the red C, shown in black, in the left VF), whose gap always sponding voltage difference. The arrows highlight an enhanced
varied left-right. In contrast, distracters of one visual field were negativity between ∼140 and 300 ms that appears contralateral to
either oriented left-right, as was the target (relevant orientation the location of the RODs independent of the target’s location in
distracters, RODs), or up-down (irrelevant orientation distracters). the left VF.
tary perceptual groupings (grouped array representations) or unconfounded by spatial attention, since the two rotating
more abstract (spatially invariant) forms of representation dot displays are superimposed. When attention was directed
serve as the objects of attention (Kramer, Weber, & Watson, endogenously to one of the surfaces, observers could judge
1997; Luck & Vecera, 2002; Mozer & Vecera, 2005; the direction of a brief translation of that surface and a
Vecera, 1997; Vecera & Farah, 1994; Weber, Kramer, & second translation of the same surface much more accu-
Miller, 1997). Another problem for isolating the neural rately than translations of the uncued surface. Paralleling
mechanisms of object-based attention is that it appears far this perceptual selection, ERP recordings showed that
from trivial to rule out contributions from location- and occipital P1 and N1 (N200) components elicited by transla-
feature-based mechanisms. tions of the unattended surface were suppressed relative to
Valdes-Sosa and colleagues (Pinilla, Cobo, Torres, & those of the attended surface. This finding was taken to
Valdes-Sosa, 2001; Valdes-Sosa, Bobes, Rodriguez, & indicate that attention favors processing of the attended
Pinilla, 1998) developed an elegant paradigm for studying surface by attenuating the object representation of the
object-based attention, which involves the competitive selec- unattended surface. A subsequent study (Rodriguez &
tion of perceived surfaces formed by moving dot arrays. The Valdes-Sosa, 2006) carried out current source localization of
stimuli were two counterrotating random dot arrays that the ERP modulation reflecting this motion-based surface
gave the impression of two superimposed surfaces rotating selection and found that the associated N200 component
in opposite directions. This paradigm offers the opportunity was generated in part in human MT+, which corresponds
to investigate the neural basis of object (surface) selection to recent findings from a neurophysiological study of surface
242 attention
Figure 15.5 Experimental design in study by Martinez and col- quadrants with ISIs of 400–600 ms. Subjects responded to detec-
leagues (2006). During each run, either two horizontal or two verti- tions of targets in the attended quadrant, which occurred with a
cal bars were presented continuously on the screen. Subjects were probability of 0.2. In the example shown, the upper left quadrant
cued by a pair of arrows near fixation to attend covertly to one of was attended; thus the unattended lower left quadrant belonged to
the four visual quadrants. Stimuli were brief (100 ms) offsets of the the attended object when the bars were vertical but not when the
corners of the bars, leaving either a concave (standard) or convex bars were horizontal.
(target) edge. Corner offsets occurred in random order in all four
which is then transmitted to the modules encoding the other defining details might have been spatially selected. To verify
features of the object. The resulting activation of the entire a mechanism of object-based selection, it is important to
network of specialized modules underlies the binding of fea- eliminate potential contributions from other possible refer-
tures into an integrated perceptual object. A key prediction ence frames. In this respect, a convincing behavioral demon-
of the integrated competition model is that directing atten- stration of object-based attention that eliminated all possible
tion to one feature of an object will result in the activation alternative explanations was carried out by Blaser, Pylyshyn,
of its other features, including those that are irrelevant to the & Holcombe (2000). Subjects had to track the identity of one
task at hand. Such an effect was demonstrated in an fMRI of two spatially superimposed Gabor patches, which continu-
study by O’Craven, Downing, & Kanwisher (1999), who ously changed along feature dimensions of orientation, color,
presented subjects with superimposed transparent pictures and spatial frequency. This approach not only eliminated the
of houses and faces, with one moving and the other being possibility of location-based selection, but also ruled out fea-
stationary. On different runs subjects selectively attended to tural identity as a basis for tracking the target. The observa-
houses, to faces, or to stimulus motion itself. It was found tion that subjects were still able to track the identity of the
that neural activity was increased not only in the critical area Gabor patch over time confirmed that visual attention had
specific to the attended stimulus feature (the fusiform face been directed to an object-based representation.
area for faces; the parahippocampal place area for houses; The key finding of O’Craven and colleagues (1999) was
area MT+ for motion), but also in the area encoding the that directing attention to one feature of an object activated
task-irrelevant feature of the attended object. These data the neural representations of its other features, including
provided strong support for object-based selection in the those not relevant to the current task. Given the low time
context of the integrated competition model. resolution of fMRI, however, it was not possible to deter-
While the study of O’Craven and colleagues was elegantly mine whether the activation of irrelevant features occurs
designed, it remains possible that location cues played some rapidly enough to participate in the feature-binding process
role, as the objects were not perfectly overlapping and target- that underlies the perception of the integrated object. This
244 attention
Figure 15.7 (A) Experimental design of study by Schoenfeld and waveforms, formed by subtracting responses when the CC belonged
colleagues (2003). On each trial a random half of the dots on the to the unattended dots minus when the CC belonged to the at-
screen moved left and half right for 300 ms. Subjects were cued to tended dots. Note that the attention-produced enhancement of the
attend to either the left- or right-moving dots and reported occa- irrelevant color feature lags the initial sensory response to color by
sional targets of higher velocity. On a random basis, either the 30–50 ms. (C ) The neural generators of this increased color signal
left- or right-moving dots could change color (CC) to red or remain at 220–300 ms were localized to the ventral occipital cortex (area
white. (B) The sensory effect of color is shown in the dotted ERMF V4v), known to be a specialized module for color processing
and ERP waveforms: these are difference waves formed by sub- (McKeefry & Zeki, 1997). This localization was confirmed by
tracting responses to the no CC condition from the condition where BOLD signal activations (irregular areas on brain slices) in a paral-
the CC belonged to the unattended dots. The attention-related lel experiment using the same design with fMRI.
activation of the irrelevant color feature is represented by the solid
246 attention
Castiello, U., & Umilta, C. (1992). Splitting focal attention. Eimer, M. (1997b). An event-related potential (ERP) study of
J. Exp. Psychol. Hum. Percept. Perform., 18, 837–848. transient and sustained visual attention to color and form. Biol.
Cave, K. R. (1999). The FeatureGate model of visual selection. Psychol., 44, 143–160.
Psychol. Res., 62, 182–194. Eimer, M. (1999). Attending to quadrants and ring-shaped regions:
Cave, K. R., & Pashler, H. (1995). Visual selection mediated by ERP effects of visual attention in different spatial selection tasks.
location: Selecting successive visual objects. Percept. Psychophys., Psychophysiology, 36, 491–503.
57, 421–432. Eimer, M. (2000). An ERP study of sustained spatial attention to
Cave, K. R., & Zimmerman, J. M. (1997). Flexibility in spatial stimulus eccentricity. Biol. Psychol., 52, 205–220.
attention before and after practice. Psychol. Sci., 8, 399–403. Eriksen, C. W., & Hoffman, J. E. (1972). Temporal and spatial
Chawla, D., Rees, G., & Friston, K. J. (1999). The physiologi- characteristics of selective encoding from visual displays. Percept.
cal basis of attentional modulation in extrastriate visual areas. Psychophys., 12, 201–204.
Nat. Neurosci., 2, 671–676. Eriksen, C. W., & Yeh, Y.-Y. (1985). Allocation of attention in the
Clark, V. P., Fan, S., & Hillyard, S. A. (1995). Identification visual field. J. Exp. Psychol. Hum. Percept. Perform., 11, 583–597.
of early visual evoked potential generators by retinotopic and Fallah, M., Stoner, G. R., & Reynolds, J. H. (2007). Stimulus-
topographic analyses. Hum. Brain Mapp., 2, 170–187. specific competitive selection in macaque extrastriate visual area
Clark, V. P., & Hillyard, S. A. (1996). Spatial selective attention V4. Proc. Natl. Acad. Sci. USA, 104, 4165–4169.
affects early extrastriate but not striate components of the visual Foxe, J. J., & Simpson, G. V. (2002). Flow of activation from V1
evoked potential. J. Cogn. Neurosci., 8, 387–402. to frontal cortex in humans: A framework for defining “early”
Connor, C. E., Gallant, J. L., Preddie, D. C., & Van Essen, visual processing. Exp. Brain Res., 142, 139–150.
D. C. (1996). Responses in area V4 depend on the spatial Gilbert, C. D., & Sigman, M. (2007). Brain states: Top-down
relationship between stimulus and attention. J. Neurophysiol., 75, influences in sensory processing. Neuron, 54, 677–696.
1306–1308. Grill-Spector, K. (2003). The neural basis of object perception.
Connor, C. E., Preddy, D. C., Gallant, J. L., & Van Essen, Curr. Opin. Neurobiol., 13, 159–166.
D. C. (1997). Spatial attention effects in macaque area V4. Hahn, S., & Kramer, A. F. (1998). Further evidence for the
J. Neurosci., 19, 3201–3214. division of attention between noncontiguous locations. Visual
Corbetta, M., Miezin, F. M., Dobmeyer, S., Shulman, G. L., Cogn., 5, 217–256.
& Petersen, S. E. (1990). Attentional modulation of neural Handy, T. C., Kingstone, A., & Mangun, G. R. (1996). Spatial
processing of shape, color, and velocity in humans. Science, 248, distribution of visual attention: Perceptual sensitivity and
1556–1559. response latency. Percept. Psychophys., 58, 613–627.
Corbetta, M., Miezin, F. M., Dobmeyer, S., Shulman, G. L., Handy, T. C., & Mangun, G. R. (1997). Early attention selection:
& Petersen, S. E. (1991). Selective and divided attention during Electrophysiological evidence for modulation by perceptual load.
visual discriminations of shape, color, and speed: Functional Manuscript.
anatomy by positron emission tomography. J. Neurosci., 11, Handy, T. C., & Mangun, G. R. (2000). Attention and spatial
2383–2402. selection: Electrophysiological evidence for modulation by per-
Cutzu, F., & Tsotsos, J. K. (2003). The selective tuning model ceptual load. Percept. Psychophys., 62, 175–185.
of attention: Psychophysical evidence for a suppressive annulus Harter, M. R., & Aine, C. (1984). Brain mechanisms of visual
around an attended item. Vis. Res., 43, 205–219. selective attention. In R. Parasuraman, & D. R. Davies (Eds.),
Di Russo, F., Martinez, A., & Hillyard, S. A. (2003). Source Varieties of attention (pp. 293–321). New York: Academic Press.
analysis of event-related cortical activity during visuo-spatial He, X., Fan, S., Zhou, K., & Chen, L. (2004). Cue validity and
attention. Cereb. Cortex, 13, 486–499. object-based attention. J. Cogn. Neurosci., 16, 1085–1097.
Di Russo, F., Martinez, A., Sereno, M. I., Pitzalis, S., & Heinze, H. J., Luck, S. J., Münte, T. F., GÖs, A., Mangun,
Hillyard, S. A. (2002). Cortical sources of the early compo- G. R., & Hillyard, S. A. (1994). Attention to adjacent and
nents of the visual evoked potential. Hum. Brain Mapp., 15, separate positions in space: An electrophysiological analysis.
95–111. Percept. Psychophys., 56, 42–52.
Downing, P. E., & Pinker, S. (1985). The spatial structure of Heinze, H. J., Mangun, G. R., Burchert, W., Hinrichs, H.,
visual attention. In M. I. Posner, & O. S. Marin (Eds.), Attention Scholz, M., Münte, T. F., et al. (1994). Combined spatial
and Performance XI (pp. 171–188). Hillsdale, NJ: Erlbaum. and temporal imaging of brain activity during visual selective
Driver, J., Davis, G., Russell, C., Turatto, M., & Freeman, attention in humans. Nature, 372, 543–546.
E. (2001). Segmentation, attention and phenomenal visual Helmholtz, H. V. (1896). Handbuch der physiologischen Optik.
objects. Cognition, 80, 61–95. Hamburg: Verlag von Leopold Voss.
Duncan, J., Humphreys, G. W., & Ward, R. M. (1997). Henderson, J. M., & MacQuistan, A. D. (1993). The spatial
Competitive brain activity in visual attention. Curr. Opin. distribution of attention following an exogenous cue. Percept.
Neurobiol., 7, 255–261. Psychophys., 53, 221–230.
Duncan, J., & Nimmo-Smith, I. (1996). Objects and attributes Hillyard, S. A., & Anllo-Vento, L. (1998). Event-related brain
in divided attention: Surface and boundary systems. Percept. potentials in the study of visual selective attention. Proc. Natl.
Psychophys., 58, 1076–1084. Acad. Sci. USA, 95, 781–787.
Egly, R., Driver, J., & Rafal, R. D. (1994). Shifting visual Hillyard, S. A., & Münte, T. F. (1984). Selective attention
attention between objects and locations: Evidence from to color and location: An analysis with event-related brain
normal and parietal lesion subjects. J. Exp. Psychol. Gen., 123, potentials. Percept. Psychophys., 36, 185–198.
161–177. Hillyard, S. A., Vogel, E. K., & Luck, S. J. (1998). Sensory
Eimer, M. (1997a). Attentional selection and attentional gradients: gain control (amplification) as a mechanism of selective
An alternative method for studying transient visual-spatial attention: Electrophysiological and neuroimaging evidence.
attention. Psychophysiology, 34, 365–376. Philos. Trans. R. Soc. London B Biol. Sci., 353, 1257–1270.
248 attention
selection processes in striate and extrastriate visual areas. Vis. Müller, N. G., Bartelt, O. A., Donner, T. H., Villringer, A.,
Res., 41, 1437–1457. & Brandt, S. A. (2003). A physiological correlate of the “zoom
Martinez, A., Ramanathan, D. S., Foxe, J. J., Javitt, D. C., & lens” of visual attention. J. Neurosci., 23, 3561–3565.
Hillyard, S. A. (2007). The role of spatial attention in the selec- Müller, N. G., & Kleinschmidt, A. (2003). Dynamic interaction
tion of real and illusory objects. J. Neurosci., 27, 7963–7973. of object- and space-based attention in retinotopic visual areas.
Martinez, A., Teder-Salejarvi, W., & Hillyard, S. A. (2007). J. Neurosci., 23, 9812–9816.
Spatial attention facilitates selection of illusory objects: Evidence Müller, N. G., & Kleinschmidt, A. (2004). The attentional
from event-related brain potentials. Brain Res., 1139, 143–152. “spotlight’s” penumbra: Center-surround modulation in striate
Martinez, A., Teder-Salejarvi, W., Vazquez, M., Molholm, S., cortex. NeuroReport, 15, 977–980.
Foxe, J. J., Javitt, D. C., et al. (2006). Objects are highlighted Müller, N. G., Mollenhauer, M., Rüsler, A., &
by spatial attention. J. Cogn. Neurosci., 18, 298–310. Kleinschmidt, A. (2005). The attentional field has a Mexican
Martinez-Trujillo, J. C., & Treue, S. (2004). Feature-based hat distribution. Vis. Res., 45, 1129–1137.
attention increases the selectivity of population responses in Noesselt, T., Hillyard, S., Woldorff, M., Schoenfeld, A.,
primate visual cortex. Curr. Biol., 14, 744–751. Hagner, T., Jancke, L., et al. (2002). Delayed striate cortical
Maunsell, J. H., & Cook, E. P. (2002). The role of attention activation during spatial attention. Neuron, 35, 575–587.
in visual processing. Philos. Trans. R. Soc. Lon. B Biol. Sci., 357, O’Craven, K. M., Downing, P. E., & Kanwisher, N. (1999).
1063–1072. fMRI evidence for objects as the units of attentional selection.
Maunsell, J. H., & Treue, S. (2006). Feature-based attention in Nature, 401, 584–587.
visual cortex. Trends Neurosci., 29, 317–322. O’Craven, K. M., Rosen, B. R., Kwong, K. K., Treisman, A.,
McAdams, C. J., & Maunsell, J. H. R. (1999). Effects of attention & Savoy, R. L. (1997). Voluntary attention modulates fMRI
on orientation-tuning functions of single neurons in macaque activity in human MT-MST. Neuron, 18, 591–598.
cortical area V4. J. Neurosci., 19, 431–441. Olson, I. R., Chun, M. M., & Allison, T. (2001). Contextual
McKeefry, D. J., & Zeki, S. (1997). The position and topography guidance of attention: Human intracranial event-related poten-
of the human colour centre as revealed by functional magnetic tial evidence for feedback modulation in anatomically early tem-
resonance imaging. Brain, 120, 2229–2242. porally late stages of visual processing. Brain, 124, 1417–1425.
McMains, S. A., & Somers, D. C. (2004). Multiple spotlights of Pinilla, T., Cobo, A., Torres, K., & Valdes-Sosa, M. (2001).
attentional selection in human visual cortex. Neuron, 42, 677–686. Attentional shifts between surfaces: Effects on detection and
McMains, S. A., & Somers, D. C. (2005). Processing efficiency early brain potentials. Vis. Res., 41, 1619–1630.
of divided spatial attention mechanisms in human visual cortex. Posner, M. I. (1980). Orienting of attention. Q. J. Exp. Psychol., 32,
J. Neurosci., 25, 9444–9448. 3–25.
Mehta, A. D., Ulbert, I., & Schroeder, C. E. (2000a). Inter- Reynolds, J. H., Alborzian, S., & Stoner, G. R. (2003). Exog-
modal selective attention in monkeys. I. Distribution and timing enously cued attention triggers competitive selection of surfaces.
of effects across visual areas. Cereb. Cortex, 10, 343–358. Vis. Res., 43, 59–66.
Mehta, A. D., Ulbert, I., & Schroeder, C. E. (2000b). Inter- Rodriguez, V., & Valdes-Sosa, M. (2006). Sensory suppression
modal selective attention in monkeys. II. Physiological mecha- during shifts of attention between surfaces in transparent motion.
nisms of modulation. Cereb. Cortex, 10, 359–370. Brain Res., 1072, 110–118.
Mishra, J., & Hillyard, S. A. (2008). Endogenous attention Roelfsema, P. R., Lamme, V. A. F., & Spekreijse, H. (1998).
selection during binocular rivalry at early stages of visual Object-based attention in the primary visual cortex of macaque
processing. Vis. Res. [Epub ahead of print. doi: 10.1016/ monkey. Nature, 395, 376–381.
j.visres.2008.02.018.] Roelfsema, P. R., Tolboom, M., & Khayat, P. S. (2007).
Mitchell, J. F., Stoner, G. R., & Reynolds, J. H. (2004). Different processing phases for features, figures, and selective
Object-based attention determines dominance in binocular attention in the primary visual cortex. Neuron, 56, 785–792.
rivalry. Nature, 429, 410–413. Saenz, M., Buracas, G. T., & Boynton, G. M. (2002). Global
Motter, B. (1993). Focal attention produces spatially selective effects of feature-based attention in human visual cortex. Nat.
processing in visual cortical areas V1, V2, and V4 in the pres- Neurosci., 5, 631–632.
ence of competing stimuli. J. Neurophysiol., 70, 909–919. Saenz, M., Buracas, G. T., & Boynton, G. M. (2003). Global
Motter, B. C. (1994). Neural correlates of attentive selection feature-based attention for motion and color. Vis. Res., 43,
for color or luminance in extrastriate area V4. J. Neurosci., 14, 629–637.
2178–2189. Schoenfeld, M., Hopf, J. M., Martinez, A., Mai, H., Sattler,
Mounts, J. R. (2000). Attentional capture by abrupt onsets C., Gasde, A., et al. (2007). Spatio-temporal analysis of feature-
and feature singletons produces inhibitory surrounds. Percept. based attention. Cereb. Cortex, 17, 2468–2477.
Psychophys., 62, 1485–1493. Schoenfeld, M. A., Tempelmann, C., Martinez, A., Hopf,
Mozer, M. C., & Vecera, S. P. (2005). Object- and space- J. M., Sattler, C., Heinze, H. J., et al. (2003). Dynamics of
based attention. In L. Itti, G. Rees, & J. K. Tsotsos (Eds.), feature binding during object- selective attention. Proc. Natl. Acad.
Neurobiology of attention (pp. 130–134). Burlington, MA: Elsevier/ Sci. USA, 100, 11806–11811.
Academic Press. Scholl, B. J. (2001). Objects and attention: The state of the art.
Müller, M. M., Andersen, S., Trujillo, N. J., Valdes-Sosa, P., Cognition, 80, 1–46.
Malinowski, P., & Hillyard, S. A. (2006). Feature-selective Serences, J. T., Yantis, S., Culberson, A., & Awh, E. (2004).
attention enhances color signals in early visual areas of the Preparatory activity in visual cortex indexes distractor
human brain. Proc. Natl. Acad. Sci. USA, 103, 14250–14254. suppression during covert spatial orienting. J. Neurophysiol., 92,
Müller, M. M., Malinowski, P., Gruber, T., & Hillyard, 3538–3545.
S. A. (2003). Sustained division of the attentional spotlight. Shipp, S. (2007). Structure and function of the cerebral cortex. Curr.
Nature, 424, 309–312. Biol., 17, R443–449.
250 attention
16 Integration of Conflict Detection
and Attentional Control
Mechanisms: Combined ERP and
fMRI Studies
george r. mangun, clifford d. saron, and bong j. walsh
abstract Attention involves powerful top-down mechanisms and the resultant influence at a “site” of action, such as
for the control of information processing in the brain, including within the perceptual system, outline the push-pull between
specialized systems in the frontal and parietal cortex. These net-
top-down and bottom-up information (e.g., Posner &
works for attentional control are sensitive to momentary goals,
enabling flexibility in behavior under changing conditions. One Petersen, 1990; Serences et al., 2005). The concept that
such influence is that which arises when competing inputs or attention involves the interactions of neural systems that
responses are in conflict. A prominent system for conflict detection, generate attentional control signals with other systems that are
cognitive control, and behavioral adjustment involves the anterior influenced by those signals remains at the core of most
cingulate cortex and dorsal lateral prefrontal cortex. Here we current models of voluntary (goal-directed) attention (e.g.,
describe how these attentional and conflict-resolution systems inter-
act, and provide a synthesis and model of how these interactions Bundesen, Habekost, & Kyllingsbaek, 2005).
support information processing. We show that conflict detected in Research in animals, patients with neurological dysfunc-
the anterior cingulate system influences the activity of the frontal- tion, and healthy human subjects using electromagnetic
parietal attention network to modulate attentional selection. As a recording, neuronal stimulation, deactivation, neuroimag-
result, when conditions lead to uncertainty that could be detrimen- ing, and transcranial magnetic stimulation suggests that
tal to successful performance, the brain combines information to
strategically alter performance on a moment-to-moment basis.
voluntary control of visual attention involves a complex
network of widely distributed areas, including superior frontal
cortex, posterior parietal cortex, posterior-superior temporal
Visual selective attention is a powerful cognitive ability that cortex, and thalamic and midbrain structures (e.g., Bisley &
aids in the perception of the world around us (see Treisman, Goldberg, 2006; Bushnell, Goldberg, & Robinson, 1981;
chapter 12 in this volume). Directing spatial attention to a Corbetta, Kincade, Ollinger, McAvoy, & Shulman, 2000;
location in the visual field facilitates processing of stimuli Thiebaut de Schotten et al., 2005; Goldberg & Bruce, 1985;
appearing at the attended location: reaction times (RT) are Hopf & Mangun, 2000; Hopfinger, Buonocore, & Mangun,
faster and discrimination accuracy is enhanced for events at 2000; Hung, Driver, & Walsh, 2005; Knight, Grabowecky,
attended versus unattended locations (e.g., Luck et al., 1994). & Scabini, 1995; McAlonan, Cavanaugh, & Wurtz, 2006;
In line with this observed RT pattern, neural responses to Mesulam, 1981; Miller, 2000; Rorden, Fruhmann Berger,
attended and ignored stimuli are modulated (see Maunsell, & Karnath, 2006). In humans, the functional anatomy of this
chapter 19) to provide a competitive advantage for attention attentional control network has been identified by combining
events (see Kastner, McMains, and Beck, chapter 13). event-related fMRI methods with tasks that temporally sepa-
rate preparatory attentional control from target-related
Attentional control activity. For example, using spatial cuing paradigms, it has
been possible to demonstrate activity in a frontal-parietal
Models of attention have distinguished between top-down attention system related to top-down attentional control
and bottom-up influences on the focus of attention. In one (e.g., Corbetta et al., 2000; Hopfinger et al., 2000; Kastner,
prominent framework, the “sources” of attentional control Pinsk, De Weerd, Desimone, & Ungerleider, 1999) and
to distinguish this activity from activity in visual cortex and
george r. mangun, clifford d. saron, and bong j. walsh Center the motor system (see Corbetta, Sylvester, & Shulman,
for Mind and Brain, University of California, Davis, California chapter 14).
252 attention
distracter information, much in the way suggested by Egner
and Hirsch (2005). Using event-related fMRI, Weissman,
Mangun, and Woldorff (2002) demonstrated similar activa-
tions in the frontal-parietal attention network to attention-
directing cues and subsequent targets containing incompatible
local and global information. This type of information
can be interpreted as evidence that during the incompatible
trials, the voluntary attention system was engaged to focus
attention on the information at the relevant level (global
or local) and suppress the irrelevant, distracting information.
Together these studies suggest a close association between
the conflict-control system (ACC-DLPFC) and attentional
control networks (frontal-parietal network). However, there
is less direct evidence that increased conflict leads to
modulations of activity in the frontal-parietal attentional
Figure 16.1 (A) Stimulus sequence. Subjects received a cue that
control network. To demonstrate such a relationship, it directed attention to left or right, or to neither side (neutral cues).
would be necessary to show that increased conflict leading Following a cue-to-target stimulus onset asynchrony (SOA), targets
to activity in the ACC on a trial (trial N) would lead to were presented. Targets were circular Gabor gratings (1.5° diam-
increased activity in the frontal-parietal attention network eter, located bilaterally 5.4° from fixation in the upper left and right
visual fields), each of which could be horizontal or vertical in ori-
on the subsequent trial (trial N + 1). Further, such effects
entation. Subjects had to discriminate the grating orientation at the
should be strongly correlated with attentional selectivity attended location. (B) Sample cue types. The length of each line
demonstrated behaviorally and in selective stimulus process- ranged from 1.1° to 1.7°, depending on the type of cue. Neutral
ing in visual cortex. cues (to which subjects did not shift attention) were similar to attend
cues, except the vertical lines were of equal length (1.1°) and had
short (0.2°) horizontal lines on the top and bottom of each vertical
Integrating brain networks for conflict processing line to distinguish them from attend cues. Subjects were instructed
and attention to shift attention to the hemifield indicated by the cue (shift left if
left line is longer, shift right if right line is longer). They were told
Using functional magnetic resonance imaging (fMRI), it was to respond with a button press indicating if the gratings of the
investigated whether the ACC conflict-detection system subsequent target were vertical or horizontal.
interacts with the frontal-parietal attentional control system
when spatial cuing leads to conflicts in attentional orienting whether the target in the attended field was horizontal or
(Walsh, 2008). The working hypothesis was that conflict in vertical (attended and unattended targets could both be the
one trial should result in adjustments in frontal-parietal top- same orientation or each be different orientations).
down attentional control to reduce conflict in a subsequent As has been observed in numerous prior studies that have
trial. used signal-processing methods to decompose cue from
The paradigm was a hybrid of those used in conflict- target activity (e.g., Ollinger, Shulman, & Corbetta, 2001;
controls studies and those typical of spatial attention studies Woldorff et al., 2004) in studies of spatial attention (e.g.,
(figure 16.1). Cues consisting of two vertical lines directed the Corbetta et al., 2000; Hopfinger et al., 2000; see Corbetta
subjects’ covert spatial attention (100% instructive) to either et al., chapter 14, this volume, for a review), the frontal-
the right or the left visual in order to discriminate the features parietal attentional control system was activated in response
of an upcoming target. The longer of the two lines in the cue to the attention-directing cues (collapsed across level of con-
indicated that spatial attention should be directed to the flict) (figure 16.2A), but not to targets. In addition, in visual
hemifield on the side of the longer line. The amount of con- cortical regions, in response to cues, there was a significant
flict generated by the cues was systematically manipulated activation of the visual cortex contralateral to the direction
by varying the difference in length of two lines (see figure attention was cued (left versus right) (figure 16.2B). Targets
16.1B), in line with evidence that requiring near-threshold- also activated visual cortex, and these sensory-evoked activa-
level perceptual judgments can result in the generation of tions were modulated by the direction of covert spatial atten-
conflict (Szmalec et al., 2008). As a result, low, medium, and tion such that responses to the bilateral targets were larger
high conflict-generating cues were created. Bilateral targets on the hemisphere contralateral to the attended hemifield
(200 ms duration) followed the cues after a delay period (not shown in figures). Therefore, independent of cue con-
(1500 ms), and these targets were then masked at offset by flict, the frontal-parietal attentional control network was
pattern masks (300 ms duration). Subjects were required to activated when subjects acted on the cue instructions, and
respond by pressing one of two buttons with their right hand this activation led to changes in visual cortex.
The goal in this work was to investigate whether difference Figure 16.3 (A) Activity in dorsal anterior cingulate cortex
in conflict in attentional orienting (a function of cue discrim- (dACC) related to cue-induced conflict (contrast between high- and
inability) led to systematic changes in the frontal-parietal low-conflict cue trials—see figure 16.1B). (B) Within the dACC
control regions. This line of argument, however, is depen- region of interest (ROI) shown in A, there was a parametric increase
in fMRI BOLD signal (plotted as beta values) as function of increas-
dent on the paradigm and cue discriminability manipulation
ing cue-related conflict.
resulting in conflict that activates the ACC and triggers
cognitive control. To establish this relationship, it is impor-
tant to demonstrate that the cue manipulation resulted in right lateral parietal cortex, and left anterior insula (figure
standard effects on behavior and brain activity (e.g., Kerns 16.3). Within the dACC region identified by the contrast,
et al., 2004). Analysis of the reaction times (RT) and accuracy the amplitudes of BOLD signals were observed to be para-
to detect targets at the cued locations were broken down as metrically related to degree of conflict engendered by the
a function of whether the trial was preceded by a high versus cues (i.e., as a function of cue discriminability). These find-
low conflict trial (defined by whether the cue was high versus ings were in line with numerous prior studies of the role of
low conflict in the preceding trial). Reaction times were faster the ACC in conflict detection and indicate that the cue
( p < .001) and accuracy was higher ( p < .005) for detecting manipulation was effective in activating the ACC (Botvinick,
targets when preceded by high-conflict trials versus low- Nystrom, Fissell, Carter, & Cohen, 1999; MacDonald et al.,
conflict trials. Thus the expected pattern of improved speed 2000; Liston, Matalon, Hare, Davidson, & Casey, 2006).
and accuracy of behavioral responses in trials following high Finally, electrophysiology measures have previously been
versus low conflict trials was observed in this spatial cuing related to conflict detection and should therefore be elicited
paradigm. This behavioral signature suggests that conflict in the present hybrid design. Studies using ERP have identi-
resulted in increased attentional control that was manifest as fied an anterior midline negativity (referred to as the N2
improvements in behavior (e.g., Kerns et al., 2004). component) that is generated in the ACC and increases in
Next, whole-brain voxel-wise analyses of the fMRI BOLD amplitude to stimuli that are more likely to induce conflict
responses were conducted to determine whether the observed (van Veen & Carter, 2002a, 2002b; Donkers & van Boxtel,
behavioral adjustments were mediated by the well-known 2004; Szmalec et al., 2008). In a subset of the same subjects
mechanisms involving the ACC. Activity associated with tested in the fMRI study, ERPs were recorded from 128
high-conflict cues compared to activity associated with low- channels as they performed the task shown in figure 16.1
conflict attend cues was found to include the dorsal anterior (Walsh, 2008). As expected, the amplitude of the N2 com-
cingulate cortex (dACC), as well as small regions of DLPFC, ponent of the evoked potential at frontocentral electrode
254 attention
cues to neutral cues were used as regions of interest (ROIs).
The hemodynamic responses in these regions of interest
could then be investigated as a function of attention or
conflict in the current trial (trial N ) and the next trial (trial
N + 1). These results are shown in figure 16.5 for the dACC,
the FEF, and the intraparietal cortex. Two main findings
can be highlighted.
First, as described earlier, contrasts related to attention
revealed activity in the frontal-parietal network but not the
dACC, whereas contrasts related to conflict affected the
dACC but not the frontal-parietal network (figure 16.5).
This pattern draws the distinction between the dACC that
is sensitive to cue conflict (e.g., Kerns et al., 2004) and the
frontal-parietal network that is sensitive to attentional control
(e.g., Corbetta et al., 2000; Hopfinger et al., 2000).
The second finding is the pattern of attention and conflict
in the dACC and the frontal-parietal network for trial N
versus trial N + 1. This can be observed in figure 16.5 by
observing the time courses of the hemodynamic responses.
The key finding is that although cue conflict does not result
in fMRI BOLD signal changes in the frontal-parietal network
for trial N, in response to trial N + 1 (following a high-conflict
cue) this network shows a robust response. This pattern
shows that high (versus low) cue conflict on one trial leads
to increased activity in the frontal-parietal network on the
next trial. This pattern is consistent with that observed in
behavior where target performance was improved on trials
that followed a high-cue-conflict trial (described earlier).
(Pandya, Van Hoesen, & Mesulam, 1981; Selemon & able, raising the possibility that for attentional conflict, inter-
Goldman-Rakic, 1988), critical elements of the frontal- actions between ACC and the frontal-parietal attention
parietal attentional control network. Given this connectivity, network were not mediated by the DLPFC; this must remain
and the results obtained in the experiments described in a hypothesis for future investigation.
this chapter, one might speculate that the ACC projects The present formulation can be considered in relation to
directly to structures in the frontal-parietal attention network other models of cortical control systems. One such model is
critical to modulating attentional control. the recent proposal regarding the interaction of cortical net-
Prior research in cognitive control has established a rela- works in default-mode processing and cognitive control.
tionship between the ACC and the dorsolateral prefrontal Dosenbach, Fair, Cohen, Schlagger, and Petersen (2008)
cortex (DLPFC) where conflict resulting in ACC activation proposed a model in which a frontal-parietal system is
triggered DLPFC activity that was related to changes in involved in top-down control processes that meet moment-
performance (e.g., Kerns et al., 2004). In the studies reported to-moment demands, while a cingulate-opercular system
in this chapter, no such DLPFC involvement was measur- (involving ACC, anterior prefrontal cortex, and inferior
256 attention
Bundesen, C., Habekost, T., & Kyllingsbaek, S. (2005).
A neural theory of visual attention: Bridging cognition and
neurophysiology. Psychol. Rev., 112(2), 291–328.
Bushnell, M. C., Goldberg, M. E., & Robinson, D. L. (1981).
Behavioral enhancement of visual responses in monkey cerebral
cortex. I. Modulation in posterior parietal cortex related to selec-
tive visual attention. J. Neurophysiol., 46(4), 755–772.
Carter, C. S., Braver, T. S., Barch, D. M., Botvinick,
M. M., Noll, D., & Cohen, J. D. (1998). Anterior cingulate
cortex, error detection, and the online monitoring of perfor-
mance. Science, 280(5364), 747–749.
Casey, B. J., Thomas, K. M., Welsh, T. F., Badgaiyan, R. D.,
Eccard, C. H., Jennings, J. R., et al. (2000). Dissociation of
response conflict, attentional selection, and expectancy with
functional magnetic resonance imaging. Proc. Natl. Acad. Sci. USA,
97(15), 8728–8733.
Figure 16.6 Diagram of the interactions of the conflict and Corbetta, M., Kincade, J. M., Ollinger, J. M., McAvoy,
attention systems. M. P., & Shulman, G. L. (2000). Voluntary orienting is dissoci-
ated from target detection in human posterior parietal cortex.
Nat. Neurosci., 3, 292–297.
lateral prefrontal cortex, as well as subcortical structures) is
Corbetta, M., & Shulman, G. (2002). Control of goal-directed
involved in maintaining a stable set over time during task and stimulus-driven attention in the brain. Nat. Rev. Neurosci.,
performance. They noted that “it seems likely that additional 3(3), 201–215.
controllers might exist, operating at other temporal and/or Corbetta, M., Tansy, A. P., Stanley, C. M., Astafiev, S. V.,
spatial scales” (p. 103). The findings reviewed in the present Snyder, A. Z., & Shulman, G. L. (2005). A functional MRI
chapter emphasize this general notion of interacting cortical study of preparatory signals for spatial location and objects.
Neuropsychologia, 43(14), 2041–2056. Epub 2005 Apr 26.
networks that act to achieve different specific computational D’Esposito, M., Detre, J. A., Alsop, D. C., Shin, R. K.,
needs during performance. By showing that systems involved Atlas, S., & Grossman, M. (1995). The neural basis of the
in conflict monitoring and error processing interact in central executive system of working memory. Nature, 378,
dynamic ways with voluntary attentional control networks 279–281.
to modulate selective attention, the present work establishes Donkers, F., & Van Boxtel, G. (2004). The N2 in go/no-go tasks
reflects conflict monitoring not response inhibition. Brain Cogn.,
the interactions of two major cognitive control systems for 56(2), 165–176.
evaluating, controlling, and improving perception and Dosenbach, N. U., Fair, D. A., Cohen, A. L., Schlaggar,
performance. B. L., & Petersen, S. E. (2008). A dual-networks architecture of
top-down control. Trends Cogn Sci., 12(3):99–105.
acknowledgments We are deeply grateful to Cameron Egner, T., & Hirsch, J. (2005). Cognitive control mechanisms
S. Carter, Michael H. Buonocore, Barry Giesbrecht, Sean P. resolve conflict through cortical amplification of task-relevant
Fannon, and Dorothee Heipertz for their collaboration, advice, and information. Nat. Neurosci., 8(12), 1784–1790.
assistance. Supported by NIMH R01 MH55714 to GRM and NEI Giesbrecht, B., Woldorff, M. G., Song, A. W., & Mangun,
Traineeship T32 EY015387 to BJW. G. R. (2003). Neural mechanisms of top-down control during
spatial and feature attention. NeuroImage, 19(3), 496–512.
Goldberg, M. E., & Bruce, C. J. (1985). Cerebral cortical activity
REFERENCES associated with the orientation of visual attention in the rhesus
monkey. Vision Res., 25(3), 471–481.
Astafiev, S. V., Shulman, G. L., Stanley, C. M., Snyder, Gratton, G., Coles, M. G., & Donchin, E. (1992). Optimizing
A. Z., Van Essen, D. C., & Corbetta, M. (2003). Functional the use of information: Strategic control of activation of responses.
organization of human intraparietal and frontal cortex J. Exp. Psychol. Gen., 121(4):480–506.
for attending, looking, and pointing. J. Neurosci., 23(11), Hopf, J. M., & Mangun, G. R. (2000). Shifting visual attention
4689–4699. in space: An electrophysiological analysis using high spatial
Baddeley, A. (1996). Exploring the central executive. Q. J. Exp. resolution mapping. Clin. Neurophysiol., 111, 1241–1257.
Psychol. [A], 49A(1), 5–28. Hopfinger, J. B., Buonocore, M. H., & Mangun, G. R. (2000).
Bisley, J. W., & Goldberg, M. E. (2006). Neural correlates The neural mechanisms of top-down attentional control. Nat.
of attention and distractibility in the lateral intraparietal area. Neurosci., 3, 284–291.
J. Neurophysiol., 95, 1696–1717. Huerta, M. F., Krubitzer, L. A., & Kass, J. H. (1987). Frontal
Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, eye field as defined by intracortical microstimulation in squirrel
C. S., & Cohen, J. D. (2001). Conflict monitoring and cognitive monkeys, owl monkeys, and macaque monkeys. II. Cortical con-
control. Psychol. Rev., 108(3), 624–652. nections. J. Comp. Neurol., 265(3), 332–361.
Botvinick, M., Nystrom, L. E., Fissell, K., Carter, C. S., & Hung, J., Driver, J., & Walsh, V. (2005). Visual selection
Cohen, J. D. (1999). Conflict monitoring versus selection- and posterior parietal cortex: Effects of repetitive transcranial
for-action in anterior cingulate cortex. Nature, 402(6758), magnetic stimulation on partial report analyzed by Bundesen’s
179–181. theory of visual attention. J. Neurosci., 25(42), 9602–9612.
258 attention
17 A Right Perisylvian
Neural Network for Human
Spatial Orienting
hans-otto karnath
abstract Homologous neural networks seem to exist in the on the contralesional, left side. When searching for targets,
human left and right hemispheres tightly linking cortical regions copying, or reading, for example, they concentrate their
straddling the sylvian fissure. White matter fiber bundles connect exploratory movements predominantly on the right side of
the inferior parietal lobule with the ventrolateral frontal cortex,
ventrolateral frontal cortex with superior/middle temporal cortex,
space (Heilman, Watson, Valenstein, & Damasio, 1983;
and superior/middle temporal cortex with the inferior parietal Behrmann, Watt, Black, & Barton, 1997; Karnath,
lobule. It is argued that these perisylvian networks serve different Niemeier, & Dichgans, 1998). The question thus arises
cognitive functions, a representation for language and praxis in the whether the development of these different functions in the
left hemisphere and a representation for processes involved in human left and right hemispheres corresponds with different
spatial orienting in the right. The tight perisylvian anatomical con-
anatomical representations. Or is it possible that homolo-
nectivity between superior/middle temporal, inferior parietal, and
ventrolateral frontal cortices might explain why lesions at these gous neural structures serve as correlates for language and
distant cortical sites around the sylvian fissure in the human right praxis in the left and for spatial orientation in the right
hemisphere can lead to the same disturbance of orienting behavior, hemisphere?
namely, to spatial neglect. Three major cortical areas have been described as neural
correlates of spatial neglect in the human right hemisphere.
A first study by Heilman and coworkers (1983) revealed the
In recent years it has been shown that functional and right inferior parietal lobule (IPL) and the temporoparietal
structural lateralization of the brain is more widespread junction (TPJ). Subsequent studies reported comparable
among vertebrates than previously believed. Nevertheless, observations (e.g., Vallar & Perani, 1986; Mort et al., 2003).
it is still appropriate that many motor, sensory, and visual, Lesions located in the right ventrolateral frontal cortex were
but also cognitive, functions show bihemispheric representa- also observed to correlate with spatial neglect (Vallar &
tions in the human and nonhuman primate. Only a few Perani, 1986; Husain & Kennard, 1996; Committeri et al.,
(so-called higher) cognitive functions have obvious asym- 2007). Finally, several studies have revealed the right supe-
metrical representations. Among them are language, praxis, rior temporal cortex and adjacent insula as being critically
and spatial orienting. While an elaborate representation related to the disorder (Karnath, Ferber, & Himmelbach,
for language and praxis has evolved in the human left 2001; Karnath, Fruhmann-Berger, Küker, & Rorden, 2004;
hemisphere, a neural system involved in spatial orienting Buxbaum et al., 2004; Corbetta, Kincade, Lewis, Snyder, &
is dominantly represented in the right hemisphere. Conse- Sapir, 2005; Committeri et al., 2007; Sarri, Greenwood,
quently, locally corresponding damage to one of the two Kalra, & Driver, 2009).
hemispheres leads to different symptoms. While the domi- Interestingly, a similar pattern of perisylvian correlates
nant disorders in neurological patients with left hemisphere has been observed in the human left hemisphere when stroke
involvement are aphasia and apraxia, patients with right patients suffer from aphasia. Early analyses (e.g., Kertesz,
hemisphere damage typically show spatial neglect. This term Harlock, & Coates, 1979; Poeck, de Bleser, Graf von
describes a spontaneous deviation of the eyes and the head Keyserlingk, 1984), as well as more recent studies of cortical
toward the ipsilesional, right side (Fruhmann-Berger & lesion localization in neurological patients with disorders in
Karnath, 2005; Fruhmann-Berger, Pross, Ilg, & Karnath, language comprehension and/or speech production (e.g.,
2006). Patients with such a disorder disregard objects located Kreisler et al., 2000; Dronkers, Wilkins, Van Valin, Redfern,
& Jaeger, 2004; Borovsky, Saygin, Bates, & Dronkers, 2007),
hans-otto karnath Center for Neurology, University of revealed involvement of the ventrolateral frontal cortex,
Tübingen, Tübingen, Germany superior and middle temporal gyri, insula, and IPL. These
karnath: a right perisylvian neural network for human spatial orienting 259
findings are supported by electrical mapping of the human Dense perisylvian white matter connectivity
cortex during surgery in awake patients as well as functional
magnetic resonance imaging (fMRI) in healthy subjects. A Beyond traditional axonal-tract-tracing and myelin-staining
recent meta-analysis of 129 fMRI studies on phonological, techniques, the development of diffusion-based imaging—
semantic, and syntactic processing revealed activation in for example, diffusion tensor and diffusion spectrum imaging
distributed areas predominantly involving left middle and (DTI/DSI)—has opened new opportunities for identifying
inferior dorsolateral frontal, superior and middle temporal, long-range white matter pathways. By combining the find-
and inferior parietal cortices (Vigneau et al., 2006). These ings from DSI and from histological tract tracing, Schmah-
sites correspond very well with those in which intraoperative mann and colleagues separated ten long, bidirectional
cortical stimulation evoked disturbances of language pro- association fiber bundles in the monkey brain (Schmahmann
cesses, such as anomia, alexia, or speech arrest (e.g., & Pandya, 2006; Schmahmann et al., 2007). Anatomical
Boatman, 2004; Duffau et al., 2005; Sanai, Mirzadeh, & homologies have been described in the human with the aid
Berger, 2008). of diffusion-based imaging in vivo (e.g., Catani, Howard,
These perisylvian brain areas in the human left hemi- Pajevic, & Jones, 2002; Catani, Jones, & ffytche, 2005;
sphere do not appear to represent language processes solely. Catani et al., 2007; Makris et al., 2005, 2007; Mori, Wakana,
Recent analyses of lesion localization in patients suffering van Zijl, & Nagae-Poetscher, 2005; Upadhyay, Hallock,
from apraxia suggested that they are also involved in the Ducros, Kim, & Ronen, 2008) and myelin staining postmor-
organization of motor actions (Goldenberg & Karnath 2006; tem (Bürgel et al., 2006). In the following, the focus will be
Goldenberg, Hermsdörfer, Glindeman, Rorden, & Karnath, on those pathways that connect the perisylvian cortical
2007). Stroke patients with either disturbed pantomime of areas—that is, the superior/middle temporal, inferior pari-
tool use or with disturbed imitation of finger postures typi- etal, and dorsolateral frontal cortices—where damage has
cally showed damage of the left inferior frontal gyrus (IFG) been shown to provoke spatial neglect in the case of right
and adjacent portions of the insula, while disturbed imitation brain lesions and aphasia and/or apraxia after left hemi-
of hand postures was associated with posterior lesions affect- sphere involvement.
ing the IPL and TPJ. This close anatomical relationship In the monkey and human, the superior longitudinal fas-
between the representation of praxis on the one hand and ciculus (SFL) is the major cortical association fiber pathway
language on the other led to the assumption that these left linking parietal and frontal cortices. It is subdivided into
perisylvian areas might represent an observation/execution different, separable components (Petrides & Pandya 1984;
matching system providing the bridge from “doing” to Makris et al., 2005; Schmahmann et al., 2007). One part—
“communicating” (Rizzolatti & Arbib, 1998; Iacoboni & the SLF I—is situated dorsally of the perisylvian network
Wilson, 2006). Its development was seen as a consequence area, connecting the superior parietal lobule with dorsal
of the fact that before speech appearance the precursors of premotor areas. Two further subcomponents connect the
these areas in the monkey were endowed with a mechanism IPL with premotor and prefrontal cortices. The SLF II links
for recognizing actions made by others. This mechanism was the IPL and intraparietal sulcus with the posterior and caudal
seen as the neural prerequisite for the development of inter- prefrontal cortex, while the SLF III connects the rostral IPL
individual communication and finally of speech (Rizzolatti with the ventral part of premotor and prefrontal cortex. A
& Arbib, 1998). further fiber tract that is separable from these connections
Thus it seems as if very similar anatomical cortical areas stems from the caudal part of the superior/middle temporal
straddling the sylvian fissure are involved in representing gyrus (STG/MTG), arches around the caudal end of the
language and praxis in the human left hemisphere and sylvian fissure, and extends to the lateral prefrontal cortex
spatial orienting in the right hemisphere. Recent anatomical along with the SLF II fibers. This latter fiber tract is termed
studies have revealed that a dense white matter connectivity the arcuate fasciculus (AF) and has been described in humans
exists specifically between these perisylvian cortical areas. In (Burdach, 1819–26; Dejerine & Dejerine-Klumpke, 1895;
the following sections, it will be argued that intimately inter- Catani et al., 2002, 2005; Makris et al., 2005; Vernooij
connected homologous perisylvian networks have evolved in et al., 2007; Upadhyay et al., 2008) as well as in the monkey
the human left and right hemispheres serving for different (Petrides & Pandya, 1988; Schmahmann et al., 2007). Some
cognitive functions, a representation for language and praxis disagreement exists related to whether or not this fiber
in the left hemisphere and a representation for spatial orient- bundle is regarded as a fourth subdivision of the SLF (SLF
ing in the right hemisphere. Further, it will be argued that IV; Makris et al., 2005; Vernooij et al., 2007), a stand-alone
for these cognitive processes the functioning of the perisyl- connection adjacent to the SLF (Schmahmann et al., 2007),
vian cortical areas is critical, not the mere disconnection of or only part, namely, the long segment (discussed later), of
their white matter interconnections. a three-way AF structure (Catani et al., 2005).
260 attention
A fiber bundle situated in close proximity to the SLF II The MdLF is a fiber bundle that runs within the white
and the AF is the superior occipitofrontal fasciculus (SOF), matter of the superior temporal gyrus extending from the
also termed the (superior) fronto-occipital fasciculus ([S]FOF) IPL to the temporal pole. Although there is no common
by some authors. The SOF forms the medial border of the agreement yet, it appears as if the EmC corresponds with
corticospinal tract and separates it from the lateral ventricles. the bundle termed “inferior occipitofrontal fasciculus (IOF)”
The fibers run parallel to the dorsolateral margin of the [also “inferior frontooccipital fasciculus (IFOF)”] by other
lateral ventricles below the corpus callosum. Some of its authors (Nieuwenhuys, Voogd, & van Huijzen, 1988; Catani
fibers intermingle with SLF II and AF fibers. Different from et al., 2002; Kier, Staib, Davis, & Bronen, 2004; Wakana,
expectations based on its labeling, the SOF connects not Jiang, Nagae-Poetscher, van Zijl, & Mori, 2004; Bürgel
only occipital but also inferior parietal with frontal lobe et al., 2006).
areas. In the monkey as well as in humans, this long associa- For clarification it should be pointed out that Catani and
tion bundle bidirectionally extends from the IPL and dorso- colleagues used a different terminology when they investi-
medial parastriate occipital cortex to caudal, dorsal, and gated the long perisylvian association fibers of the human
medial frontal lobe areas (Catani et al., 2002; Bürgel et al., left hemisphere (Catani et al., 2002, 2005, 2007). The SLF
2006; Makris et al., 2007; Schmahmann et al., 2007). and AF historically have been regarded as a single fiber
Two further, ventrally located fiber bundles contribute to bundle in the human (Burdach, 1819–26; Dejerine &
the perisylvian network focused on in this chapter, namely, Dejerine-Klumpke, 1895). The terms “superior longitudinal
the strong pathway running through the extreme capsule fasciculus” and “arcuate fasciculus” thus often were and
(EmC) and the middle longitudinal fasciculus (MdLF). Both still are used interchangeably by some authors, including
bundles have been described in monkeys (Seltzer & Pandya, Catani and coworkers. However, despite the different termi-
1984; Schmahmann & Pandya, 2006; Schmahmann et al., nology, Catani and colleagues (2005) also found a long
2007; Petrides & Pandya, 2007) as well as in humans (Makris and two shorter segments between the superior/middle tem-
& Pandya, 2009; Makris et al., 2009). The EmC is situated poral, inferior parietal, and lateral frontal cortices (figure
between the claustrum and the insular cortex interconnect- 17.1A). Tractography reconstruction for a group of 11
ing the inferior frontal and orbitofrontal gyri with the healthy subjects revealed a direct connection between the
midportion of the superior temporal region. It further left rentrolateral frontal and the superior/middle temporal
continues caudally toward the occipital cortex and toward cortex—that is, between Broca’s and Wernicke’s language
the IPL, flanking here another fiber pathway, namely areas. In addition, two shorter pathways were found con-
the MdLF (Makris & Pandya, 2009; Makris et al., 2009). necting superior/middle temporal with the inferior parietal
(A) (B)
Figure 17.1 Averaged tractography reconstruction for fiber con- from the superior/middle temporal to the inferior parietal cortex
nections between the superior/middle temporal, inferior parietal, is shown in yellow. The anterior segment running from the inferior
and lateral frontal cortices by using a two-region-of-interest parietal to the lateral frontal cortex is shown in green. IPL, inferior
approach in (A) the human left hemisphere (Catani, Jones, & parietal lobule; LFC, lateral frontal cortex; STC, superior temporal
ffytche, 2005) and (B) the human right hemisphere (Gharabaghi cortex; MTC, middle temporal cortex. (With modifications from
et al., 2009). A long connection was observed linking superior/ Catani et al., 2005, and from Gharabaghi et al., 2009.) (See color
middle temporal and lateral frontal cortices (shown in red). Two plate 19.)
shorter pathways also were found. The posterior segment running
karnath: a right perisylvian neural network for human spatial orienting 261
cortex (“posterior segment”) and the inferior parietal with rior/middle temporal with the inferior parietal cortex and
the dorsolateral frontal cortex (“anterior segment”). There an anterior segment running from the inferior parietal to
is no doubt that the long connection between the superior/ the dorsolateral frontal cortex. In contrast, they found the
middle temporal and inferior frontal cortex represents the long, direct segment between the superior/middle temporal
fiber bundle that has been termed AF in the work of other and the lateral frontal cortices in only about 40% of
groups (Petrides & Pandya, 1988; Makris et al., 2005; their individuals, while this segment was present in all
Schmahmann et al., 2007; Upadhyay et al., 2008). The subjects (100%) studied by Gharabaghi and colleagues
“anterior segment” between the inferior parietal and inferior (2009). Likewise, some studies observed largely symmetrical
frontal cortex most probably represents the fiber bundle(s) conditions between the human hemispheres for volume,
that have been termed SLF II—maybe in combination with bundle density, and location of the left- and right-sided
the SLF III and/or SOF. The “posterior segment” between AF and SLF (Makris et al., 2005; Bürgel et al., 2006;
the superior temporal and inferior parietal cortex had been Upadhyay et al., 2008), while discrepant observations
assumed to represent the MdLF (Schmahmann et al., 2007). have also been reported (Powell et al., 2006; Vernooij et al.,
However, the recent work by Makris et al. (2009) in the 2007; Glasser & Rilling, 2008). Possible reasons for the dis-
human rather argues that the MdLF is distinct from and crepancy between these studies can be attributed to differ-
located medial to the SLF-AF fibers. ences in fiber tracking methods, in the choice of the seeding
To analyze the perisylvian connectivity between the ROIs, and/or the composition of subject samples. Future
superior/middle temporal, inferior parietal, and lateral studies will have to clarify this issue. However, beyond the
frontal cortices in the human right hemisphere, Gharabaghi discrepant observations regarding the long, dorsally located
and coworkers (2009) investigated 12 right-handed male direct connection via the AF, it is undisputed that the supe-
subjects without neurological deficits by using the same rior/middle temporal, lateral frontal, and inferior parietal
procedure that Catani and colleagues (2005) applied for the cortices show dense direct (via the EmC/IOF) as well as
left hemisphere analysis. Figure 17.1B shows the averaged indirect interconnectivity.
tractography reconstruction obtained from this DTI analy- To summarize the hitherto existing findings from tract-
sis. It revealed a pattern of fiber connections that largely tracing, myelin-staining, and diffusion-based imaging tech-
corresponded to the one demonstrated by Catani and col- niques, a dense perisylvian network seems to exist in both
leagues (2005) in the human left hemisphere (figure 17.1A). hemispheres connecting the inferior parietal lobule with
While Gharabaghi and coworkers were conducting this the ventrolateral frontal cortex (via SLF II, SLF III, SOF),
analysis, Catani and colleagues published a study (Catani ventrolateral frontal cortex with superior/middle temporal
et al., 2007) in which they also had analyzed the perisylvian cortex (via AF, EmC/IOF), and superior temporal cortex
connectivity in the human right hemisphere. In line with with the inferior parietal lobule (via MdLF, EmC/IOF).
the findings illustrated in figure 17.1B, they found an indirect Figure 17.2 illustrates these tightly connected perisylvian
connection with a posterior segment connecting the supe- neural networks.
Figure 17.2 Sketch of the perisylvian neural network linking the EmC/IOF). SLF II/III, subcomponents II/III of the superior
inferior parietal lobule with the ventrolateral frontal cortex (via SLF longitudinal fasciculus; SOF, superior occipitofrontal fasciculus;
II, SLF III, SOF), ventrolateral frontal cortex with superior/middle AF, arcuate fasciculus; IOF, inferior occipitofrontal fasciculus;
temporal cortex and insula (via AF, EmC/IOF), and superior EmC, extreme capsule; MdLF, middle longitudinal fasciculus.
temporal cortex with the inferior parietal lobule (via MdLF,
262 attention
Functional role of the perisylvian network array) that closely resembled the clinical procedure employed
in the human right hemisphere to detect spatial neglect in stroke patients. The authors
observed significant activation associated with visual explo-
There is no disagreement that the perisylvian network in the ration located at the TPJ, the midportion of the STG, and
human left hemisphere is involved in language processes the IFG.
(e.g., Frey, Campbell, Pike, & Petrides, 2008; Saur et al., Thus observations deriving from different techniques con-
2008; Catani & Mesulam, 2008; Makris & Pandya, 2009). verge to suggest that the densely interconnected perisylvian
In contrast, the functional involvement of the right hemi- neural system in the human right hemisphere (figure 17.2)
sphere perisylvian network is less clear. Catani and col- represents the anatomical basis of processes that are involved
leagues suggested that the perisylvian network in the human in spatial orientation, provoking spatial neglect in the case
right hemisphere might represent—as in the human left of damage. The tight anatomical connectivity between supe-
hemisphere—a network involved in language functions rior/middle temporal, inferior parietal, and ventrolateral
(Catani et al., 2007, p. 17166). In this chapter, a different frontal cortices might explain why lesions at these distant
view is suggested. The perisylvian pathways between the cortical sites around the sylvian fissure in the human right
right IPL, ventrolateral frontal, and superior/middle hemisphere can lead to the same disturbance of orienting
temporal cortices and insula connect those areas which have behavior, namely, to spatial neglect.
repeatedly been associated with spatial neglect in the case of
brain damage (Heilman et al., 1983; Vallar & Perani, 1986; Spatial neglect—A disconnection syndrome?
Mort et al., 2003; Karnath et al., 2001, 2004; Committeri
et al., 2007; Sarri et al., 2009). In contrast, aphasia is only Beginning with the seminal work of Dejerine and his wife
extremely rarely associated with lesion of these right hemi- (Dejerine & Dejerine-Klumpke, 1895) on the human cortical
sphere perisylvian areas (as rarely as spatial neglect is pathways, several authors developed the idea that some neu-
observed after left hemisphere damage). Thus it is proposed rological conditions might result from the disconnection of
that the perisylvian network in the human right hemisphere one area of the brain from another. Among them, Geschwind
represents the anatomical basis for processes involved in (1965) put forward the view that several neuropsychological
spatial orienting and exploration. disorders could best be interpreted as resulting from inter-
Supporting evidence for this hypothesis has been reported ruption of specific cortical association pathways. With respect
from transcranial magnetic stimulation (TMS), electrical to spatial neglect, Mesulam and Geschwind (1978) suggested
mapping of the human cortex during neurosurgery, and that this disorder—among other disorders of attention and
fMRI in healthy subjects. Using TMS, Ellison, Schindler, emotion—results from disruption of neural connections
Pattison, and Milner (2004) induced “virtual lesions” at the between limbic structures and neocortex. Mesulam (1981,
right STG and right posterior parietal cortex (PPC) in 1985) further evolved this concept, suggesting that an inter-
healthy subjects. They observed a specific impairment connected network between posterior parietal, frontal, and
induced by TMS over the right STG for serial feature search cingulate cortices as well as the reticular formation is involved
(termed “hard feature search task”). In contrast, TMS over in spatial neglect. A disconnection hypothesis has also been
the right PPC resulted in increased reaction times during put forward by Watson, Heilman, Miller, and King (1974)
“hard conjunction search.” Gharabaghi, Fruhmann-Berger, and Watson, Miller, and Heilman (1978) when observing
Tatagiba, and Karnath (2006) observed that intraoperative that spatial neglect can be evoked in the monkey by a lesion
inactivation of the middle portion of the STG in human in the mesencephalic reticular formation.
leads to disturbed serial visual search. Using the same tech- More recently, some authors have revived the concept to
nique Thiebaut de Schotten and colleagues (2005) found view spatial neglect as a “disconnection syndrome” (Catani,
that inactivating regions in the right IPL or at the caudal 2006; Bartolomeo, Thiebaut de Schotten, & Doricchi, 2007;
and the middle parts of the STG leads to deficits in the per- He et al., 2007). Bartolomeo and colleagues proposed that
ception of line length. long-lasting signs of spatial neglect result from frontoparietal
Evidence for the involvement of superior temporal, infe- intrahemispheric and from interhemispheric disconnection.
rior parietal, and lateral frontal areas in processes of spatial They suggested that “a particular form of disconnection
orienting has also been obtained from fMRI experiments in might have greater predictive value than the localization of
healthy subjects. In a cued spatial-attention task, Hopfinger, gray matter lesions concerning the patients’ deficits and dis-
Buonocore, and Mangun (2000) found bilateral activation abilities” (Bartolomeo et al., 2007, p. 2484). Intrahemispheri-
in these cortical areas correlated with covert attentional cally, they related disconnection of the SLF (Thiebaut de
shifts in the horizontal dimension of space. Himmelbach, Schotten et al., 2005, 2008; Bartolomeo et al., 2007) but also
Erb, and Karnath (2006) investigated active visual explora- of the IOF (Urbanski et al., 2008) to spatial neglect. Using
tion in healthy subjects, using a task (visual search in a letter DTI tractography, He and colleagues found damage to the
karnath: a right perisylvian neural network for human spatial orienting 263
SLF and AF in five patients with severe spatial neglect but best be interpreted as a “disconnection syndrome” (Mesulam
not in five patients with mild cases. Furthermore, the analysis & Geschwind, 1978; Watson et al., 1974; Watson, Miller, &
of interregional functional connectivity, based on coherent Heilman, 1978; Catani, 2006; Bartolomeo et al., 2007; He
fluctuations of fMRI signals, suggested that not only ana- et al., 2007), one may conclude that their data argue more
tomically but also functionally disrupted connectivity in against than in favor of such a hypothesis. In fact, their
dorsal and ventral attention networks might constitute a criti- analysis revealed that between 89.1% and 96.6% of the
cal mechanism underlying the pathophysiology of spatial lesion area in spatial neglect affected brain structures other
neglect (He et al., 2007). than the perisylvian white matter fiber tracts, namely, corti-
To investigate the possible impact of damage to white cal and subcortical gray matter structures such as the supe-
matter association fibers for the genesis of spatial neglect, rior temporal, inferior parietal, inferior frontal, and insular
Karnath, Rorden, and Ticini (2009) analyzed lesion location cortices, as well as the putamen and caudate nucleus
in a large seven-year sample of 140 right-hemispheric stroke (Karnath et al., 2009). Damage to these gray matter struc-
patients. This large number of stroke patients allowed the tures in the right hemisphere thus appears to be a strong
authors not only to study a representative sample of subjects predictor of spatial neglect.
with spatial neglect, but also to perform a statistical voxel- Another aspect arguing against the view of spatial neglect
wise lesion-behavior mapping (VLBM) analysis (e.g., Bates as a white matter disconnection syndrome in the traditional
et al., 2003; Rorden, Karnath, & Bonilha, 2007) to estimate sense is the perfusion-weighted imaging (PWI) results
which brain regions are more frequently compromised in obtained in patients with subcortical infarcts. Perfusion-
neglect patients relative to patients without neglect. Karnath weighted imaging is an MR technique that allows the iden-
and coworkers (2009) studied the patients’ white matter con- tification of brain regions that are receiving enough blood
nectivity by using a new method that combines a statistical supply to remain structurally intact but not enough to func-
VLBM approach with the histological maps of the human tion normally. By using this technique, several studies showed
white matter fiber tracts provided by the stereotaxic proba- that left- or right-sided subcortical lesions—including selec-
bilistic atlas developed by the Jülich group (Amunts & Zilles, tive white matter strokes—cause spatial neglect only if the
2001; Zilles, Schleicher, Palomero-Gallagher, & Amunts, subcortical damage provokes additional malperfusion of cor-
2002). In contrast to the reference brain of the Talairach tical gray matter structures in the ipsilesional hemisphere
and Tournoux atlas (Talairach & Tournoux, 1988) or the (Demeurisse, Hublet, Paternot, Colson, & Serniclaes, 1997;
MNI single-subject or group templates (Evans et al., 1992; Hillis et al., 2002, 2005). Without this malfunction of cortical
Collins, Neelin, Peters, & Evans, 1994), the Jülich probabi- structures, subcortical brain lesions did not provoke distur-
listic atlas is based on the analysis of the cytoarchitecture in bances of spatial orienting. Thus it seems that damage to
a sample of 10 different human postmortem brains. It thus subcortical white matter connectivity alone does not provoke
provides information on the location and intersubject vari- spatial neglect but rather requires additional malfunction of
ability of brain structures, illustrating for each voxel of the cortical gray matter structures.
MNI reference space the relative frequency with which a
certain structure was present in 10 normal human brains. Conclusions
Using a modified myelin-staining technique, Bürgel and col-
leagues (2006) were able to distinguish 10 individual white Homologous perisylvian neural networks seem to exist in
matter fiber tracts for this atlas at microscopic resolution. the human left and right hemispheres composed of tightly
The analysis of the 140 right-hemisphere stroke patients connected cortical areas straddling the sylvian fissure (cf.
revealed that 7.0% of the right SLF, 8.2% of the IOF, 12.7% figure 17.2). It is suggested that the neural network consisting
of the SOF, and only 0.6% of the uncinate fasciculus of superior/middle temporal, inferior parietal, and ventro-
were significantly more affected in patients with spatial lateral frontal cortices in the human right hemisphere rep-
neglect than in those not showing the disorder (figure 17.3). resents the anatomical basis for processes involved in spatial
The authors concluded that damage of right perisylvian orienting. Neurons of these regions provide us with redun-
white matter connections is a typical finding in patients with dant information about the position and motion of our body
spatial neglect. However, the proportion of involvement of in space. They seem to play an essential role in adjus-
each of the fiber bundles was very low. When the authors ting body position relative to external space (Karnath &
analyzed how much of the lesion area in neglect patients Dieterich, 2006). Damage to this perisylvian system in the
overlapped with all of the perisylvian white matter connec- right hemisphere may provoke spatial neglect. In the human
tions, they found an overlap between 3.4% and 10.9% left hemisphere, a similar perisylvian network seems to exist
(Karnath et al., 2009). but is serving different functions, namely, language and
Although the study by Karnath and coworkers (2009) praxis. This functional specialization of left and right peri-
cannot finally decide whether or not spatial neglect should sylvian networks is still not observed in the nonhuman
264 attention
Figure 17.3 Overlap of the statistical VLBM lesion map (the fiber tract was present (e.g., yellow color indicates that the fiber
brain territory significantly more affected in 78 patients with spatial tract was present in that voxel in seven out of ten postmortem
neglect than in 62 stroke patients without this disorder) with the brains). The pink contour demarks the area of the fiber tracts
probabilistic, cytoarchitectonic maps of the white matter associa- affected by the statistical lesion map. (A) Overlap illustrated for
tion fiber tracts from the Jülich atlas. The statistical lesion map is perisylvian fiber tracts SFL, superior longitudinal fasciculus; IOF,
illustrated in homogeneous brown color. The color coding of the inferior occipitofrontal fasciculus; and SOF, superior occipitofron-
Jülich atlas from 1 (dark blue, observed in 1 postmortem brain) to tal fasciculus. (B) Overlap illustrated for fiber tracts CT, corticospi-
10 (red, overlap in all ten postmortem brains) represents the abso- nal tract; AR, acoustic radiation; and UF, uncinate fascicle. (From
lute frequency for which in each voxel of the brain a respective Karnath et al., 2009.) (See color plate 20.)
primate. Here, lesions of this perisylvian system in both Catani, M., Jones, D. K., & Ffytche, D. H. (2005). Perisylvian
hemispheres induce disturbed exploration and orientation language networks of the human brain. Ann. Neurol., 57, 8–16.
Catani, M., & Mesulam, M. (2008). The arcuate fasciculus and
toward the respective contralateral side (e.g., Luh, Butter, &
the disconnection theme in language and aphasia: History and
Buchtel, 1986; Watson, Valenstein, Day, & Heilman, 1994; current state. Cortex, 44, 953–961.
Wardak, Olivier, & Duhamel, 2002, 2004). Hence the phy- Collins, D. L., Neelin, P., Peters, T. M., & Evans, A. C. (1994).
logenetic transition from monkey to human brain seems to Automatic 3D intersubject registration of MR volumetric data
be a restriction of a formerly bilateral function represented in standardized Talairach space. J. Comput. Assist. Tomogr., 18,
within right- and left-sided perisylvian networks to the right 192–205.
Committeri, G., Pitzalis, S., Galati, G., Patria, F., Pelle, G.,
hemisphere (Karnath et al., 2001). It appears as if this later- Sabatini, U., et al. (2007). Neural bases of personal and extrap-
alization of spatial orientation to the right hemisphere ersonal neglect in humans. Brain, 130, 431–441.
network parallels the emergence of an elaborate representa- Corbetta, M., Kincade, M. J., Lewis, C., Snyder, A. Z., & Sapir,
tion for language in the left-sided perisylvian network. A. (2005). Neural basis and recovery of spatial attention deficits
in spatial neglect. Nat. Neurosci., 8, 1603–1610.
acknowledgments This work was supported by the Bundes- Dejerine, J., & Dejerine-Klumpke, A. M. (1895). Anatomie des centres
ministerium für Bildung und Forschung (BMBF-Verbundprojekt nerveux. Paris: Rueff et Cie.
“Räumliche Orientierung” 01GW0641) and the Deutsche For- Demeurisse, G., Hublet, C., Paternot, J., Colson, C., &
schungsgemeinschaft (SFB 550-A4). I would like to thank Bianca Serniclaes, W. (1997). Pathogenesis of subcortical visuo-
de Haan and Marc Himmelbach for their discussion and helpful spatial neglect: A HMPAO SPECT study. Neuropsychologia, 35,
comments on the manuscript. 731–735.
Dronkers, N. F., Wilkins, D. P., Van Valin, R. D., Jr., Redfern,
B. B., & Jaeger, J. J. (2004). Lesion analysis of the brain areas
involved in language comprehension. Cognition, 92, 145–177.
REFERENCES Duffau, H., Gatigno, P., Mandonnet, E., Peruzzi, P., Tzourio-
Amunts, K., & Zilles, K. (2001). Advances in cytoarchitectonic Mazoyer, N., & Capelle, L. (2005). New insights into the
mapping of the human cerebral cortex. Neuroimaging Clin. N. Am., anatomo-functional connectivity of the semantic system: A study
11, 151–169. using cortico-subcortical electrostimulations. Brain, 128,
Bartolomeo, P., Thiebaut de Schotten, M., & Doricchi, F. 797–810.
(2007). Left unilateral neglect as a disconnection syndrome. Ellison, A., Schindler, I., Pattison, L. L., & Milner, A. D.
Cereb. Cortex, 17, 2479–2490. (2004). An exploration of the role of the superior temporal gyrus
Bates, E., Wilson, S. M., Saygin, A. P., Dick, F., Sereno, in visual search and spatial perception using TMS. Brain, 127,
M. I., Knight, R. T., et al. (2003). Voxel-based lesion−symptom 2307–2315.
mapping. Nat. Neurosci., 6, 448–450. Evans, A. C., Marrett, S., Neelin, P., Collins, L., Worsley, K.,
Behrmann, M., Watt, S., Black, S. E., & Barton, J. J. (1997). Dai, W., et al. (1992). Anatomical mapping of functional activa-
Impaired visual search in patients with unilateral neglect: An tion in stereotactic coordinate space. NeuroImage, 1, 43–53.
oculographic analysis. Neuropsychologia, 35, 1445–1458. Frey, S., Campbell, J. S., Pike, G. B., & Petrides, M. (2008).
Boatman, D. (2004). Cortical bases of speech perception: Dissociating the human language pathways with high angular
Evidence from functional lesion studies. Cognition, 92, 47–65. resolution diffusion fiber tractography. J. Neurosci., 28,
Borovsky, A., Saygin, A. P., Bates, E., & Dronkers, N. (2007). 11435–11444.
Lesion correlates of conversational speech production deficits. Fruhmann-Berger, M., & Karnath, H.-O. (2005). Spontaneous
Neuropsychologia, 45, 2525–2533. eye and head position in patients with spatial neglect.
Burdach, K. F. (1819–1826). Vom Baue und Leben des Gehirns. Leipzig: J. Neurol., 252, 1194–1200.
Dyk. Fruhmann-Berger, M., Pross, R. D., Ilg, U. J., & Karnath,
Bürgel, U., Amunts, K., Hoemke, L., Mohlberg, H., H.-O. (2006). Deviation of eyes and head in acute cerebral
Gilsbach, J. M., & Zilles, K. (2006). White matter fiber tracts stroke. BMC Neurol., 6, 23; corrigendum, 6, 49.
of the human brain: Three-dimensional mapping at microscopic Geschwind, N. (1965). Disconnexion syndromes in animals and
resolution, topography and intersubject variability. NeuroImage, man. Brain, 88, 237–294, 585–644.
29, 1092–1105. Gharabaghi, A., Fruhmann-Berger, M., Tatagiba, M., &
Buxbaum, L. J., Ferraro, M. K., Veramonti, T., Farne, A., Karnath, H.-O. (2006). The role of the right superior temporal
Whyte, J., Ladavas, E., et al. (2004). Hemispatial neglect: Sub- gyrus in visual search—Insights from intraoperative electrical
types, neuroanatomy, and disability. Neurology, 62, 749–756. stimulation. Neuropsychologia, 44, 2578–2581; corrigendum, 45,
Catani, M. (2006). Diffusion tensor magnetic resonance imaging 465.
tractography in cognitive disorders. Curr. Opin. Neurol., 19, Gharabaghi, A., Kunath, F., Erb, M., Saur, R., Heckl, S.,
599–606. Tatagiba, M., et al. (2009). Perisylvian white matter connectiv-
Catani, M., Allin, M. P. G., Husain, M., Pugliese, L., Mesulam, ity in the human right hemisphere. BMC Neurosci., 10, 15.
M.-M., Murray, R. M., et al. (2007). Symmetries in human Glasser, M. F., & Rilling, J. K. (2008). DTI tractography of the
brain language pathways correlate with verbal recall. Proc. Natl. human brain’s language pathways. Cereb. Cortex, 18, 2471–2482.
Acad. Sci. USA, 104, 17163–17168. Goldenberg, G., HermsdÖrfer, J., Glindemann, R., Rorden, C.,
Catani, M., Howard, R. J., Pajevic, S., & Jones, D. K. (2002). & Karnath, H.-O. (2007). Pantomime of tool use depends
Virtual in vivo interactive dissection of white matter fasciculi in on integrity of left inferior frontal cortex. Cereb. Cortex, 17,
the human brain. NeuroImage, 17, 77–94. 2769–2776.
266 attention
Goldenberg, G., & Karnath, H.-O. (2006). The neural basis of Makris, N., & Pandya, D. N. (2009). The extreme capsule in
imitation is body part specific. J. Neurosci., 26, 6282–6287. humans and rethinking of the language circuitry. Brain Struct.
He, B. J., Snyder, A. Z., Vincent, J. L., Epstein, A., Shulman, Funct., 213, 343–358.
G. L., & Corbetta, M. (2007). Breakdown of functional con- Makris, N., Papadimitriou, G. M., Kaiser, J. R., Sorg, S.,
nectivity in frontoparietal networks underlies behavioral deficits Kennedy, D. N., & Pandya, D. N. (2009). Delineation of the
in spatial neglect. Neuron, 53, 905–918. middle longitudinal fascicle in humans: A quantitative, in vivo,
Heilman, K. M., Watson, R. T., Valenstein, E., & Damasio, DT-MRI study. Cereb. Cortex, 19, 777–785.
A. R. (1983). Localization of lesions in neglect. In A. Kertesz Makris, N., Papadimitriou, G. M., Sorg, S., Kennedy, D. N.,
(Ed.), Localization in neuropsychology (pp. 471–492). New York: Aca- Caviness, V. S., & Pandya, D. N. (2007). The occipitofrontal
demic Press. fascicle in humans: A quantitative, in vivo, DT-MRI study.
Hillis, A. E., Newhart, M., Heidler, J., Barker, P. B., NeuroImage, 37, 1100–1111.
Herskovits, E. H., & Degaonkar, M. (2005). Anatomy of Mesulam, M.-M. (1981). A cortical network to directed attention
spatial attention: Insights from perfusion imaging and hemispa- and unilateral neglect. Ann. Neurol., 10, 309–325.
tial neglect in acute stroke. J. Neurosci., 25, 3161–3167. Mesulam, M.-M. (1985). Attention, confusional states, and neglect.
Hillis, A. E., Wityk, R. J., Barker, P. B., Beauchamp, N. J., In M.-M. Mesulam (Ed.), Principles of behavioral neurology (pp.
Gailloud, P., Murphy, K., et al. (2002). Subcortical aphasia 125–168). Philadelphia: F. A. Davis.
and neglect in acute stroke: The role of cortical hypoperfusion. Mesulam, M.-M., & Geschwind, N. (1978). On the possible role of
Brain, 125, 1094–1104. neocortex and its limbic connections in the process of attention
Himmelbach, M., Erb, M., & Karnath, H.-O. (2006). Exploring and schizophrenia: Clinical cases of inattention in man and
the visual world: The neural substrate of spatial orienting. experimental anatomy in monkey. J. Psychiatr. Res., 14, 249–259.
NeuroImage, 32, 1747–1759. Mori, S., Wakana, S., van Zijl, P. C. M., & Nagae-Poetscher,
Hopfinger, J. B., Buonocore, M. H., & Mangun, G. R. (2000). L. M. (2005). MRI atlas of human white matter. Amsterdam:
The neural mechanisms of top-down attentional control. Nat. Elsevier.
Neurosci., 3, 284–291. Mort, D. J., Malhotra, P., Mannan, S. K., Rorden, C.,
Husain, M., & Kennard, C. (1996). Visual neglect associated with Pambakian, A., Kennard, C., et al. (2003). The anatomy of
frontal lobe infarction. J. Neurol., 243, 652–657. visual neglect. Brain, 126, 1986–1997.
Iacoboni, M., & Wilson, S. M. (2006). Beyond a single area: Nieuwenhuys, R., Voogd, J., & van Huijzen, C. (1988). The human
Motor control and language within a neural architecture encom- central nervous system. Berlin: Springer.
passing Broca’s area. Cortex, 42, 503–506. Petrides, M., & Pandya, D. N. (1984). Projections to the frontal
Karnath, H.-O., & Dieterich, M. (2006). Spatial neglect—a cortex from the posterior parietal region in the rhesus monkey.
vestibular disorder? Brain, 129, 293–305. J. Comp. Neurol., 228, 105–116.
Karnath, H.-O., Ferber, S., & Himmelbach, M. (2001). Spatial Petrides, M., & Pandya, D. N. (1988). Association fiber pathways
awareness is a function of the temporal not the posterior parietal to the frontal cortex from the superior temporal region in the
lobe. Nature, 411, 950–953. rhesus monkey. J. Comp. Neurol., 273, 52–66.
Karnath, H.-O., Fruhmann-Berger, M., Küker, W., & Rorden, Petrides, M., & Pandya, D. N. (2007). Efferent association path-
C. (2004). The anatomy of spatial neglect based on voxelwise ways from the rostral prefrontal cortex in the macaque monkey.
statistical analysis: A study of 140 patients. Cereb. Cortex, 14, J. Neurosci., 27, 11573–11586.
1164–1172. Poeck, K., de Bleser, R., & Graf von Keyserlingk, D.
Karnath, H.-O., Niemeier, M., & Dichgans, J. (1998). Space (1984). Computed tomography localization of standard
exploration in neglect. Brain, 121, 2357–2367. aphasic syndromes. In F. C. Rose (Ed.), Advances in neurology,
Karnath, H.-O., Rorden, C., & Ticini, L. F. (2009). Damage to Vol. 42: Progress in aphasiology (pp. 71–89). New York: Raven
white matter fiber tracts in acute spatial neglect. Cereb. Cortex, in Press.
press. Powell, H. W. R., Parker, G. J. M., Alexander, D. C., Symms,
Kertesz, A., Harlock, W., & Coates, R. (1979). Computer tomo- M. R., Boulby, P. A., Wheeler-Kingshott, C. A. M., et al.
graphic localization, lesion size, and prognosis in aphasia and (2006). Hemispheric asymmetries in language-related pathways:
nonverbal impairment. Brain Lang., 8, 34–50. A combined functional MRI and tractography study. NeuroImage,
Kier, E. L., Staib, L. H., Davis, L. M., & Bronen, R. A. (2004). 32, 388–399.
MR imaging of the temporal stem: Anatomic dissection tractog- Rizzolatti, G., & Arbib, M. A. (1998). Language within our grasp.
raphy of the uncinate fasciculus, inferior occipitofrontal fascicu- Trends Neurosci., 21, 188–194.
lus, and Meyer’s Loop of the optic radiation. Am. J. Neuroradiol., Rorden, C., Karnath, H.-O., & Bonilha, L. (2007). Improving
25, 677–691. lesion-symptom mapping. J. Cogn. Neurosci., 19, 1081–1088.
Kreisler, A., Godefroy, O., Delmaire, C., Debachy, B., Sanai, N., Mirzadeh, Z., & Berger, M. S. (2008). Functional
Leclercq, M., Pruvo, J.-P., et al. (2000). The anatomy of outcome after language mapping for glioma resection. N. Engl.
aphasia revisited. Neurology, 54, 1117–1123. J. Med., 358, 18–27.
Luh, K. E., Butter, C. M., & Buchtel, H. A. (1986). Impairments Sarri, M., Greenwood, R., Kalra, L., & Driver, J. (2009). Task-
in orienting to visual stimuli in monkeys following unilateral related modulation of visual neglect in cancellation tasks.
lesions of the superior sulcal polysensory cortex. Neuropsychologia, Neuropsychologia, 47, 91–103.
24, 461–470. Saur, D., Kreher, B. W., Schinell, S., Kümmerer, D.,
Makris, N., Kennedy, D. N., McInerney, S., Sorensen, A. G., Kellmeyer, P., Vry, M. S., et al. (2008). Ventral and dorsal
Wang, R., Caviness, V. S., Jr., et al. (2005). Segmentation of pathways for language. Proc. Natl. Acad. Sci. USA, 105,
subcomponents within the superior longitudinal fascicle in 18035–18040.
humans: A quantitative, in vivo, DT-MRI study. Cereb. Cortex, Schmahmann, J. D., & Pandya, D. N. (2006). Fiber pathways of
15, 854–869. the brain. New York: Oxford University Press.
karnath: a right perisylvian neural network for human spatial orienting 267
Schmahmann, J. D., Pandya, D. N., Wang, R., Dai, G., D’Arceuil, healthy subjects: A combined fMRI and DTI study. NeuroImage,
H. E., de Crespigny, A. J., et al. (2007). Association fiber path- 35, 1064–1076.
ways of the brain: Parallel observations from diffusion spectrum Vigneau, M., Beaucousin, V., Hervé, P. Y., Duffau, H.,
imaging and autoradiography. Brain, 130, 630–653. Crivello, F., Houdé, O., et al. (2006). Meta-analyzing left
Seltzer, B., & Pandya, D. N. (1984). Further observations on hemisphere language areas: Phonology, semantics, and sentence
parieto-temporal connections in the rhesus monkey. Exp. Brain processing. NeuroImage, 30, 1414—1432.
Res., 55, 301–312. Wakana S., Jiang, H., Nagae-Poetscher, L. M., van Zijl,
Talairach, J., & Tournoux, P. (1988). Co-planar stereotaxic atlas of P. C. M., & Mori, S. (2004). Fiber tract-based atlas of human
the human brain. Stuttgart: Thieme. white matter anatomy. Radiology, 230, 77–87.
Thiebaut de Schotten, M., Kinkingnéhun, S., Delmaire, C., Wardak, C., Olivier, E., & Duhamel, J.-R. (2002). Neglect in
Lehéricy, S., Duffau, H., Thivard, L., et al. (2008). Visualiza- monkeys: Effect of permanent and reversible lesions. In H.-O.
tion of disconnection syndromes in humans. Cortex, 44, Karnath, A. D. Milner, & G. Vallar (Eds.), The cognitive and
1097–1103. neural bases of spatial neglect (pp. 101–118). Oxford, UK: Oxford
Thiebaut de Schotten, M., Urbanski, M., Duffau, H., Volle, University Press.
E., Lévy, R., Dubois, B., et al. (2005). Direct evidence for a Wardak, C., Olivier, E., & Duhamel, J.-R. (2004). A deficit in
parietal-frontal pathway subserving spatial awareness in humans. covert attention after parietal cortex inactivation in the monkey.
Science, 309, 2226–2228. Neuron, 42, 501–508.
Upadhyay, J., Hallock, K., Ducros, M., Kim, D.-S., & Ronen, Watson, R. T., Heilman, K. M., Miller, B. D., & King, F. A.
I. (2008). Diffusion tensor spectroscopy and imaging of the (1974). Neglect after mesencephalic reticular formation lesions.
arcuate fasciculus. NeuroImage, 39, 1–9. Neurology, 24, 294–298.
Urbanski, M., Thiebaut de Schotten, M., Rodrigo, S., Catani, Watson, R. T., Miller, B. D., & Heilman, K. M. (1978).
M., Oppenheim, C., TouzÉ, E., et al. (2008). Brain networks of Nonsensory neglect. Ann. Neurol., 3, 505–508.
spatial awareness: Evidence from diffusion tensor imaging trac- Watson, R. T., Valenstein, E., Day, A., & Heilman, K. M.
tography. J. Neurol. Neurosurg. Psychiatry, 79, 598–601. (1994). Posterior neocortical systems subserving awareness and
Vallar, G., & Perani, D. (1986). The anatomy of unilateral neglect. Arch. Neurol., 51, 1014–1021.
neglect after right-hemisphere stroke lesions: A clinical/CT-scan Zilles, K., Schleicher, A., Palomero-Gallagher, N., &
correlation study in man. Neuropsychologia, 24, 609–622. Amunts, K. (2002). Quantitative analysis of cyto- and receptor
Vernooij, M. W., Smits, M., Wielopolski, P. A., Houston, G. C., architecture of the human brain. In J. C. Mazziotta & A. Toga
Krestin, G. P., & van der Lugt, A. (2007). Fiber density asym- (Eds.), Brain mapping: The methods (pp. 573– 602). Amsterdam:
metry of the arcuate fasciculus in relation to functional hemi- Elsevier.
spheric language lateralization in both right- and left-handed
268 attention
18 Spatial Deficits and
Selective Attention
lynn c. robertson
abstract The focus of this chapter is on spatial deficits that spatial representations and their interactions with perception
produce a complete or partial loss of spatial awareness of the visual and other attention mechanisms.
world after damage to the dorsal pathway of the human brain. Not
surprisingly, when spatial awareness is deficient, controlling spatial
attention is also compromised. Yet even with complete loss of The loss of perceptual space
spatial information of the external world, object-based and feature-
based processes continue to influence what is seen. Nevertheless, I will begin by classifying spatial deficits in behavioral
features may be inaccurately bound to form abnormal rates of neurology into three general classes: complete (there is no
illusory conjunctions even under free viewing conditions. There there there), partial (only a portion is there), and scrambled
also is emerging evidence from priming studies that conjunctions
(here when it should be there). Complete, or nearly com-
are bound late in processing whereas features are coded early. How
the multiple spatial representations in the brain may interact to plete, loss of a mental spatial map can be observed in Balint’s
influence selection is also discussed. syndrome (Balint, 1909; Holmes & Horax, 1919; Rafal,
1997), partial loss can be observed in unilateral neglect
(Heilman, Watson, & Valenstein, 1994; Bartolomea &
The influential 18th-century philosopher Immanuel Kant Chokron, 2001), and scrambling is seen in integrative
claimed that space and time were the two necessary mental agnosia, where stimuli are processed piecemeal, producing
concepts supporting all other human experience. A mental a fragmented percept of parts with little overall coherence
representation of space separates sensory experience occur- (Riddoch & Humphreys, 1987a). Although any of these can
ring at the same time into different entities that are spatially be observed in more than one modality, the deficits appear
segregated yet related to one another, while a mental repre- most often or at least are more obvious in vision.
sentation of time separates sequentially presented informa- All three types of spatial loss have been associated with
tion into segmented events. Kant himself wrote, “We never (although not necessarily limited to) posterior damage of
can imagine or make a representation to ourselves of the the human brain, with complete and partial loss more preva-
nonexistence of space.” lent after dorsal damage (although see Karnath, Ferber, &
Although it is very nearly impossible to imagine a world Himmelback, 2001), whereas spatial scrambling is more
in which space does not exist, there are individuals with prevalent after ventral damage. Complete loss occurs after
damage to certain brain areas who must contend with the bilateral dorsal damage, and partial loss occurs after unilat-
loss of spatial perception on a daily basis. These are neuro- eral damage. Also, partial loss and scrambling are more
logical patients who have suffered unilateral damage to likely to occur after right than eft hemisphere damage
parietal (and/or less often frontal or superior temporal) (Heilman et al., 1994; Ivry & Robertson, 1998; Mesulam,
areas, producing unilateral neglect, and those with bilateral 1981). The common denominator for complete and partial
parietal damage, which can produce a complete loss of loss is damage to the parietal lobe, but the areas involved
spatial information beyond a person’s own body (Balint’s are different depending on the nature of the spatial deficits.
syndrome). Studies of such individuals have shown that In fact, several researchers have suggested that the critical
certain perceptual experiences remain relatively intact, but areas that produce unilateral neglect (a partial loss of space)
others are altered or lost altogether. Contrary to Kant’s are centered in the temporal-parietal junction and the
claims, a mental representation of space is not necessary inferior parietal lobe (e.g., Heilman et al., 1994; Mort
for all perceptual phenomena, and the exceptions provide et al., 2003, but see Karnath, chapter 17 in this volume).
insights into the cognitive and neurobiological bases of Conversely, a recent review of this literature has made
a compelling argument for unilateral neglect as a
lynn c. robertson Veterans Administration Research, Depart- disconnection syndrome. When lesions include the white
ment of Psychology, and Helen Wills Neuroscience Institute, matter tracts that connect posterior regions to the fron-
University of California, Berkeley, California tal lobe, neglect is more severe and more likely to be
270 attention
Figure 18.2 Object-based effects (reaction time difference the background was split and when it was completed (top). Mean
between the two invalid conditions—between minus within) for reaction time for each invalid condition (bottom). The effects were
rectangles perceived as holes and those perceived as objects when significant for all but the completed background/holes condition.
of the screen. The two objects were superimposed and can be large, small, simple, or complex). This new object is
appeared transparent. On each trial either the house or the then perceived for a time, until another object takes its place.
face moved slightly. While making judgments about the The computations needed to individuate objects or to define
motion, there was more activity in ventral areas of the brain their locations are absent, and the objects that attract atten-
that respond to houses (parahippocampal place area) when tion can come from anywhere in the visual scene. Although
the houses moved and more activity in the areas of the brain patients can move their eyes in any given direction on
that respond to faces (fusiform face area) when faces moved. command, they only rarely do so unless instructed by the
Attentional selection for motion (a feature that drives differ- observer. They suffer from what Balint (1909) called a
ent areas of the brain than houses or faces) incorporated the “pseudo paralysis of gaze.” It is as if the one object they do
object that was moving as well. see fills their entire visual field. Consistently, RM, the Balint’s
If object- and space-based attention utilize different neural patient whom we studied for several years in my laboratory,
systems, complete loss of space with Balint’s syndrome reported that the size of objects did not appear as they
should attract attention to objects, but there should be little should be.
or no knowledge of where they are located, and this result The fact that an object as a whole can be perceived at all
is what occurs. Individuals with this problem see only one when spatial deficits are nearly complete is very strong evi-
object at any given moment (known as simultanagnosia), but dence for an object-based system that is separate from a
they can be at chance in locating it. Neither reaching, point- space-based one. However, when the spatial map that indi-
ing, nor verbally reporting where the object is located is viduates objects is absent, voluntary movement of attention
accurate. Even spatial judgments such as saying whether the between objects is also a problem. It is as if there is a lineup
object is toward the top or bottom of the screen when they of objects that compete for object selection when spatial
are several inches apart or toward the patient’s own head or information is gone.
feet can be near chance levels. In addition, an object in a
visual display will automatically and seemingly randomly Balint’s Syndrome and Feature-Based Attention
attract attention with no control over how long the object Feature-based attention is also well supported in the
remains in view or what object will take its place. The one perceptual literature. It too is thought to be separate from
object that is seen abruptly disappears and is replaced by space-based as well as object-based attention. Indeed,
another object (which need not be in the line of sight and attending to a feature in one location makes it difficult to
272 attention
that guides it is damaged. Other studies with Balint’s patients confusion. Further research is needed to sort out which of
have corroborated this conclusion. For instance, utilizing a these accounts is more plausible.
cue to attend to a given location in anticipation of an
upcoming target (Posner, 1980) is all but impossible (L. Consequences of partial spatial loss and attention
Robertson & Rafal, 2000), and shifting attention from local
to global levels of a display is disrupted as well. Partial spatial loss is much more frequent than Balint’s syn-
Nevertheless, exogenous spatial orienting appears to be drome, and as a result more is known about how syndromes
intact, and priming measures have shown that shape at a such as unilateral neglect affect perception and attention. In
global level of a hierarchically structured pattern is implicitly addition to the many scientific papers on the subject, there
represented even when it is not perceived (Egly, Robertson, are several books and chapters that discuss the symptoms,
Rafal, & Grabowecky, 1995; Karnath, Ferber, Rorden, & diagnosis, rehabilitation, and/or natural course of recovery
Driver, 2000). In addition, priming studies have shown that over time (see DeRenzi, 1982; Driver, Veuilleumier, &
implicit spatial information is present in Balint’s patients Husain, 2004; Karnath, Milner, & Valler, 2002; Heilman,
(Kim & Robertson, 2001; L. Robertson et al., 1997). For Watson, & Valenstein, 2003; I. Robertson & Halligan, 1999).
example RM read the word “up” faster when it was in the Although the neuroanatomical damage that produces uni-
upper part of a rectangle than the word “down” in the same lateral visual neglect is thought to be more anterior than that
location and vice versa, while he was at chance in reporting found in Balint’s syndrome (see L. Robertson, 2004), the two
the word’s location. This result leads to the question of why spatial problems can produce similar spatial attention defi-
these spatial maps are not accessible after damage to bilat- cits that can affect perception in similar ways, but obviously
eral occipital-parietal areas. more severely on the contralesional than ipsilesional side for
There is ample evidence from neurobiological studies for neglect.
the existence of multiple spatial maps throughout the visual Studies of patients with unilateral visual neglect have
cortex. Topographical tuning of visual neural responses in shown that feature search, although longer on the neglected
animals and fMRI BOLD responses in humans have been than unneglected side, remains relatively intact. The differ-
mapped from the fine spatial resolution of V1 to the rough ence in search rate for features on the contralesional and
spatial resolution of the parietal lobes, as well as in posterior ipsilesional sides for patients with neglect is not a result of
temporal lobes and the frontal eye fields (Anderson, the fact that a normally parallel search turns into a serial
Batista, Snyder, Buneo, & Cohen, 2000; Colby & Goldberg, search, since adding distractors on the neglected side does
1999; Desimone & Duncan, 1995; Graziano & Gross, 1994; not change response time or accuracy of feature detection
Grill-Spector & Malach, 2004; Laeng, Brennen, & Espeseth, (Brooks, Wong, & Robertson, 2005; Esterman, McGlinchey-
2002; Silver, Ress, & Heeger, 2005; Wandell, Brewer, & Berroth, & Milberg, 2000). Conversely, searching for the
Dougherty, 2005). Gross and Graziano (1995) argued that conjunction of two features (requiring binding and endoge-
the parietal lobe functions as a selection hub and noted its nous control of attention) is either not initiated on the
strong connections to several other areas of the brain that neglected side or substantially delayed (often taking a minute
contain topographical maps. Basically, the suggestion is that or more to begin). Most importantly, the time required to
the parietal lobe is the gatekeeper in a network of spatial make a decision about the presence or absence of a target
maps that decides which spatial map to access for the task on the contralesional side increases as the number of distrac-
at hand. The loss of parietal functioning would deny access tors increases (Eglin, Robertson, & Knight, 1989; Esterman,
to remaining spatial maps through disconnection. et al.; Laeng et al., 2002; Pavolovskya, Ring, Groswasser, &
Another possibility is that parietal functions integrate Hochstein, 2002; Riddoch & Humphreys, 1987b). These
spatial information from other areas that contain topograph- findings demonstrate that not all information on the
ical information, which then emerges into what Treisman neglected side fails to attract attention. Features that are
(1988) called a “master map of locations,” and it is this map coded independent of spatial awareness continue to pop out,
that guides voluntary spatial attention (L. Robertson, 2003). while conjunctions that require controlled spatial attention
It is also this map that allows for the perception of a unified do not.
spatial world. In this scenario, the loss of parietal function Illusory conjunctions are also more likely on the neglected
directly damages the master map and consequently the nec- side of space in cases of unilateral neglect (Cohen & Rafal,
essary spatial information for object individuation, feature 1991), similar to Balint’s syndrome, but of course limited to
colocation, perceptual organization, and of course the vol- the contralesional side. The most apparent problem in both
untary control of spatial attention. The hypothesis is that syndromes is a deficit in spatial attention, and this fact
spatial information that feeds into the computation of the led to the idea that Balint’s syndrome was the bilateral
master map stays below the level of spatial awareness even version of unilateral neglect. However, this seems not to be
in normal perception to attenuate the possibility of spatial the case, as there are important differences between the two
syndromes other than their bilateral/unilateral incarnation. were complete line drawings of objects followed 50 minutes
For instance, between 50% and 80% of patients with uni- later by probes that were either new or from the prime list.
lateral neglect also show evidence of object-based neglect The probes varied in the number of line segments used to
(figure 18.3), while patients with Balint’s syndrome see draw each object (from sparse to dense) and were presented
nothing but objects, even if only one at a time. Their where the stimuli were clearly visible. Fragmentation thresh-
commonality in disrupting spatial attention does not affect old was measured and defined as the number of fragments
object-based attention in a common way. Also, neglect often that were necessary for the patients to identify the probe.
disrupts the very concept of space on the neglected side. Thresholds for primes that were not detected at all were
Patients with neglect act as if that side does not exist, while lower for old than for new objects and closer to primes that
patients with Balint’s syndrome know there is a space out had been seen during the prime phase (although still signifi-
there; they just cannot see it. This observation leads to the cantly higher).
possibility that one syndrome represents direct damage to Findings such as these show that undetected objects are
spatial selection, whereas the other represents the loss of a processed up to and including semantic knowledge and
master spatial map on which selection relies. suggest that visual objects are bound as a whole before atten-
tion is engaged. They have been used to argue that attention
Implicit Processing and Neglect There has been much simply acts to modulate information, boosting these preat-
interest in what gets processed outside the focus of attention, tentively bound items above some threshold for awareness.
both in normal observers and in patients with attentional However, as the next section will show, this is not always
disorders. The case of unilateral neglect presents a situation the case. Attention does have a role to play in addition to
in which the effects of spatial awareness can be studied in the modulation.
same individual by comparing performance when stimuli are
presented on the ipsilesional and contralesional sides of Implicit Processing, Binding, and Attention The
the visual field. It also provides an opportunity to explore the evidence for implicit representations of objects as a whole in
level at which perceptual processing takes place when the the neglected field is consistent with results of studies with
very existence of a sensory event in neglected space is absent. normal observers. For instance, using psychophysical
As with studies of Balint’s syndrome, priming methods measures, Breitmeyer, Ogmen, Ramon, and Chen (2005)
have shown that a great deal of visual information exists varied the time between a shape and mask and examined
below the level of awareness in the neglected field. For priming effects for wholes and parts. In one case the shapes
instance, an undetected line drawing of a baseball bat pre- were “invisible,” and in the other case they were “visible.”
sented on the neglected side speeds the ability to determine The prime shapes were a square and diamond with one
whether the letter string “baseball” is a word or not being shown on each trial followed by a mask that was
(McGlinchey-Berroth, Milbert, Verfaellie, Alexander, & either congruent or incongruent in shape with the prime
Kilduff, 1993). The semantics of the undetected object (the (figure 18.4). The primes were either wholes (had connected
baseball bat) primes the word decision response. contours) or parts (e.g., corners). The primes were shown for
Priming from undetected objects in the neglected field can 13 ms followed by a mask 40 or 200 ms latter. At 40 ms
be observed even after almost an hour delay between the participants were unable to identify the primes any better
prime and probe. Vuilleumier, Schwartz, Clarke, Husain, than chance, but at 200 ms, accuracy in discriminating the
and Driver (2002) showed primes on the neglected side that primes was about 95%. This difference in accuracy was
274 attention
Figure 18.4 Example of stimuli used by Breitmeyer, Ogmen, Ramon, and Chen (2005) to study preattentive binding of parts into
shapes.
about the same whether the primes were wholes or parts. the combination of feature and shape (i.e., conjunctions),
But the most interesting finding was that congruency between there was no evidence to support conjunction priming in the
the prime and mask influenced reaction times to report invisible condition, whereas priming was evident in the
whether the mask was a square or a diamond even in the visible condition.
40-ms condition where the primes were invisible. Whether These results are also consistent with other behavioral
the primes were wholes or corners, responses to the mask findings reported by Lavie (1997) examining the effects of
shape in the congruent conditions were faster than those in focused and distributed attention on feature and conjunction
the incongruent conditions. The results are consistent with processing using a flanker task. Three colored shapes were
an implicit holistic representation of the prime. However, arranged horizontally across the screen with a target appear-
there was a difference between priming by wholes and parts ing either in the same location on every trial (focused atten-
that depended on visibility, with the congruency effects tion) or in one of the three locations (divided attention).
being larger for wholes than parts when the primes were When participants were focused on a location throughout
invisible, but the reverse when they were visible. At the the trial, congruency between the target and flanker features
preattentive level, shape representations were stronger when (color or shape) influenced response time, with incongruent
the primes were wholes than when they were parts, while features causing more interference than congruent features.
the reverse was true when they were visible. These results However, flankers that contained the combination of the
suggest that more processing was needed in perceptually two target features were no more influential than the fea-
binding the corners into a square when the primes were tures alone. Conversely, when attention was divided across
visible, possibly because of their weaker representation at the the display, both incongruent feature and incongruent
preattentive level. conjunction flankers interfered with response time. Together
A more recent study by Tapia, Breitmeyer, and Schooner these studies suggest that shapes may be bound preatten-
(in press) used a similar design to investigate effects of target/ tively, but that features that are properties of a shape (see
mask congruency in binding color and shape and found no Treisman, 1996) such as color require spatial attention to be
evidence for preattentive binding in this case. The primes correctly bound.
were again square or diamond shapes, but now all had con- Tom Van Vleet and I recently pursued this issue by study-
necting contours, and the “parts” were shape and color. The ing three patients with left neglect resulting from right hemi-
shapes were either blue or green on each trial, and the mask sphere stroke in the middle cerebral artery distribution. In
was congruent with either the color, the shape, both, or a priming study we presented feature or conjunction displays
neither. When participants were instructed to respond to the as primes in the periphery followed by probes presented at
color only or shape only, prime visibility did not affect fixation. Using a staircase procedure, we first determined
the degree of priming. However, when they responded to how long a search display had to be presented (threshold
presentation time, TPT) to produce a high (75%) or low The studies in normal observers also suggest that there is a
(25%) probability of target detection for each patient for difference between preattentive binding of features such
features on the one hand and conjunctions on the other as color and shape and preattentive binding of parts into
(figure 18.5). In this way, we equated for detection of features shape. Top-down control is involved in binding depending
and conjunctions at two levels of difficulty: one in which on what is to be bound. The visual system appears to work
targets were detected most of the time and another in which independent of spatial awareness when integrating parts
they were missed most of the time. We then used the result- into whole shapes but seems to require spatial attention to
ing TPTs in a priming stage of the experiment, showing the bind across separated feature maps. It has been suggested
primes (the same feature and conjunction displays) at the that the neural signal involved in binding produces increased
estimated high and low detection TPTs, randomly presented gamma band responses in the EEG (Womelsdorf & Fries,
throughout a block of trials. The prime was followed 500 ms chapter 20, this volume). Consistently, Landau, Esterman,
later by a single colored shape in the center of the screen, Robertson, Bentin, and Prinzmetal (2007) showed that vol-
and the patient was asked to respond yes or no as rapidly as untary spatial attention produced greater induced gamma
possible whether it was a red triangle or not. Priming effects than did automatic spatial orienting. Likewise, increased
were calculated by subtracting responses to the red triangle BOLD activity in the fusiform face area has been observed
when the prime was neutral from when it was either a when a cue predicts the location of an upcoming target
feature or conjunction target. Priming effects were greater face but not when the same cue is unpredictive (Esterman
for the conjunction condition when the target was more et al., 2008). Voluntary attention seems to increase the per-
often visible (high TPT) than when it was more often invisi- ceptual fidelity of the target, while involuntary does not
ble (low TPT). Conversely, although there was significant (Prinzmetal, McCool, & Park, 2005).
priming in the feature condition, it did not differ for high
and low TPT (figure 18.6). Follow-up studies demonstrated Feature Integration Theory (FIT) Revisited According
that neither the differences in the number of red and blue to feature integration theory as it was originally proposed
items in the feature and conjunction displays shown in figure by Treisman and Gelade (1980), the reason spatial attention
18.5 nor the color differences between the right and left sides was engaged in conjunction search but not feature
of the display could account for these effects. search was the binding requirements in detecting a
The evidence from normal observers and patients with conjunction target among multiple items with similar
unilateral neglect converges to support the conclusion that features. Some investigators have argued that it is not
features coded by specialized neural populations are inte- the binding requirements per se but rather differences in
grated through spatial attentional control and that parietal difficulty (some call it saliency) between targets embedded in
functions are critical. Features themselves prime a subse- the two types of search displays. When difficulty is high,
quent response whether or not they are likely to be detected. more top-down mechanisms that control attention must be
276 attention
The related question is whether the typical differences in
difficulty between feature and conjunction search are a suf-
ficient explanation or whether something else is required,
namely, a binding process. The study of neglect described
earlier equated search difficulty at a high and low detection
threshold, yet found differences in priming between feature
and conjunction displays. There is also fMRI evidence that
binding can be separated from search difficulty. Donner and
colleagues (2002) presented normal participants with con-
junction and feature search displays, with the feature search
being either the same as or different from the conjunction
search in difficulty. Consistent with a difficulty component
for search, they found more parietal and frontal activity for
both the hard-feature and conjunction-search tasks than for
the easy-search task. Nonetheless, they also found an area of
activation that could not be explained by difficulty. Activity
at the junction of the posterior inferior parietal sulcus and
dorsal occipital lobe was more pronounced with conjunction
search than with hard-feature search even though behavior-
ally they were equally difficult. These findings are consistent
with a feature-binding mechanism that is engaged in con-
junction search but not feature search. They also support
FIT in that parietal activity increases whenever a serial
search is initiated, but parietal functions are also involved
when feature integration is required. The findings are also
consistent with previous results using PET to examine the
neurobiology of conjunction and feature search. For instance,
Corbetta, Shulman, Miezin, and Petersen (1995) showed
search displays and varied the task to look for either a feature
(motion or color) or the conjunction of motion and color.
They found increased activity in both posterior temporal
and superior parietal lobes when participants responded to
the conjunction of motion and color but only temporal acti-
vation when they responded to only motion or only color.
Conclusions
Even when the representation of external space disappears
completely with damage to both parietal lobes, anatomically
intact areas outside these regions continue to respond to
objects and object parts as well as basic features encoded
Figure 18.6 Differences in reaction time to respond to the central by specialized neural populations (e.g., color, size, motion).
target as a function of prime type (respective search prime minus However, the locations of these objects and their surface
neutral prime conditions) for three different patients.
features may not be accurately bound together in percep-
tion. Studies of the interaction between spatial maps, atten-
tion, and binding with patients who suffer spatial loss (both
engaged than when difficulty is low, and when more top- complete and partial) have contributed to a better under-
down attentional control is needed, parietal activity will standing of the neural systems involved in the perception of
be higher. There is general agreement that dorsal frontal- a normally unified spatial world. They also support a role
parietal attentional systems are activated under most for this spatial map in guiding spatial attention and in per-
conditions when top-down attentional control is needed to ceiving properly bound features. These studies have also
perform a task. clarified what a patient with spatial loss may and may not
278 attention
gyrus during on-line storage of spatial memoranda. J. Cogn. Robertson, L. C. (2003). Binding, spatial attention and perceptual
Neurosci., 14, 659–671. awareness. Nat. Rev. Neurosci., 4, 93–102.
McGlinchey-Berroth, R., Milbert, W. P, Verfaellie, M., Robertson, L. C. (2004). Space, objects, minds and brains. New York:
Alexander, M., & Kilduff, P. T. (1993). Semantic processing Psychology Press (Francis & Taylor), Essays in Cognitive
in the neglected visual field: Evidence from a lexical decision Science.
task. Cogn. Neuropsychol., 10, 79–108. Robertson, L. C., & Rafal, R. (2000). Disorders of visual
Mesulam, M.-M. (1981). A cortical network for directed attention attention. In M. Gazzaniga (Ed.), The New Cognitive Neurosciences.
and unilateral neglect. Ann. Neurol., 4, 309–325. Cambridge, MA: MIT Press.
Mort, D. J., Malhotra, P., Mannan, S. K., Rorden, C., Robertson, L. C., Treisman, A., Friedman-Hill, S., &
Pambakian, A., Kennard, C., et al. (2003). The anatomy of Grabowecky, M. (1997). The interaction of spatial and object
visual neglect. Brain, 126, 1986–1997. pathways: Evidence from Balint’s syndrome. J. Cogn. Neurosci., 9,
O’Craven, K. M., Downing, P. E., & Kanwisher, N. (1999). 295–317.
fMRI evidence for objects as the units of attentional selection. Saenz, M., Buracas, G. T., & Boynton, G. M. (2002). Global
Nature, 401, 584–587. effects of feature-based attention in human visual cortex. Nat.
Pavlovskaya, M., Ring, H., Groswasser, Z., & Hochstein, Neurosci., 5, 631–632.
S. (2002). Searching with unilateral neglect. J. Cogn. Neurol., 14, Silver, M. A., Ress, D., & Heeger, D. J. (2005). Topographical
745–756. maps of visual spatial attention in human parietal cortex.
Posner, M. I. (1980). Orienting of attention. Q. J. Exp. Psychol., 32, J. Neurophysiol., 94, 1358–1371.
3–25. Tapia, E., Breitmeyer, B., & Schooner, C. (in press). Role of task-
Prinzmetal, S., McCool, C., & Park, S. (2005). Attention: directed attention in nonconscious and conscious response
Reaction time and accuracy reveal different mechanisms. J. Exp. priming by form and color. J. Exp. Psychol. Hum. Percept. Perform.
Psychol., 134, 73–92. Treisman, A. (1988). Features and Objects: The Fourteenth
Rafal, R. (1997). Balint syndrome. In T. E. Feinberg & M. J. Bartlett Memorial Lecture. Q. J. Exp. Psychol. [A], 40, 201–237.
Farah (Eds.), Behavioral neurology and neuropsychology. New York: Treisman, A. M. (1996). The binding problem. Curr. Opin.
McGraw Hill. Neurobiol., 6, 171–178.
Rafal, R. (2001). Balint’s syndrome. In M. Behrmann (Ed.), Dis- Treisman, A. M., & Gelade, G. (1980). A feature-integration
orders of visual behavior (Vol. 4, pp. 121–141). Amsterdam: theory of attention. Cogn. Psych., 12, 97–136.
Elsevier Science. Treisman, A. M., & Schmidt, H. (1982). Illusory conjunctions in
Riddoch, M. J., & Humphreys, G. W. (1987a). A case of integrative perception of objects. Cogn. Psych., 14, 107–141.
visual agnosia. Brain, 110, 1431–1462. Vuilleumier, P., Schwartz, S., Clarke, K., Husain, M., &
Riddoch, M. J., & Humphreys, G. W. (1987b). Perception and Driver, J. (2002). Testing memory for unseen visual stimuli in
action systems in unilateral visual neglect. In M. Jeannerod (Ed.), patients with spatial neglect and extinction. J. Cogn. Neurosci., 14,
Neuropsychological and neurophysiological aspects of spatial neglect. 875–886.
Amsterdam: New Holland. Wandell, B. A., Brewer, A. A., & Dougherty, R. F. (2005).
Rizzo, M., & Vecera, S. P. (2002). Psychoanatomical substrates of Visual field map clusters in human cortex. Philos. Trans. R. Soc.
Balint’s syndrome. J. Neurol. Neurosurg. Psychiatry, 72, 162–178. London B Biol. Sci., 360, 693–707.
Robertson, I. H., & Halligan, P. W. (1999). Spatial neglect: A clinical
handbook for diagnosis and treatment. East Sussex, UK: Psychology
Press.
abstract Performance on sensory tasks depends not only on the neurons in the sensory epithelia, and broad fiber tracts
quality of the sensory signals that are available, but also on the convey the results to more central structures that carry out
aspects of the sensory signals that the subject attends to. Recordings further parallel computations. The brain uses parallel hard-
from individual neurons in trained, behaving monkeys have shown
that attention to particular visual stimuli alters the way that those
ware throughout. Even the simplest behavioral response
stimuli are represented in cerebral cortex. The primary effect of depends on the concerted activity of thousands of motor
attention appears to be a gain change, which increases the responses neurons.
of neurons that represent attended stimuli while decreasing the It seems ironic that while computer engineers strive to
responses of other neurons. This gain change affects responses to make the serial hardware at the heart of a computer emulate
all stimuli proportionately, without affecting the selectivity of
parallel behaviors, the brain, with a low-level architecture
neurons or the stimulus that they prefer. This effect alone can
explain much of the improvement in behavioral performance that that seems to be the embodiment of parallel processing,
is conferred by attention. completes tasks in a largely serial way. Unlike a multitasking
computer, people and animals generally do one thing during
any brief interval. While certain vegetative functions can
One of the most useful features of a personal computer is its operate autonomously and in parallel, the higher function
ability to multitask. A computer can simultaneously input of the central nervous system seems to be largely limited to
text for a document, check the spelling and grammar of one task at a time.
recently entered words and phrases, copy the text to disk, There are physical constraints on what a single organism
check for updates to its software, put up reminders about can do at one time, but no aspect of body design would
appointments, and do dozens of other important chores. prevent a person from, say, doing completely independent
However, in its low-level hardware, the computer is a serial tasks with each hand. Nevertheless, the brain does most tasks
device. All its tasks are accomplished by a central processing one at a time. This serial nature of the brain is often described
unit (or a few central processing units) stepping rapidly in terms of attention. Attention determines which sensory
through a sequence of instructions. For the most part, paral- signals control behavior. We attend to one item or group
lel processing in a computer is an illusion created by a at a time and shift attention from one subject to another
central processing unit doing one task at a time, but switch- to accomplish our goals. Thus attention is a key player
ing between dozens of different tasks so rapidly that the user in the brain’s serial processing of tasks, and it limits the
does not notice. Much of the effort in computer design rate at which information can be processed. For example,
during the last few decades has been directed at making the evidence from split-brain patients suggests that visual search
hardware of computers more genuinely parallel by adding goes faster when each cerebral hemisphere employs an
more central processing units and by delegating minor tasks independent focus of attention (Luck, Hillyard, Mangun, &
to peripheral devices. Gazzaniga, 1989).
In contrast to a computer, the neurons and synapses What is attention, and how does it affect the processing
that make up the hardware of the brain process information of sensory signals in the brain? Some of the most detailed
in a massively parallel way. Sensory signals are collected information about the role of attention in controlling sensory
and analyzed simultaneously by hundreds of millions of processing has come from studies of the visual system. Here
we will focus on findings from studies that examine how
john h. r. maunsell Department of Neurobiology, Harvard attention affects some of the low-level hardware that sup-
Medical School, Howard Hughes Medical Institute, Boston, ports behaviors: the responses of individual units of visual
Massachusetts cerebral cortex in monkeys.
number
of neurons
0 0
0.25 0.5 1.0 2.0 4.0 0.25 0.5 1.0 2.0 4.0
attentional modulation attentional modulation
(attended / unattended) (attended / unattended)
Figure 19.1 Attentional modulation of neuronal responses is change of 90°). (B) When the same neurons were tested during
affected by task difficulty. (A) When animals shifted their attention interleaved trials when animals were doing a much more difficult
between a stimulus in the receptive field of a V4 neuron and a detection (orientation change of ∼10°), the median modulation was
distant stimulus, the median modulation across all neurons tested much greater, 24%. (Data from Boudreau, Williford, & Maunsell,
was only 7% when the task was easy (detecting an orientation 2006.)
282 attention
in which the animal was attending to one location or the that are large enough to hold two stimuli that are sufficiently
other to detect a much smaller change in orientation (∼10°), well separated that a subject can direct attention to one or
the locus of attention affected responses much more (median the other. If one stimulus is a preferred stimulus for the
modulation 24%, figure 19.1B). neuron and the other is a nonpreferred stimulus, shifting
The amount of attentional modulation of neuronal attention between them can produce two- or threefold
responses that is observed should be expected to depend on changes in neuronal response (Moran & Desimone, 1985;
details of the task design. On one hand, because most experi- Motter, 1994; Luck, Chelazzi, Hillyard, & Desimone, 1997;
ments do not push subjects to the limit of their performance, Treue & Maunsell, 1999).
the magnitude of the attentional modulation of neuronal Whether this stronger modulation reflects a different
responses measured in most experiments is almost certainly mechanism than that engaged when a single stimulus is in
more modest than it might be. On the other hand, we are the receptive field is an important question that remains to
rarely pushed to the limits of our abilities in everyday life, be addressed. To date, no experiment has measured atten-
so the results reported may be representative of the operat- tional modulations with one and two stimuli in the receptive
ing range of attention in typical situations. field under comparable conditions. It remains possible that
the greater modulation with two stimuli is only apparent,
How much does attention change the strength because measurements made with that configuration
of sensory responses? compare responses with attention to a preferred stimulus
with responses with attention to a nonpreferred stimulus.
Because attentional modulation depends on task conditions, With a single stimulus in the receptive field, the comparison
there can be no veridical answer to the question of how is instead between responses with attention to a preferred
much attention alters neuronal responses in different parts stimulus and with responses to a neutral stimulus (one well
of visually responsive cerebral cortex. Some general observa- outside the receptive field that cannot affect responses
tions are possible, however. First, most studies of attentional directly). Additionally, measurements with two stimuli inside
modulation of single units describe moderate modulations of a receptive field require closely spaced stimuli, which make
rate of firing in all parts of monkey visual cortex, averaging the task more difficult. This extra difficulty may contribute
10–50% (see Maunsell & Cook, 2002). Directing attention to the stronger attentional modulation in this condition.
toward or away from a stimulus is rarely seen to turn neurons More information about the role of attention with cluttered
on or off. Functional imaging studies of attentional mod- visual displays will fill an important lacuna in our under-
ulation of neural activity in corresponding regions of standing of attentional modulation. It is possible that the
human visual cortex have found stronger modulations than relatively sparse displays used in most experiments produce
those described for single units in monkeys (Kanwisher & modest neuronal modulation compared with that occurring
Wojciulik, 2000; Pessoa, Kastner, & Ungerleider, 2003). in natural viewing conditions. For example, it has been
However, EEG and local field potential recordings from reported that the responses of neurons in inferotemporal
human visual cortex typically find moderate effects that are cortex are virtually all-or-none depending on whether
more in keeping with the results from monkey single-unit monkeys notice a target in a cluttered scene (Sheinberg &
recording (Hillyard & Anllo-Vento, 1998; Yoshor, Ghose, Logothetis, 2001), but comparisons were not made using
Bosking, Sun, & Maunsell, 2007), suggesting that the differ- equivalent retinal stimulation in the two conditions.
ence may depend more on the indirect measure of neuronal While attention typically has modest effects on neuronal
activity used in functional imaging, rather than a species rate of firing in laboratory experiments, attention has other
difference. Overall, attention seems to act much more like a effects on neuronal response. Attention can also modulate
moderate filter for sensory representations, rather than a the amount of synchrony or gamma power in neuro-
gate, at least in the relatively reduced visual displays that are nal activity, both in monkey microelectrode recordings
used in most experiments. (Steinmetz et al., 2000; Fries, Reynolds, Rorie, & Desimone,
The amount of attentional modulation can depend on 2001; Taylor, Mandon, Freiwald, & Kreiter, 2005; Fries,
stimulus configurations, and some arrangements of visual Womelsdorf, Oostenveld, & Desimone, 2008) and human
stimuli consistently produce stronger attentional modula- macroelectrode recordings (see Gruber, Müller, Keil, &
tions. Most single-unit studies of attention compare neuronal Elbert, 1999; Müller, Gruber, & Keil, 2000; Müller
responses to a given stimulus while attention is directed & Gruber, 2001; Jensen, Kaiser, & Lachaux, 2007; Wyart
toward or away from that stimulus. However, shifting atten- & Tallon-Baudry, 2008). Changes in gamma-band activity
tion between two stimuli that are both within a neuron’s have been correlated with the behavioral signatures of
receptive field can produce stronger modulations. Neurons attention (Womelsdorf, Fries, Mitra, & Desimone, 2006).
in later stages of visual cortex generally have receptive fields While these changes in gamma oscillations or synchrony are
284 attention
and thereby shifted the contrast tuning curve toward their receptive field. Calculations showed that this increase
lower values. in response would improve the median neuron’s smallest
However, the effect these reports described was distinct discriminable orientation (for a peripheral stimulus) from
from that described for orientation or direction tuning 26.5° to 20.4°, by virtue of improved signal-to-noise.
curves, where attention scaled responses proportionally. If This study also considered whether attention might have
attention scaled contrast tuning curves proportionally, its a more dramatic effect on signal-to-noise by reducing the
greatest effect would have been at the highest contrasts, variance directly, but found no evidence for an effect on
where responses are strongest. A subsequent study of the variance beyond that expected from changing the strength
effect of attention on contrast response functions has called of the response. Recently, however, Mitchell, Sundberg, &
the earlier conclusions into question. Williford and Maunsell Reynolds (2007) reexamined the effects of attention in V4
(2006) reexamined the effects of attention on contrast tuning by classifying neurons as broad-spiking (putative pyramidal
curves in V4 and found they could not distinguish whether cells) and narrow-spiking (putative inhibitory interneurons)
effects were strongest at low contrast or high contrast. based on the duration of their action potentials. Although
Because most neurons in V4 do not show strong saturation attention seemed to have proportional effects on the rate
at high contrast, models describing effects primarily at high of firing of these two cell types, there was a difference in
and low contrast are both able to fit the data well. As noted its effects on the variance of their responses. Attention did
in that report, while each of the three studies of the effects not affect the relationship between response rate and
of attention on contrast tuning curves favored one or the response variance for the broad-spiking neurons, but for the
other model, none ruled out the alternative model as an narrow-spiking neurons there was additional variance when
acceptable description. a stimulus was unattended. The effect was small and
Thus it remains possible that attention does not have a appeared only at high rates of firing, and the responses of
special effect on stimuli of low contrast, but instead has a narrow-spiking neurons to attended stimuli had no less vari-
single effect on all neuronal tuning curves: a simple propor- ance than broad-spiking neurons showed for attended or
tional scaling of responses to all stimuli, without change in unattended stimuli. It is not clear what benefits arise from
the breadth of tuning or the preferred stimulus. It should be increasing the variance of responses of one class of cells to
possible to resolve this question by examining neurons with unattended stimuli.
strongly saturating tuning curves and collecting sufficiently This unexpected effect of attention on the variance of the
precise data to see whether attention has proportionally responses of narrow-spiking neurons leads to a more general
greater effects at low contrasts. question about attention and neuronal signal-to-noise. If
attention to a stimulus can produce better signal-to-noise
How does attentional modulation of neuronal through either stronger responses or less variance, why
responses improve behavior? would the brain ever want to decrease the signal-to-noise of
neuronal responses? One explanation is the cost of high rates
What is accomplished by modulating the strength of neuro- of firing. Metabolic expense is a considerable factor for the
nal responses? One obvious suggestion is that stronger brain, which consumes an inordinate amount of the body’s
responses can have a better signal-to-noise ratio. Sensory energy (Attwell & Laughlin, 2001; Lennie, 2003). It may not
neurons give variable responses. They produce different be practical to maintain the higher rates of firing to achieve
numbers of spikes in response to different presentations of higher signal-to-noise. Alternatively, the answer may instead
the same stimulus. For most neurons throughout visual cere- depend on how neuronal signals translate into target detec-
bral cortex, the variance in the number of spikes approxi- tion and false alarms. Most of the higher signal-to-noise that
mates the mean number of spikes for the response, as is true attention achieves is gained through making neurons more
for a Poisson process (Softky & Koch, 1993; Shadlen & sensitive. When the response of a neuron is enhanced by
Newsome, 1998). If the signal-to-noise ratio of a response is attention, it will give the same response to some nonpre-
defined as the mean response (signal) divided by the standard ferred stimuli that it would give to a preferred stimulus when
deviation of the response (noise), then signal-to-noise is its response is not enhanced. Enhanced responses to nonpre-
expected to improve systematically as responses are made ferred stimuli could be interpreted as false alarms about the
stronger, because the standard deviation is the square root presence of a preferred stimulus. When you search for red
of the variance, and it therefore remains proportional to the stimuli, it may be acceptable, or even adaptive, to have false
square root of the mean response. alarms from red-preferring neurons, but false alarms from
In a study of the effects of attention on the responses to red-preferring neurons are more likely to be maladaptive
different orientations in V4 (McAdams & Maunsell, 1999b), when searching for other colors.
neurons were found to respond an average of 30% more It has been suggested that attention increases the
strongly when attention was directed toward the stimulus in sensitivity of neurons according to how closely their response
286 attention
Hillyard, S. A., & Anllo-Vento, L. (1998). Event-related brain Müller, M. M., Gruber, T., & Keil, A. (2000). Modulation of
potentials in the study of visual selective attention. Proc. Natl. induced gamma band activity in the human EEG by attention
Acad. Sci. USA, 95, 781–787. and visual information processing. Int. J. Psychophysiol., 38,
Ikeda, T., & Hikosaka, H. (2003). Reward-dependent gain and 283–299.
bias of visual responses in primate superior colliculus. Neuron, 39, Navalpakkam, V., & Itti, L. (2007). Search goal tunes visual fea-
693–700. tures optimally. Neuron, 53, 605–617.
Jensen, O., Kaiser, J., & Lachaux, J. P. (2007). Human gamma- Pessoa, L., Kastner, S., & Ungerleider, L. G. (2003). Neuroim-
frequency oscillations associated with attention and memory. aging studies of attention: from modulation of sensory processing
Trends Neurosci., 30, 317–324. to top-down control. J. Neurosci., 23, 3990–3998.
Kanwisher, N. G., & Wojciulik, E. (2000). Visual attention: Platt, M. L., & Glimcher, P. W. (1999). Neural correlates of
Insights from brain imaging. Nat. Rev. Neurosci., 1, 91–100. decision variables in parietal cortex. Nature, 400, 233–238.
Lavie, N. (1995). Perceptual load as a necessary condition Posner, M. I. (1980). Orienting of attention. Q. J. Exp. Psychol., 32,
for selective attention. J. Exp. Psychol. Hum. Percept. Perform., 21, 3–25.
451–468. Pouget, A., Deneve, S., & Ducom, J. C. (1999). Narrow versus
Lavie, N., & Tsal, Y. (1994). Perceptual load as a major wide tuning curves: What’s best for a population code? Neural
determinant of the locus of selection in visual attention. Percept. Comput., 11, 85–90.
Psychophys., 56, 183–197. Reynolds, J. H., Pasternak, T., & Desimone, R. (2000). Attention
Lennie, P. (2003).The cost of cortical computation. Curr. Biol., 13, increases sensitivity of V4 neurons. Neuron, 26, 703–714.
493–497. Roesch, M. R., & Olson, C. R. (2003). Impact of expected
Luck, S. J., Chelazzi, L., Hillyard, S. A., & Desimone, reward on neuronal activity in prefrontal cortex, frontal and
R. (1997). Neural mechanisms of spatial selective attention in supplementary eye fields and premotor cortex. J. Neurophysiol.,
areas V1, V2, and V4 of macaque visual cortex. J. Neurophysiol., 90, 1766–1789.
77, 24–42. Shadlen, M. N., & Newsome, W. T. (1998). The variable discharge
Luck, S. J., Hillyard, S. A., Mangun, G. R., & Gazzaniga, of cortical neurons: Implications for connectivity, computation,
M. S. (1989). Independent hemispheric attentional systems and information coding. J. Neurosci., 18, 3870–3896.
mediate visual search in split-brain patients. Nature, 342, Sheinberg, D. L., & Logothetis, N. K. (2001). Noticing familiar
543–545. objects in real world scenes: The role of temporal cortical
Mangun, G. R., & Hillyard, S. A. (1987). The spatial allocation neurons in natural vision. J. Neurosci., 21, 1340–1350.
of visual attention as indexed by event-related brain potentials. Softky, W. R., & Koch, C. (1993). The highly irregular firing of
Hum. Factors, 29, 195–211. cortical cells is consistent with temporal integration of random
Mangun, G. R., & Hillyard, S. A. (1990). Allocation of visual EPSPs. J. Neurosci., 13, 334–350.
attention to spatial locations: Tradeoff functions for event-related Sparks, D. L. (1999). Conceptual issues related to the role of the
brain potentials and detection performance. Percept. Psychophys., superior colliculus in the control of gaze. Curr. Opin. Neurobiol., 9,
47, 532–550. 698–707.
Martinez-Trujillo, J. C., & Treue, S. (2002). Attentional modu- Spitzer, H., Desimone, R., & Moran, J. (1988). Increased atten-
lation strength in cortical area MT depends on stimulus contrast. tion enhances both behavioral and neuronal performance.
Neuron, 35, 365–370. Science, 240, 338–340.
Maunsell, J. H. R. (2004). Neuronal representations of cognitive Spitzer, H., & Richmond, B. J. (1991). Task difficulty: Ignoring,
state: Reward or attention? Trends Cogn. Sci., 8, 261–265. attending to, and discriminating a visual stimulus yield progres-
Maunsell, J. H. R., & Cook, E. P. (2002). The role of attention sively more activity in inferior temporal neurons. Exp. Brain Res.,
in visual processing. Philos. Trans. R. Soc. Lond. B. Biol. Sci., 357, 83, 340–348.
1063–1072. Steinmetz, P. N., Roy, A., Fitzgerald, P. J., Hsiao, S. S.,
Maunsell, J. H. R., & Treue, S. (2006). Feature-based attention Johnson, K. O., & Niebur, E. (2000). Attention modulates
in visual cortex. Trends Neurosci, 29, 317–322. synchronized firing in primate somatosensory cortex. Nature,
McAdams, C. J., & Maunsell, J. H. R. (1999a). Effects of 404, 187–190.
attention on orientation-tuning functions of single neurons in Taylor, K., Mandon, S., Freiwald, W.A., & Kreiter, A. K.
macaque cortical area V4. J. Neurosci., 19, 431–441. (2005). Coherent oscillatory activity in monkey area V4 predicts
McAdams, C. J., & Maunsell, J. H. R. (1999b). Effects of successful allocation of attention. Cereb. Cortex, 15, 1424–
attention on the reliability of individual neurons in monkey 1437.
visual cortex. Neuron, 23, 765–773. Tiesinga, P., Fellous, J. M., & Sejnowski, T. J. (2008). Regulation
Mitchell, J. F., Sundberg, K. A., & Reynolds, J. H. (2007). Dif- of spike timing in visual cortical circuits. Nat. Rev. Neurosci., 9,
ferential attention-dependent response modulation across cell 97–107.
classes in macaque visual area V4. Neuron, 55, 131–141. Tiesinga, P. H. E., Fellous, J. M., Salinas, E., Jose, J. V., &
Moran, J., & Desimone, R. (1985). Selective attention gates Sejnowski, T. J. (2004). Inhibitory synchrony as a mechanism
visual processing in the extrastriate cortex. Science, 229, for attentional gain modulation. J. Physiol. Paris, 98, 296–314.
782–784. Treue, S., & Martinez-Trujillo, J. C. (1999). Feature-based
Motter, B. C. (1994). Neural correlates of attentive selection for attention influences motion processing gain in macaque visual
color or luminance in extrastriate area V4. J. Neurosci., 14, cortex. Nature, 399, 575–579.
2178–2189. Treue, S., & Maunsell, J. H. R. (1999). Effects of attention on
Müller, M. M., & Gruber, T. (2001). Induced gamma-band the processing of motion in macaque middle temporal and
responses in the human EEG are related to attentional informa- medial superior temporal visual cortical areas. J. Neurosci., 19,
tion processing. Visual Cogn., 8, 579–592. 7591–7602.
288 attention
20 Selective Attention
Through Selective
Neuronal Synchronization
thilo womelsdorf and pascal fries
abstract Selective attention relies on the dynamic restructuring (2) modulates the impact of selective local neuronal
of cortical information flow in order to prioritize neuronal com- groups conveying relevant information within functionally
munication from those neuronal groups conveying information specialized brain areas, and (3) controls long-range interac-
about behaviorally relevant information, while reducing the influ-
ence from groups encoding irrelevant and distracting information.
tions among neuronal groups from distant brain areas
Electrophysiological evidence suggests that such selective neuronal (Maunsell & Treue, 2006; Mitchell, Sundberg, & Reynolds,
communication is instantiated and sustained through selective 2007; Reynolds & Chelazzi, 2004; Womelsdorf & Fries,
neuronal synchronization of rhythmic gamma-band activity 2007).
within and between neuronal groups: Attentionally modulated For all these levels of neuronal interactions, converging
synchronization patterns evolve rapidly, are evident even before
evidence suggests that the selective modulation of interac-
sensory inputs arrive, follow closely subjective readiness to process
information in time, can be sustained for prolonged time periods, tions critically relies on selective synchronization. Neuronal
and carry specific information about top-down selected sensory synchronization is typically oscillatory in nature; that is,
features and motor aspects. These functional implications of neurons fire and pause together in a common rhythm. When
selective synchronization patterns are complemented by recent synchronization is rhythmic, it is often addressed as coher-
insights about the mechanistic consequences of rhythmic synchro-
ence, and we will use these terms interchangeably. This
nization, showing that selective neuronal interactions are subserved
by neuronal synchronization that is selective in space, time, and rhythmic synchronization can influence neuronal interac-
frequency. tions in several ways: (1) Spikes that are synchronized will
have a larger impact on a target neuron than spikes that are
not synchronized (Azouz & Gray, 2003; Salinas & Sejnowski,
2001). (2) Local inhibition that is rhythmically synchronized
Top-down attention is the key mechanism to restructure leaves periods without inhibition, while nonsynchronized
cortical information flow in order to prioritize processing inhibition will prevent local network activity continuously
of behaviorally relevant over irrelevant and distracting (Tiesinga, Fellous, Salinas, Jose, & Sejnowski, 2004). (3)
information (Gilbert & Sigman, 2007). The behavioral Rhythmic synchronization of a local group of neurons will
consequences of attentional restructuring of information modulate the impact of input to that group, and therefore
flow are manifold. Attended sensory inputs are processed the impact of rhythmic input will depend on the synchroni-
more rapidly and accurately and with higher spatial resolu- zation between input and target (Womelsdorf et al., 2007).
tion and sensitivity for fine changes, while nonattended These mechanisms are at work on all levels of attentional
information appears lower in contrast and is sometimes not selection: At the level of microcircuits, inhibitory interneu-
perceived at all (Carrasco, Ling, & Read, 2004; Simons & ron networks have been shown to impose rhythmic synchro-
Rensink, 2005). nization capable of effectively controlling the gain of the
These functional consequences of attention require neuronal spiking output (Bartos, Vida, & Jonas, 2007;
temporally dynamic and selective changes of neuronal Tiesinga, Fellous, & Sejnowski, 2008). At the level of local
interactions spanning multiple levels of neuronal informa- neuronal groups, attention is known to selectively synchro-
tion processing: Attentional selection (1) modulates inter- nize the responses of those neurons conveying information
actions among single neurons within cortical microcircuits, about the attended feature or location (Womelsdorf & Fries,
2007). And the coherent output from these local neuronal
thilo womelsdorf and pascal fries Donders Institute for Brain, groups has been shown to selectively synchronize over long-
Cognition and Behaviour, Radboud University Nijmegen, range connections with task-relevant neuronal groups in
Nijmegen, The Netherlands distant brain regions (Buschman & Miller, 2007; Saalmann,
290 attention
A B
synchronization
Mutual interaction
“Good”-phase
strength
0 0.2 0.4
-pi
Phase relation
synchronization
0
“Bad” phase
pi
C c(AB) for ( [ 1 + 2 ] - [ 3 + 4 ] ) / 2
A A c(AB) for ( [ 1 + 3 ] - [ 2 + 4 ] ) / 2
+ + + - 0.08
+ Trials with “Good”
correlation
B C B C
phase relations
? Power
1 2
V4
3 4 0.04
- Trials with “Bad” A A
phase relations - + - -
B C B C 0
10 20 60 100 140
Frequency (Hz)
Figure 20.1 Selective synchronization renders neuronal interac- (black and dark gray groups), or in antiphase (black and light gray
tions among subsets of neuronal groups effective. (A) Anatomical groups). The plot on the right shows that mutual interactions
connectivity (sketched as lines) provides a rich infrastructure for (upper axis, correlation of the power of the LFP and the neuronal
neuronal communication among neuronal groups (circles) through- spiking response between neuronal groups) are high during periods
out the cortex. With selective attention, only a small subset of these of in-phase synchronization and lower otherwise. (C) The trial-by-
connections are rendered effective (solid lines). Interactions among trial interaction pattern between neuronal groups (A to B and A to
groups conveying irrelevant information (light gray circles) for the C) is predicted by the pattern of synchronization: If AB synchro-
task at hand are rendered less effective (dashed lines). (B) Illustra- nizes at a good phase, their interaction is strongest, irrespective of
tion of the hypothesized role of selective synchronization for selec- whether A synchronizes with C at good or bad phase relations in
tive communication among three neuronal groups (circles). the same trials. Thus the spatial pattern of mutual interactions can
Rhythmic activity (local field potential, or LFP, oscillations with be predicted by the phase of synchronization among rhythmically
spikes in troughs) provide briefly recurring time windows of activated neuronal groups. (Panels in B and C adapted from
maximum excitability (LFP troughs), which are either in phase Womelsdorf et al., 2007.)
and physiology. The different metrics used for quantifying Synchronization is a neuronal population phenomenon,
synchronization are typically normalized for firing rate. and it is often very difficult to assess it with recordings from
Physiologically, there are examples where enhanced firing isolated single units. Correspondingly, many studies of neu-
rates are associated with strongly reduced synchronization, ronal synchronization use recordings of multiunit activity
for example, the stimulus-induced alpha-band desyn- and/or of the local field potential (LFP). The LFP reflects the
chronization in the superficial layers of monkey V4 (Fries summed transmembrane currents of neurons within a few
et al., 2008). Neuronal gamma-band synchronization typi- hundred micrometers of tissue. Since synchronized currents
cally emerges when neuronal groups are activated, and sum up much more efficiently than unsynchronized currents,
therefore it is in most cases associated with increased the LFP primarily reflects synchronized synaptic activity.
firing rates. However, firing rates and gamma-band syn- Changes in LFP power typically correlate very well with
chronization can also be dissociated from each other, changes in direct measures of neuronal synchronization.
and this pattern can be found primarily when firing Rhythmic synchronization within a neuronal group not
rate changes are driven not by changes in bottom-up input only increases its impact on postsynaptic target neurons in
(e.g., stimulus changes) but rather by changes in top-down a feedforward manner. It also rhythmically modulates the
input (e.g., attention or stimulus selection) (Fries, Schröder, group’s ability to communicate, such that rhythmic synchro-
Roelfsema, Singer, & Engel, 2002; Womelsdorf et al., nization between two neuronal groups likely subserves their
2006). interaction, because rhythmic inhibition within the two
292 attention
target perisomatic regions of principal cells and are thereby Despite the prominent computational role of interneuron
capable of determining the impact of synaptic inputs arriv- activity for selective communication, there are only sparse
ing at sites distal to a cell’s soma. Such perisomatic connec- insights into their implications in selective information pro-
tivity could therefore critically control the input gain of cessing during cognitive task performance. The basic predic-
principal cells across a large population of principal cells tion from the preceding models is that interneurons are
(Buzsaki, Kaila, & Raichle, 2007; Cobb, Buhl, Halasy, strongly attentionally modulated. Consistent with this pre-
Paulsen, & Somogyi, 1995; Markram, Wang, & Tsodyks, supposition, a recent study by Mitchell, Sundberg, and
1998; Rudolph, Pospischil, Timofeev, & Destexhe, 2007; Reynolds (2007) reports a clear attentional modulation of
Tiesinga, Fellous, et al., 2004). As described in the previous putative interneurons in visual area V4 during a selective
paragraph, the inhibitory synaptic influence is inherently attention task requiring monkeys to track moving grating
rhythmic at high frequencies, carrying stronger gamma- stimuli (Mitchell et al.). Putative interneurons showed similar
band power than pyramidal cells (Bartos et al., 2007; Hasen- relative increases in firing rate and greater increases in reli-
staub et al., 2005). ability compared to putative pyramidal neurons. However,
The prominent role of these high-frequency inputs in tests of more refined predictions about the relative modula-
shaping the spiking output of principal cells has recently tion of synchronization and the phase relation of spiking
been demonstrated directly in visual cortex of the awake cat. responses of inhibitory and excitatory neuron types still need
It was shown that the spiking of principal cells is indeed to be conducted (Buia & Tiesinga, 2008).
preceded by brief periods of reduced inhibition (Rudolph
et al., 2007; see also figure 8 of Hasenstaub et al., 2005). Selective modulation of synchronization during
Taken together, these findings suggest that interneurons are attentional processing
the source of rhythmic inhibition onto a local group of
neurons synchronizing the discharge of pyramidal cells to Direct evidence for the functional significance of selective
the time windows between inhibition. synchronization within local neuronal groups for attentional
In the context of selective attention, interneuron networks selection has been obtained from recordings in macaque
could be activated by various possible sources. They may be visual cortical area V4 (Womelsdorf & Fries, 2007). One
activated by transient and spatially specific neuromodula- consistent result across studies is that spatial attention
tory inputs (Lin, Gervasoni, & Nicolelis, 2006; Rodriguez, enhances gamma-band synchronization within neuronal
Kallenbach, Singer, & Munk, 2004). Alternatively, selective groups that have receptive fields overlapping the attended
attention could target local interneuron networks directly by location (Fries, Reynolds, Rorie, & Desimone, 2001; Taylor
way of top-down inputs from neurons in upstream areas et al., 2005; Womelsdorf et al., 2006). The enhanced rhyth-
(Buia & Tiesinga, 2008; Mishra et al., 2006; Tiesinga et al., mic synchronization is strongly evident within the LFP
2008). In these models, selective synchronization emerges signal, which is a compound signal of activity within a local
either by depolarizing selective subsets of interneurons (Buia neuronal group, and is likewise reflected in more precise
& Tiesinga, 2008; Tiesinga & Sejnowski, 2004) or by biasing synchronization of neuronal spiking responses to the LFP.
the phase of rhythmic activity in a more global inhibitory Importantly, the synchronization among the spiking output
interneuron pool (Mishra et al., 2006). In either case, rhyth- from neurons coding for the attended location is also
mic inhibition controls the spiking responses of groups of enhanced compared to the spiking output of neurons acti-
excitatory neurons, enhancing the impact of neurons spiking vated by a nonattended distracter stimulus (figure 20.2)
synchronously within the periods of disinhibition, while (Fries et al., 2008). These attentional effects on spike-to-spike
actively reducing the impact of neurons spiking asynchro- synchronization imply that the postsynaptic targets receive
nously to this rhythm. This suppressive influence on excit- more coherent input from the neuronal groups that convey
atory neurons, which are activated by distracting feedforward behaviorally relevant information.
input, reflects the critical ingredient for the concept of selec-
tive attention through selective synchronization: Attention Functional Implications of Selective Gamma-Band
not only enhances synchronization of already more coherent Synchronization In addition to the described attentional
activity representing attended stimuli, but also actively sup- effect, recent studies demonstrated that the precision of local
presses the synchronization and impact of groups of neurons synchronization in visual area V4 is closely related to task
receiving strong, albeit distracting, inputs, because they performance, including behavioral accuracy and the time to
arrive at nonoptimal phase relations to the noninhibited detect behaviorally relevant stimulus changes (Taylor et al.,
periods in the target group. The computational feasibility of 2005; Womelsdorf et al., 2006). This conclusion was derived
both facilitatory and suppressive aspects and the critical role from an error analysis of the pattern of synchronization in
of the timing of inhibitory circuits have recently received area V4 (Taylor et al.). In this study, the spatial focus
direct support (Börgers & Kopell, 2008). of attention could be inferred from the pattern of
Relative Power
1.4 10
2 2
1
1 1
0.6 5
2 10 18 28 52 76 100 2 10 18 28 52 76 100
B E
0.2 0.06 0.2 0.1
Spike-Field
Coherence
0.1 0.05
0.1 0.03
2 10 18 28 52 76 100 2 10 18 28 52 76 100
C F0.3 0.05
0.06
Spike-Spike
Coherence
0.2
2 10 18 28 52 76 100 2 10 18 28 52 76 100
Frequency (Hz) Frequency (Hz)
Attention outside the RF Attention inside the RF
Figure 20.2 The pattern of attentional modulation of synchroni- (gray lines) the receptive field location of the recorded neuronal
zation in macaque visual area V4 before and during sensory stimu- groups in blocks of trials. (D–F) Attentional modulation of the
lation. (A–C ) Attentional modulation of relative LFP power (A), neuronal response during stimulation with an attended/ignored
spike-to-LFP coherence (B), and spike-to-spike coherence (C ) across moving grating. Same format as in A–C. Horizontal gray bars
low and high frequencies during the baseline period of a spatial denote frequencies with significant attentional effects. (Adapted
attention task. Monkeys either attended (dark lines) or ignored from Fries, Womelsdorf, Oostenveld, & Desimone, 2008.)
synchronization measured through epidural electrodes. gamma band. Notably, the correlation of gamma-band
Gamma-band synchronization was not only stronger for synchronization with the speed of change detection
correct trials than for miss trials, but additionally, the degree showed high spatial selectivity: Neurons activated by an
of synchronization predicted whether the monkey was unattended stimulus engaged in lower synchronization
paying attention to the distracter. Thus this study when the monkeys were particularly fast in responding
demonstrated that gamma-band synchronization reflects the to the stimulus change at locations outside their receptive
actual allocation of attention rather than merely the field. This finding rules out a possible influence of globally
attentional cuing itself. Furthermore, another recent study increased synchronization during states of enhanced alertness
demonstrated that the precision of stimulus-induced gamma- and arousal (Herculano-Houzel, Munk, Neuenschwander,
band synchronization predicts how rapidly a stimulus change & Singer, 1999; Munk, Roelfsema, Konig, Engel, & Singer,
can be reported behaviorally. When monkeys were spatially 1996; Rodriguez et al., 2004). And it argues for a fine-
cued to select one of two stimuli in order to detect a color grained influence of synchronization to modulate the
change of the attended stimulus, the speed of change effective transmission of information about the stimulus
detection could be partly predicted by the strength of change to postsynaptic target areas concerned with the
gamma-band synchronization shortly before the stimulus planning and execution of responses.
change actually occurred (Womelsdorf et al., 2006). These behavioral correlates of gamma-band synchroniza-
Importantly, the reaction times to the stimulus change could tion during selective attention tasks are complemented by
not be predicted at times before the stimulus change by a variety of correlational results linking enhanced gamma-
overall firing rates, nor by synchronization outside the band synchronization to efficient task performance in various
294 attention
paradigms involving attentional processing. For example, in In macaque visual cortical area V4, neurons synchronized
memory-related structures the strength of gamma-band syn- their spiking responses to the LFP in the gamma band more
chronization has been linked to the successful encoding and precisely when monkeys expected a target stimulus at the
retrieval of information (Montgomery & Buzsaki, 2007; Sed- receptive field location of the respective neuronal group
erberg et al., 2006; Sederberg, Kahana, Howard, Donner, (figure 20.2B). This modulation was evident even though
& Madsen, 2003; Sederberg et al., 2007). rhythmic activity proceeded at far lower levels in the absence
of sensory stimulation compared to synchronization strength
Selective Gamma-Band Coherence Beyond Visual during high-contrast sensory drive. Lower overall strength,
Cortex These results of selective gamma-band synchroni- and correspondingly lower signal-to-noise ratio, may account
zation with selective spatial attention are supported by a for the lack of significant gamma-band modulation of LFP
growing number of converging findings from human EEG power or spike-to-spike synchronization during the prestim-
and MEG studies (Doesburg, Roggeveen, Kitajo, & Ward, ulus period when compared to attentional modulation
2008; Fan et al., 2007; Gruber, Müller, Keil, & Elbert, 1999; during stimulation (figure 20.2).
Landau, Esterman, Robertson, Bentin, & Prinzmetal, 2007; During preparatory periods, and thus in the absence
Wyart & Tallon-Baudry, 2008). Importantly, attention of strong excitatory drive to the local network, rhythmic
modulates gamma-band synchronization beyond sensory activity is dominated by frequencies lower than the gamma
visual cortex. It has been reported for auditory cortex band. In the described study from macaque V4, prestimulus
(Kaiser, Hertrich, Ackermann, & Lutzenberger, 2006; periods were characterized by alpha-band peaks of
Tiitinen et al., 1993) and more recently in somatosensory local rhythmic synchronization when monkeys attended
cortex. Spatial attention for tactile discrimination at either away from the receptive location of the neuronal group.
the right or left index finger in humans enhanced stimulus- Figure 20.2B,C demonstrates reduced locking of neuronal
induced gamma-band synchronization in primary somato- spiking in the alpha band to the LFP and to spiking output
sensory cortex when measured with MEG (Bauer, Oostenveld, of nearby neurons (figure 20.2B,C ). This finding is in general
Peeters, & Fries, 2006; Hauck, Lorenz, & Engel, 2007). agreement with various studies demonstrating reduced
Similar topographies and dynamics of gamma-band alpha-band activity during attentional processing (Bauer
synchronization were shown to correlate with the actual et al., 2006; Pesaran, Pezaris, Sahani, Mitra, & Andersen,
perception of somatosensory induced pain (Gross, Schnitzler, 2002; Rihs, Michel, & Thut, 2007; Worden, Foxe, Wang,
Timmermann, & Ploner, 2007). Importantly, enhanced & Simpson, 2000; Wyart & Tallon-Baudry, 2008). Interest-
oscillatory dynamics in the gamma band during tactile ingly, human EEG studies extend this finding by showing
perception is not restricted to the somatosensory cortex that the degree of alpha-frequency desynchronization
(Ohara, Crone, Weiss, & Lenz, 2006). In recent intracranial during prestimulus intervals of visuospatial attention tasks
recordings in humans, synchronization was modulated indicates how fast a forthcoming target stimulus is processed
across somatosensory cortex, medial prefrontal, and insular ( Jin, O’Halloran, Plon, Sandman, & Potkin, 2006; Sauseng
regions when subjects had to direct attention to painful et al., 2006; Thut, Nietzel, Brandt, & Pascual-Leone, 2006).
tactile stimulation (Ohara et al.). For example, reaction times to a peripherally cued target
stimulus are partially predicted by the lateralization of
Spatially Specific Synchronization Patterns During alpha activity in the one-second period before target ap-
Preparatory Attentional States The described gamma- pearance (Thut et al.). While this predictive effect was
band modulation of rhythmic activity is most prominent based predominantly on reduced alpha-band responses
during activated states. However, attentional top-down over the hemisphere processing the attended position,
control biases neuronal responses in sensory cortices already recent studies suggest that alpha-band oscillations are
before sensory inputs impinge on the neuronal network selectively enhanced within local neuronal groups process-
(Fries, Reynolds, et al., 2001; Fries et al., 2008; Luck, ing distracting information, that is, at unattended loca-
Chelazzi, Hillyard, & Desimone, 1997). In many attention tions (Kelly, Lalor, Reilly, & Foxe, 2006; Rihs et al.;
studies, the instructional cue period is followed by a temporal Yamagishi et al., 2003). These findings suggest that
delay void of sensory stimulation. During these preparatory rhythmic alpha-band synchronization may play an active
periods, top-down signals set the stage for efficient processing role in preventing the signaling of stimulus information.
of expected stimulus information, rendering local neuronal According to this hypothesis, attention is thought to up-
groups ready to enhance the representation of attended regulate alpha-band activity of neuronal groups expected
sensory inputs. Intriguingly, the described preparatory to process distracting stimulus information, rather than
bias is evident in selective synchronization patterns in the to down-regulate local alpha-band synchronization for
gamma band and in rhythmic synchronization at lower neuronal groups processing attended stimulus features and
frequencies. locations.
A 13 dB
B 40
E F
AV mean delta phase reaction time
61
% of trials
1-30
Frequency(Hz)
0
27 40
AA
11 0 20-49
delta phase sorted sweeps
-pi 0 pi
5 delta phase (rad)
40-69 max prestim.
2.2
C Gamma amplitude
(AV-AA)/AA delta
1 0.2
60-89
61 0
Frequency(Hz)
27
-0.2
-300 ms -25 ms 80-109
11
5 D MUA
(AV-AA, μV) 100-9
0.25 min prestim.
2.2
delta
0
-800 -400 0 time(ms) 400 120-29
-0.25 -pi 0 pi 300 350 450
Visual Auditory Auditory -300 ms -25 ms (ms)
phase (rad)
stim. stim. stim.
Figure 20.3 (A) Entrainment of synchronization from delta- to (1.55 Hz) phase at the time of visual stimulus onset when attention
gamma-band frequencies in supragranular layers of the primary was directed to the visual (upper panel) and auditory (lower panel)
visual cortex during an auditory-visual change detection task. (B) modality across recording sessions. (D, E ) Modulation of gamma
Time-frequency spectrograms during attention to the visual (upper band amplitude of the LFP (D) and multiunit activity (E ) before
panel) and auditory (bottom panel) input stream, aligned to the and at visual stimulus onset. Positive values indicate enhancement
onset time of the visual stimulus. The task cued monkeys to detect with visual versus auditory attention. (F ) Reaction times (x-axis) to
infrequent deviant stimuli in either the auditory (white noise tones) the visual target stimulus sorted into groups of trials according to
or visual (red light flashes) input stream. Visual (auditory) stimuli the prestimulus delta phase (at 0 ms to stimulus onset) (y-axis).
were onset at a regular interval of 650 ms ± 150 ms indicated below Solid/dashed horizontal lines indicate the group of trials corre-
the time axis (mean stimulus rate, 1.55 Hz). (C ) Entrainment of sponding to maximum/minimum delta amplitudes. (Adapted from
delta-frequency phase in visual cortex measured as the delta Lakatos, Karmos, Metha, Ulbert, & Schroeder, 2008.)
296 attention
The described results suggest that top-down information of neurons tuned to the attended stimulus feature (Bichot,
selectively modulates excitability in early sensory cortices Rossi, & Disimone, 2005). In this study, spiking responses
through changes in the phase of rhythmic entrainment in and LFPs were recorded from neuronal groups in macaque
these areas. The exact frequency band underlying excitabil- visual area V4 while monkeys searched in multistimulus
ity modulations may extend from the low delta band directly displays for a target stimulus defined either by color, shape,
imposed by the stimulus structure in the described study to or both. When monkeys searched, for example, for a red
the theta band around 4–8 Hz. This suggestion may be stimulus by shifting their gaze across stimuli on the display,
derived from the time-frequency evolution of LFP power in the nonfoveal receptive fields of the recorded neurons could
the theta band and its attentional modulation, shown in either encompass nontarget stimuli (e.g., of blue color) or the
figure 20.3A. Intriguingly, similar to the effect of delta phase (red) target stimulus prior to the time when the monkey
on the gamma-band response demonstrated directly in the detected the target. The authors found that neurons syn-
discussed study (figure 20.3B), previous studies have linked chronized to the LFP more strongly in response to their
the phase of rhythmic activity in the theta band to the preferred stimulus feature when it was the attended search
strength of high-frequency gamma-band synchronization in target feature rather than a distracter feature.
rodent hippocampus and over large regions in the human Thus attention enhanced synchronization of the responses
cortex (Canolty et al., 2006; Csicsvari, Jamieson, Wise, & of the neurons that shared a preference for the attended
Buzsaki, 2003). An additional hint suggesting a functional target feature—and irrespective of the spatial location of
relevance of low-frequency phase fluctuations can be found attention (Bichot et al., 2005). This feature-based modula-
in a recent study in rodents demonstrating that neuronal tion was also evident during a conjunction search task
spiking responses in rodent prefrontal cortex phase-lock to involving targets that were defined by two features: When
theta-band activity in the hippocampus during task epochs monkeys searched for a target stimulus with a particular
requiring spatial decisions in a working memory context orientation and color (e.g., a red horizontal bar), neurons
( Jones & Wilson, 2005). In macaque visual cortex, the phase with preference to one of these features enhanced their neu-
of theta-band synchronization has been directly linked to ronal synchronization (Bichot et al.). This enhancement was
selective maintenance of task-related information (Lee, observed not only in response to the color-shape-defined
Simpson, Logothetis, & Rainer, 2005). Taken together, the conjunction target, but also in response to distracters sharing
emerging evidence demonstrates (1) that top-down, task- one feature with the target (e.g., red color). This latter finding
related information modulates low-frequency rhythmic corresponds well with the behavioral consequences of
activity, (2) that the phase of this rhythmic activity can be increased difficulty and search time needed for conjunction-
functionally related to task performance, and (3) that the defined targets.
phase of low-frequency activity shapes the strength of gamma- This study shows that feature salience is indexed not
band synchronization in response to sensory inputs. As such, only by changes in firing rates as has been shown before
the pattern of selective synchronization in the gamma band (Martinez-Trujillo & Treue, 2004; Treue & Martinez-
described in the previous paragraphs could be tightly linked Trujillo, 1999; Wannig, Rodriguez, & Freiwald, 2007), but
to underlying, selective low-frequency activity modulations. also by selectively synchronizing neuronal responses depend-
Whether both are coupled in an obligatory way, or whether ing on the similarity between neuronal feature preferences
the comodulation may be triggered by specific task demands, and the attended stimulus feature. The mechanisms behind
will be an interesting subject for future research. this selective influence of featural top-down information
could be based on a similar spatial weighting of interneuron
Feature-Selective Modulation of Rhythmic Synchroni- network activity as implicated for spatial selection. Neuronal
zation The preceding sections discussed evidence for tuning to many basic sensory features is organized in regu-
selective neuronal synchronization patterns evolving with larly arranged local maps. Correspondingly, the tuning of
space-based attentional selection of sensory inputs. However, groups of neurons measured with the LFP is locally highly
in addition to spatial selection, attention frequently pro- selective. Importantly, neuronal stimulus preference is sys-
ceeds only on top-down information about the behaviorally tematically related to the strength of neuronal synchroniza-
relevant sensory feature and independent of the exact spatial tion in the gamma frequency band. This relationship has
location at which input impinges on sensory cortices. Such been demonstrated for stimulus orientation and spatial fre-
feature-based attention is known to modulate the responses quency (Frien, Eckhorn, Bauer, Woelbern, & Gabriel, 2000;
of neurons that are tuned to the attended feature such as a Gray, Engel, Konig, & Singer, 1990; Kayser & König, 2004;
particular motion direction or the color of a visual stimulus Kreiter & Singer, 1996; Siegel & König, 2003), the speed
(Maunsell & Treue, 2006). and direction of visual motion (Liu & Newsome, 2006), and
Importantly, a recent study demonstrated that attention the spatial motor intentions and movement directions
to a particular feature selectively synchronizes the responses (Scherberger & Andersen, 2007; Scherberger, Jarvis, &
n
ctio
Rea ime
T
Bottom-up toTop-down
Cue 0.05
Difference
0.25
Top-down Search
Top-down Search
Coherence
20 40 60
n
ctio
Rea ime
T
0.05
10 30 50 70
Frequency (Hz)
Figure 20.4 Selective modulation of long-range synchronization saccade to the target stimulus position as soon as they found it. (B)
between frontal and parietal cortex during visual search. (A) Sketch The authors measured the coherence of the LFP activity of neuro-
of two visual search tasks used by Buschman and Miller (2007). A nal groups in the frontal eye field and dorsolateral prefrontal cortex)
cue instructed monkeys about the orientation and color of a bar and parietal area LIP. The line plots on the right show the coher-
that was the later search target in a multistimulus display during a ence (y-axis) for different frequency bands (x-axis) in the bottom-up
bottom-up search task (both target color and orientation were and top-down tasks, along with the coherence difference across
unique, upper panels) and during a top-down search task (target tasks (solid line in inset). The results show that attentional demand
shared color or orientation with distracting stimuli, bottom panels). modulated long-range frontoparietal coherence at different fre-
Monkeys covertly attended the multistimulus array and made a quency bands. (Adapted from Buschman & Miller, 2007.)
298 attention
(“bottom-up search”) or that is nonsalient because it shares neuronal integration of information across distributed
features with distracting stimuli (Buschman & Miller, 2007). cortical areas. However, further studies need to elucidate
In contrast to bottom-up salient targets, the nonsalient target the properties of particular frequency bands and their
stimuli were detected more slowly, indicating that they characteristic recruitment during specific tasks (Kopell,
require attentive search through the stimuli in the display Ermentrout, Whittington, & Traub, 2000).
before they are successfully detected (“top-down search”).
Paralleling the difference in behavioral demands, the authors Concluding remarks
found a selective synchronization pattern among the LFPs
in frontal and parietal cortex. While attentive “top-down Selective attention describes a central top-down process that
search” enhanced specifically rhythmic synchronization at restructures neuronal activity patterns to establish a selective
20–35 Hz compared to the “bottom-up” search, the stimulus representation of behavioral relevance. The surveyed evi-
driven “bottom-up” search resulted in stronger inter-areal dence suggests that attention achieves this functional role by
synchronization in the gamma-frequency band (figure selectively synchronizing those neuronal groups conveying
20.4B). The pattern of results is most likely due to relative task-relevant information. Attentionally modulated synchro-
differences in task demands in both search modes and was nization patterns evolve rapidly, are evident even before
unaffected by differences in reaction times. Therefore these sensory inputs arrive, follow closely subjective readiness to
findings suggest that inter-areal communication during process information in time, can be sustained for prolonged
attentional top-down control is conveyed particularly through time periods, and carry specific information about top-down
rhythmic synchronization in a high beta band, either in selected sensory features and motor aspects.
addition to or separate from the frequency of rhythmic inter- In addition to these functional characteristics, insights into
actions underlying bottom-up feedforward signaling. the physiological origins of synchronization have begun to
Consistent with a functional role for top-down-mediated shed light on the mechanistic underpinning of selective neu-
long-range neuronal communication, various experimental ronal interaction patterns at all spatial scales of cortical pro-
paradigms demanding attentive processing have shown cessing: At the level of single neurons and local microcircuits,
long-range synchronization in a broad beta band, although studies are deciphering the role of inhibitory interneuron
mostly at frequencies below 25 Hz. The following provide a networks, how precise timing information is conveyed and
few examples of beta-band modulation in recent studies sustained even at high oscillation frequencies, and how
using very different task paradigms: Variations in reaction rhythmic synchronization among interneurons is actively
times and readiness to respond to a sensory-change event made robust against external influences (Bartos et al., 2007;
induced corresponding fine-grained variations of motor- Vida et al., 2006). These insights are integrated at the
spinal coherence in the beta band (Schoffelen et al., 2005). network level in models demonstrating how selective syn-
Somatosensory and motor cortex synchronize in the beta chronization patterns evolve in a self-organized way (Börgers
band during sensorimotor integration (Brovelli et al., 2004). & Kopell, 2008; Tiesinga et al., 2008).
Selective working memory maintenance in a delayed match- Acknowledging those basic physiological processes under-
to-sample task results in stronger coherence in the beta lying the dynamic generation of selective synchronization
band between higher visual areas in humans (Tallon-Baudry, seems to be pivotal to further elucidation of the mechanistic
Bertrand, & Fischer, 2001) and locally predicts performance working principles of selective attention in the brain.
in a similar task in the monkey (Tallon-Baudry, Mandon,
Freiwald, & Kreiter, 2004). The failure to detect a target acknowledgments This work was supported by the Human
stimulus in a rapid stream of stimuli in the attentional Frontier Science Program Organization, the Volkswagen Foun-
dation, the European Science Foundation’s European Young
blink paradigm is associated with reduced frontoparietal and Investigator Award program (PF), and the Netherlands Organiza-
frontotemporal beta-band synchronization (Gross et al., tion for Scientific Research (PF and TW).
2004). And as a last example for a potential functional role
of beta-band activity, the perception of coherent objects
from fragmented visual scenes goes along with transiently REFERENCES
enhanced beta-band synchronization of the LFP among pre- Azouz, R. (2005). Dynamic spatiotemporal synaptic integration in
frontal, hippocampal, and lateral occipital sites (Sehatpour cortical neurons: Neuronal gain, revisited. J. Neurophysiol., 94(4),
et al., 2008). 2785–2796.
Taken together, these diverse findings agree to suggest Azouz, R., & Gray, C. M. (2003). Adaptive coincidence detection
that inter-areal synchronization critically subserves neuronal and dynamic gain control in visual cortical neurons in vivo.
Neuron, 37(3), 513–523.
interactions during attentive processing. In the surveyed Bartos, M., Vida, I., & Jonas, P. (2007). Synaptic mechanisms of
studies, synchronization in a broadly defined beta band synchronized gamma oscillations in inhibitory interneuron net-
occurred selectively during task epochs requiring effective works. Nat. Rev. Neurosci., 8(1), 45–56.
300 attention
Jin, Y., O’Halloran, J. P., Plon, L., Sandman, C. A., & Potkin, input synchrony in a model cortical neuron. Neural Net., 19(9),
S. G. (2006). Alpha EEG predicts visual reaction time. Int. J. 1329–1346.
Neurosci., 116(9), 1035–1044. Mitchell, J. F., Sundberg, K. A., & Reynolds, J. H. (2007).
Jones, M. W., & Wilson, M. A. (2005). Theta rhythms coordinate Differential attention-dependent response modulation across
hippocampal-prefrontal interactions in a spatial memory task. cell classes in macaque visual area V4. Neuron, 55(1), 131–141.
PLoS Biol., 3(12), e402. Monosov, I. E., Trageser, J. C., & Thompson, K. G. (2008).
Kaiser, J., Hertrich, I., Ackermann, H., & Lutzenberger, W. Measurements of simultaneously recorded spiking activity and
(2006). Gamma-band activity over early sensory areas local field potentials suggest that spatial selection emerges in the
predicts detection of changes in audiovisual speech stimuli. frontal eye field. Neuron, 57(4), 614–625.
NeuroImage, 30(4), 1376–1382. Montgomery, S. M., & Buzsaki, G. (2007). Gamma oscillations
Kayser, C., & König, P. (2004). Stimulus locking and feature dynamically couple hippocampal CA3 and CA1 regions during
selectivity prevail in complementary frequency ranges of V1 memory task performance. Proc. Natl. Acad. Sci. USA, 104(36),
local field potentials. Eur. J. Neurosci., 19(2), 485–489. 14495–14500.
Kelly, S. P., Lalor, E. C., Reilly, R. B., & Foxe, J. J. (2006). Moran, J., & Desimone, R. (1985). Selective attention gates
Increases in alpha oscillatory power reflect an active retinotopic visual processing in the extrastriate cortex. Science, 229(4715),
mechanism for distracter suppression during sustained visuospa- 782–784.
tial attention. J. Neurophysiol., 95(6), 3844–3851. Munk, M. H., Roelfsema, P. R., Konig, P., Engel, A. K., &
Khayat, P. S., Spekreijse, H., & Roelfsema, P. R. (2006). Atten- Singer, W. (1996). Role of reticular activation in the modulation
tion lights up new object representations before the old ones fade of intracortical synchronization. Science, 272(5259), 271–274.
away. J. Neurosci., 26(1), 138–142. Ohara, S., Crone, N. E., Weiss, N., & Lenz, F. A. (2006). Analysis
Kopell, N., Ermentrout, G. B., Whittington, M. A., & Traub, of synchrony demonstrates “pain networks” defined by rapidly
R. D. (2000). Gamma rhythms and beta rhythms have different switching, task-specific, functional connectivity between pain-
synchronization properties. Proc. Natl. Acad. Sci. USA, 97(4), related cortical structures. Pain, 123(3), 244–253.
1867–1872. Pesaran, B., Pezaris, J. S., Sahani, M., Mitra, P. P., &
Kreiter, A. K., & Singer, W. (1996). Stimulus-dependent Andersen, R. A. (2002). Temporal structure in neuronal
synchronization of neuronal responses in the visual cortex of the activity during working memory in macaque parietal cortex. Nat.
awake macaque monkey. J. Neurosci., 16(7), 2381–2396. Neurosci., 5(8), 805–811.
Lakatos, P., Karmos, G., Metha, A. D., Ulbert, I., & Reynolds, J. H., & Chelazzi, L. (2004). Attentional modulation
Schroeder, C. E. (2008). Entrainment of neuronal oscillations of visual processing. Annu. Rev. Neurosci., 27, 611–647.
as a mechanism of attentional selection. Science, 320(5872), Reynolds, J. H., Chelazzi, L., & Desimone, R. (1999).
110–113. Competitive mechanisms subserve attention in macaque areas
Landau, A. N., Esterman, M., Robertson, L. C., Bentin, S., & V2 and V4. J. Neurosci., 19(5), 1736–1753.
Prinzmetal, W. (2007). Different effects of voluntary and invol- Riehle, A. (2005). Preparation for action: One of the key functions
untary attention on EEG activity in the gamma band. of motor cortex. In A. Riehle & E. Vaadia (Eds.), Motor cortex in
J. Neurosci., 27(44), 11986–11990. voluntary movements: A distributed system for distributed functions (Vol. 1,
Lee, H., Simpson, G. V., Logothetis, N. K., & Rainer, pp. 213–240). Boca Raton, FL: CDC Press.
G. (2005). Phase locking of single neuron activity to theta oscil- Rihs, T. A., Michel, C. M., & Thut, G. (2007). Mechanisms
lations during working memory in monkey extrastriate visual of selective inhibition in visual spatial attention are indexed
cortex. Neuron, 45(1), 147–156. by alpha-band EEG synchronization. Eur. J. Neurosci., 25(2),
Lin, S. C., Gervasoni, D., & Nicolelis, M. A. (2006). Fast modula- 603–610.
tion of prefrontal cortex activity by basal forebrain Rodriguez, R., Kallenbach, U., Singer, W., & Munk,
noncholinergic neuronal ensembles. J. Neurophysiol., 96(6), M. H. (2004). Short- and long-term effects of cholinergic modu-
3209–3219. lation on gamma oscillations and response synchronization in
Liu, J., & Newsome, W. T. (2006). Local field potential in the visual cortex. J. Neurosci., 24(46), 10369–10378.
cortical area MT: Stimulus tuning and behavioral correlations. Roelfsema, P. R., Engel, A. K., König, P., & Singer, W.
J. Neurosci., 26(30), 7779–7790. (1997). Visuomotor integration is associated with zero timelag
Luck, S. J., Chelazzi, L., Hillyard, S. A., & Desimone, synchronization among cortical areas. Nature, 385(6612),
R. (1997). Neural mechanisms of spatial selective attention in 157–161.
areas V1, V2, and V4 of macaque visual cortex. J. Neurophysiol., Roelfsema, P. R., Tolboom, M., & Khayat, P. S. (2007).
77(1), 24–42. Different processing phases for features, figures, and selective
Markram, H., Toledo-Rodriguez, M., Wang, Y., Gupta, A., attention in the primary visual cortex. Neuron, 56(5), 785–792.
Silberberg, G., & Wu, C. (2004). Interneurons of the neocorti- Rudolph, M., Pospischil, M., Timofeev, I., & Destexhe,
cal inhibitory system. Nat. Rev. Neurosci., 5(10), 793–807. A. (2007). Inhibition determines membrane potential dynamics
Markram, H., Wang, Y., & Tsodyks, M. (1998). Differential sig- and controls action potential generation in awake and sleeping
naling via the same axon of neocortical pyramidal neurons. Proc. cat cortex. J. Neurosci., 27(20), 5280–5290.
Natl. Acad. Sci. USA, 95(9), 5323–5328. Saalmann, Y. B., Pigarev, I. N., & Vidyasagar, T. R. (2007).
Martinez-Trujillo, J. C., & Treue, S. (2004). Feature-based Neural mechanisms of visual attention: How top-down feedback
attention increases the selectivity of population responses in highlights relevant locations. Science, 316(5831), 1612–1615.
primate visual cortex. Curr. Biol., 14(9), 744–751. Salinas, E., & Sejnowski, T. J. (2001). Correlated neuronal
Maunsell, J. H., & Treue, S. (2006). Feature-based attention in activity and the flow of neural information. Nat. Rev. Neurosci.,
visual cortex. Trends Neurosci., 29(6), 317–322. 2(8), 539–550.
Mishra, J., Fellous, J. M., & Sejnowski, T. J. (2006). Selective Sauseng, P., Klimesch, W., Freunberger, R., Pecherstorfer, T.,
attention through phase relationship of excitatory and inhibitory Hanslmayr, S., & Doppelmayr, M. (2006). Relevance of EEG
302 attention
IV
SENSATION AND
PERCEPTION
Chapter 21 barlow 309
26 carroll, yoon,
and williams 383
27 brainard 395
28 ringach 409
29 seidemann, chen,
and geisler 419
31 connor, pasupathy,
brincat, and yamane 455
32 mckone, crookes,
and kanwisher 467
33 deangelis 483
34 angelaki, gu,
and deangelis 499
36 simoncelli 525
Introduction
j. anthony movshon and brian a. wandell
abstract By the late 1960s, recording from sensory pathways visual cortex. Although this idea of a hierarchy has not
had shown that single neurons can be much more sensitive, selec- completely crashed, it has had a bumpy ride and never speci-
tive, and reliable than had previously been recognized. The term fied a functional goal that the supposed hierarchy might help
grandmother cell started as a fanciful name for a high-level neuron
that might enable us to experience complex perceptions and to achieve.
discriminate among them. The concept included invariance of The second part of this chapter briefly considers some
response for changes in some variables as well as selectivity of objections to the idea of a hierarchy, but we rapidly encoun-
response for others, together with the idea that these cells are ter the fact that the cortex contains vastly more neurons
created by processing at a hierarchy of levels. This chapter first representing each location in the visual field than the retina
outlines the discoveries that eventually led to the general accep-
tance that such cells really exist. It then discusses hierarchical pro-
or LGN, and the need for this vast excess has not been
cessing, the evolution of the cortex, and ideas about the new explained by any current computational model. This makes
behavioral faculties that evolved with it. Finally it points to the one suspect that the cortex performs computations that are
enormous, unaccounted for number of neurons in the cortex and different from those of earlier stages in the visual pathway
suggests that this plays a major role in enhancing our ability to or other parts of the brain, which adds a new slant to the
exploit symmetry and invariance in our environment.
problem. To help decide what these new computations
might be, in the third part I look at the problem from a
broader perspective, consulting other academic disciplines,
The term grandmother cell suggests that there are particular and this leads to the idea that the cortex uses the symmetry
neurons in the visual cortex that are activated by the sight and invariance present in its input to generate a more eco-
of one’s grandmother, and it implies that such neurons play nomical, sparser, representation of the environment.
an important part in generating high-level perceptions and
discriminating between them. The term was introduced in A brief history
1969 by Jerry Lettvin (see Barlow 1995), and in the first part
of this chapter, I shall present a brief history of the facts as Charles Gross deserves at least as much credit as anyone
they have been discovered since then. In the early years, else for the actual experiments on which claims about
many people thought that it was just a catchy term for an grandmother cells rest, and in his essay “Genealogy of
implausible idea that would lead nowhere, but by now, it is the Grandmother Cell” (Gross, 2002), he tells how Jerzy
clear that neurons fitting this definition really exist and that Konorski’s neuropsychological studies of visual agnosia,
advances in understanding their neurophysiology will con- with evidence of the functional hierarchy of neurons in the
stitute real progress in understanding the brain, particularly visual cortex from Hubel and Wiesel, further helped by Jerry
the conscious, thinking parts of it. Lettvin’s fertile imagination, may have guided his labora-
Even before Lettvin had coined the term, Konorski (1967) tory toward these discoveries. Gross gave the definition
had championed the idea that what he called gnostic neurons shown in box 21.1, and this will be adopted as an initial
were the end result of a hierarchical series of transformations working definition, since it coincides well with most people’s
along the lines that Hubel and Wiesel (1962, 1965) had sug- usage. However, one of the main points of the grandmother
gested from the results of their single-unit recordings in the cell idea is that the cells’ responses are largely unaffected
by changes of grandma’s position, pose, clothing, facial
horace barlow Department of Physiology, Development and expression, and so on. Invariance has got lost in box 21.1’s
Neuroscience, University of Cambridge, Cambridge, United definition, but it is at least as important a property as selec-
Kingdom tivity is.
MIT lies across the Charles River from Hubel and patterns makes them, in some respects, like precortical
Wiesel’s lab at Harvard Medical School, from which the grandmother cells.
evidence for a hierarchical functional organization in the Most people who have recorded from neurons in sensory
cortex had emerged. Hubel and Wiesel (1959) had discov- pathways will have experienced the long periods of intense
ered that neurons in V1 are selective for the orientation of frustration that occur when you know that your electrode is
visual stimuli, and I shall make a digression here to illustrate near a cell, because you detect its action potentials when it
how difficult it was to make that step. This is based partly fires spontaneously, but you are unable to find the visual
on my own experience somewhat later, when I was working stimulus, nicknamed its “trigger-feature,” that reliably excites
with Bill Levick on retinal ganglion cells in the rabbit (Barlow, it. In such cases, one frequently relieves one’s frustration by
Hill, & Levick, 1964); these fall into many different catego- moving on to another neuron in the hope that one will find
ries whose high degree of selectivity for different, specific its trigger feature more easily, but in the following quotation,
Figure 21.2 Examples of shapes used to stimulate a group TE unit apparently having a very complex trigger feature. The stimuli are
arranged from left to right in order of increasing ability to drive the neuron from none (1) or little (2 and 3) to maximum (6). (From Gross,
Rocha-Miranda, & Bender, 1972.)
increase in the number of neurons involved in vision as the The actual number of nerve fibers running in parallel
messages are passed from the LGN neurons to the granule from retina to LGN is not far from the number required for
cells in layer 4 of the primary visual cortex. The actual there to be one pathway per resolvable element of the optical
figures are interesting. Reading from the chart, this first stage image falling on the retina or one pathway per cone, pro-
of increase is by a factor of about 55, from 1.4 million to 75 vided that one confines one’s attention to the foveal part of
million. If one includes all the 250 million neurons in V1, the pathway and takes into account that the image in the
the factor increases to 180, and for the whole visual cortex near periphery is undersampled (i.e., it has fewer nerve fibers
(nearly 800 million neurons), the number of neurons is serving it than the quality of the image deserves). The
almost 600 times the number of LGN neurons. approximate agreement in the fovea means that with these
That really establishes the point that the cortex has very cautions, the number of nerve fibers in the optic nerve
many more cells at its disposal than do precortical levels, but roughly coincides with the number of degrees of freedom in
the numbers in figure 21.4 represent averages over the whole the copy of the image that the optic nerve passes to the brain.
visual field. It is estimated that in the foveal region, the Why, then, does the cortex need up to 10,000 times that
density of neurons in V1 per unit solid angle of the visual number of neurons to perform its computations? The fact
field is 10,000 times the density of input fibers from the LGN that we think we understand the limiting factors of foveal
(Hawken & Parker, 1991). This compares with the average vision makes the huge numbers of neurons that are available
over the whole visual field of 180 times, given above. The in that part of the cortex even more impressive. Is there
difference results from the fact that the cortical representa- something about the computations it does that we have
tion of the fovea has much more than its fair share of neurons, missed or paid insufficient attention to?
even after taking account of the overrepresentation of the The next part of this chapter discusses this problem in
fovea in the input from the LGN. The number of neurons light of the new behavior that the new part of the mamma-
available for computations on the input from the fovea is lian brain, the cerebral cortex, is thought to have brought
truly phenomenal. about. Although this evidence is not very satisfying for a
Detailed accounting for the numbers shown in the left modern neurophysiologist, it is one of the few sources that
part of figure 21.4 is complicated by the fact that rods and can give useful hints about these new functions.
cones behave differently, and the pattern of convergence The cerebral cortex first appeared in the forebrain of
and then divergence is very different at the fovea and in the mammals while dinosaurs were the dominant large, terres-
periphery. Many of these matters are not relevant here, but trial vertebrates. Before mammals, the vertebrate forebrain
there is one point that makes sense physically, and this is had been dominated by its olfactory input, and in fact, olfac-
worth pointing out, lest the obvious message conveyed above tion is the only sensory modality that still has a direct input
be overshadowed by these complications. to the cortex; all the others pass through the thalamus. These
abstract Despite major progress in elucidating the anatomical the correct moment will leap several feet into the air, catch-
and molecular foundations of olfaction, the rules underlying the ing the fly between his clapped paws (figure 22.1).
link between the olfactory stimulus and the olfactory percept Observing this marvelous demonstration of sensory sub-
remain unknown. We argue that this lack is a reflection of visual
primacy in human perception and thinking, primacy that has pre-
stitution (audition for vision) is a lesson to us in our studies
vented the development of a perception-based approach to study- of olfaction. We humans are visual animals, and this has
ing olfactory coding. With this in mind, in this chapter, we first shaped not only how we negotiate the world around us, but
provide a tutorial on the organization of the mammalian olfactory also how we think about it. Vision dominates our conscious
system and then describe our recent efforts to generate a percep- perception. We intuitively think that information about the
tion-based olfactory metric. The primary olfactory perceptual axis
outside world that is naturally provided to us through vision
revealed by this effort was odorant pleasantness. We found that
pleasantness is a perceptual representation of the physical axis that is inherently visual information. This is not necessarily true,
best explains the variance in molecular odorant structure. That the however, as is so powerfully shown in Diesel’s fly hunting.
most important dimension in olfactory perception should be the In other words, all distal senses have evolved to maximize
best correlate of the most discriminating physicochemical measures the amount and types of information they can extract from
suggests that, as with other senses, the olfactory system has evolved
the environment. It follows from this that if we pay careful
to exploit a fundamental regularity in the physical world. In this
respect, olfactory pleasantness can be likened to visual color and attention to our sensory perceptions in each domain, we can
auditory pitch. Finally, we review our use of this olfactory metric learn much about how that domain is physically organized
in predicting odor perception in humans and odorant-induced in the world around us, and in our brain. This simple truth
neural activity in the olfactory system of nonhuman animals. was elegantly stated by Helmholtz (1878): “Thus, even if in
their qualities our sensations are only signs whose specific
nature depends completely upon our make-up or organiza-
Our lab has a pet cat named Diesel. We found Diesel at the tion, they are not to be discarded as empty appearances.
age of about four weeks, when he was suffering from severe They are still signs of something—something existing or
feline herpes that had invaded both his eyes. Despite signifi- something taking place—and given them we can determine
cant veterinary efforts, both his eyes had to be removed, and the laws of these objects or these events. And that is some-
Diesel has since been a blind cat. That said, any naïve visitor thing of the greatest importance!”
to our lab will not notice Diesel’s blindness. Diesel runs and Neurobiologists have internalized this lesson for vision
jumps around the lab following an internal spatial mental and audition but not for olfaction. In vision, we learned that
map he has constructed, and he negotiates changes in this what appears to us as perceptually similar in color is in fact
landscape, such as a chair that has been moved, with surpris- similar in the physical dimension of wavelength; and in audi-
ing speed, thanks to rapid processing of information from tion, we learned that what appears to us as perceptually
his whiskers, which are always a step in front of him when similar in pitch is in fact similar in the physical dimension
he is in motion. However, the most astonishing aspect of of frequency. We further learned that the physical dimen-
Diesel’s behavioral repertoire is his ability to catch flies in sions of wavelength and frequency are represented at various
flight. Diesel will identify the sound of a fly seemingly despite stages of the nervous system. But what physical dimension
any level of background noise, will follow the fly seemingly is common to similarly smelling odors? And how is this
tracking the fly with his eyes (which are not there), and at dimension represented in the brain? These questions remain
unanswered. In this chapter, we will first provide a basic
tutorial on the organization of the mammalian olfactory
yaara yeshurun, hadas lapid, rafi haddad, shani gelstien, anat
arzi, lee sela, aharon weisbrod, rehan khan, and noam system and then describe our initial efforts to generate a
sobel Department of Neurobiology, Weizmann Institute of perception-based approach for probing the neurobiology of
Science, Rehovot, Israel olfaction.
Figure 22.5 Airflow visualization using Schlieren imaging in smelled. As this image clearly depicts, sensation is an active process,
dogs (Settles, 2001) and humans (Porter & Sobel, 2005). Left panel: and olfaction is a good case in point. Remaining panels: Imaging of
Before sniffing inward, dogs may also sniff outward in a lateral the human nose clearly reveals an asymmetry in airflow into each
trajectory distributing particles so that they can be inhaled and nostril.
Figure 22.6 Odorant molecules bind to olfactory receptors (R) ways. Ca+ stimulates Cl channels, allowing Cl ions to exit the
embedded within the olfactory epithelium. The binding causes the cell (intracellular Cl concentration is greater than extracellular
associated G-protein complex to release its two subunits (α and β). concentration). This ion exchange further depolarizes the mem-
The α subunit stimulates the integral membrane protein adenylyl brane. Calcium also inhibits the transduction by combining with
cyclase (AC III), which in turn increases the concentration of Ca+-binding protein (CBP) that closes the CNG channels, ending
cAMP. Cyclic-gated nucleotide (CNG) channels open with cAMP, signal transduction. Signal termination is also mediated by a variety
leading to membrane depolarization; if there is sufficient depolar- of protein kinases (PKA, GRK, PKC) that phosphorylate the olfac-
ization an action potential is generated in the sensory axon. The tory receptor and by β-arrestin-2 (BARR-2) interacting with the
βγ complex released from the G protein stimulates phospholipase olfactory receptor. Odorant-binding proteins (OBP) in the nasal
C (PLC) leading to higher intracellular inositol triphosphate (IP3) mucosa may increase odorant solubility and/or receptor-binding
and diacylglycerol (DAG). IP3 opens Ca++ channels, allowing affinity, aiding transduction, or may assist in odorant clearance,
Ca+ to enter the neuron. The Ca+ ions have multiple effector path- aiding signal termination. (Image after Buck, 1996.)
Figure 22.8 Temporal development of odor-induced activity. onset, and the late response is data obtained 300–500 ms following
Data from Spors and Grinvald (2002) showing the temporal devel- stimulation. The spatial pattern of response is clearly modified over
opment of the bulbar response to the odorant ethylbutyrate. The time. (See color plate 26.)
early response is data obtained 150–300 ms following stimulus
perceptual space. Thus in both explicit and implicit similar- uum from unpleasant to pleasant, also referred to as perceptual
ity tasks, our derived perceptual space corresponded to sub- valence or hedonic tone). To test this intuitive label, in a separate
jects’ judgments of similarity. Odorants near each other in experiment, we asked subjects to rank the pleasantness of
our space were perceived as similar, and odorants distant the odorants, and then compared the difference in pleasant-
from one another were perceived as dissimilar. ness to distances along PC1. The two measures were strongly
Having validated the space, we set out to identify its prin- correlated (figure 22.10B ), indicating that our intuitive label
cipal axis. A first indication to the identity of PC1 is in the of pleasantness for PC1 was valid.
descriptors that flank it, that were SWEET, PERFUMERY, Our characterization of the first PC as pleasantness is in
AROMATIC, FLORAL, and LIGHT on one end and agreement with previous research (Richardson & Zucco,
SICKENING, PUTRID-FOUL-DECAYED, RANCID, 1989). Pleasantness is the primary perceptual aspect humans
SHARP-PUNGENT-ACID, and SWEATY at the other use to discriminate odorants (Schiffman, 1974; Godinot
(figure 22.10A ). An intuitive name for an axis spanning these & Sicard, 1995) or combine them into groups (Berglund,
descriptors is perceptual pleasantness (we use the term pleasantness Berglund, Engen, & Ekman, 1973; Schiffman, Robinson,
for the sake of simplicity, yet we are referring to the contin- & Erickson, 1977). Pleasant and unpleasant odorants are
evaluated at different speeds (Bensafi et al., 2002) and by accounted for approximately 70% of the variance. Figure
dissociable neural substrates, as evidenced in both electro- 22.11B shows the five descriptors that anchored the first PC
physiological recordings (Kobal, Hummel, & Vantoller, of the space. The full names of these physicochemical
1992; AlaouiIsmaili, Rubin, Rada, Dittmar, & Vernet- descriptors are listed in the legend of figure 22.11.
Maury, 1997; Pause & Krauel, 2000; Masago, Shimomura, Characterizing the primary dimensions of the PCA space
Iwanaga, & Katsuura, 2001) and functional neuroimaging of the physicochemical descriptors is more challenging than
studies (Zald & Pardo, 1997; Royet et al., 2000; Gottfried, the task for the perceptual space, because of both the set size
Deichmann, Winston, & Dolan, 2002; Anderson et al., 2003; and the variety of descriptors involved. Nevertheless, giving
Rolls, Kringelbach, & de Araujo, 2003; Grabenhorst, Rolls, them a coherent general character is possible and useful for
Margot, da Silva, & Velazco, 2007). Finally, studies with providing a sense of what information they might capture.
newborns suggest that at least some aspects of olfactory The first physicochemical PC was weighted at one end by
pleasantness may be innate (Steiner, 1979; Soussignan factors that are reasonable proxies for molecular size or
Schaal, Marlier, & Jiang, 1997). Thus our findings are con- weight: the sum of the atomic van der Waals volumes is
sistent with the view that “it is clearly the hedonic meaning essentially a crude count of atoms, as is the count of the
of odor that dominates odor perception” (Engen, 1982). number of nonhydrogen atoms and the self-returning walk
count of order one for nonhydrogen atoms (which is actually
Building a physicochemical molecular descriptor space identical to a count of the nonhydrogen atoms). The char-
acterization of these descriptors as indices of “weight” is
Having identified the primary perceptual axis of olfaction, borne out by a very high weighting that “molecular weight”
we set out to ask whether any physicochemical dimension itself has on this side of the first PC.
may be linked with it. Using structural chemistry software At the other end of the dimension are a series of topologi-
(Dragon), we obtained 1514 physicochemical descriptors for cal descriptors that vary with the “extent” of a molecule. In
each type of odorant. These descriptors were of many types, fact, all five of the descriptors are average eigenvectors
for example, atom counts, functional group counts, counts of distance or adjacency matrices, normalized in slightly
of types of bonds, molecular weights, topological descriptors, different ways: average eigenvector coefficient sum from
and so on. We then used the same PCA procedure to reduce electronegativity-weighted distance matrix, and average
the dimensionality of the physicochemical space. Applying eigenvector coefficient sum from Z-weighted distance matrix
PCA revealed that the effective dimensionality of the space (Barysz matrix), average eigenvector coefficient sum from
of descriptors was much lower than the apparent dimension- mass-weighted distance matrix, and average eigenvector
ality of 1514. Figure 22.11A shows the percent variance coefficient sum from distance matrix, average eigenvector
explained by each of the first 10 PCs. The first PC accounted coefficient sum from adjacency matrix. Each of these mea-
for approximately 32% of the variance, and the first 10 sures increases as the denseness of the atomic connections
increases, that is, as the number of atoms is packed more ception should be the best correlate of the most discriminat-
closely together. In combination, then, these two extremes ing physicochemical measures suggests that, as with other
anchor a dimension that characterizes the amount and senses, the olfactory system has evolved to exploit a funda-
distribution of mass within a molecule. Thus the first PC mental regularity in the physical world.
can be thought of as a measure of the “compactness” of a Having established that the physicochemical space maps
molecule. onto the perceptual space, we next built linear predictive
models through a cross-validation procedure. We then split
Building a model from physical to perceptual space the Dravnieks data in half, modeled one half, and used this
to predict PC1 of the other half. We repeated this 1000 times
We used the same procedure to construct two spaces: a and obtained a modest but significant prediction of PC1
perceptual space that was derived from an initially high- (odorant pleasantness) based on physicochemical attributes
dimensional odor descriptor space and a physicochemical (figure 22.12B ). To test the generality of this finding, we
space that was derived from an initially high-dimensional predicted the pleasantness of 104 odorants that we had never
molecular descriptor space. Both spaces were constructed smelled before and that were not used by Dravnieks or us at
by using PCA, which generates ordered sets of orthogonal any stage. We then obtained these odorants and collected
axes, constructed to maximize the variance they capture pleasantness estimates from three culturally diverse groups
in the original feature space. Because the axes are orth- of subjects (Americans in Berkeley, California, in the United
ogonal by construction, that is, uncorrelated using a linear States; Israeli Jews in Rehovot in Israel; and rural Israeli
Pearson correlation statistic, we can compare them Muslim Arabs in the village of Dir El Asad in the Northern
independently. Galali in Israel), each tested with more than 20 odorants. In
For each of the first four perceptual PCs, we asked whether each case, we obtained a similarly accurate and significant
they were correlated with any of the first seven physico- prediction of odorant pleasantness based on odorant struc-
chemical PCs. Strikingly, we noted that the strongest corre- ture (figure 22.13).
lation that we observed was between the first perceptual PC
and the first physicochemical PC (figure 22.12A ). In other Using odor space to predict neural activity in the
words, the single optimal axis for explaining the variance in olfactory system
the physicochemical data was the best predictor of the single
optimal axis for explaining the variance in the perceptual PC1 of physicochemical structure, or compactness, is a
data. That the most important dimension in olfactory per- single axis. However, it is multidimensional in the sense
Figure 22.13 Cross-cultural validation. Twenty-seven odorous subjects: Americans (23 subjects), Arab Israelis (22 subjects), and
molecules not commonly used in olfactory studies and not previ- Jewish Israelis (20 subjects). In all cases, our predictions of odorant
ously tested by us were presented to three cultural groups of naïve pleasantness were similar and significant.
that more than 1500 known features contributed to it Given that we have generated an olfactory metric, we can
with known weights. In other words, we can represent reanalyze previously collected data by reordering the studied
each odorant as a single value reflecting its compactness odorants using the above described vector-type representa-
(its PC1 score), or we can represent each odorant as a tion. With this in mind, Rafi Haddad and colleagues in our
vector of more than 1500 values (although PC1 was gener- lab revisited nine previously published data sets as well as
ated with 1514 values, Dragon software will generate up one novel data set for which we knew the odorants used but
to 1664 values). When using the former approach, the did not know the neural response (Haddad et al., 2008). We
distance between two odorants is the difference in PC1 found that our novel metric was always better at accounting
values. When using the latter approach, one can com- for neural responses than the specific metric used in each
pute the distance between any two odorants by the square study (e.g., carbon chain length). Moreover, this single metric
root of the sum of squares of the differences between was applicable across studies that used different olfactory
descriptors. neurons, different model systems, and different neuronal
response measurement techniques and odorants varying and joined others (Engen, 1982) in observing that pleasant-
along different feature types. In other words, our approach ness is the principal perceptual aspect of olfaction. We
enabled us to use odorant structure to predict olfactory next reduced the dimensionality of physicochemical pro-
perception in human subjects (Khan et al., 2007) and perties, and identified a primary axis of physicochemical
odor-induced neural activity in nonhuman animals (Haddad structure. We found that 144 molecules were similarly
et al., 2008) (figure 22.14). ordered by these two independently obtained principal
axes: one for perception and one for physicochemical struc-
Conclusions ture. In other words, when measures useful to chemists
with no a priori connection to any particular percepts
Although the neuroanatomy of the olfactory system is were analyzed, those physicochemical measures that were
well described and the molecular mechanisms of olfactory best at discriminating a set of molecules were found to be
transduction are well understood, overall coding of olfac- precisely those that were most correlated with the perception
tion remains a mystery, in the sense previously noted of olfactory pleasantness. It is in identification of this pri-
whereby an olfactory percept cannot be predicted from vileged link that we add to the work of Schiffman, Amoore,
an olfactory stimulus structure. Our recent efforts have, in Dravnieks, and others, who together laid the groundwork
our view, made a step in this direction. However, this for this approach between the early 1950s and the late 1970s
remains an initial step in what is a long path. To reiterate, (Amoore, 1963; Laffort & Dravnieks, 1973; Schiffman,
we first reduced the dimensionality of olfactory perception 1974).
abstract The detection of a target sound embedded in a chaotic incomplete description of auditory masking. Tanner began
acoustical environment is a basic yet poorly understood component with Licklider’s (1951) description of masking: “Masking is
of auditory perception. This chapter reviews three aspects of such thus the opposite of analysis; it represents the inability of
auditory masking, building toward the important problem of the
perception of hearing a speech signal in the presence of competing
the auditory mechanism to separate the tonal stimulation
speech sounds. First, the history of psychoacoustic masking experi- into components and discriminate between the presence
ments and the development of energy-based models to account for or absence of one of them” (Tanner, 1958, p. 191; Licklider,
the resulting data are described. Then experiments and models of 1951, p. 1005). This definition reflects the fact that the
the detection of a tonal signal masked by randomly drawn maskers, auditory system represents sounds tonotopically, such that
an example of informational masking, are described. Informational
the different frequencies of impinging sounds are encoded
masking experiments such as these are important because they
reveal masking phenomena that are mediated more centrally than and represented by different populations of neurons. The
the masking associated with traditional masking studies. Finally, the failure of analysis noted by Licklider, then, essentially equates
roles of peripheral and central masking for the detection of speech masking with limitations in the frequency resolution of the
masked by other speech sounds are discussed. auditory system. When the signal and masker share common
frequencies, the overlap of signal and masker energy makes
the detection of the signal difficult because the shared energy
Imagine the following: You and a friend are standing on the is represented by the same population of neurons. Tanner
corner of 34th and Chestnut Streets in Philadelphia, awaiting noted that when the frequency of the tone to be detected is
a bus to take you to Monks, a local pub known for Belgian- not fixed but chosen at random, the detectability of the
style beers. It is just after 5:00 p.m., and the traffic is heavy. signal decreases (Tanner & Norman, 1954). It is unlikely that
And as it happens, a garbage truck is clearing public trash this reduction in sensitivity reflects a failure of analysis at the
bins along Chestnut Street. Needless to say, you have to yell auditory periphery. Should this decline in sensitivity be con-
to be heard over the din of activity. This is an example of sidered a form of masking? Tanner further pondered the
masking.1 To communicate your message acoustically, you detection of a tone of known frequency masked either by
must broadcast your signal at a high intensity. Sound pres- another tone or by Gaussian (white) noise. Should masking
sure waves superimpose, or add, in space, and what enters be defined independently of the properties of the masker?
the ear is the accumulation of the sound in the environment. Regardless of whether the frequency of the tone to be
In enclosed environments—a classroom, for example— detected is uncertain or the characteristics of the masker are
superposition includes not just the noise of students, audio- varied, the end result is the same: A listener’s ability to detect
visual equipment, and so on, but also the sounds reflected the signal changes. Should all of these examples be described
off of walls, chalkboards, tables, and other surfaces. It is as by using the single term masking?
though we live in an acoustical hall of mirrors. The auditory Tanner’s point might be described in a slightly different
system has adapted to this din, allowing the segregation of way: As the sound stream is processed by the ascending
a target sound away from the masking sounds. Efforts to auditory pathway, what information is lost and where in the
understand this fundamental aspect of auditory perception processing is it lost? With regard to what masking tells us
form the basis of modern research in auditory masking. about the auditory system, the question becomes: How is it
In 1958, Tanner suggested that the intuitive sense of that biological systems effortlessly detect and follow an
masking conveyed in the scenario described above is a vastly ongoing target sound under the pressure of multiple acousti-
cal distracters? While not directly addressed in the current
chapter, it is important to appreciate that for many individ-
virginia m. richards Department of Psychology, University of
Pennsylvania, Philadelphia, Pennsylvania uals with compromised auditory capabilities, there is a
gerald kidd, jr. Department of Speech, Language and Hearing decline in the ability to detect a target sound in the presence
Sciences, Boston University, Boston, Massachusetts of other sounds especially when the listening situation is
Figure 23.1 The amount of masking for a tonal signal masked by a narrowband noise centered at 410 Hz is plotted as a function of
signal frequency. The level of the narrowband noise was 40, 60, and 80 dB sound pressure level. (From Egan & Hake, 1950.)
Figure 23.4 A schematic illustration of the steps in processing the rectifying the filtered speech, (4) low-pass filtering of the rectified
target and/or masker speech into sets of narrow bands. The various waveforms to extract the envelopes, and (5) multiplying the enve-
stages include (left to right): (1) gradual high-frequency emphasis, lope functions with pure-tone carriers centered in each frequency
(2) filtering the speech into one-third-octave bands, (3) half-wave band. (Adapted from Arbogast, 2003.)
80
60
50
40
30
20
0.2 0.5 1 2 5
Frequency (kHz)
Figure 23.5 Magnitude spectra of target (black) and masker (gray) speech processed as shown in figure 23.4 into mutually exclusive
frequency bands.
target and masker were both played from the same loud-
0 0o speaker, which was located directly in front of the listener
90o (0° spatial separation). The lower of the two points in each
−5
case indicates a threshold that is obtained when the target
Target to Masker Ratio (dB)
abstract Many auditory skills improve with practice, indicating To help establish the principles of auditory learning, we
malleability of the underlying neural system. Here we consider the and others have focused our investigations on basic auditory
effect of training on the human perception of basic sound attributes skills and simple training regimens. In such cases, during
such as frequency, intensity, and duration. We first compare learn-
ing patterns across multiple tasks, in each of which listeners dis-
training, listeners are asked to discriminate between small
criminate changes in a different sound attribute. These patterns variations in only one attribute of a relatively simple sound,
differ markedly for different tasks and sometimes even for different such as to determine which of two tones has a higher fre-
stimuli within the same task, in terms of both how performance quency or a longer duration. Establishing the learning
changes over training sessions and how learning generalizes to patterns under these circumstances forms a baseline for
untrained conditions. The differences suggest that training on dif-
interpreting improvements on more complex tasks and with
ferent tasks affects different neural processes. We then describe in
more detail sets of training experiments on auditory-timing and more complex training regimens. Suggesting that this is a
spatial-hearing skills and make inferences about the underlying reasonable approach, we have recently seen that phenom-
neural processes affected by the training. Finally, we speculate ena we first observed in learning on simple auditory tasks
about the neural underpinnings of auditory learning. This chapter occur on speech-perception tasks as well. Similar reasoning
thus illustrates that the examination of auditory learning can
has also guided a large number of investigations in the visual
provide unique insights into the human perception and neural
processing of sounds. system (Rust & Movshon, 2005).
In this chapter, we begin with a brief review of differences
across different trained tasks in the pattern of performance
improvement resulting from training and argue that these
A remarkable and often unrecognized characteristic of differences constitute evidence for the involvement of differ-
human perceptual abilities is that they can be improved with ent neural processes in these different cases. We then con-
practice. Such perceptual learning indicates that the under- sider in more detail the learning patterns on auditory
lying neural processes are malleable. Investigations of the temporal and spatial tasks to illustrate how these patterns
circumstances that yield this learning and of the patterns can be used to make inferences about the particular neural
with which this learning occurs have both theoretical and processes that were modified through training. We conclude
practical value. On the theoretical side, this information with a brief discussion of the neural bases of auditory learn-
provides insight into the architecture and plasticity of the ing itself.
neural processes that govern perceptual performance. On
the practical side, it can guide the development of more Evidence that different neural processes contribute to
effective and efficient perceptual training regimens to aid learning on different auditory tasks
individuals with perceptual disorders as well as others who
desire enhanced perceptual skills. To date, perceptual learn- The behavioral evidence that auditory learning involves dif-
ing has been examined primarily in the visual system. Here, ferent neural processes on different tasks arises primarily
we instead describe select aspects of perceptual learning in from examination of three aspects of the learning patterns:
the auditory system. learning on the trained condition, across-task generalization,
and across-stimulus generalization. This evidence rests on
the basic assumption that the pattern of learning and gen-
beverly a. wright and yuxuan zhang Department of
Communication Sciences and Disorders and Interdepartmental eralization is determined by the particular neural circuitry
Neuroscience Program, Northwestern University, Evanston, that is being modified as well as by the particular type of
Illinois modification that is occurring (e.g., Hochstein & Ahissar,
wright and zhang: human auditory processing and perceptual learning 353
2002; Karni & Sagi, 1991). Given this assumption, differ-
ences in learning and generalization patterns across different
tasks suggest that training on different tasks induced the
same modifications in different neural circuitry, different
modifications in the same circuitry, or different modifica-
tions in different circuitry. We typically cannot distinguish
among these three types of differences at the behavioral
level. Therefore we use the phrase different neural processes to
refer to all three possibilities.
1.0
Frequency A frequency discrimination across multiple daily training ses-
sions showed no systematic improvement in performance
within each session (Wright & Sabin, 2007). That is, the
0.9 improvement appeared to occur between, rather than within,
sessions. However, we have seen different within-session
0.8 patterns for other tasks. For example, although listeners
improved across sessions on the detection of a brief tone that
0.7 was presented immediately before a masking noise (back-
ward masking), their performance tended to worsen from
0.6 the beginning to the end of each training session (unpub-
ILD tone lished data). That the within-session performance patterns
0.5 differ across tasks, despite across-session improvements in all
Interval cases, is consistent with the idea that there are at least two
distinct stages of perceptual learning. One, acquisition, is the
2 4 6 8 10
period during which the task is actually practiced. The other,
consolidation, is the period during which performance sta-
wright and zhang: human auditory processing and perceptual learning 355
Frequency Discrimination Temporal-Interval Discrimination
A Standard Comparison C Standard Comparison
Δt
Frequency
Frequency
t t
Δf
Time Time
B p<0.01 D
Adjusted Learning Curve Slope
-0.1 n.s.
-10
-0.2
-15
-0.3
360 900 360 900
Number of Trials per Day Number of Trials per Day
Figure 24.3 Different amounts of daily training required for from zero for listeners who practiced 360 trials per day for six days
learning across multiple sessions on frequency (A, B ) and temporal- (open triangles; n = 7), indicating no improvement across training
interval (C, D ) discrimination. (A) Schematic diagram of the fre- sessions. In contrast, the slopes of listeners who practiced 900 trials
quency discrimination task. Listeners discriminated between per day (solid squares; n = 8) differed significantly from zero and
standard (left) and comparison (right) stimuli that differed from were negative ( p < 0.01), indicating across-session improvement.
each other only in frequency. (B) Rate of across-session improve- (C, D) Same as A and B but for the temporal-interval discrimination
ment on frequency discrimination indicated by the slopes of task. For this task, the slopes were significantly different from
regression lines fitted, for each listener (symbols), to the daily zero and were negative, regardless of whether the listeners prac-
thresholds versus the log of the training session number. Individual ticed 360 (open triangles; n = 6, p < 0.001) or 900 (solid squares; n
differences were taken into account by adjusting the slopes based = 6, p < 0.0001) trials per day for six days, indicating improvement
on pretraining thresholds (ANCOVA). The box plots indicate the across training sessions in both cases. (Figure adapted from Wright
median and quartile values. The slopes did not differ significantly & Sabin, 2007.)
different trained tasks is that training on one task rarely leads (Mossbridge, Fitzgerald, O’Connor, & Wright, 2006), or
to performance improvements on other tasks (Wright & frequency and temporal-interval discrimination (unpub-
Zhang, 2009). The specific assumption here is that learning lished data). It also did not generalize, in the one direction
generalizes from a trained to an untrained task if and only that was tested, from interaural-level-difference to interau-
if the practice on the trained task modifies neural processes ral-time-difference discrimination (Wright & Fitzgerald,
that also govern performance on the untrained one. 2001), from amplitude-modulation rate to rippled-noise
Therefore a failure to generalize across tasks suggests that (figure 24.4) (Fitzgerald & Wright, 2005) or temporal-
different neural processes are modified through training on interval (van Wassenhove & Nagarajan, 2007) discrimina-
those tasks. tion, or from amplitude-modulation rate discrimination to
There are a number of examples of a lack of across-task amplitude-modulation detection (figure 24.4) (Fitzgerald &
generalization on basic auditory skills. Following multiple- Wright, 2005). The lack of across-task generalization has
session training, learning did not generalize in either direc- also been observed following a single session of training. For
tion between frequency and amplitude-modulation rate example, a brief period of training on sound-intensity or visual-
discrimination (figure 24.4) (Fitzgerald & Wright, 2005; contrast discrimination did not lead to better performance
Grimault, Micheyl, Carlyon, Bacon, & Collet, 2003), asyn- on frequency discrimination, though the same period of
chrony detection and order discrimination at sound onset training on frequency discrimination did (Hawkey, Amitay,
wright and zhang: human auditory processing and perceptual learning 357
et al., 2005; Delhommeau et al., 2005; Demany & Semal,
2002; Irvine et al., 2000) and to an untrained standard inte-
raural-level-difference value for interaural-level-difference
discrimination (Wright & Fitzgerald, 2001) but not to
untrained temporal intervals for temporal-interval discrimi-
nation (Karmarkar & Buonomano, 2003; Wright et al.,
1997). The differences in across-stimulus generalization
pattern for different trained tasks have been used to make
inferences about the tuning characteristics of the neural cir-
cuitry that was modified by training (see below).
wright and zhang: human auditory processing and perceptual learning 359
A
B D
C E
Figure 24.6 Learning and generalization on auditory asynchrony posttraining (filled symbols) tests on the trained condition (left
detection (B, D) and temporal-order discrimination (C, E). (A) Sche- column). However, the generalization pattern differed across the
matic diagrams of the signal and standard stimuli used in the four trained conditions. In three cases (B, C, E ), the learning attributable
relative-timing conditions. Each stimulus consisted of two tones. to the multiple-hour training was specific to the trained condition.
The duration of the higher-frequency tone was fixed at 500 ms. In the fourth case (D), the learning generalized to all conditions
The frequencies of the tones depended on the condition parameters. tested with the trained frequency pair and therefore spread
(B–E) The mean threshold values on a set of relative-timing con- more broadly than learning for the other trained conditions did.
ditions tested before and after six to eight 720-trial daily training The parameters for each condition are marked on the abscissas
sessions on a single condition: asynchrony-detection (B ) or tempo- (n = 6–18 for each group in each condition). Boxes indicate con-
ral-order-discrimination (C ) at sound onset or asynchrony-detection ditions on which trained listeners learned significantly more
(D) or temporal-order-discrimination (E ) at sound offset. In all four than controls did (p < 0.05). (Figure adapted from Mossbridge,
cases, trained listeners (squares) improved significantly more than Fitzgerald, O’Connor, & Wright, 2006; Mossbridge, Scissors, &
controls (triangles) between the pretraining (open symbols) and Wright, 2008.)
wright and zhang: human auditory processing and perceptual learning 361
the SAM tone (Zhang & Wright, in review). There were processes. Here we speculate about the actual neural under-
three such differences. First, the training-induced learning pinnings of auditory learning on the basis of behavioral and
on ILD discrimination with the SAM tone generalized to physiological data from the auditory as well as other sensory
untrained SAM tones with different carrier frequencies and systems.
modulation rates but not to pure tones, even when those We propose that in most cases, for auditory learning to
tones had the same frequency as the trained carrier or mod- occur on a given condition, a neural process that limits the
ulation rate. Thus within the trained stimulus type, learning performance on that condition has to be selected and placed
for the SAM tone generalized across frequency, while that in a modification-prone state (sensitized). Sufficient stimula-
for the pure tone did not. Second, the amount of learning tion of the sensitized process results in modifications that
could be predicted on the basis of the starting thresholds lead to behavioral improvement. We further suggest that the
for ILD learning with the pure tone but not with the SAM selection and sensitization of the targeted process occurs
tone. Third, the learning curve was more linear for the SAM through top-down influences such as attention or reward.
tone than for the pure tone. These differences suggest that These influences are typically and optimally provided by
training affects differentially the processing of ILDs in performance of the target condition rather than simply
amplitude-modulated stimuli and pure tones, even at the through the bottom-up stimulation received from stimulus
same frequency. exposures. A role for top-down influences in perceptual
Finally, we investigated the extent to which the rapid learning has been proposed previously for visual learning
improvement that we observed on ITD discrimination (Ahissar & Hochstein, 2004; Gilbert & Sigman, 2007; Seitz
results from learning of the trained stimulus, the lateraliza- & Watanabe, 2005). The primary behavioral evidence for
tion task, or other factors that are collectively classified as this involvement, both here and in other sensory systems,
the procedure (Ortiz & Wright, 2009). Toward this end, we comes from the observations that learning on one task rarely
trained three groups of listeners for a single session, each on generalizes to other tasks, even when the same stimuli are
a different condition, and tested all of them the next day on employed, and from the different learning and generaliza-
a target ITD-discrimination condition. The three trained tion patterns for different tasks performed with the same
conditions shared different elements with the target ITD stimuli. The idea that sufficient stimulation of the sensitized
condition, forming a hierarchy of similarity. One group of process is required to achieve learning comes in part from
listeners was trained on a temporal-interval discrimination the demonstration that improvement across days on an audi-
condition that shared with the target condition only the tory task requires a sufficient amount of training per day
general, procedural aspects. These listeners had lower (Wright & Sabin, 2007). It also echoes a recent proposal,
thresholds on the target ITD condition than naïve listeners arising from a literature review, that a “learning threshold”
did, suggesting procedure learning. Another group of must be surpassed, through any of a variety of means, for
listeners was trained on an ILD-discrimination condition improvement to occur (Seitz & Dinse, 2007). Note that this
that shared with the target condition both the procedure proposed requirement for learning provides one means for
and the lateralization task but not the stimulus. The ITD- preserving the necessary balance between stability and plas-
discrimination thresholds of the ILD-trained listeners were ticity in the nervous system.
similar to those of the interval-trained listeners, implying We also suggest that the processes that are selected and
that there was little additional improvement that was attrib- sensitized during auditory training differ across tasks and
utable to task learning. The third group was trained on the can shift over the course of training. These ideas are sup-
target ITD condition itself. These listeners had lower ITD ported by evidence from neurophysiology and imaging that
thresholds than the ILD-trained listeners did, suggesting that the neural changes that accompany perceptual learning
the additional improvement resulted from stimulus learning. occur at multiple stages of the nervous system, including
Thus rapid improvements on ITD discrimination appear to primary sensory cortices (Clapp, Kirk, Hamm, Shepherd, &
result primarily from learning of the procedure and the Teyler, 2005; Furmanski, Schluppeck, & Engel, 2004; Li,
stimulus, implying that a single session of training can affect Piech, & Gilbert, 2008; Pourtois, Rauss, Vuilleumier, &
at least two types of neural processes. Schwartz, 2008) as well as associative (Law & Gold, 2008)
and frontal (Krigolson, Pierce, Holroyd, & Tanaka, 2008)
Neural underpinnings of perceptual learning cortices, particularly those involved in attention (Mukai
et al., 2007). There are also reports of global reorganization
Up to this point, we have documented the large variation in spanning multiple stages of processing (Schiltz, Bodart,
learning patterns across auditory tasks, argued that this Michel, & Crommelinck, 2001; Sigman et al., 2005; Vaina,
variation suggests that different neural processes are involved Belliveau, des Roziers, & Zeffiro, 1998; van Wassenhove &
in learning on these tasks, and illustrated how these learning Nagarajan, 2007). These ideas receive further support from
patterns can be used to make inferences about the affected evidence that different sites are affected at different time
wright and zhang: human auditory processing and perceptual learning 363
Fiorentini, A., & Berardi, N. (1980). Perceptual learning specific Meegan, D. V., Aslin, R. N., & Jacobs, R. A. (2000). Motor timing
for orientation and spatial frequency. Nature, 287(5777), 43–44. learned without motor training. Nat. Neurosci., 3(9), 860–862.
Fitzgerald, M. B., & Wright, B. A. (2005). A perceptual learning Mollon, J. D., & Danilova, M. V. (1996). Three remarks on
investigation of the pitch elicited by amplitude-modulated noise. perceptual learning. Spatial Vis., 10(1), 51–58.
J. Acoust. Soc. Am., 118(6), 3794–3803. Mossbridge, J. A., Fitzgerald, M. B., O’Connor, E. S., &
Furmanski, C. S., Schluppeck, D., & Engel, S. A. (2004). Learning Wright, B. A. (2006). Perceptual-learning evidence for separate
strengthens the response of primary visual cortex to simple pat- processing of asynchrony and order tasks. J. Neurosci., 26(49),
terns. Curr. Biol., 14(7), 573–578. 12708–12716.
Gilbert, C. D., & Sigman, M. (2007). Brain states: Top-down Mossbridge, J. A., Scissors, B. N., & Wright, B. A. (2008). Learn-
influences in sensory processing. Neuron, 54(5), 677–696. ing and generalization on asynchrony and order tasks at sound
Gold, J., Bennett, P. J., & Sekuler, A. B. (1999). Signal but not offset: Implications for underlying neural circuitry. Learn. Mem.,
noise changes with perceptual learning. Nature, 402(6758), 15(1), 13–20.
176–178. Mukai, I., Kim, D., Fukunaga, M., Japee, S., Marrett, S., &
Gottselig, J. M., Brandeis, D., Hofer-Tinguely, G., Borbely, Ungerleider, L. G. (2007). Activations in visual and attention-
A. A., & Achermann, P. (2004). Human central auditory plastic- related areas predict and correlate with the degree of perceptual
ity associated with tone sequence learning. Learn. Mem., 11(2), learning. J. Neurosci., 27(42), 11401–11411.
162–171. Nagarajan, S. S., Blake, D. T., Wright, B. A., Byl, N., &
Grimault, N., Micheyl, C., Carlyon, R. P., Bacon, S. P., & Merzenich, M. M. (1998). Practice-related improvements in
Collet, L. (2003). Learning in discrimination of frequency or somatosensory interval discrimination are temporally specific
modulation rate: Generalization to fundamental frequency dis- but generalize across skin location, hemisphere, and modality.
crimination. Hear. Res., 184(1–2), 41–50. J. Neurosci., 18(4), 1559–1570.
Grimault, N., Micheyl, C., Carlyon, R. P., & Collet, L. (2002). Ortiz, J. A., & Wright, B. A. (2009). Contributions of procedure
Evidence for two pitch encoding mechanisms using a selective and stimulus learning to early, rapid perceptual improvements.
auditory training paradigm. Percept. Psychophys., 64(2), 189–197. J. Exp. Psychol. Hum. Percept. Perform, 35(1), 188–194.
Hawkey, D. J., Amitay, S., & Moore, D. R. (2004). Early and Petersen, S. E., van Mier, H., Fiez, J. A., & Raichle, M. E.
rapid perceptual learning. Nat. Neurosci., 7(10), 1055–1056. (1998). The effects of practice on the functional anatomy of task
Henning, G. B., & Ashton, J. (1981). The effect of carrier and performance. Proc. Natl. Acad. Sci. USA, 95(3), 853–860.
modulation frequency on lateralization based on interaural Poggio, T., Fahle, M., & Edelman, S. (1992). Fast perceptual
phase and interaural group delay. Hear. Res., 4(2), 185–194. learning in visual hyperacuity. Science, 256(5059), 1018–1021.
Hochstein, S., & Ahissar, M. (2002). View from the top: Hierar- Pourtois, G., Rauss, K. S., Vuilleumier, P., & Schwartz, S.
chies and reverse hierarchies in the visual system. Neuron, 36(5), (2008). Effects of perceptual learning on primary visual cortex
791–804. activity in humans. Vision Res., 48(1), 55–62.
Irvine, D. R., Martin, R. L., Klimkeit, E., & Smith, R. (2000). Recanzone, G. H., Merzenich, M. M., Jenkins, W. M., Grajski,
Specificity of perceptual learning in a frequency discrimination K. A., & Dinse, H. R. (1992). Topographic reorganization of
task. J. Acoust. Soc. Am., 108(6), 2964–2968. the hand representation in cortical area 3b owl monkeys trained
Karmarkar, U. R., & Buonomano, D. V. (2003). Temporal speci- in a frequency-discrimination task. J. Neurophysiol., 67(5),
ficity of perceptual learning in an auditory discrimination task. 1031–1056.
Learn. Mem., 10(2), 141–147. Recanzone, G. H., Schreiner, C. E., & Merzenich, M. M. (1993).
Karni, A., Meyer, G., Rey-Hipolito, C., Jezzard, P., Adams, Plasticity in the frequency representation of primary auditory
M. M., Turner, R., et al. (1998). The acquisition of skilled cortex following discrimination training in adult owl monkeys.
motor performance: Fast and slow experience-driven changes J. Neurosci., 13(1), 87–103.
in primary motor cortex. Proc. Natl. Acad. Sci. USA, 95(3), Robinson, K., & Summerfield, A. Q. (1996). Adult auditory learn-
861–868. ing and training. Ear Hear., 17(3, Suppl.), 51S–65S.
Karni, A., & Sagi, D. (1991). Where practice makes perfect in Roth, D. A., Amir, O., Alaluf, L., Buchsenspanner, S., &
texture discrimination: Evidence for primary visual cortex plas- Kishon-Rabin, L. (2003). The effect of training on frequency
ticity. Proc. Natl. Acad. Sci. USA, 88(11), 4966–4970. discrimination: Generalization to untrained frequencies and
Krigolson, O. E., Pierce, L. J., Holroyd, C. B., & Tanaka, to the untrained ear. J. Basic Clin. Physiol. Pharmacol., 14(2),
J. W. (2008). Learning to become an expert: Reinforcement 137–150.
learning and the acquisition of perceptual expertise. J. Cogn. Rubin, N., Nakayama, K., & Shapley, R. (1997). Abrupt learning
Neurosci. [Epub ahead of print. doi: 10.1162/jocn.2009.21128.] and retinal size specificity in illusory-contour perception. Curr.
Law, C. T., & Gold, J. I. (2008). Neural correlates of perceptual Biol., 7(7), 461–467.
learning in a sensory-motor, but not a sensory, cortical area. Nat. Rust, N. C., & Movshon, J. A. (2005). In praise of artifice. Nat.
Neurosci., 11(4), 505–513. Neurosci., 8(12), 1647–1650.
Leek, M. R., & Watson, C. S. (1984). Learning to detect auditory Schiltz, C., Bodart, J. M., Michel, C., & Crommelinck, M.
pattern components. J. Acoust. Soc. Am., 76(4), 1037–1044. (2001). A pet study of human skill learning: Changes in brain
Leek, M. R., & Watson, C. S. (1988). Auditory perceptual learning activity related to learning an orientation discrimination task.
of tonal patterns. Percept. Psychophys., 43(4), 389–394. Cortex, 37(2), 243–265.
Li, W., Piech, V., & Gilbert, C. D. (2008). Learning to link visual Schoups, A., Vogels, R., Qian, N., & Orban, G. (2001). Practising
contours. Neuron, 57(3), 442–451. orientation identification improves orientation coding in V1
Lu, Z. L., Chu, W., Dosher, B. A., & Lee, S. (2005). Independent neurons. Nature, 412(6846), 549–553.
perceptual learning in monocular and binocular motion systems. Seitz, A. R., & Dinse, H. R. (2007). A common framework for
Proc. Natl. Acad. Sci. USA, 102(15), 5624–5629. perceptual learning. Curr. Opin. Neurobiol., 17(2), 148–153.
wright and zhang: human auditory processing and perceptual learning 365
25 Auditory Object Analysis
timothy d. griffiths, sukhbinder kumar, katharina von kriegstein,
tobias overath, klaas e. stephan, and karl j. friston
abstract The question addressed in this chapter is how the audi- for the creation of images (brain representations corres-
tory system allows us to represent the elements of the acoustic ponding to an object) with two or more spatial dimensions
world? The term auditory object is widely used in the literature but in the form of arrays of neural activity that preserve spatial
in a number of different ways. We consider different aspects of
object analysis and the ways in which these can be approached by
relationships from the retina to the cortex. In the auditory
using experimental techniques such as functional imaging. Func- system, the concept of an image is most often used to refer
tional imaging allows us to map networks for the abstraction of to a brain representation with dimensions of frequency and
perceived objects and generalization across objects. This funda- time or derivations of these such as spectral ripple density
mental aspect of auditory perception involves high-level cortical related to frequency (Chi, Ru, & Shamma, 2005) and ampli-
mechanisms in the lateral temporal lobe. Systems identification
tude modulation (Chi et al., 2005) or forms of autocorrela-
techniques based on Bayesian model selection in individual subjects
allow the testing of specific models that explain the activity of the tion (Patterson, 2000) related to time. If we accept the
networks that are mapped. existence of images with a temporal dimension, then the
concepts of auditory objects and auditory images can be
considered in a way comparable to how the visual system is
considered. The idea was first proposed by Kubovy and
The concept of auditory object Van Valkenburg (2001), who suggested that auditory
objects can be considered as existence regions within
In the acoustic world, we experience a number of different frequency-time space that have borders with the rest of the
things that form the natural sound scene. The problem con- sound scene.
sidered here is how the brain abstracts representations of A second issue about the concept of auditory object analy-
these things, or objects, as a basis for perception. The com- sis (which is also relevant to visual object analysis) is the
putation required for this process is formidable, given the cognitive level to which it should be extended. Consider
richness of our sound experience that is entirely based on the situation in which you hear someone making the
two pressure waveforms arriving at the ears. The problem vowel sound /a/ at a pitch of 110 Hz and intensity of
is a key issue for what has become known as auditory scene 75 dB on the left side of the room. That situation requires
analysis (Bregman, 1990). sensory analysis of the spectrotemporal structure of the
In contrast to the concept of visual objects, the concept sound. It also requires categorical perception to allow the
of auditory object is controversial for a number of reasons sound to be distinguished from other sounds. Sounds from
(Griffiths & Warren, 2004). At the level of the stimulus, it is which it has to be distinguished might be from another class
more difficult to examine the sound pressure waveform that (e.g., a telephone ringing at the same pitch, intensity, and
enters the cochlea and “see” different objects in the same location) or the same class (e.g., another person making
way that we “see” objects in the visual input to the retina. the vowel sound /a/ at a different pitch, intensity, or spatial
However, in the auditory system and in the visual system, location). We can appreciate that we are listening to the
objects can be understood in terms of the images they same type of sound if we hear it at 80 Hz or 65 dB or on
produce during the processing of sense data. The idea that the right side of the room. We can appreciate that similarity
objects are mental events that result from the creation of even if we do not speak a relevant language to allow us to
images from sense data goes back to Kant and Berkeley recognize or name the vowel. At another level of analysis,
(Russell, 1945). In the visual system, there is good evidence the sound must enter a form of echoic memory store (to
allow comparison with sounds that might immediately
timothy d. griffiths and sukhbinder kumar Institute of follow it) and might enter an anterograde memory store that
Neuroscience, Newcastle University, Newcastle upon Tyne; allows comparison with sounds heard over days or weeks.
Wellcome Centre for Imaging Neuroscience, University College,
At a further level of analysis, we might call the sound a voice,
London, United Kingdom
katharina von kriegstein, tobias overath, klaas e. stephan, and or my voice, the vowel “a,” or (if we have absolute pitch)
karl j. friston Wellcome Centre for Imaging Neuroscience, “A2.” The term object analysis might therefore be applied
University College, London, United Kingdom to (1) the perception of a coherent whole, the essence of
Figure 25.6 Activation due to passive listening to changing resonator scale and sound class in the three types of harmonic sounds shown
in figure 25.2 (von Kriegstein et al., 2007). (See color plate 29.)
C
D
Figure 25.7 Activation due to passive listening to changing spectral envelope (Warren et al., 2005). HG, Heschl’s gyrus; PT, planum
temporale; PP, planum polare; STS, superior temporal sulcus. (See color plate 30.)
poral plane (in red) corresponding to whether or not the perties beyond the representation of spectrotemporal struc-
sounds were associated with pitch. That mapping occurs in ture. We consider later how the responsible system for
lateral Heschl’s gyrus (HG) in a region previously demon- spectral envelope analysis might be determined explicitly
strated to increase activity as a function of pitch salience by using dynamic causal models of functional auditory
(Patterson, Uppenkamp, Johnsrude, & Griffiths, 2002; architectures.
Penagos, Melcher, & Oxenham, 2004). The key contrast in
figure 25.7 (in blue) is between changing spectral envelope Univariate analysis of fMRI data: Sequences
and fixed spectral envelope in series of objects with continu- of objects
ously varying fine-spectral structure (shown in figure 25.3):
an argument can be made that this contrast identifies areas Figure 25.8 shows an experiment in which the encoding of
involved in the “abstraction” of spectral envelope relevant sequences of objects was assessed: specifically, the encoding
to object analysis over and above the analysis of the fine- of the fractal-pitch sequences similar to the examples in
spectral structure. The contrast shows bilateral activation in figure 25.4. The information content of a pitch series was
the superior temporal plane in the planum temporale (PT), systematically varied by changing the exponent, n, determin-
posterior to the pitch mechanisms, and predominantly right- ing a power spectrum with the form f −n from which the pitch
lateralised activation in the STS. series was derived. The experiment was carried out as an
These studies all highlight a critical role in object analysis explicit search for mechanisms for the encoding of auditory
for temporal lobe areas beyond the primary and secondary sequences. It was predicted that computationally efficient
cortices in HG in the superior temporal plane. The areas encoding mechanisms should use less computational resource
are likely to be involved in the abstraction of object pro- (measured indirectly by using the BOLD response) for more
redundant sequences containing less information. Such a Multivariate analysis of fMRI data
relationship was demonstrated in two experiments in the PT,
bilaterally, but not in the primary and secondary auditory There has been considerable interest in techniques to
cortices in HG. The work is consistent with the suggestion demonstrate different spatial distributions of BOLD activity
(Griffiths & Warren, 2002) that the PT represents a “com- in response to sensory stimulation, which can be achieved
putational hub” responsible for the encoding of acoustic by the use of multivariate statistical methods. For a descrip-
stimuli, and suggests overlapping substrates for the abstrac- tion of this approach to visual data, see Haynes and Rees
tion of object features as in figure 25.7 and the encoding of (2006). The technique has the potential resolution to allow
sequences of objects. In contradistinction, figure 25.9 shows fMRI characterization of different responses within the same
a contrast to demonstrate areas involved in the retrieval of cortical areas that correspond to the perception of different
auditory sequences in the second experiment during a one- individual auditory objects. The interpretation of such map-
back task where subjects were required to compare succes- pings would be subject to the same issues discussed above in
sive pitch sequences. The contrast demonstrates bilateral terms of whether spectrotemporal structure or a correlate of
frontal activity including activity in the frontal operculum the perceived object is represented.
which in the right hemisphere is similar to that occurring
during working memory tasks for melodic pitch sequences Analysis of categorical processing using fMRI
(Zatorre, Evans, & Meyer, 1994). Unlike encoding, the activ-
ity associated with retrieval was not affected by the infor- Categorical response to changes in objects can be assessed
mation content of the stimulus. This can be interpreted in by using the technique of repetition suppression that has
terms of the retrieval process requiring a symbolic level of been developed for the analysis of visual fMRI data (Grill-
processing that is not yoked to the complexity of the acoustic Spector, Henson, & Martin, 2006). Previous work suggested
stimulus in the same way as encoding. categorical mechanisms for visual representation based on
BOLD responses to exemplars from the same category of sequences of objects. In particular, a key role for the PT
that decrease with repeated presentation, regardless of other is demonstrated in these studies consistent with the idea that
(category-independent) stimulus changes. The technique this is an important “computational hub” concerned with
allows categorical mechanisms to be sought even when dif- auditory encoding. The term hub implies connection to other
ferent neuronal ensembles tuned to different categories are nodes of analysis and a flow of information: There is a need
located in the same region. Recent visual neurophysiological for the identification of specific systems for object analysis
work (Sawamura, Orban, & Vogels, 2006) demonstrates cor- that might use similar nodes in different ways. Specifically,
relates of the phenomenon at the single-unit level. Models different aspects of object analysis might be subserved by
that might explain the phenomenon at the neuronal ensem- different patterns of connectivity between nodes. In this
ble level are developed in Grill-Spector et al., (2006). A section, we consider the application of this approach to one
recent study applied a related approach to mapping of an aspect of object analysis, spectral envelope analysis, address-
auditory continuum between two phonemes (Raizada & Pol- ing the question of how PT and the other nodes within the
drack, 2007). That study demonstrated responses that right-hemisphere network for spectral envelope analysis are
changed across phoneme boundaries in areas beyond the effectively connected.
temporal lobe, but the technique could also be applied to We use an approach called dynamic causal model-
shifts between objects at a presemantic level that might be ing (DCM) (Friston, Harrison, & Penny, 2003), together
analyzed in temporal lobe areas. with Bayesian model selection (Penny, Stephan, Mechelli, &
Friston, 2004), to test different models for auditory object
Effective connectivity analysis of fMRI data analysis. The approach identifies effective connectivity
between areas (the causal influence of activity in one area
The conventional analyses considered in figures 25.5 to 25.9 on the activity in another) and the modulatory effect of
demonstrate considerable overlap in the networks of activity task (or any other experimentally controlled manipulation)
that are involved in the analysis of objects assessed using on effective connectivity. DCM belongs to a family of
different types of stimulus manipulation and in the analysis models of effective connectivity such as structural equation
approaching 0.1 Hz, plausible models of dynamic neural where z is the state vector (with one state variable per region),
interactions at the millisecond level can still be disambigu- t is continuous time, and uj is the jth input (i.e., some experi-
ated, given the fMRI data. This is because the forward mentally controlled manipulation). This state equation rep-
model predicts, given the known experimental inputs, what resents the strength of connections between the modeled
the BOLD signal should look like at any future time point, regions (the endogenous A matrix), the modulation of these
including the times when BOLD measurements were taken; connections as a function of experimental manipulations
the sampling frequency (repetition time) is irrelevant. (e.g., changes in task; the modulatory or bilinear B (1) . . . B (m)
Like any model, DCM comprises variables (that may or matrices), and the strengths of direct inputs (e.g., sensory
may not be measurable) and parameters that are estimated stimuli, the exogenous C matrix). These parameters corre-
from the measurements. The model that is used in DCM spond to the rate constants of the modeled neurophysiologi-
has three types of variables: input variables (the same as cal processes. Combining the neural and hemodynamic
those used in conventional analyses based on the general model into a joint forward model, DCM uses a Bayesian
linear model, or GLM), encoding the experimental manipu- estimation scheme to determine the posterior density of the
lation; output variables that are the regional hemodynamic parameters. Under Gaussian assumptions, this density can
responses from each of the regions considered in the model; be characterized in terms of its maximum a posteriori esti-
and state variables. State variables describe the “hidden” mate and its posterior covariance. The parameters of the
(unobserved) states of the system and represent the neural neural and hemodynamic model are fitted such that the
activity and biophysical variables (e.g., blood flow) that modeled BOLD signals are as similar as possible to the
transform neural activity into a hemodynamic response. observed BOLD responses. This allows one to understand
DCM uses three different sets of parameters: endogenous and make statistical inferences about regional BOLD
parameters that model the baseline connection strengths responses in terms of the connectivity at the underlying
between the regions in the absence of any external excitation neural level.
Figure 25.10 Serial and parallel models for spectral envelope of spectral envelope (Kumar et al., 2007). HG, Heschl’s gyrus; PT,
analysis in the right hemisphere. The triangle in the pathway planum temporale; STS, superior temporal sulcus.
between two regions indicates the modulatory effect of extraction
abstract Visual experience is initiated by photons captured in function. As we move outward from the center of the fovea
the photoreceptors. The arrangement of these photoreceptors— (where peak cone density occurs), cone density falls off pre-
their topography—limits how we see and sets the stage for how the cipitously until the ora serrata, where there is a significant
circuitry in the retina and brain operates. Four different classes of
cells are interleaved in the photoreceptor layer of the retina: the
elevation in cone density (R. W. Williams, 1991).
rods and three spectral subtypes of cone that form the basis for Using adaptive optics to image the foveal cone mosaic,
trichromatic color vision, the long-, middle-, and short-wavelength- Putnam and colleagues (2005) observed that the location of
sensitive cones (L, M, and S, respectively). A great deal is under- peak cone density does not correspond to the preferred
stood about how the presence of three spectral cone types limits retinal locus of fixation; thus there remains ambiguity about
color perception; however, until recently, considerably less had
the difference between the anatomical fovea and the “func-
been known about the spatial arrangement of these cone types and
how their topography influenced visual experience. Recently, new tional” fovea. Shown in figure 26.2 are data from three
methods have been developed that enable us to measure the spatial subjects, showing the location of peak cone density with
arrangement of cone photoreceptors in the living human eye. respect to the retinal locus of fixation on individual psycho-
These new measurements have produced surprising results that physical trials. While fixation is in general very accurate, as
answer some questions about color appearance but raise many
is shown by the tight cluster of fixation points, there is a
others. In this chapter, we review the current understanding of the
cone mosaic in normal and defective color vision, emphasizing systematic deviation from the location of peak cone density.
recent results derived from adaptive optics retinal imaging. If visual acuity is reciprocally related to cone spacing near
the fovea (cf. Green, 1970; Marcos & Navarro, 1997), acuity
would have declined by an average of 8% for the subjects
in figure 26.2 at the center of fixation compared with the
Photoreceptor mosaic in normal color vision
anatomic center of the fovea. This is a relatively small loss
A wealth of histological data are available describing the in acuity that would be difficult to measure, owing to blur-
overall topography of the human photoreceptor mosaic. ring by the eye’s optics, which reduces foveal visual acuity
The most comprehensive data come from Curcio, Sloan, below the cone Nyquist frequency (Marcos & Navarro,
Kalina, and Hendrickson (1990), who showed that while 1997). Recent work using adaptive optics to image the cone
there are gross topographical features of the mosaic that are mosaic of individuals with red-green color vision defects
common across different retinas, there is also considerable reveals severely disrupted mosaics but normal visual acuity
variability. For example, as is shown in figure 26.1, the rela- measured with a letter target (Carroll, Neitz, Hofer, Neitz,
tive rod:cone density varies dramatically across the retina, & Williams, 2004). This further illustrates the insensitivity of
and this general feature is well preserved in all human retina standard acuity measures and advocates using interference
that have been studied to date. However, Curcio and col- fringe stimuli that are immune to optical blur to probe the
leagues (1990) found that the peak foveal cone density varied absolute relationship between the cone mosaic and visual
by at least a factor of 3, though since the data were obtained acuity.
on postmortem tissue, it was not possible to determine The human foveal cone mosaic is an efficiently packed
whether such differences had any practical impact on visual mosaic, with the locations of cone centers forming a trian-
gular array. Interleaved within the overall cone mosaic are
the three spectral cone submosaics (short-, middle-, and
joseph carroll Department of Ophthalmology, Medical College
of Wisconsin, Milwaukee, Wisconsin long-wavelength-sensitive; S, M, and L). Since only one
geunyoung yoon and david r. williams Center for Visual spectral type of cone occupies any given location within the
Science, University of Rochester, Rochester, New York cone mosaic, there is an apparent confound as to how the
carroll, yoon, and williams: cone photoreceptor mosaic in normal and defective color vision 383
Figure 26.1 Nonuniform distribution of rods and cones in the
human retina. Plot of photoreceptor density as a function of retinal
eccentricity. The top panels show ex vivo images of the photorecep-
tor mosaic from Curcio, Sloan, Kalina, and Hendrickson (1990).
The leftmost image is from the all-cone fovea; the remaining panels
contain both rods (smaller cells) and cones (larger cells). While rod
density increases dramatically in the peripheral retina, rod diame-
ter remains relatively constant (about 2 μm). Conversely, the cone
photoreceptors increase from about 2 μm in diameter at the fovea
to about 8 μm at about 10 degrees eccentricity, after which point
they remain relatively constant (Samy & Hirsch, 1989). (Modified
from Webvision (http://webvision.med.utah.edu), with permis-
sion.) (See color plate 33.)
Figure 26.2 The area of highest cone density is not always used
for fixation. Shown are retinal montages of the foveal cone mosaic
visual system is able to reliably extract color information at for three subjects. The black square represents the foveal center of
all spatial locations within an image. As such, the precise each subject, as defined by the location of peak cone density. The
dashed black line is the isodensity contour line representing a 5%
arrangement of these spectral subtypes has been of great increase in cone spacing, and the solid black line is the isodensity
interest, and we discuss the S, M, and L cone mosaics contour line representing a 15% increase in cone spacing. Red dots
below. are individual fixation locations. Scale bar is 50 μm. (Reproduced
from Putnam et al., 2005, with permission.) (See color plate 34.)
S Cone Mosaic: Structural Organization The S cones
can easily be distinguished from L and M cones by interleaved among the L/M mosaic near the fovea but
morphological and histochemical features (Ahnelt & Kolb, becomes regularly arranged at more peripheral locations. A
2000; Cornish, Hendrickson, & Provis, 2004; de Monasterio, number of questions surrounding the S cone mosaic remain,
Schein, & McCrane, 1981; Szel, Diamanstein, & Rohlich, such as why the arrangement of the mosaic is different in
1988). They are more cylindrical in shape (Curcio et al., nonhuman primates, what the variability is across subjects,
1991), have distinct neural circuitry (Mariani, 1984) and and what molecular mechanisms govern the nonuniform
synaptic structure (Ahnelt & Kolb, 2000), and contain a placement of S cones within the human retina.
photopigment that is distinct from that found in the L/M
cones (Bowmaker & Dartnall, 1980; Nathans, Thomas, & S Cone Mosaic: Functional Consequences It has long
Hogness, 1986). The S cones are relatively sparse throughout been believed that the S cone mosaic is sparser than the
the human retina (averaging about 6–8% of the total cone other L and M cones because the retinal image quality that
population), with peak density (usually about 10% of the is available to the S cones is reduced by the eye’s chromatic
local cone number) occurring near 1-degree eccentricity aberration (Packer & Williams, 2003; D. R. Williams et al.,
(Curcio et al., 1991). An interesting feature of the S cone 1981a, 1981b; Yellott, Wandell, & Cornsweet, 1984),
submosaic is that the very central fovea is lacking S cones although this interpretation is controversial. McLellan,
(König, 1894; Willmer & Wright, 1945; D. R. Williams, Marcos, Prieto, and Burns (2002) concluded that when
MacLeod, & Hayhoe, 1981a, 1981b). The extent of this S monochromatic aberrations are taken into account,
cone free zone is about 20 degrees of arc in diameter, though chromatic aberration does not, in the average eye, degrade
the size and even existence of this area are variable across the retinal image quality of the S cones. However, there is
individuals. In humans, the S cone mosaic is randomly theoretical and experimental evidence that the role of
carroll, yoon, and williams: cone photoreceptor mosaic in normal and defective color vision 385
of the M/L cones (Curcio et al., 1991; Hofer, Carroll, Neitz, the peripheral retina, reaching nearly an all L cone mosaic
Neitz, & Williams, 2005; Roorda & Williams, 1999). For a at the edge of the retina (M. Neitz, Balding, McMahon,
3-mm pupil, the spatial bandwidth of the 550-nm MTF is Sjoberg, & Neitz, 2006). Shown in figure 26.4 is a topo-
about 3.2 times greater than that for the 440-nm MTF, graphical map of L-to-M mRNA, revealing the dramatic
showing that the relationship between optical quality and and systematic variation across the retina.
sampling is not very different for the S cones and the M and As was mentioned above, numerous indirect studies have
L cones. suggested that there are on average about twice as many L
cones as M in the human retina, with large intersubject vari-
L and M Cone Mosaic: Structural Organization There ability. Direct information on the numbers and locations of
is no antibody that can distinguish L cones from M cones; L, M, and S cones obtained with spatially localized retinal
this is because the photopigments they contain are 96% densitometry in 10 living human subjects reveals the extent
identical (Nathans, Thomas, & Hogness, 1986). As such, of this variability in L-to-M cone ratio across individuals
much of what we know about the L/M submosaic has come with normal color vision to be over a 40-fold range (Hofer
from indirect measurements of the retina. Recent direct et al., 2005). Shown in figure 26.5 are pseudo-colored images
work using adaptive optics and molecular analyses has for of the human cone mosaic, showing the remarkable varia-
the most part confirmed previous results, though it has tion in L-to-M cone ratio. Also evident from these images is
uncovered surprising levels of variation within this mosaic. the fact that the L and M cone submosaics are randomly
Given that they make up about 95% of the total cone interleaved.
population and thus drive the majority of our visual activity,
there has been considerable interest in the L and M cones, L and M Cone Mosaic: Functional Consequences In the
both in their relative numbers (Carroll, Neitz, & Neitz, 2002; face of the dramatic intersubject and intrasubject variation
DeVries, 1946; Dobkins, Thiele, & Albright, 2000; Jacobs in the L-to-M cone mosaic, questions arise regarding the
& Neitz, 1993; Kremers et al., 2000; Pokorny, Smith, & behavioral consequence of such variability. The fundamental
Wesner, 1991; Roorda & Williams, 1999; Rushton & Baker, experiment in color vision, color matching with spatially
1964) and in their topographical arrangement through- uniform fields, is very sensitive to the spectral absorptance
out the mosaic (Balding, Sjoberg, Neitz, & Neitz, 1998; of the cone photopigments but invariant with respect to the
Bowmaker, Parry, & Mollon, 2003; Deeb, Diller, Williams, local ratio of cone types. Thus both across observers and
& Dacey, 2000; Hagstrom, Neitz, & Neitz, 1998; Knau, across visual field position for a single observer, color matches
Jägle, & Sharpe, 2001; Mollon & Bowmaker, 1992; Packer, will be preserved despite variations in the ratios of cone
Williams, & Bensinger, 1996; Roorda, Metha, Lennie, & types. There are pronounced deficits in L/M color vision in
Williams, 2001). For many years, scientists used indirect the periphery compared to the central retina (Gordon &
measures to assess the homogeneity of these ratios across the Abramov, 1977; Mullen, 1991); however, this has more to
retina and across observers. With the advent of adaptive do with postreceptoral sampling than with L/M numerosity.
optics, it has become possible to examine the L/M mosaic Even the variation in L-to-M ratio between subjects has
noninvasively and directly. What is now clear is that in been shown to have little consequence for color vision,
humans with normal color vision, there are on average two despite previous hypotheses (cf. Cicerone, 1987). For
L cones for every M cone, there is variability in the relative example, Miyahara, Pokorny, Smith, Baron, and Baron
numbers of L and M cones between people, and the ratio of (1998) showed that in two female carriers of a red/green
L to M cones is not completely uniform across an individual color vision defect, despite a dramatic skew in their L-to-M
retina. cone ratio, their red/green color vision was completely
While most studies suggest that the ratio of L to M cones normal. J. Neitz, Carroll, Yamauchi, Neitz, and Williams
is probably constant across the central retina, the evidence (2002) used the flicker photometric ERG to probe L-to-M
for this is somewhat inferential. In fact, direct evidence from ratio in over 60 individuals and showed that that wavelength
adaptive optics and retinal densitometry has shown that in of unique yellow (the presumed null point of the red-green
at least one individual, the relative numerosity of L and M system) did not change across subjects, suggesting a
cones is not homogenous across the central retina (Hofer postreceptoral normalization mechanism that compensates
et al., 2005). However, there are other, larger-scale inhomo- for any biases in L-to-M cone ratio. The reality is that
geneities in the primate L/M mosaic. There is a nasal- the human visual system is quite resilient to variation at the
temporal asymmetry in the local L-to-M cone ratio in retinal level, though this is obviously dependent on the
macaque retina, though whether this asymmetry is a promi- sensitivity of the test used to probe color vision.
nent feature of human retinae is not clear. Data from mRNA The fact that the three cone submosiacs each sample the
analysis and cone isolating mfERG of human retina reveal retinal image at a lower rate than the overall mosaic makes
that the relative L-to-M cone ratio increases significantly in the retina susceptible to aliasing at lower spatial frequencies
carroll, yoon, and williams: cone photoreceptor mosaic in normal and defective color vision 387
assimilation, that is qualitatively similar to the reduced spatial ment of the photopigment, the substitution of a polar, neutral
bandwidth for chromatic mechanisms assessed at detection amino acid for a positively charged one would be expected
threshold. It is possible that assimilation effects occur because to compromise the function of the photopigment. The father
of the need to protect against submosaic aliasing, though a manifests as tritanopic on all color vision tests, whereas the
quantitative model of the benefits assimilation could provide daughter made mild tritan errors on only a small subset of
has yet to be made. color vision tests. Interestingly, the father reports that it has
only been in recent years that he has noticed difficulties with
Cone topography in inherited color vision deficiencies discriminating between some colors such as orange-yellow
and pink, while the daughter reports never having any color
There are a number of instances in which normal color dis- discrimination problems. We used adaptive optics ophthal-
crimination is impaired, and there is now a detailed under- moscopy to obtain high-resolution images of the cone mosaic
standing of the molecular mechanisms underlying these of both individuals combined with retinal densitometry to
defects. Nearly all inherited color vision defects have their identify S cones in the mosaic. Surprisingly, while normal
origin in a disruption of normal cone photopigment expres- S cone density was reported for the daughter (4.9%, or 2224
sion; either a cone pigment is absent or a mutant pigment is cones/mm2), no evidence for S cones was observed in the
expressed. Until recently, little attention had been given to father, though the overall cone density was within normal
the residual photoreceptor mosaic of these individuals. limits for both individuals. Since S cones normally occupy a
Results using adaptive optics retinal imaging have stimu- small minority of the total cone population and since cone
lated a reevaluation of the ideas about what the appearance density is so variable across normal individuals (Curcio
of these mosaics might be. Here, we review the four major et al., 1990; Gao & Hollyfield, 1992), it is not surprising that
types of inherited color deficiencies and discuss what has the cone density of the father appeared normal despite the
been revealed about the accompanying cone photoreceptor apparent absence of S cones.
mosaic. One feature of the cone mosaic that can be exploited to
study subtle disruptions in the packing geometry of the cone
Tritanopia Tritan color vision deficiency is an inherited mosaic is the spatial regularity. The spatial regularity of the
autosomal dominant abnormality of S cone function (Wright, mosaic can be assessed by using a number of metrics (Cook,
1952). The disorder is reported to exhibit incomplete 1996; Rodieck, 1991); one of the more intuitive ones is the
penetrance, meaning that individuals with the same Voronoi analysis (Curcio & Sloan, 1992; Pum, Ahnelt, &
underlying mutation manifest different degrees of color Grasl, 1990). With this analysis, individual cones are repre-
vision impairment (Cole, Henry, & Nathan, 1966; Kalmus, sented as points in a two-dimensional plane. For each cell,
1955; Miyake, Yagasaki, & Ichikawa, 1985; Pokorny, Smith, a Voronoi domain is constructed by defining points in the
& Went, 1981; Went & Pronk, 1985). Four different amino plane that are closer to that cell than any other cell in the
acid substitutions in the S cone photopigment have been mosaic. The number of sides of the resultant polygon reflects
associated with tritanopia, which is only slightly more rare the packing geometry of the local mosaic. In a perfectly
than autosomal dominant retinitis pigmentosa (adRP) regular mosaic with triangular packing, each cell would have
(Gunther, Neitz, & Neitz, 2006; Weitz et al., 1992; Weitz, a hexagonal Voronoi domain. Shown in figure 26.6A is the
Went, & Nathans, 1992). Each substitution occurs at an Voronoi analysis of a normal human cone mosaic and that
amino acid position that lies in one of the transmembrane of the tritan father mosaic. The polygons are color coded
alpha helices of the protein and is therefore expected to according to the number of sides they have, with green
interfere with folding, processing, or stability of the encoded indicating six sides. While the majorities are green (indicat-
opsin. ing a largely triangular mosaic), there are many fractures in
In adRP, rod photopigment mutations are associated with the regularity of the mosaic. These disruptions have been
degeneration of the associated photoreceptors. This is due hypothesized to correlate with the location of S cones in the
in part to the fact that the photopigment plays such an mosaic (Pum et al., 1990); however, compelling evidence for
important structural role in maintaining the integrity of the this is lacking. Nevertheless, the father’s mosaic was signifi-
outer segment, comprising nearly 90% of the protein in the cantly more irregular than normal (figure 26.6C ), while the
outer segment. To investigate whether there is S cone degen- daughter’s mosaic was indistinguishable from normal (figure
eration in autosomal dominant tritan defects, Baraas and 26.6B). The disparate S cone mosaics in these two subjects
colleagues (2007) examined two related tritan subjects (a are consistent with their distinct behavioral phenotypes, the
57-year-old male and his 34-year-old daughter) both hetero- increased irregularity in the father’s mosaic likely being a
zygous for a novel mutation in their S-opsin gene. The remnant of the degeneration of the S cones in his mosaic.
mutation resulted in a substitution of glutamine for arginine The work reported by Baraas and colleagues (2007) pro-
at position 283 (R283Q). Given the sensitive microenviron- vides the first anatomical evidence that tritan phenotypes
Figure 26.6 Regularity of the human cone mosaic. Voronoi disruptions in the hexagonal packing of the foveal mosaic. Despite
domain associated with each cone photoreceptor in a patch of the fact that the father and the daughter carried the same hetero-
retina from (A) a normal trichromat, (B) a 34-year-old female with zygous mutation in their S-opsin genes (predicting a tritan pheno-
a mild tritan defect, and (C ) a 57-year-old male with a severe tritan type), the regularity of the father’s mosaic was significantly
defect. The color code indicates the number of sides on each disrupted, while the daughter’s was indistinguishable from normal.
Voronoi polygon (magenta = 4, cyan = 5, green = 6, yellow = 7, Scale bar is 50 μm. (Reproduced from Baraas et al., 2007, with
red = 8, purple = 9). Large regions of six-sided polygons indicate permission.) (See color plate 37.)
a regular triangular lattice, whereas other colors mark points of
associated with S-opsin mutations can be associated with the (Sakmar, 2002). This mutation was first observed in blue
loss of S cones. This suggests a mechanism in which the cone monochromacy (Nathans et al., 1989) where it was
mutations produce their effects by reducing the viability of shown to directly disrupt photopigment function (Kazmi,
the S cones (similar to the mechanism in adRP), and hetero- Sakmar, & Ostrer, 1997). Mutating the corresponding
zygotes that express both the normal and mutant S opsins cysteine residue in human rhodopsin (position 187) causes
will exhibit trichromatic color vision that can be indistin- autosomal dominant retinitis pigmentosa (Richards, Scott,
guishable from normal until the S cones succumb to the & Sieving, 1995).
toxicity of the mutant opsin. The two different causes of red/green color vision defects
might be expected to have different retinal phenotypes. It is
Red/Green Color Vision Deficiency The most common thought that all photoreceptors that are destined to become
form of inherited color vision deficiency is one that affects L or M cones will express either the first or second gene
the red-green (L-M cone) system. Among individuals of in the X-chromosome array (Hayashi, Motulsky, & Deeb,
Western European ancestry, about 7–10% of males have a 1999). In the case of gene rearrangements, all photorecep-
red-green color vision defect. The incidence in females is tors are expected to express a gene that encodes a functional
much lower (approximately 0.4%) because the defects are pigment, though these would all be of the same spectral type.
inherited as X-linked recessive traits, though approximately However, in the case of inactivating mutations, a fraction of
15% of females are carriers of a red-green defect. The the photoreceptors will express a pigment that is not func-
general genetic causes of red-green color vision deficiency tional and, in fact, may be deleterious to the viability of the
involve a disruption of the L/M gene array on the X- cell. Recently, it was discovered that there are different
chromosome. The most common cause is rearrangement of retinal phenotypes among red-green color-blind individuals.
the L/M genes resulting either in the deletion of all but one Carroll, Porter, Neitz, Williams, and Neitz (2005) found that
visual pigment gene or in the production of a gene array in in individuals having either a single-gene array or an array
which the first two genes both encode a pigment of the same in which the first two genes both encode a pigment of the
spectral class (Deeb et al., 1992; Jagla, Jägle, Hayashi, same spectral class, the cone mosaic is normal in appear-
Sharpe, & Deeb, 2002; Nathans, Piantanida, Eddy, Shows, ance. In contrast, in individuals in whom one of the genes
& Hogness, 1986; M. Neitz et al., 2004; Ueyama et al., in the array encodes a pigment with an inactivating muta-
2003). The second general cause is the introduction of an tion, dramatic loss of healthy cones is observed, consistent
inactivating mutation in either the first or second gene in the with the hypothesis that cells expressing the mutant pigment
array. The most prevalent inactivating mutation results degenerated (Carroll et al., 2004). Shown in figure 26.7 are
in the substitution of arginine for cysteine at position 203 adaptive optics images from individuals with color vision
(C203R) in the L/M pigment (Bollinger, Bialozynski, Neitz, defects caused by photopigment mutations.
& Neitz, 2001; M. Neitz et al., 2004; Winderickx et al., Besides a reduction in color discrimination, the disrupted
1992). Cysteine 203 forms an essential disulfide bond mosaics (sometimes having 60% fewer cones than normal)
being highly conserved among G-protein-coupled receptors might also be expected to confer a reduction in spatial vision.
carroll, yoon, and williams: cone photoreceptor mosaic in normal and defective color vision 389
A B
further highlights the limiting effect these aberrations can
have on our normal visual activity.
A B
C D
Figure 26.8 The retina of the rod monochromat is highly chromat. Images are from 2.5 degrees (A, B) and 4 degrees (C, D)
unusual. (A, C ) Images from a 28-year-old normal male, who had temporal retina. The size and density of the visible cells were typical
been imaged as part of a number of unrelated studies over the for rod, not cone, photoreceptors. Scale bar is 20 μm. (Reproduced
course of 3 months. (B, D) Images from a 28-year-old rod mono- from Carroll, Choi, & Williams 2008, with permission.)
carroll, yoon, and williams: cone photoreceptor mosaic in normal and defective color vision 391
review) and that cortical signals in rod monochromats are Carroll, J., Neitz, M., & Neitz, J. (2002). Estimates of
abnormal (Baseler et al., 2002). Thus gene therapies for rod L : M cone ratio from ERG flicker photometry and genetics.
J. Vis., 2(8), 531–542.
monochromacy would also need to account for the develop-
Carroll, J., Porter, J., Neitz, J., Williams, D. R., & Neitz, M.
mental reorganization in the visual cortex, as well as any (2005). Adaptive optics imaging reveals effects of human cone
potential remodeling of the retinal circuitry. opsin gene disruption. Invest. Ophthalmol. Vis. Sci., 46, ARVO
E-Abstract 4564.
Cicerone, C. M. (1987). Constraints placed on color vision models
by the relative numbers of different cone classes in human fovea
REFERENCES
centralis. Die Farbe, 34, 59–66.
Ahnelt, P. K., & Kolb, H. (2000). The mammalian photoreceptor Cole, B. L., Henry, G. H., & Nathan, J. (1966). Phenotypical
mosaic-adaptive design. Prog. Retinal Eye Res., 19(6), 711–777. variations of tritanopia. Vis. Res., 6, 303–313.
Alexander, J. J., Umino, Y., Everhart, D., Chang, B., Min, Cook, J. E. (1996). Spatial properties of retinal mosaics: An empiri-
S. H., Li, Q., et al. (2007). Restoration of cone vision in a mouse cal evaluation of some existing measures. Vis. Neurosci., 13(1),
model of achromatopsia. Nat. Med., 13(6), 685–687. 15–30.
Ayyagari, R., Kakuk, L. E., Bingham, E. L., Szczesny, J. J., Kemp, Cornish, E. E., Hendrickson, A. E., & Provis, J. M. (2004). Dis-
J. A., Toda, Y., et al. (2000). Spectrum of color gene deletions tribution of short-wavelength-sensitive cones in human fetal and
and phenotype in patients with blue cone monochromacy. Hum. postnatal retina: Early development of spatial order and density
Genet., 107, 75–82. profiles. Vis. Res., 44, 2019–2026.
Balding, S. D., Sjoberg, S. A., Neitz, J., & Neitz, M. (1998). Curcio, C. A., Allen, K. A., Sloan, K. R., Lerea, C. L., Hurley,
Pigment gene expression in protan color vision defects. Vis. Res., J. B., Klock, I. B., et al. (1991). Distribution and morphology
38(21), 3359–3364. of human cone photoreceptors stained with anti-blue opsin.
Baraas, R. C., Carroll, J., Gunther, K. L., Chung, M., J. Comp. Neurol., 312, 610–624.
Williams, D. R., Foster, D. H., et al. (2007). Adaptive optics Curcio, C. A., & Sloan, K. R. (1992). Packing geometry of human
retinal imaging reveals S-cone dystrophy in tritan color-vision cone photoreceptors: Variation with eccentricity and evidence
deficiency. J. Opt. Soc. Am. [A], 24(5), 1438–1446. for local anisotropy. Vis. Neurosci., 9, 169–180.
Barthelmes, C., Sutter, F. K., Kurz-Levin, M. M., Bosch, Curcio, C. A., Sloan, K. R., Kalina, R. E., & Hendrickson,
M. M., Helbig, H., Niemeyer, G., et al. (2006). Qualitative A. E. (1990). Human photoreceptor topography. J. Comp. Neurol.,
analysis of OCT characteristics in patients with achromatopsia 292, 497–523.
and blue-cone monochromatism. Invest. Ophthalmol. Vis. Sci., de Monasterio, F. M., Schein, S. J., & McCrane, E. P. (1981).
47(3), 1161–1166. Staining of blue-sensitive cones of the macaque retina by a
Baseler, H. A., Brewer, A. A., Sharpe, L. T., Morland, A. B., fluorescent dye. Science, 213, 1278–1281.
Jägle, H., & Wandell, B. A. (2002). Reorganization of human Deeb, S. S., Diller, L. C., Williams, D. R., & Dacey, D. M.
cortical maps caused by inherited photoreceptor anomalies. Nat. (2000). Interindividual and topographical variation of
Neurosci., 5, 364–370. L : M cone ratios in monkey retinas. J. Opt. Soc. Am. [A], 17(3),
Berson, E. L., Sandberg, M. A., Maguire, A., Bromley, W. C., 538–544.
& Roderick, T. H. (1986). Electroretinograms in carriers of blue Deeb, S. S., Lindsey, D. T., Hibiya, Y., Sanocki, E., Winderickx,
cone monochromatism. Am. J. Ophthalmol., 102(2), 254–261. J., Teller, D. Y., et al. (1992). Genotype-phenotype relation-
Bollinger, K., Bialozynski, C., Neitz, J., & Neitz, M. (2001). ships in human red/green color-vision defects: Molecular and
The importance of deleterious mutations of M pigment genes as psychophysical studies. Am. J. Hum. Genet., 51, 687–700.
a cause of color vision defects. Color Res. Appl., 26, S100–S105. DeVries, H. L. (1946). Luminosity curve of trichromats. Nature
Bowmaker, J. K., & Dartnall, H. J. A. (1980). Visual pigments (Lond.), 157, 736–737.
of rods and cones in a human retina. J. Physiol. Lond., 298, Dobkins, K. R., Thiele, A., & Albright, A. D. (2000). Com-
501–511. parison of red-green equiluminance points in humans and
Bowmaker, J. K., Parry, J. W. L., & Mollon, J. D. (2003). The macaques: Evidence for different L : M cone ratios between
arrangement of L and M cones in human and a primate retina. species. J. Opt. Soc. Am. [A], 17, 545–556.
In J. D. Mollon, J. Pokorny & K. Knoblauch (Eds.), Normal and Falls, H. F., Wolter, R., & Alpern, M. (1965). Typical total
defective colour vision (pp. 39–50). New York: Oxford University monochromasy: A histological and psychophysical study. Arch.
Press. Ophthalmol., 74, 610–616.
Brainard, D. H., Williams, D. R., & Hofer, H. (2008). Trichro- Galezowski, X. (1868). Du diagnostic des Maladies des Yeux par la
matic reconstruction from the interleaved cone mosaic: Bayesian Chromatoscopie rétinienne: Précéde d’une Etude sur les Lois physiques et
model and the color appearance of small spots. J. Vis., 8(5), physiologiques des Couleurs. Paris: J.B. Baillière et Fils.
1–23. Gao, H., & Hollyfield, J. G. (1992). Aging of the human retina:
Cao, D., & Shevell, S. K. (2005). Chromatic assimilation: Spread Differential loss of neurons and retinal pigment epithelial cells.
light or neural mechanism? Vis. Res., 45(8), 1031–1045. Invest. Ophthalmol. Vis. Sci., 33(1), 1–17.
Carroll, J., Choi, S. S., & Williams, D. R. (2008). In vivo imaging Glickstein, M., & Heath, G. G. (1975). Receptors in the mono-
of the photoreceptor mosaic of a rod monochromat. Vis. Res., chromat eye. Vis. Res., 15, 633–636.
48(26), 2564–2568. Gordon, J., & Abramov, I. (1977). Color vision in the peripheral
Carroll, J., Neitz, M., Hofer, H., Neitz, J., & Williams, D. R. retina: II. Hue and saturation. J. Opt. Soc. Am., 67, 202–207.
(2004). Functional photoreceptor loss revealed with adaptive Green, D. G. (1970). Regional variations in the visual acuity
optics: An alternate cause for color blindness. Proc. Natl. Acad. Sci. for interference fringes on the retina. J. Physiol. Lond., 207,
USA, 101(22), 8461–8466. 351–356.
carroll, yoon, and williams: cone photoreceptor mosaic in normal and defective color vision 393
(Eds.), From pigments to perception: Advances in understanding visual pro- array is associated with deutan color-vision deficiency. Proc. Natl.
cesses (pp. 23–34). New York: Plenum Press. Acad. Sci. USA, 100(6), 3357–3362.
Pum, D., Ahnelt, P. K., & Grasl, M. (1990). Iso-orientation areas Weitz, C. J., Miyake, Y., Shinzato, K., Montag, E., Zrenner,
in the foveal cone mosaic. Vis. Neurosci., 5, 511–523. E., Went, L. N., et al. (1992). Human tritanopia associated
Putnam, N. M., Hofer, H. J., Doble, N., Chen, L., Carroll, J., with two amino acid substitutions in the blue sensitive opsin. Am.
& Williams, D. R. (2005). The locus of fixation and the foveal J. Hum. Genet., 50, 498–507.
cone mosaic. J. Vis., 5(7), 632–639. Weitz, C. J., Went, L. N., & Nathans, J. (1992). Human tritanopia
Richards, J. E., Scott, K. M., & Sieving, P. A. (1995). Disruption associated with a third amino acid substitution in the blue sensi-
of conserved rhodopsin disulfide bond by Cys187Tyr mutation tive visual pigment. Am. J. Hum. Genet., 51, 444–446.
causes early and severe autosomal dominant retinitis pigmen- Went, L. N., & Pronk, N. (1985). The genetics of tritan distur-
tosa. Ophthalmology, 102(4), 669–677. bances. Hum. Genet., 69, 255–262.
Rodieck, R. W. (1991). The density recovery profile: A method for Williams, D. R., & Collier, R. J. (1983). Consequences of spatial
the analysis of points in the plane applicable to retinal studies. sampling by a human photoreceptor mosaic. Science, 221,
Vis. Neurosci., 6, 95–111. 385–387.
Roorda, A., Metha, A. B., Lennie, P., & Williams, D. R. (2001). Williams, D. R., MacLeod, D. I. A., & Hayhoe, M. (1981a).
Packing arrangement of the three cone classes in primate retina. Foveal tritanopia. Vis. Res., 21(9), 1341–1356.
Vis. Res., 41, 1291–1306. Williams, D. R., MacLeod, D. I. A., & Hayhoe, M. M. (1981b).
Roorda, A., & Williams, D. R. (1999). The arrangement of the Punctate sensitivity of the blue-sensitive mechanism. Vis. Res., 21,
three cone classes in the living human eye. Nature (Lond.), 397, 1357–1375.
520–522. Williams, D. R., Sekiguchi, N., Haake, W., Brainard, D. H., &
Rushton, W. A. H., & Baker, H. D. (1964). Red/green sensitivity Packer, O. (1991). The cost of trichromacy for spatial vision. In
in normal vision. Vis. Res., 4, 75–85. B. B. Lee & A. Valberg (Eds.), From pigments to perception: Advances
Sakmar, T. P. (2002). Structure of rhodopsin and the superfamily in understanding visual processes (pp. 11–22). New York: Plenum
of seven-helical receptors: The same and not the same. Curr. Press.
Opin. Cell Biol., 14(2), 189–195. Williams, R. W. (1991). The human retina has a cone-enriched
Samy, C. N., & Hirsch, J. (1989). Comparison of human and rim. Vis. Neurosci., 6(4), 403–406.
monkey retinal photoreceptor sampling mosaics. Vis. Neurosci., 3, Willmer, E. N., & Wright, W. D. (1945). Colour sensitivity of the
281–285. fovea centralis. Nature, 156, 119–121.
Sekiguchi, N., Williams, D. R., & Brainard, D. H. (1993a). Winderickx, J., Sanocki, E., Lindsey, D. T., Teller, D. Y.,
Aberration-free measurements of the visibility of isoluminant Motulsky, A. G., & Deeb, S. S. (1992). Defective colour vision
gratings. J. Opt. Soc. Am. [A], 10, 2105–2117. associated with a missense mutation in the human green visual
Sekiguchi, N., Williams, D. R., & Brainard, D. H. (1993b). pigment gene. Nat. Genet., 1, 251–256.
Efficiency in detection of isoluminant and isochromatic interfer- Wright, W. D. (1952). The characteristics of tritanopia. J. Opt. Soc.
ence fringes. J. Opt. Soc. Am. [A], 10, 2118–2133. Am., 42, 509–521.
Szel, A., Diamanstein, T., & Rohlich, P. (1988). Identification of Yellott, J. I., Jr., Wandell, B., & Cornsweet, T. (1984). The
blue-sensitive cones in the mammalian retina by antivisual beginnings of visual perception: The retinal image and its initial
pigment antibody. J. Comp. Neurol., 273, 593–602. coding. In I. Darian-Smith (Ed.), Handbook of physiology, Section 1:
Ueyama, H., Li, Y.-H., Fu, G.-L., Lertrit, P., Atchaneeyasakul, The nervous system III (pp. 257–316). Bethesda, MD: American
L., Oda, S., et al. (2003). An A-71C substitution in a green gene Physiological Society.
at the second position in the red/green visual-pigment gene
abstract Visual perception is difficult because image formation The color signal is given as the wavelength-by-wavelength
and sensory transduction lose information about the physical scene: product of the illuminant power and the object’s surface
Many different scenes lead to the same image data. Understanding
reflectance function, where the latter specifies the fraction of
how the brain copes with this information loss, so that our percepts
provide a useful representation of the world around us, is a central incident light reflected from the object. Thus the color signal
problem in cognitive neuroscience. In the case of color vision, the confounds object properties with those of the illuminant. To
nature of the information loss is well understood. First, the light provide a representation of object reflectance that is stable
reflected to the eye confounds illuminant properties with those of across changes of illuminant, the visual system must process
objects. Second, spectral and spatial sampling by the cone photo- the color signal to separate the physical effects of illuminant
receptors further reduces the available information. To provide a
stable representation of object color, the brain must compensate and object surface.
by combining the directly available information with assumptions The postreceptoral visual system does not have direct
about which scene configurations are likely to occur. This chapter access to the color signal. Rather, this spectrum is encoded
reviews how Bayesian decision theory can model how this happens by the joint responses of the retinal mosaic of cone photore-
and discusses two Bayesian models that have been effective in ceptors. There are three classes of cones, each characterized
accounting for color appearance.
by a distinct spectral sensitivity (figure 27.1B). These are
often referred to as the L, M, and S cones.
Each individual cone codes information about light as a
Visual perception is difficult. One pervasive reason for this scalar quantity, the rate at which its photopigment is isomer-
difficulty is that image formation and sensory transduction ized. This rate confounds the overall intensity of the color
lose information about the physical scene, so many different signal with its relative spectrum. Thus two physically distinct
scenes could have caused the same image data. Color vision color signals can produce the same isomerization rate in all
presents an opportunity to understand how the brain copes three classes of cones (figure 27.1C ). Moreover, there is at
with this information loss, because our understanding of the most one cone at each retinal location. To obtain even trichro-
information loss and the scene parameters of perceptual matic information about the spectrum of the color signal, the
interest is well developed. In this sense, color provides a visual system must combine information from cones at differ-
model system for developing and testing theories that may ent retinal locations; sampling of the image by the retinal
have more general applicability. Of course, color perception mosaic confounds spatial and chromatic image structure.
is an important aspect of our perceptual experience, and This brief review illustrates a series of stages in which
understanding how it arises is also interesting in its own information about object spectral properties is lost: The
right. This chapter provides an introduction to Bayesian color signal confounds object reflectance with the spectral
modeling of human color vision and reviews two lines of power distribution of the illuminant; the retina as a whole
work where the approach has been fruitful. contains only three classes of univariate cones, so the color
signal’s full spectrum is represented by at most three numbers;
Fundamentals of color vision and at each retinal location, there is only one cone. Each of
these stages of information loss produces ambiguity about
The visual system assigns a color to essentially all viewed the scene being viewed.
objects. The information available about an object’s color is How does the visual system resolve ambiguity to extract
carried by the spectrum of the light reflected from it, as illus- a perceptually useful representation of object color? Since
trated by figure 27.1A. This spectrum, which we refer to as the cone responses do not completely determine the reflec-
the color signal, is specified by its power at each wavelength. tance properties of the object, some additional constraints
must be imposed. Here, we apply Bayesian analysis as a
david h. brainard Department of Psychology, University of framework to express these constraints and develop models
Pennsylvania, Philadelphia, Pennsylvania of color perception.
C(λ) = I (λ)S(λ)
language of probability distributions. These two types of
information are then combined via Bayes’ rule to produce
a posterior distribution that expresses what is known about
the scene. A specific estimate of the scene configuration can
B then be extracted from the posterior, for example, by taking
its mean.
An example serves to illustrate the key ideas. Imagine a
toy universe containing only one wavelength of light and one
spatial location. An illuminant impinging on a surface is
specified by an intensity i, and the object surface reflectance
is specified by a single number s. We refer to these as the
scene parameters, as fixing their values specifies the physical
scene. The color signal here reduces to a single number, c =
is. If an eye with a single photoreceptor images the scene,
we can model its response r by the equation r = c + n, where
n is additive noise. We then ask, “What do the image data r
tell us about the values of the scene parameters?” Within the
framework of Bayesian analysis, this is given by the likelihood,
written p(r⎪i, s). For any scene parameters, the likelihood tells
C us how probable any observed response r is. Figure 27.2A
illustrates the likelihood for our example, for the case r = 1.
When the image data are held fixed, the likelihood is a func-
tion of the scene parameters.
Two features of the likelihood are worth note. First, some
pairs (i, s) lead to higher likelihood than others. This means
that the image data provide some information about the
scene parameters. Second, the likelihood function has a ridge
of equal values along the hyperbola is = 1. The fact that the
likelihood is equal along this ridge indicates that the image
data provide incomplete information; there are multiple
scene configurations that the image data do not distinguish.
Because the image data do not uniquely determine the
scene parameters, some other principle must be invoked to
resolve the ambiguity. The prescription provided by the
Bayesian principles Bayesian approach is to specify the statistical properties of
the scene parameters. In our toy universe, for example, it
Basic ideas Bayesian statistics provides a general form- might be that not all illuminant intensities occur with equal
ulation that allows image data to be combined with prior probability. This fact can be expressed as a prior probability
assumptions to provide a reasonable estimate of the physical distribution over the scene parameters (figure 27.2B). Here,
scene configuration. Both the information provided by the the prior has a ridge parallel to the s dimension, indicating
data and the prior assumptions are expressed in the common that all values of s are equally likely. But along the i dimen-
Figure 27.3 Prior over surface reflectance. (A) Three basis generate the linear model, and its approximation (dashed curve)
functions of a linear model for surfaces. The basis functions were within the model shown in panel A. (C ) Distribution of model
obtained through analysis of the principle components of a collec- weights for the 462 surface reflectance functions in the data set used
tion of 462 measured surface reflectance functions (Newhall, Nick- to generate the linear model. Each panel shows the histogram for
erson, & Judd, 1943; Nickerson, 1957). (B) A measured surface one basis function. The solid curves are a normal approximation.
reflectance function (solid curve) from the same data set used to (After Brainard & Freeman, 1997; see their figure 4.)
B 0.52
0.50
0.48
0.46
0.44
0.42
0.40
0.14 0.16 0.18 0.20 0.22 0.24 0.26
Figure 27.4 Color constancy performance. (A) Images of 4 of 17 the CIE u ′v ′ chromaticity diagram. This is a standard color repre-
simulated scenes used to compare human performance and a sentation that preserves information about the relative responses of
model derived from a Bayesian illuminant estimation algorithm. the L, M, and S cones but not about intensity. Large open circles
Each scene has the same spatial structure. In the first three images, show the scene illuminants, with the color key as indicated beneath
from left to right, the illuminant varies. In the rightmost image, the images in panel A. Large solid circles show the achromatic loci
the illuminant is the same as that in the second image, but the measured by observers who adjusted a test patch at the location
background surface has been changed so that the light reflected indicated by the black rectangle in each image. The small
from it matches that reflected from the background surface in the open circles show the model’s predictions of the achromatic loci.
leftmost image. (Reproduced from Brainard et al., 2006, figure 1.) (Reproduced from figure 7 of Brainard et al., 2006.) (See color plate
(B) Illuminants, achromatic loci, and model predictions plotted in 38.)
Of importance is that the agreement occurs both for cases a full spectral representation of the color signal to the trichro-
in which the observers showed good constancy and cases in matic representation provided by the L, M, and S cones. It
which constancy was poor. assumed, however, that the responses of all three cone classes
were available at each spatial location. This is intuitively
Bayes and the cone mosaic reasonable if the spatial scale of the image of various objects
is large in comparison to the spacing between cones. None-
The treatment of color constancy above addresses the infor- theless, it would be more satisfying to have a theory of how
mation loss caused by the interaction of surfaces and illumi- information from cones is combined across space to provide
nants in the formation of the image and the reduction from a representation that is effectively trichromatic at such spatial
abstract This chapter deals with the question of how receptive & Lamantia, 1992; Swindale, 1991). This chapter deals pri-
fields and cortical maps are wired before the onset of the classical marily with the early establishment of the circuit. We want
critical period. In particular, we ask how simple-cell receptive fields to know how receptive fields and cortical maps are wired
may arise without an intermediate phase of overlap between ON
and OFF subregions and without the need for correlated spontane-
before the onset of the critical period. The reason this ques-
ous activity in the developing thalamus. We discuss one possible tion is a critical piece of the puzzle of V1 development is
solution, the statistical connectivity hypothesis, which postulates clarified by a summary of some experimental findings.
that initial wiring of the cortex is highly constrained by the spatial In their pioneering studies of visual cortex, Hubel and
arrangement of the retinal ganglion cell mosaic and their coverage Wiesel (1963) demonstrated that kittens lacking normal
ratios. We examine a recently confirmed prediction of the theory:
visual experience have cells that are tuned for orientation.
that orientation bandwidth must depend on the location of neurons
within the orientation map. Finally, the theory is shown to predict They also found that orientation-tuned cells cluster into
the existence of orientation scotomas: At any given retinal location, populations of similar preference, suggesting the presence of
not all orientations can be represented equally well by neurons in an early orientation map in these young animals (Hubel &
primary visual cortex. Wiesel, 1963). These early cortical responses were also found
to be heavily dominated by contralateral input (Crair et al.,
1998; Fregnac & Imbert, 1978; Movshon & Van Sluyters,
Wiring of receptive fields and functional maps in 1981).
primary visual cortex Recent studies using intrinsic imaging of cortical activity
in combination with single-unit electrophysiology have con-
To fully understand the development and the adult organi- firmed and refined these classical findings, showing that in
zation of primary visual cortex, there are three separate kittens, orientation maps and ocular dominance columns
questions that must be addressed. First, we need to discover are already present by two weeks of age (Crair et al., 1998;
the mechanisms that are responsible for wiring receptive Crair, Horton, Antonini, & Stryker, 2001). Furthermore,
fields and cortical maps at the earliest stages of development there are no obvious differences in the development of ocular
(Albus & Wolf, 1984; Hubel & Wiesel, 1963; Sherk & dominance and orientation maps of normal and binocularly
Stryker, 1976). Second, we need a description of how activ- deprived animals up to the third postnatal week, demon-
ity-dependent processes maintain, modify, or refine these strating that normal visual stimulation is not necessary for
initial structures during the critical period (Crair, Gillespie, the early wiring of receptive fields and maps (Crair et al.,
& Stryker, 1998; Crowley & Katz, 2002; Katz & Crowley, 1998, 2001).
2002; Miller, Erwin, & Kayser, 1999; Swindale, 1996). An important finding is that receptive fields with segre-
Third, we need to understand which features of the resulting gated ON and OFF subregions, which are characteristic of
receptive fields and cortical maps are vital for normal visual simple cells, are observed in the thalamo-recipient layers 4
processing and which may arise as an epiphenomenon of and 6 as soon as the cortex becomes visually responsive
developmental processes and wiring constraints (Adams & (Albus & Wolf, 1984; Blakemore & Van Sluyters, 1975;
Horton, 2003; Chklovskii & Koulakov, 2000; Horton & Braastad & Heggelund, 1985; Hubel & Wiesel, 1963; Sherk
Adams, 2005; Koulakov & Chklovskii, 2001; Purves, Riddle, & Stryker, 1976). Furthermore, the ratio between the
numbers of simple and complex cells in the first weeks of
dario l. ringach Department of Neurobiology and Psychology, development remains approximately constant and does not
Jules Stein Eye Institute, David Geffen School of Medicine, differ from that in the adult (Albus & Wolf, 1984; Braastad
University of California, Los Angeles, California & Heggelund, 1985).
ringach: wiring of receptive fields and functional maps in primary visual cortex 409
These data are difficult to reconcile with the notion that perhaps this is done by different molecular markers specify-
simple-cell receptive fields develop from a set of heavily ing the locations where each input type is allowed to create
overlapping ON/OFF inputs (Linsker, 1986; Miller, 1994; synaptic contacts on the target neuron, thereby generating
Reid & Alonso, 1995). If this were the case, one would have nonoverlapping ON/OFF subregions.
predicted (1) an initial prevalence of receptive fields with Molecular guidance has been shown to be involved in the
overlapping ON/OFF receptive fields with a progressive establishment of a coarse retinotopy and in retinogeniculate
spatial segregation of subregions during development and laminar segregation (Cang, Kaneko, Yamada, Woods, &
(2) an increase in the ratio of simple to complex cells during Stryker, 2005; Huberman, 2007), yet its role in guiding
this developmental process. Instead, the available data indi- connectivity at the fine spatial scales required to shape the
cate that salient features of the adult cortical organization, structure of the subregions in individual receptive fields has
including the subregion segregation of simple cells, orienta- never been demonstrated and appears unlikely. For example,
tion, and ocular dominance maps, manifest themselves at it is difficult to conceive how different simple cells, on the
the earliest stages of cortical development, well before the same orientation column, could coordinate the expression
onset of the critical period. of markers so that all their receptive fields develop similar
We are thus faced with the challenge of explaining how orientation preferences.
receptive fields and cortical maps are wired initially. A Molecular patterning has also been proposed as underly-
couple of hypotheses have been considered so far. One pos- ing the generation of ocular dominance columns (Crowley
sibility is that the presence of structured spontaneous activity & Katz, 1999; Hubener & Bonhoeffer, 1999; Katz &
in the developing thalamus could drive the initial thalamo- Crowley, 2002). This may be a more appealing possibility,
cortical wiring (Miller, 1994; Miller et al., 1999). Such cor- owing to the larger spatial scales involved, but we should
relation-based models predict a specific pattern of activity: consider that functional maps are related in specific ways.
Thalamic cell pairs having the same center sign (either ON/ In the cat, for example, orientation pinwheels tend to align
ON or OFF/OFF) should be more correlated than cells with with the centers of ocular dominance domains (Bartfeld &
different center signs at small distances (on the scale of a Grinvald, 1992; Grinvald, Frostig, Siegel, & Bartfeld, 1991),
subregion width); an opposite pattern, in which same-sign and peaks of low/high spatial frequency domains tend to
cells are less correlated than opposite sign pairs, should be align with the pinwheel centers (Everson, 1998, Issa, Trepel,
observed at larger separations. This pattern of spontaneous & Stryker, 2000). Envisioning how molecular guidance by
activity and a synaptic connectivity rule by which “neurons itself could simultaneously explain the development of corti-
that fire together, wire together” ensure the emergence of cal maps (retinotopy, ocular dominance, orientation, spatial
segregated ON/OFF subregions (simple-cell receptive fields) frequency) and their relationships appears to be a rather
from overlapping ON/OFF inputs. To guarantee the peri- difficult task indeed.
odicity of orientation columns, an additional mechanism Arguably, these considerations weaken the case for spon-
that leads nearby cells to develop similar receptive fields and taneous activity in the developing thalamus and molecular
cells at large separations to develop different receptive fields guidance as explanations for the early establishment of the
must be invoked. cortical architecture. While it is premature to rule out their
Recent measurements of spontaneous activity in the involvement altogether, one cannot help but wonder whether
developing thalamus, however, have failed to corroborate there are any other wiring mechanisms that have not been
the predicted pattern of correlations (Ohshiro & Weliky, considered.
2006; Weliky & Katz, 1999). Instead of the predicted The proposal that I would like to discuss here was born out
Mexican hat profile, one observes a Gaussian falloff of cor- of the realization that the common assumption that simple
relation for same-sign receptive fields and zero correlation cells develop from a set of overlapping ON/OFF receptive
for different-sign pairs at all distances. Under these condi- fields is not supported by the available data. Thus asking
tions, the model fails to develop segregated ON/OFF sub- how simple cells arise from overlapping inputs is not the right
regions. These ideas could still be rescued by invoking a question to pursue. The relevant question is how the subre-
more complex “split constraint” that conserves the synaptic gions of simple cells could be wired without going through a
strength of ON and OFF center cells separately during developmental phase of substantial ON/OFF overlap.
development. However, its biological implementation is The answer to this question, I propose, is that receptive
hard to imagine (Ohshiro & Weliky, 2006). fields of LGN afferents are not expected to have a high
A second possibility is that molecular cues, involved in degree of overlap in the first place (Ringach, 2004, 2007).
axonal guidance/patterning, help to establish the initial cor- This assertion is based on the known statistics of retinal
tical architecture (Crowley & Katz, 2000; Katz & Crowley, ganglion cell mosaics, the degree of overlap of their receptive
2002). If thalamic afferents carrying signals from overlap- fields (coverage ratios), and the fact that LGN cells are domi-
ping ON/OFF-center receptive fields are to be sorted, nated by a single ganglion cell input (thereby reflecting the
ringach: wiring of receptive fields and functional maps in primary visual cortex 411
Figure 28.1 Conceptual description of statistical connectivity adjacent ON/OFF subregions are shown to the right. (B) Two
and some of its consequences. (A) The theory posits that the dis- consequences of statistical connectivity can be inferred from this
tribution of ON-center (plus signs) and OFF-center (triangles) simple diagram. First, to obtain receptive fields with substantially
retinal ganglion cells receptive fields, along with a moderate cover- different orientation, one must move to a different retinal location
age ratio (the solid disks indicate 1 standard deviation of the recep- (orientation scotomas). Second, there should be a tendency for
tive field center), and the isotropic sampling of incoming afferents overlapping simple-cell receptive fields to have the same sign within
(dashed circle) are responsible for the establishment of the early the overlap area. (C, D) The theory is consistent with the statistics
cortical architecture. In this example, sampling from the afferents of thalamocortical connectivity. Both the sign rule and the distribu-
within the indicated area would generate a receptive field with tion of receptive field overlap is explained by the model.
ability to repeatedly stimulate the exact same retinal location simple cells (Hubel & Wiesel, 1962). However, this interpre-
across experimental sessions. We are currently attempting to tation rests on the assumption that the cortex receives affer-
perform similar psychophysical experiments while carefully ents from a large number of overlapping ON and OFF
monitoring eye movements. geniculate receptive fields, which, as was discussed above,
Another success of statistical connectivity is in explaining is incorrect.
the probability and strength of monosynaptic connections Statistical connectivity offers an explanation for the appar-
from thalamus to cortex. In particular, the model replicates ent precision of thalamocortical wiring (figure 28.1C). In the
the sign rule of connectivity (Alonso et al., 2001; Reid & model, an ON subregion of a simple-cell results in the event
Alonso, 1995). This refers to the finding that the probability that an ON-center rather than OFF-center geniculate cell
of a monosynaptic connection is highest when the geniculate dominates that location of visual space. As a consequence,
receptive field overlaps a simple-cell subregion of the same one would expect a tendency for ON subregions to avoid
signature (either ON or OFF), while the probability of “inap- OFF inputs. The reason is simply that there are no OFF
propriate” connections between receptive fields of opposite inputs to avoid at that location. A similar analysis is to plot
signature is much lower. These data have been interpreted the distribution of the correlation coefficient between the
as supporting the existence of precise rules of synaptic con- spatial receptive field of thalamic afferents and overlap-
nectivity in accordance with the classic wiring scheme for ping cortical receptive fields for cases in which cells were
ringach: wiring of receptive fields and functional maps in primary visual cortex 413
Figure 28.2 Orientation tuning bandwidth and local map struc- The scatterplot illustrates the optimal correlation in one instance.
ture. (A) Example of an orientation preference map in macaque (D) A local homogeneity index was defined to capture the diversity
visual cortex along with the recovered location of the microelec- of orientation preferences around each cortical point. The example
trode array. (B) Reverse correlation in the orientation domain illustrates two locations with a low homogeneity index of 0.1
(Ringach et al., 1997) was used to measure the tuning curves at attained near a pinwheel and a location with a high index of 0.6
each electrode site simultaneously. The example here shows the in an iso-orientation domain. (E ) Spatial distribution of the local
average spike rate triggered to the presentation of each orientation homogeneity index for the same patch of cortex as the one shown
in a rapid stimulus sequence, yielding a preferred orientation, θ0, in panel D. (F ) Isolation of single units. Only units that could be
and tuning width, Δθ. (C ) The estimated location of the array (solid very well isolated, as is typical of the principal component analysis
dots in panel A) was estimated by finding the optimal translation/ here, were used in our analyses of tuning bandwidth and local map
rotation parameters for which the preferred orientations as mea- structure. (See color plate 42.)
sured via reverse correlation matched those measured optically.
centers (figure 28.2E ). Using this method, we found that Testing statistical connectivity
orientation tuning width and homogeneity index are nega-
tively correlated in both monkeys (r = −0.56, p = 0.00001) The status of statistical connectivity as a viable working
and cats (r = −0.56, p = 0.00005) (figure 28.2E), as predicted hypothesis for the early wiring of receptive field and cortical
by the model. maps derives from the fact that it is an extremely simple
It should be emphasized that statistical connectivity is concept that can explain a large set of data, including the
not the only possible explanation for this trend. One likely structure and emergence of simple receptive fields, the rela-
contribution to this relationship comes from the fact that the tionship between cortical maps, and the dependence of
tuning properties of the local environment of a cell is likely neuronal selectivity across functional maps.
to determine the tuning of the intracortical feedback signal There are many predictions that remain to be tested. A
and, in turn, the tuning of the cell (Marino et al., 2005; particularly interesting one is the existence of orientation
McLaughlin, Shapley, & Shelley, 2003; Schummers, Marino, scotomas. However, there are other ways to test the theory
& Sur, 2002). These two explanations are not mutually directly. If the theory is correct, given the structure of
exclusive. the RGC mosaic in the contralateral eye, one should be able
ringach: wiring of receptive fields and functional maps in primary visual cortex 415
Cang, J., Kaneko, M., Yamada, J., Woods, G., Stryker, M. P., Hubel, D. H., & Wiesel, T. N. (1963). Receptive fields of cells
& Feldheim, D. A. (2005). Ephrin-as guide the formation of in striate cortex of very young, visually inexperienced kittens.
functional maps in the visual cortex. Neuron, 48, 577–589. J. Neurophysiol., 26, 994–1002.
Chichilnisky, E. J., & Kalmar, R. S. (2002). Functional asymme- Hubener, M., & Bonhoeffer, T. (1999). Eyes wide shut. Nat.
tries in ON and OFF ganglion cells of primate retina. J. Neurosci., Neurosci., 2, 1043–1045.
22, 2737–2747. Hubener, M., Shoham, D., Grinvald, A., & Bonhoeffer, T.
Chklovskii, D. B., & Koulakov, A. A. (2000). A wire length (1997). Spatial relationships among three columnar systems in
minimization approach to ocular dominance patterns in mam- cat area 17. J. Neurosci., 17, 9270–9284.
malian visual cortex. Physica A Statist. Mechanics Appl., 284, Huberman, A. D. (2007). Mechanisms of eye-specific visual circuit
318–334. development. Curr. Opin. Neurobiol., 17, 73–80.
Cleland B. G., & Lee, B. B. (1985). A comparison of visual Issa, N. P., Trepel, C., & Stryker, M. P. (2000). Spatial frequency
responses of cat lateral geniculate-nucleus neurons with those maps in cat visual cortex. J. Neurosci., 20, 8504–8514.
of ganglion-cells afferent to them. J. Physiol. Lond., 369, Jin, J. Z., Weng, C., Yeh, C. I., Gordon, J. A., Ruthazer, E. S.,
249–268. Stryker, M. P., et al. (2008). On and off domains of geniculate
Crair, M. C., Gillespie, D. C., & Stryker, M. P. (1998). The role afferents in cat primary visual cortex. Nat. Neurosci., 11,
of visual experience in the development of columns in cat visual 88–94.
cortex. Science, 279, 566–570. Katz, L. C., & Crowley, J. C. (2002). Development of cortical
Crair, M. C., Horton, J. C., Antonini, A., & Stryker, M. P. circuits: Lessons from ocular dominance columns. Nat. Rev.
(2001). Emergence of ocular dominance columns in cat visual Neuroscience 3, 34–42.
cortex by 2 weeks of age. J. Comp. Neurol., 430, 235–249. Koulakov, A. A., & Chklovskii, D. B. (2001). Orientation pre-
Crowley, J. C., & Katz, L. C. (1999). Development of ocular ference patterns in mammalian visual cortex: A wire length
dominance columns in the absence of retinal input. Nat. minimization approach. Neuron 29, 519–527.
Neurosci., 2, 1125–1130. Lennie, P., & Movshon, J. A. (2005). Coding of color and form in
Crowley, J. C., & Katz, L. C. (2000). Early development of ocular the geniculostriate visual pathway (invited review). J. Opt. Soc.
dominance columns. Science, 290, 1321–1324. Am. [A], 22, 2013–2033.
Crowley, J. C., & Katz, L. C. (2002). Ocular dominance develop- Linsker, R. (1986). From basic network principles to neural archi-
ment revisited. Curr. Opin. Neurobiol., 12, 104–109. tecture: Emergence of spatial-opponent cells. Proc. Natl. Acad. Sci.
Das, A., & Gilbert, C. D. (1997). Distortions of visuotopic map USA, 83, 7508–7512.
match orientation singularities in primary visual cortex. Nature, Maldonado, P. E., Godecke, I., Gray, C. M., & Bonhoeffer, T.
387, 594–598. (1997). Orientation selectivity in pinwheel centers in cat striate
DeAngelis, G. C., Ghose, G. M., Ohzawa, I., & Freeman, R. D. cortex. Science, 276, 1551–1555.
(1999). Functional micro-organization of primary visual cortex: Marino, J., Schummers, J., Lyon, D. C., Schwabe, L., Beck, O.,
Receptive field analysis of nearby neurons. J. Neurosci., 19, Wiesing, P., et al. (2005). Invariant computations in local corti-
4046–4064. cal networks with balanced excitation and inhibition. Nat.
Farley, B. J., Yu, H., Jin, D. Z., & Sur, M. (2007). Alteration of Neurosci., 8, 194–201.
visual input results in a coordinated reorganization of multiple McConnell, S. K., & Levay, S. (1984). Segregation of on-center
visual cortex maps. J. Neurosci., 27, 10299–10310. and off-center afferents in mink visual-cortex. Proc. Natl. Acad. Sci.
Fregnac, Y., & Imbert, M. (1978). Early development of visual USA, 81, 1590–1593.
cortical-cells in normal and dark-reared kittens: Relationship McLaughlin, D., Shapley, R., & Shelley, M. (2003). Large-scale
between orientation selectivity and ocular dominance. J. Physiol. modeling of the primary visual cortex: Influence of cortical archi-
Lond., 278, 27–44. tecture upon neuronal response. J. Physiol. Paris, 97, 237–252.
Gilbert, C. D. (1977). Laminar differences in receptive-field Miller, K. D. (1994). A model for the development of simple cell
properties of cells in cat primary visual-cortex. J. Physiol. Lond., receptive-fields and the ordered arrangement of orientation
268, 391–421. columns through activity-dependent competition between on-
Grinvald, A., Frostig, R. D., Siegel, R. M., & Bartfeld, E. and off-center inputs. J. Neurosci., 14, 409–441.
(1991). High-resolution optical imaging of functional brain Miller, K. D., Erwin, E., & Kayser, A. (1999). Is the develop-
architecture in the awake monkey. Proc. Natl. Acad. Sci. USA, 88, ment of orientation selectivity instructed by activity? J. Neurobiol.,
11559–11563. 41, 44–57.
Grinvald, A., & Hildesheim, R. (2007). VSDI: A new era in Movshon, J. A., & Van Sluyters, R. C. (1981). Visual neural
functional imaging of cortical dynamics. Nat. Rev. Neurosci., 5, development. Annu. Rev. Psychol., 32, 477–522.
874–885. Nauhaus, I., & Ringach, D. L. (2007). Precise alignment of
Heggelund, P. (1986). Quantitative studies of the discharge micromachined electrode arrays with V1 functional maps. J.
fields of single cells in cat striate cortex. J. Physiol. Lond., 373, Neurophysiol., 97, 3781–3789.
277–292. Ohki, K., Chung, S., Ch’ng, Y. H., Kara, P., & Reid, R. C.
Horton, J. C., & Adams, D. L. (2005). The cortical column: A (2005). Functional imaging with cellular resolution reveals
structure without a function. Philos. Trans. R. Soc. Lond. B Biol. precise micro-architecture in visual cortex. Nature, 433,
Sci., 360, 837–862. 597–603.
Hubel, D. H., & Wiesel, T. N. (1959). Receptive fields of single Ohki, K., Chung, S. Y., Kara, P., Hubener, M., Bonhoeffer, T.,
neurones in the cat’s striate cortex. J. Physiol. Lond., 148, & Reid, R. C. (2006). Highly ordered arrangement of single
574–591. neurons in orientation pinwheels. Nature, 442, 925–928.
Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular Ohshiro, T., & Weliky, M. (2006). Simple fall-off pattern of cor-
interaction and functional architecture in cat’s visual cortex. related neural activity in the developing lateral geniculate
J. Physiol. Lond., 160, 106–154. nucleus. Nat. Neurosci., 9, 1541–1548.
ringach: wiring of receptive fields and functional maps in primary visual cortex 417
29 Encoding and Decoding with
Neural Populations in the
Primate Cortex
eyal seidemann, yuzhi chen, and wilson s. geisler
abstract Environmental stimuli are encoded by large neural requirements of the task? Addressing such questions is one of
populations in sensory cortical areas and subsequently decoded the key challenges facing systems neuroscience.
into motor plans by large neural populations in motor cortical
Over the past several decades, the primary approach to
areas. Furthermore, large populations of neurons are likely to
exhibit new emergent properties that are difficult or impossible to addressing these questions has been single-neuron electro-
infer from the activity of single neurons, recorded one at a time. physiology in combination with measurement of the stimuli
Thus, to understand encoding and decoding in the cortex, it is and/or behavioral responses. While much has been learned
essential to measure and analyze neural population responses, by using this approach, it will be difficult, if not impossible,
ideally in behaving subjects. In this chapter we review recent prog- to fully understand how neural circuits in the mammalian
ress in experimental techniques for measuring simultaneously the
activity of large neural populations; we discuss several important cortex encode and decode information on the basis of single-
emergent properties of neural population responses; and we unit electrophysiology. The reason is simply that large popu-
describe a Bayesian ideal observer framework that, when applied lations of neurons are likely to exhibit new and fundamentally
to simultaneous measurements of neural population responses and different emergent properties that might not be evident from
behavioral performance, can be used to rigorously explore encod- recordings of individual neurons, one at a time. Therefore
ing and decoding strategies at the level of neural populations.
we believe that it is important to shift the focus from single
neurons to populations of neurons by directly measuring the
properties of neural population responses, ideally in behav-
Sensory stimuli are encoded by large populations of neurons ing subjects. In addition, to understand the implications of
in sensory cortical areas and later decoded into motor plans these properties, it is necessary to develop a theoretical
that are implemented by large populations of neurons in framework that would allow a rigorous exploration of possi-
motor cortical areas (figure 29.1). The specific encoding and ble encoding and decoding strategies at the level of neural
decoding circuits that are implemented in the cortex are populations. In this chapter, we describe recent progress
undoubtedly shaped by a number of factors, including the along these two lines of research.
natural tasks that the organism performs, the properties of The main focus in this review is on the primary visual
the sensory stimuli and musculature that are relevant to cortex (V1) of primates, because this sensory area is arguably
performing those tasks, the biophysical and anatomical the best understood in terms of its anatomy, neuronal
properties of neurons, and the available space and metabolic response properties, and functional organization (de Valois
resources. & de Valois, 1988; Hubel & Weisel, 1977). However, most
These considerations raise many fundamental questions. of the results and theoretical considerations described below
For example, what are the number and identity of neurons in are likely to apply to other cortical areas and species. To
a given cortical area that contribute to a behavioral response? illustrate some of the general experimental and theoretical
What aspects of the stimulus or behavior are represented in issues that are relevant for understanding population coding
the signals from these neurons, and how are they encoded in and decoding, we discuss some of our measurements with
the neural response? How are these signals combined over voltage-sensitive dye imaging (VSDI) in V1, because these
space and time to mediate behavior? Are these encoding and measurements forced us to think generally about the expected
decoding algorithms fixed, or do they depend on the specific response properties of large populations of neurons. Two
goals of this chapter are to stimulate more research on popu-
eyal seidemann, yuzhi chen, and wilson s. geisler Department lation encoding and decoding and to emphasize the impor-
of Psychology and Center for Perceptual Systems, University of tance of using multiple complementary techniques to
Texas, Austin, Texas measure neural population activity.
seidemann, chen, and geisler: encoding and decoding with neural populations in primate cortex 419
mation, even if this could lead to improved performance. It
is also possible that the organism uses some irrelevant (poten-
tially performance-degrading) properties. Thus a second
important goal is to determine which properties of the popu-
lation responses the organism uses to perform a given task.
Finally, a third important goal is to determine how these
properties of the population responses are translated by sub-
Figure 29.1 Schematic representation of processing stages in sequent circuits into behavioral responses. Here we define
perceptual tasks. The solid lines indicate the measurements that are the actual code for a task as those specific properties of the
the focus of this review. population responses that the organism does use in perform-
ing the task, we define the actual encoder for the task as those
We begin this review by describing a computational frame- specific neural mechanisms that translate the sensory stimu-
work that can be used to explore possible population encod- lus into the actual code, and we define the actual decoder for
ing and decoding mechanisms. We then describe the the task as the specific subsequent neural mechanisms that
advantages and disadvantages of some of the experimental translates the actual code into perceptual decisions and
tools currently available for monitoring population responses motor plans (see figure 29.1).
in vivo. We next discuss key properties of population responses For any given task and set of neural constraints (e.g., some
in the primate cortex. We end with a general discussion that fixed number of neurons with specified anatomical and bio-
includes open questions and future research directions. physical limits), there is an optimal encoder that translates the
sensory stimuli into the neural population code that carries
Theoretical framework for studying population the most information relevant to performing the task. Simi-
encoding and decoding larly, for any given task and population code, there is an
optimal decoder. The concepts and mathematics of ideal
The responses of the neural population in a given sensory observer theory can be used to derive both optimal encoders
cortical area contain a certain amount of information that and decoders. Determining the optimal encoder and decoder
is potentially available to support performance in a given can be very useful in the quest to identify the actual encoder
task. This information could be characterized, in principle, and decoder, because the exercise generally leads to a deep
by measuring the statistical relationship between relevant understanding of the computational requirements of the task
environmental stimuli and response properties of the neural and often provides principled (and sometimes unexpected)
population at the given stage. Further, this information can hypotheses for the neural mechanisms.
be quantified, in principle, by deriving the ideal Bayesian Although much has been learned and much remains
observer (for the given task) that has complete knowledge of to be learned about encoding mechanisms in sensory areas
the statistical relationship between environmental stimuli (e.g., Geisler, 2008; Simoncelli & Olshausen, 2001), because
and the population response. An ideal observer is a theoreti- of space limitations the focus in this chapter is on evaluating
cal device that performs a task optimally given the available possible decoding mechanisms based on measured neural
input signals, knowledge of the prior probabilities of different population responses in sensory areas.
possible stimuli, and knowledge of the cost and benefits of To further illustrate these ideas, consider a thought experi-
the possible stimulus-response outcomes (Geisler, 1989; ment in which an organism is required to discriminate
Green & Swets, 1966). The performance of the ideal Bayes- between two barely discriminable stimuli. Assume that all the
ian observer is the appropriate measure of the neural infor- sensory information relevant for performance in this task
mation potentially available for specific tasks, and it is the passes through a single cortical area and that, as experiment-
measure we will use here.1 ers, we have precise access to the responses of all neurons in
Not all response properties of a neural population might this area, to the stimulus, and to the behavioral response of
be relevant for a given task (i.e., would improve the ideal the subject. In this case, the Bayesian framework outlined
observer’s performance); therefore one important goal is to above allows us to determine how to perform the task opti-
determine which properties of the population’s responses mally based on these neural signals and to determine the
carry information relevant to a given task and which do not. behavioral sensitivity that could be supported by this optimal
On the other hand, there may be relevant response proper- decoder. Because we have access to all the neural signals that
ties that the organism does not use to perform a given task are available for the organism to perform the task, the optimal
because of limitations of the decoding mechanisms. For decoder must do as well as, or better than, the organism.
example, subsequent stages might be unable to select The ideal Bayesian observer analysis provides several key
responses from an individual neuron (or arbitrary subset of benefits. Equal performance of the ideal observer and the
neurons) in the population or use precise spike-timing infor- organism implies that the organism is using all the relevant
seidemann, chen, and geisler: encoding and decoding with neural populations in primate cortex 421
Figure 29.2 Expected spread of activity in the visual cortex in dashed, and gray circles in panel C represent the outline (at ±2sst)
response to a small localized visual stimulus. (A) Cranial window of Gabor patches with sst of 0.05°, 0.25°, and 0.45°, respectively.
over V1 in the left hemisphere of one monkey. The cortical vascu- The corresponding arcs in panel B indicate the expected spread of
lature is seen through a transparent artificial dura (Arieli, Grinvald, activity in response to the three stimuli in a narrow strip of cortex
& Slovin, 2002). A typical region of interest of 8 × 8 mm2 with its along the representation of 2.5° eccentricity (see text for additional
anterior border running along the V1/V2 border is indicated by details). (D) Expected spread of cortical activity sR as a function of
the black square. (B) Expanded view of the cortical vasculature in stimulus size sst. The dashed horizontal line indicates the minimal
the 8 × 8 mm2 region of interest. (C ) Representation of the lower spread, which corresponds to the average V1 receptive field size
right visual field with a fixation crosshair in the top left. The shaded (srf of 0.25 degree) multiplied by the CMF at this eccentricity
wedge region is the approximate portion of the visual field that is (4 mm/degree). The oblique dashed line shows the expected spread
represented in the patch in panel B. The mapping from visual space of cortical activity based solely on the CMF.
to the cortex is indicated in panels C and B, respectively. The solid,
can be challenging, particularly when firing rates are above from the representation of the fovea toward the periphery
a few hertz. Also, because this is a scanning technique, there (increasing the eccentricity), receptive fields (RFs) of
is an inherent tradeoff between frame rate and field of view. V1 neurons become larger, and the cortical magnification
At high frame rates, the technique is currently limited to factor (CMF), the distance in cortex that corresponds to a
recording several dozen neurons within a fraction of a square given distance in visual space, decreases, both changing
millimeter. Finally, as with VSDI, this technique is currently approximately by a power law (Tootell, Switkes, Silverman,
limited to recording from the superficial cortical layers. & Hamilton, 1988; Van Essen, Newsome, & Maunsell, 1984;
Overall, this is a promising new technique, but more work Yang, Heeger, & Seidemann, 2007). This patch of cortex
is necessary before it will be applicable to alert, behaving represents a wedged-shaped region in visual space (shaded
primates. region in figure 29.2C ), extending approximately from an
These techniques promise new and exciting discoveries in eccentricity of 1.5 degrees to 3.5 degrees (degrees of visual
the coming years. Given their limitations, however, we angle) and representing directions about the visual axis
believe that to address questions of encoding and decoding between 270 and 310 degrees (angular degrees). Here, we
by populations of neurons, it will be necessary to use comple- consider the population response in a narrow vertical strip
mentary techniques and to develop quantitative understand- that is centered on the cortical representation of 2.5 degrees
ing of the relationship between measurements provided by eccentricity, where the CMF is approximately 4 mm/degree.
these and other techniques. RFs of V1 neurons have an envelope that is approximately
a two-dimensional Gaussian, with a space constant srf that is
Properties of population responses in the on average around 0.25 degree at this eccentricity (Nienborg,
primate cortex—mean response Bridge, Parker, & Cumming, 2004; Palmer, Cheng, &
Seidemann, 2007). The dashed circle in figure 29.2C shows
Here, we focus on three key properties of the mean popula- a typical RF with diameter of 1 degree (4 × srf).
tion response—spatial spread, sparseness, and temporal The expected spatial profile of the population response in
dynamics—using the primary visual cortex (V1) as an V1 can be obtained by filtering (convolving) the retinotopic
example. projection of the stimulus to the cortex with the average RF
expressed in millimeters of cortex (under linearity assump-
Spatial Spread of the Population Response Consider tions, which are approximately true for small localized
the spread of activity in a small patch of 8 × 8 mm2 cortex on stimuli). Specifically, if the stimulus has a Gaussian envelope
the dorsal portion of macaque V1 (figure 29.2). V1 contains (e.g., Gabor patch), the expected spread of activity in the
a topographic map of visual space, with disproportional cortex would also be a Gaussian with a space constant sR
representation of the center of gaze (fovea). As one moves given by
Figure 29.3 Spatiotemporal properties of V1 population response to targets at different contrasts after subtraction of the average
to a small Gabor target (sst of 0.33 degree) as measured by VSDI response in target absent trials. (C ) Response latency as a function
in one experiment. (A) Spatial profile of response amplitude to the of response amplitude for 25% (gray) and 7% (black) target con-
Gabor target at 25% contrast. Response amplitude is computed as trasts. Time courses were averaged in regions with similar response
the average amplitude in a 200-ms-long temporal interval following amplitude and fitted with sigmoidal function. To obtain regions
target onset. The white ellipse indicates the contour of a two- with similar response amplitude, the fitted two-dimensional
dimensional Gaussian fit to the evoked response (at 2-standard- Gaussian was divided into 10 elliptical annuli containing response
deviations). The black ellipse indicates the 2-standard-deviation amplitudes within 10 quantiles (e.g., the second innermost annulus
contour of the expected region of spiking activity (see text). The contains location with response between 80% and 90% of Rmax).
square shows a 1 × 1 mm2 region centered at the most sensitive Latency is time to half maximum. Lines show best fit linear
location. (B) Time courses of average VSDI signals in response regression.
seidemann, chen, and geisler: encoding and decoding with neural populations in primate cortex 423
Figure 29.4 Interactions between tuning width, number of stim- panels B and C but for five independent stimulus dimensions. (F )
ulus dimensions to which neurons are selective, and sparseness of Quantitative relationship between baseline and stimulus-evoked
the population response, based on a hypothetical population of response of single neurons and multiunits measured in macaque
neurons. (A) Gaussian tuning curve of one neuron across one V1. The scatterplot shows the equivalent number of selective single
stimulus dimension. Neurons in the population are assumed to units, NS, that can account for the multiunit response evoked by an
have the same tuning curve properties across all stimulus dimen- optimal Gabor patch versus the expected total number of single
sions and to uniformly cover the full stimulus range. (B) Frequency units, NT, that contribute to the baseline multiunit response. Gray
histogram of the relative response amplitude to a random stimulus circles represent single units; black circles represent multiple units.
assuming that neurons are tuned to one stimulus dimension. (C ) The solid curve is the fit to the observed multiunit data with a satu-
Percentage of the total stimulus-evoked response contributed rating function. (See Palmer, Cheng, & Seidemann, 2007, for addi-
by neurons in the different bins in panel B. (D and E ) Same as tional details.)
This section considers the relationship between neural The distribution is bimodal, with more than 50% of the
tuning width, the number of stimulus dimensions that are neurons responding at less than 10% of their maximal
represented by the population, and the sparseness of the response (Rmax) but a significant fraction of neurons respond-
response within the population. For simplicity, we ignore the ing at more than 90% of Rmax. From this distribution, we can
specific details of the tuning properties of V1 neurons and also determine the percentage of the total stimulus-evoked
consider a hypothetical population of neurons tuned uni- population response contributed by neurons at each of the
formly across n stimulus dimensions. Figure 29.4A shows a 10 response quantiles (figure 29.4C). In this case, about 30%
Gaussian tuning curve of one neuron across one circular of the total evoked response is contributed by the most active
stimulus dimension (such as orientation) with 2s equal to neurons, and the percentage of the total response decreases
one-sixth of the full range, a value that is comparable to the monotonically with decreasing mean response. Finally, we
average orientation tuning width in V1 (Geisler & Albrecht, can compute w1, the ratio of the average stimulus-evoked
1997). Assume for the moment that this is the only stimulus response to Rmax. In this case, w1 is about 0.2. (More gener-
dimension along which the neurons in the population are ally, this ratio is given by wn = w 1n, where n is the number of
tuned and that the neurons uniformly cover the full stimulus independent stimulus dimensions.)
range. Given the tuning curve, we can determine the frac- The picture changes dramatically if we consider five stim-
tion of neurons in the population that are expected to ulus dimensions (figures 29.4D and 29.4E ). Now more than
respond at any level of activity to a random stimulus. Figure 99.9% of the neurons in the population fall in the lowest
29.4B shows a frequency histogram of the expected propor- amplitude quantile, and the proportion of neurons in the
tion of the population at each of 10 response-level quantiles. highest quantile is less than 10−6. Similarly, when we con-
Additive Versus Multiplicative Noise In single cortical Spatial Correlations in the Population Response The
neurons, the variance of the spike count during a short magnitude and extent of spatial correlations in response
interval is proportional to the mean (Geisler & Albrecht, variability can have a large impact on the improvement in
1997; Tolhurst, Movshon, & Dean, 1983). What is the performance that can be attained by pooling responses over
seidemann, chen, and geisler: encoding and decoding with neural populations in primate cortex 425
Figure 29.5 Statistical properties of population response vari- neurons in each pool (see Chen, Geisler, & Seidemann, 2006). The
ability as measured by VSDI. A, mean (circles) and standard devia- value of the pairwise correlation is indicated near each curve. The
tion (asterisks) of response amplitude as a function of stimulus dashed vertical line is the approximate number of neurons contrib-
contrast averaged across eight VSDI experiments. Error bars indi- uting to each location. The dashed horizontal line is the predicted
cate the standard error of the mean. The mean response as a correlation in VSDI for two locations that are 0.25 mm apart. C,
function of contrast is fitted with a Naka-Rashton function; the Average correlation between two locations in one VSDI experi-
standard deviation as a function of contrast is fitted with linear ment as a function of the separation between the locations. D,
regression. The slope of the regression is not significantly different Average temporal correlations between responses in two frames as
from zero (i.e., stimulus-independent additive noise). B, Expected a function of their separation in time. Smooth curves are exponen-
correlations between the summed activity in two pools of neurons tial fits.
with uniform pairwise correlations, as a function of the number of
large populations of neurons (e.g., Abbott & Dayan, 1999; Significant correlations can be observed even at distances
Averbeck, Latham, & Pouget, 2006; Johnson, 1980; Snippe exceeding 4 mm.
& Koenderink, 1992; Sompolinsky, Yoon, Kang, & Shamir, The strong correlations at the level of the pool could
2001). Extracellular recording studies have measured the contribute to the additive nature of the variability in popula-
correlations in spiking activity between pairs of nearby tion responses. At the level of the pool, the variance is domi-
cortical neurons (e.g., Bair, Zohary, & Newsome, 2001; nated by weak correlated noise between pairs of neurons,
Gawne & Richmond, 1993; Lee, Port, Kruse, & Georgopoulos, which may be relatively stimulus-independent (but see Kohn
1998; Romo, Hernandez, Zainos, & Salinas, 2003; Zohary, & Smith, 2005).
Shadlen, & Newsome, 1994). These studies report low but
highly significant correlations between pairs of neurons that Temporal Correlations in the Population Response
are recorded from the same electrode. More recent studies Temporal correlations are an important property of neural
with multiple electrodes suggest that these correlations decay population responses with significant consequences for
over space but remain significant even at distances of multiple possible decoding mechanisms. Figure 29.5D shows the
millimeters (Kohn & Smith, 2005). Pearson correlation between the amplitude of the VSDI
Simple theoretical considerations suggest that in large signals in two frames as a function of their separation in time
neural populations, average correlations should be signifi- (Chen et al., 2008). The correlations are high for short
cantly higher than in pairs of single neurons (figure 29.5B). intervals and fall off exponentially with a time constant of
The reason is that in large pools of neurons, sources of noise approximately 100 ms. The temporal correlations are similar
that are independent across the pool are averaged out while in target-present and target-absent trials, consistent with the
leaving the weak correlated noise unaffected; this leads to additive nature of the variability in VSDI responses. The
much higher correlations between the pooled responses. For additive variability and long-lasting temporal correlations
example, if we assume a uniform pairwise correlation are consistent with findings from VSDI experiments in the
between neurons in two pools, pairwise correlations that are visual cortex of anesthetized cat (Arieli et al., 1996).
undetectable (e.g., r = 10−3; solid curve, figure 29.5B) could Significant temporal correlations have been observed in
lead to exceedingly high correlations between the pools for single-unit recordings from primate visual cortex (Osborne,
large numbers of neurons. In other words, given reasonable Bialek, & Lisberger, 2004; Uka & DeAngelis, 2003).
assumptions about the number of neurons contributing to There are more subtle questions regarding the nature of
each location in VSDI experiments, much higher correla- the spatiotemporal correlations that we have not discussed
tions than are observed for pairs of single neurons are here and should be addressed by future research. For
expected. example, are there higher-order correlations at the level of
As predicted, the correlations in the VSDI signals are very pools of neurons? In the retina, the observed correlations
high between nearby locations and fall off exponentially with can be explained remarkably well if we assume only pairwise
space constants that are on the order of 2 mm (figure 29.5C ). correlations between neighboring retinal ganglion cells
detecting the target from the monkey’s VSDI signals and to where wi is the weight given to response xi from site i (Chen
compare its performance with that of several suboptimal et al., 2006; Duda, Hart, & Stork, 2001). This pooled
decoders and with the performance of the monkey. For response is the decision variable that is used to determine
simplicity, we first describe the optimal strategy for spatial whether the target is present or absent on a given trial. The
decoding, ignoring the temporal dimension by averaging optimal set of weights, w = 〈w1, ... , wn〉, is given by
the VSDI signals over a short temporal interval. We then
w = Σ −1 s (2)
where Σ −1 is the inverse of the response covariance matrix
Σ and s is the mean difference in response between the
target-present and target-absent trials (Chen et al., 2006;
Duda et al., 2001).
An equivalent way to obtain the optimal set of weights is
to derive a whitening or decorrelation spatial filter that, when
convolved with the population response, produces variability
that is independent over space. Figure 29.7A shows a one-
dimensional slice through the whitening filter matched to the
properties of the spatial correlations in the VSDI signals
(figure 29.5C ). This filter has a sharp positive peak and a
small negative trough. By applying this filter to the fitted
response profile (figure 29.3A), we obtained the linear weights
used by the optimal spatial decoder (figure 29.7B) (Chen
et al., 2006).
The optimal weights contain a central positive region and
a larger negative surround. The reason these weights have
a center-surround structure is that the spatial correlations
fall off more slowly over space than the signal does. Because
Figure 29.6 Visual detection task. Monkeys were required to variability in the surround, where stimulus-evoked signals
detect a low-contrast Gabor patch that appeared at a known loca- are weak or absent, is still highly correlated with variability
tion in half of the trials. The target appeared 300 ms after the in the center, the optimal strategy is to estimate the common
dimming of the fixation point. The monkey indicated detection by
noise from the surround and subtract it from the center.
making a saccadic eye movement to the target location when it was
detected but no later than 600 ms after target onset. The monkey The detection sensitivity of the optimal decoder can be
indicated target absence by maintaining fixation for 1.5 s after fixa- determined by measuring its performance in the detection
tion point dimming. task (Chen et al., 2006). We can also evaluate the detection
seidemann, chen, and geisler: encoding and decoding with neural populations in primate cortex 427
Figure 29.7 Optimal spatial pooling of VSDI responses in a Optimal weights for pooling the population responses over space.
detection task. (A) A one-dimensional cut through a two-dimen- (C ) Average difference in percent correct between the performance
sional spatial decorrelation (whitening) filter that removes the of the optimal and four suboptimal spatial pooling rules and the
spatial correlations in the population responses (figure 29.5C ). (B) performance of the monkey in eight VSDI experiments.
sensitivity of other previously proposed spatial pooling rules; simply consider the time course in a 1 × 1 mm2 region
for example, a rule that gives equal weight to all locations centered on the most sensitive location. This optimal decoder
(average rule, analogous to the rule used by Shadlen, Britten, evaluates V1 responses and decides, on a moment-by-
Newsome, and Movshon (1996)) or a rule that weights each moment basis, whether and when sufficient evidence that
location based on its sensitivity (weighted d ′, analogous to the target is present has accumulated.
the rule used by Geisler and Albrecht (1997)). Similarly, we To optimally decode neural population responses over
can evaluate pooling rules that consider only a small region time, temporal correlations in the population responses must
such as the location with the peak average response or the first be removed. Analogous to space, temporal correlations
location with the highest sensitivity (maximal d ′). can be removed by a decorrelation filter that, when con-
Figure 29.7C shows the average difference in performance volved with the responses in single trials, produces responses
of the optimal and four suboptimal pooling rules from the that are independent across frames. To be biologically plau-
performance of the monkey in eight VSDI experiments. sible, however, this filter must be causal; that is, the output
This figure shows two surprising results. First, the optimal of the filter at time t must depend only on the response up
rule does significantly better than the monkey, demonstrat- to time t. The whitening filter is shown in figure 29.8A; it
ing the sensitivity of the VSDI technique and showing that has a sharp positive peak, immediately followed by a smaller
there are more signals in V1 than the monkey uses. Second, and slightly longer-lasting negative peak. Such a filter could
the two rules that pool over a large area with positive weights be implemented biologically with rapid excitation followed
(average and weighted d ′) perform significantly worse than by time-lagged inhibition. The whitening operation empha-
the monkey, while the two rules that consider only a single sizes the response onset (and offset) relative to the sustained
location perform comparably to the monkey. response (figure 29.8B). In other words, there is more infor-
The rules that pool over a large area with positive weights mation per unit time in the initial rising edge of the response
perform poorly, owing to the spatial correlations (figure than in the sustained response. This occurs because the
29.5C ). Because the pool contains both highly sensitive and response onset contains high temporal frequencies and most
weakly sensitive neurons, averaging these together reduces of the power in the correlated noise is in the low temporal
signal without reducing the correlated noise. Thus when the frequencies.
noise is highly correlated, pooling over a small area may be The optimal temporal decoder takes the whitened VSDI
better than pooling with positive weights over a larger area, signal in single trials and computes the dynamic posterior
even if the larger area contains signals. The only way to probability of each possible stimulus, given the observed
improve performance beyond the performance of a rule responses (Chen et al., 2008). It then reports “target present”
such as maximal d ′ is to use negative weights to cancel some if the posterior probability for target presence exceeds a fixed
of the noise. Importantly, rules that rely on a single-site criterion (the horizontal line in figure 29.8C ) that is selected
perform poorly if the site is significantly smaller than 0.25 × to maximize accuracy. The optimal temporal pooling model
0.25 mm2. With smaller sites, independent noise dominates performed more accurately than the monkey (figure 29.8D).
the response, leading to reduced performance. In addition, the “reaction times” of the optimal temporal
pooling model (the time at which the posterior probability
Temporal Decoding of Neural Population Responses for target presence reached the criterion) were much faster,
Next, consider the optimal Bayesian temporal decoder for on average, than the monkey’s reaction times (figure 29.8E ).
detecting the target from V1 population responses in a These results indicate that population responses provide
reaction time task. Here we ignore the spatial dimension and reliable information that could guide behavior even in brief
temporal intervals (∼100 ms). The mean and the variance of of temporal correlations cannot be entirely overcome by
both the ideal observer’s and the monkey’s reaction times optimal pooling. Finally, note that the performance of the
increase with decreasing target contrast, but at a faster rate optimal temporal decoder (and running integrator decoder)
for the monkey than for the ideal observer (figure 29.8E). can be improved further by combining signals over space
As with spatial pooling, one advantage of deriving the using the optimal spatial pooling rule rather than averaging
optimal decoder is that it can serve as a benchmark to which the signals in a 1.0-mm2 region (Chen et al., 2008).
suboptimal models can be compared. We evaluated the per-
formance of a simple model in which the VSDI responses Discussion
are summed until a fixed threshold is reached. This model
performs significantly worse than the monkey. We also eval- This chapter began with two central claims that are relevant
uated an optimally shaped “running integrator” model that to the goal of understanding encoding and decoding by
integrates the whitened responses over a window of about neural populations in the mammalian cortex. First, emer-
100 ms. Because most of the information in our task was gent properties in large neural populations make it essential
concentrated in the rising edge of the response and because to augment single-neuron recording with techniques that
the temporal profiles of the response at different contrasts measure the responses of large populations of neurons simul-
differ only in latency and amplitude, this running integrator taneously. Second, in formulating and testing hypotheses for
model performed almost as well as the ideal observer. population encoding/decoding, it can be highly beneficial
Note that the ideal observer would have performed much to derive and evaluate optimal encoding and decoding strat-
better had the responses been statistically independent over egies. To illustrate the first claim, we reviewed some emer-
time, demonstrating that in our task, the detrimental effect gent properties that have been observed in V1: widespread
seidemann, chen, and geisler: encoding and decoding with neural populations in primate cortex 429
responses even from maximally localized stimuli, highly Sources of Correlated Noise Another important
sparse representations with most of the population response emergent property in V1 is the widespread and large
arising from weakly responding neurons, rapid response spatial and temporal correlations in the variability of the
dynamics over large areas of cortex, additive but not multi- population response. These correlations can have profound
plicative population noise, and large spatial and temporal consequences for decoding, and they are entirely expected
noise correlations. To illustrate the second claim, we when a large number of neural inputs, having weakly
described the optimal decoding strategy for VSDI signals correlated noise, are summed. Thus, it is important to
recorded from behaving monkeys in a reaction-time detec- identify and characterize the sources responsible for the
tion task and how this optimal decoder provides insight into small correlated noise that is shared between the neurons in
specific questions about neural decoding. a population. For example, it is possible that weak correlated
Although progress is being made, the rigorous study of noise must always be present, owing to the inevitable sharing
population responses in sensory and motor areas of the of inputs between neurons. If so, then every time a large
cortex is just beginning; indeed, the results obtained to date amount of convergence is required in a neural circuit, there
raise more questions than they answer. Next we discuss some will be the need for a decorrelating mechanism that can
of the relevant issues. cancel most of the correlated noise.
Weakly Versus Strongly Responding Neurons An Decoding a Neural Population Response with a
important emergent property of large population responses Neural Population In the description of optimal and
in V1 is the dominance of relatively weakly responding suboptimal candidate decoders, we were not explicit about
neurons in the total population activity (figure 29.4). An how they might be implemented. In all likelihood, the
open question is whether the weakly responding neurons decoding of population responses is implemented with
are ignored or used by subsequent decoding mechanisms. another neural population. In fact, it is likely that, at every
It is not uncommon from the perspective of single-neuron step along a sensorimotor pathway (from sensory encoding,
electrophysiology to assume that those neurons that are to decision computation, all the way to the activation of
most sensitive to a stimulus are the ones that carry most of muscle fibers), the stimulus and/or motor response is
the information used by the brain, but this need not be the represented by the activity of a large neural population,
case. For example, magnocellular neurons in the LGN are because that is the obvious way to obtain robust behavior
much more sensitive to contrast than are parvocellular without ever requiring any specific neuron to be as robust
neurons, and hence one might expect them to dominate as the behavior.
performance in contrast detection, but in fact, the much
more numerous but weaker responding parvocellular Decoding Population Responses in Different Cortical
neurons dominate in most contrast detection tasks (Merigan, Areas Given the similarities in neural anatomy across the
Katz, & Maunsell, 1991). cortex, it is quite possible that in all sensory and motor areas
To further illustrate the potential significance of weakly (as in V1), even the most localized inputs are encoded by
responding neurons, consider the study of choice proba- population activity that extends over at least several square
bility (trial-by-trial correlations between neural and behav- millimeters. However, there may also be some substantial
ioral variability) at the single-neuron level. If a large pool differences in the properties of population responses across
of weakly responding neurons were contributing as much areas. For example, it is possible that early sensory areas
to a subject’s choice as a small pool of strongly responding contain a more sparse representation than higher sensory
neurons, recording from a single strongly responding areas because they must represent many stimulus dimensions
neuron could easily yield a measurable correlation with within the same area. This could have important conse-
behavior, whereas recording from a single weakly respond- quences for the properties of population responses and hence
ing neuron could easily yield no measurable correlation. for subsequent decoding.
Obviously, it would be a mistake to interpret the lack of
correlation in the weakly responding neuron as evidence Decoding Population Responses in Different Tasks The
against a major role for the weakly responding neurons in ideal Bayesian spatial decoder developed for our detection
the subject’s choice. task (figure 29.7B) can be extended to other detection
These considerations provide an additional illustration of and discrimination tasks. As long as the variability is
a central theme of this chapter: that effects that are very consistent with an additive Gaussian noise, the optimal
weak at the single-neuron level could have a dominant role weights are given by equation 2. Because the variability is
at the level of the pool. Therefore a general conclusion is additive, the only factor that determines the task-dependent
that one should be cautious when making predictions based component of the optimal weights is the difference in the
on single-unit measurements. mean response between the two stimulus conditions, s.
Figure 29.9 Optimal linear weights in three different perceptual population responses in the middle panel. (D) top panel: response
tasks based on hypothetical population responses. (A) Detection of to the horizontal Gabor in panel B minus the response to the verti-
a low-contrast square. (B) Detection of a low-contrast horizontal cal Gabor in panel C. The orientation-selective response was
Gabor patch. (C ) Detection of a low-contrast vertical Gabor patch. modeled as a high-spatial-frequency activation pattern (2.5 cycles/
(D) Discrimination between the Gabor patches in panels B and C. mm) with amplitude that is 10% of the amplitude of the Gaussian
(A–C ) top panel: stimulus; middle panel: hypothetical response in envelope of the population response. (D) bottom panel: optimal
an 8 × 8 mm2 patch of cortex; bottom panel: optimal linear weights linear weights for discriminating between the Gabor patches at the
obtained by applying the whitening filter in figure 29.7A to the two orientations.
seidemann, chen, and geisler: encoding and decoding with neural populations in primate cortex 431
Identifying the Actual Neural Code and Decoder The Conclusions
goal of an ideal observer analysis is to determine how
population responses should be pooled over space and time This chapter reviewed recent progress in understanding
to perform a specific task optimally. As was discussed before, neural population coding in the mammalian cortex. This
this approach can, in principle, be used to reject possible research area is clearly in its infancy. Making further prog-
combinations of codes and decoders if their sensitivity falls ress will necessitate improving existing techniques, as well as
significantly short of the subject’s sensitivity. The fact that developing new techniques, for monitoring and manipulat-
the ideal observer does better than the monkeys in our ing neural population responses in behaving subjects. It is
detection task shows that in this task, there is no need to unlikely that any single technique will provide access to the
assume an actual code with a finer spatial and/or temporal real-time activity of all the neurons in a given cortical area
resolutions than the one provided by VSDI. An important that could potentially contribute to behavior. Therefore it is
goal for future research is to determine whether this holds important to develop a quantitative understanding of the
in other tasks. For example, it is possible that in an orientation relationships between the measurements of neural popula-
discrimination task, an ideal observer using VSDI signals tions obtained with different techniques at different spatial
from V1 would perform significantly worse than the subject, scales. Analyzing simultaneous measurements of population
owing to the coarse spatial pooling that is inherent to this responses and behavioral performance, within a Bayesian
technique. ideal-observer framework, is a powerful approach for
The finding that an ideal observer performs significantly addressing fundamental issues of population encoding and
better than the subject does not necessarily imply that the decoding.
subject is using a suboptimal decoding strategy. Population
responses could be pooled by using the optimal pooling acknowledgments We thank W. Bosking, C. Michelson, C.
Palmer, and Z. Yang for discussions, and T. Cakic for technical
strategy but be degraded by subsequent sources of noise. For
support. This work was supported by National Eye Institute Grants
a more complete discussion of why the monkeys might EY-016454 and EY-016752 to E. Seidemann and EY-02688 to W. S.
perform suboptimally in our detection task, see Chen and Geisler and by a Sloan Foundation Fellowship to E. Seidemann.
colleagues (2006, 2008).
Ultimately, the goal of this line of research is to determine
what are the actual code and actual decoder used by the observer. NOTE
We are a long way from being able to address this question 1. Note that this measure of information is related to, but differs
even in the simplest perceptual and motor tasks. Next, we from, traditional measures in information theory (Cover &
briefly mention two approaches that could be used to address Thomas, 2006). For example, Shannon information (mutual
information) is appropriate for characterizing the potential bit
these questions.
rate of information transfer through a noisy channel when the
One potential approach is to examine the trial-by-trial goal is input reconstruction, but mutual information is not
covariations between neural and behavioral responses monotonically related to the trial-by-trial accuracy of an
(choice probability). Previous studies of neural and behav- ideal observer in a discrimination or classification task (Geisler,
ioral performances near psychophysical threshold demon- Albrecht, Salvi, & Saunders, 1991). Fisher information can be
monotonically related to the performance of the ideal observer,
strated weak but significant covariation between the
although the ideal observer provides the more general
activity of single neurons and behavioral responses (e.g., measure.
Britten, Newsome, Shadlen, Celebrini, & Movshon, 1996;
Cook & Maunsell, 2002; Palmer et al., 2007; Purushotha-
man & Bradley, 2005). If such correlations can be measured REFERENCES
at the population level, their nature could provide useful Abbott, L. F., & Dayan, P. (1999). The effect of correlated vari-
information regarding the decoding mechanisms used by the ability on the accuracy of a population code. Neural Comput.,
subject. 11(1), 91–101.
A second approach that could be used to study the actual Albrecht, D. G., Geisler, W. S., Frazor, R. A., & Crane, A. M.
(2002). Visual cortex neurons of monkeys and cats: Temporal
decoder is to perturb population responses in specific ways
dynamics of the contrast response function. J. Neurophysiol., 88(2),
that are designed to distinguish between possible decoding 888–913.
mechanisms (e.g., Lee, Rohrer, & Sparks, 1988). Combining Arieli, A., Grinvald, A., & Slovin, H. (2002). Dural substitute
careful behavioral and neurophysiological measurements for long-term imaging of cortical activity in behaving monkeys
with better techniques for selectively perturbing brain activ- and its clinical implications. J. Neurosci. Methods, 114(2),
119–133.
ity at the population level (e.g., genetic-based techniques:
Arieli, A., Sterkin, A., Grinvald, A., & Aertsen, A. (1996).
Tan et al., 2006; Zhang et al., 2007) is a promising direction Dynamics of ongoing activity: Explanation of the large variabil-
for testing candidate decoding models. ity in evoked cortical responses. Science, 273(5283), 1868–1871.
seidemann, chen, and geisler: encoding and decoding with neural populations in primate cortex 433
Seidemann, E., Arieli, A., Grinvald, A., & Slovin, H. (2002). ible inactivation of mammalian neurons in vivo using the dro-
Dynamics of depolarization and hyperpolarization in the frontal sophila allatostatin receptor. Neuron, 51(2), 157–170.
cortex and saccade goal. Science, 295(5556), 862–865. Tolhurst, D. J., Movshon, J. A., & Dean, A. F. (1983). The
Shadlen, M. N., Britten, K. H., Newsome, W. T., & Movshon, statistical reliability of signals in single neurons in cat and monkey
J. A. (1996). A computational analysis of the relationship between visual-cortex. Vis. Res., 23(8), 775–785.
neuronal and behavioral responses to visual motion. J. Neurosci., Tootell, R. B. H., Switkes, E., Silverman, M. S., & Hamilton,
16(4), 1486–1510. S. L. (1988). Functional-anatomy of macaque striate cortex: 2.
Shlens, J., Field, G. D., Gauthier, J. L., Grivich, M. I., Petrusca, Retinotopic organization. J. Neurosci., 8(5), 1531–1568.
D., Sher, A., et al. (2006). The structure of multi-neuron firing Uka, T., & DeAngelis, G. C. (2003). Contribution of middle
patterns in primate retina. J. Neurosci., 26(32), 8254–8266. temporal area to coarse depth discrimination: Comparison of
Simoncelli, E. P., & Olshausen, B. A. (2001). Natural image neuronal and psychophysical sensitivity. J. Neurosci., 23(8),
statistics and neural representation. Annu. Rev. Neurosci., 24, 3515–3530.
1193–1216. Van Essen, D. C., Newsome, W. T., & Maunsell, J. H. R. (1984).
Slovin, H., Arieli, A., Hildesheim, R., & Grinvald, A. (2002). The visual field representation in striate cortex of the macaque
Long-term voltage-sensitive dye imaging reveals cortical dynam- monkey: Asymmetries, anisotropies, and individual variability.
ics in behaving monkeys. J. Neurophysiol., 88(6), 3421–3438. Vis. Res., 24(5), 429–448.
Snippe, H. P., & Koenderink, J. J. (1992). Information in channel- Yang, Z., Heeger, D. J., & Seidemann, E. (2007). Rapid and
coded systems: Correlated receivers. Biol. Cybern., 67(2), precise retinotopic mapping of the visual cortex obtained by
183–190. voltage sensitive dye imaging in the behaving monkey. J. Neuro-
Sompolinsky, H., Yoon, H., Kang, K. J., & Shamir, M. (2001). physiol., 98(2), 1002–1014.
Population coding in neuronal systems with correlated noise. Zhang, F., Wang, L.-P., Brauner, M., Liewald, J. F., Kay, K.,
Phys. Rev. E, 64(5), 051904-1–051904-11. Watzke, N., et al. (2007). Multimodal fast optical interrogation
Stosiek, C., Garaschuk, O., Holthoff, K., & Konnerth, A. of neural circuitry. Nature, 446(7136), 633–639.
(2003). In vivo two-photon calcium imaging of neuronal net- Zohary, E., Shadlen, M. N., & Newsome, W. T. (1994). Corre-
works. Proc. Natl. Acad. Sci. USA, 100(12), 7319–7324. lated neuronal discharge rate and its implications for psycho-
Tan, E. M., Yamaguchi, Y., Horwitz, G. D., Gosgnach, S., Lein, physical performance. Nature, 370(6485), 140–143.
E. S., Goulding, M., et al. (2006). Selective and quickly revers-
1o
C
2.4o 4o
20 No
Square
10
Square
Baseline
0
5.6o 12.8o
20
10
0
0 2 4 6 8 10 12 0 2 4 6 8 10 12
Time (Sec)
Direct evidence for the primacy of boundary representa- in area 17–18 pairs and was weaker in area 17–17 pairs.
tions comes from recording study in the cat (Hung, Ramsden, Further, the reduction of inhibition observed in neurons
& Roe, 2007), in which responses from simultaneously with RFs in a figure surrounded by dynamic texture during
recorded cell pairs were compared, one RF being over a Troxler fading (De Weerd, 2006) suggests that boundary
surface boundary and another inside the surface. They adaptation permits the recorded neurons to become driven
observed a border-to-surface shift in the relative timing of by excitatory input from the texture background. This is
spiking activity for both real and illusory (COC) brightness in agreement with the role of boundary representations in
contrast stimuli. Interestingly, the difference between bound- controlling neural spreading activation related to surface
ary and surface-related signals was observed predominantly perception.
Surface spreading processes are controlled not only by in V1, V2, and V4. Second, spread of surface properties
boundary representations, but also by more global aspects occurs via lateral connectivity in the surface representation
of the stimulus. Data from Troxler paradigms suggest system. Third, recurrent loops within hierarchically
that factors that determine figure-ground assignment organized border and surface processing streams determine
(Sakaguchi, 2001, 2006; De Weerd et al., 1998; Hamburger, the perceptual outcome of spreading activation. Fourth,
Prior, Sarris, & Spillmann, 2006; Hsieh & Tse, 2006) and spread of surface feature is contained by inhibitory signal
statistics of the textures themselves (Hindi Attar, Hamburger, from the boundary system, and adaptation of inhibition is a
Rosenholtz, Götzl, & Spillmann, 2007; Sagakuchi, 2006) permissive factor for new spreading activation across surface
play an important role in determining the outcome of the boundaries.
spread of visual surface features. Rather than focusing on luminance and color stimuli
Overall, the data suggest that boundary representations (e.g., Grossberg, 2003a; Neumann, Pessoa, & Hanson, 2001),
control spreading activation, both by initiating and contain- the present model aims to simulate texture filling-in in a
ing spread in normal vision and by permitting new spread Troxler paradigm, in which a figure is “invaded” by the
when boundaries adapt during stabilized vision. background following figure boundary adaptation. We will
not consider initial spreading events at stimulus onset. We
Empirical Basis of a Computational Model of Surface aim to increase understanding of divergent results from
Filling-in According to our interpretation of the data, fMRI and spiking data in filling-in studies, to study implica-
four principles of visual system organization can be used tions of anatomical intertwining of boundary and surface
to guide the modeling of perceptual filling-in. First, surface processing streams for the fMRI activity distribution, and to
representation is accomplished by separate but interacting increase insight into the contribution of recurrent loops to
boundary and surface mechanisms that are intertwined surface perception.
ai ( t ) = (1 − r ) ai ( t − 1) + τσ ( neti ( t ) + bi )
where wij is the weight from unit j to unit i, ai(t) is the average
spike output of unit i at time t, neti(t) is the net input (excit-
atory minus inhibitory input) for unit i at time t, bi is a bias
term, and s(x) is the logistic (sigmoidal) function. The value
t (0 < t ≤ 1) determines how strongly the activation value
(average spiking activity) at the last time point (t − 1) is influ-
encing the activity at the current time point t.
degrees of eccentricity (corresponding to about 10 mm in The simulated, passive spread of fMRI may reveal the spread
V1). This was surprising, because estimates of spatial resolu- of subthreshold synaptic activity into modeled cortical regions
tion on the order of 2 mm had been reported around the where spiking activity is absent (for discussion, see the third
same time (e.g., Engel et al., 1994; Engel, Glover, & Wandell, part of the chapter). As expected, there was also significant
1997). On the basis of our own findings, we have suggested spread of modeled fMRI signal from a stimulated figure
that spatial resolution was much lower (Gaussian point representation into an empty background representation
spread of 7 mm at HWHM; De Weerd et al., 1997). The (Fig-On-Back-Off; figures 30.6E and 30.6F ) beyond the
data from De Weerd and colleagues (1997) were obtained activity distribution defined by spiking activity (figures 30.6G
by using a 1.5T scanner and surface coil, but in recent years, and 30.6H ).
we have confirmed their observations using a 3T scanner
(unpublished data). Perceptual filling-in of a gray figure by a dynamic noise texture
To investigate this phenomenon, we conducted a simple background The lateral spreading described in the previous
simulation study in which we presented either a rectangular section does not lead to spiking activity in surface-processing
figure made of dynamic texture on an “empty” background units within the representation of a homogenous figure sur-
(Fig-On-Back-Off) or the inverse stimulus with an empty rounded by texture. Hence the modeled fMRI spread is not
rectangular figure on a dynamic texture background (Fig- a correlate of perceptual filling-in but rather a phenomenon
Off-Back-On). The figure was varied in size, and the result- that presents an obstacle to measure a fMRI correlate of
ing activity levels in the background and figure are shown perceptual filling-in. Here, we model filling-in of a figure by
in figure 30.6 (active voxels in white). Modeled fMRI signal surrounding fine-grained dynamic noise. Prior to filling-in,
is shown in M-V2, but it is reasonable to expect similar inhibitory influences from the boundary-processing system
limitations in spatial resolution in other extrastriate areas. are thought to contain spiking activity within surface bound-
Modeled fMRI activity for a small empty figure on a texture aries. Figure 30.7A shows spiking activity during this state in
background was as high in the figure as in the background boundary and surface systems in M-V1 and M-V2 (activity
representation (figure 30.6A). Only for a large empty square, in white), during the presentation of a homogenous figure
the fMRI signal revealed a gradual fall-off of activity from on a dynamic noise background to the M-Retina. Figure
the texture toward the center of the empty figure (figure 30.7B shows fMRI activity for the same system state in
30.6B). Modeled spiking activity, however, was elevated only boundary and surface systems of M-V2. The figure is suffi-
in the background representation and did not invade the ciently large to prevent the inflow of passive fMRI activity
representation of the empty figure (figures 30.6C and 30.6D). from the background to reach the middle of the figure rep-
A B E F
Spiking
C D G H
Figure 30.6 Simulation of spiking activity and fMRI signals the Fig-Off-Back-On condition show inflow from texture back-
when an “empty” square is surrounded by dynamic texture (Fig- ground into the figure representation (A, B), and only for the larger
Off-Back-On) or vice versa (Fig-On-Back-Off). As indicated by red square (B) the interior is spared (shown in dark). Predicted spiking
outlines, stimuli were presented with a small square (5 × 5 rectan- data (C, D) show a perfect representation of the square, without
gle; A, C, E, G) or a large square (9 × 9 rectangle; B, D, F, H ). Each noticeable inflow from the background. The predicted fMRI data
panel shows the activity state of the same layer from the surface- in the Fig-On-Back-Off condition show outflow of fMRI activity
processing system of M-V2. The lateral connectivity pattern of from the representation of the texture square into the empty back-
each unit within this layer is shown in G. The activity state of a ground (E, F ), while such outflow is unnoticeable for spiking data
processing unit is indicated by a black-to-white color range corre- (G, H ). (See color plate 45.)
sponding to weak-to-strong activity. The predicted fMRI data in
texture backgrounds (e.g., De Weerd et al., 1998; Hindi are the same as those described for fine-grained textures,
Attar et al., 2007). For textures, especially when they are but they now involve interactions between boundary-
coarse, it can be asked what information is actually spread- and surface-processing systems within both M-V4 and
ing during filling-in and how it is computed. In the first part M-V2 (for details, see http://www.brainvoyager.com/n3d/
of the chapter, we reviewed evidence suggesting that V4 finn/index.html). In addition, a recurrent loop involving the
neurons encode global statistics of texture patches, such as two M-areas is required to produce spreading activity in
overall brightness, brightness, gradients, and texture density either area. Before active neuronal spreading has occurred,
(Hanazawa & Komatsu, 2001; Tanabe et al., 2005). Similar inhibition from units in the boundary-processing system
statistical operations on multiple elements within a RF have of M-V4 is strong enough to prevent lateral spreading
been reported for V2 neurons (Anzai, Peng, & Van Essen, within the surface-processing system of M-V4, and the same
2007), but unless textures consist of fine elements that are holds in M-V2. After boundary adaptation, active spread-
very densely packed, RFs of neurons in V2 might be too ing within M-V4 and M-V2 may occur, but the activity
small to produce reliable estimates of texture statistics. Corti- levels in units within the surface-processing systems of
cal areas at a level higher than V4 probably would produce M-V4 and M-V2 are codependent. More specifically, lateral
statistical estimates that are insufficiently local to guide subthreshold inputs to units in the M-V4 surface module
spreading processes in lower-order, retinotopic visual areas. inside the representation of the homogenous figure must
Hence in our model, we assume that M-V4 has a special be supplemented with feedforward subthreshold input from
role in estimating global texture statistics and that these units in M-V2 surface module at corresponding retinotopic
statistics are the kind of information that spreads during locations. Similarly, lateral subthreshold inputs to units in
perceptual filling-in. the M-V2 surface module inside the representation of the
The computational processes that lead to spreading homogenous figure must be supplemented by feedback sub-
activation in figure representations in retinotopic maps threshold input from units in the M-V4 surface module at
abstract Object perception is a critical aspect of cognition, and monkeys. The second-order and more difficult question is
is essential to understanding the world we inhabit. It is also one of what computational mechanisms underlie the transfor-
the brain’s most remarkable computational abilities, considering mations between coding stages. Preliminary analyses of
the enormously complex, variable mapping between retinal images
and physical objects. Retinal images are transformed into mental recurrent network mechanisms supporting the V4-to-IT
representations of objects by the ventral pathway of visual cortex. transformation are described here.
The vast dimensionality of the retinal image is compressed into a
compact representation of object part configurations. This explicit Retinal signals must be transformed to support
representation of configural structure may serve as the basis for
recognition, evaluation, and physical interaction with objects. object vision
Two factors make the original retinal representation of
objects unsuitable to support object perception. One factor
We live in a world of objects, and that world is familiar and is high dimensionality. The retinal representation is essen-
comprehensible only because we are so good at recognizing tially a megapixel spatial map of local contrast that replicates
and understanding those objects. Object perception is com- the form of the optical image. A million-dimensional signal
putationally difficult because of the high dimensionality of cannot be directly accessed by other brain regions to
the retinal input (on the order of 106 channels) and the guide behavior and cannot be stored in memory. This high-
extreme variability in input patterns produced by any given dimensional pixel map must be transformed into a tractable,
object (depending on position, distance, orientation, light- explicit code for useful object information. As will be described
ing, etc.). The brain must transform this complex, variable below, the ventral pathway achieves this by recoding large
retinal input into compact, stable representations of useful regions of the pixel map as object boundary fragments char-
object information. This transformation is carried out by the acterized by geometric derivatives. Entire objects are repre-
ventral pathway of visual cortex (Ungerleider & Mishkin, sented as spatial configurations of boundary fragments.
1982; Felleman & Van Essen, 1991), which splits off from The other factor is variability. Any given object produces
the rest of the visual hierarchy at the connection between a potentially infinite range of retinal input patterns. This is
areas V2 and V4. Beyond V4, object information is pro- due to the continually shifting relationship between the input
cessed through a posterior-to-anterior series of stages in infe- spatial reference frame of the eye and the signal source refer-
rior occipital and temporal cortex. ence frame of the object. The variable mapping between eye
The first-order question about the ventral pathway trans- images and objects makes the retinal representation far too
formation is how object information is encoded at each unstable for cognitive access and memory storage. As will be
stage. This chapter describes the current understanding of described below, the ventral pathway derives a more stable
object shape coding in area V4 and inferotemporal cortex representation by transforming spatial information from eye
(IT) based on neurophysiological studies in macaque coordinates into a reference frame that is at least partially
defined by the object itself.
charles e. connor Johns Hopkins University, Baltimore,
Maryland Object boundary fragments are summarized by
anitha pasupathy University of Washington, Seattle,
Washington geometric derivatives
scott brincat Massachusetts Institute of Technology, Boston,
Massachusetts The retinal response patterns produced by natural images
yukako yamane Riken Brain Science Institute, Saitama, Japan are far from random. (Most random pixel patterns look like
connor, pasupathy, brincat, and yamane: neural transformation of object information 455
television snow.) They are dominated by local correlations (Pasupathy & Connor, 1999). Average neural responses of
that are determined by the structure of objects in our world, three V4 neurons are plotted in figures 31.1B–31.1D. Darker
which in turn reflect the constraints of physics, material backgrounds correspond to higher response rates (see gray-
properties, biological growth processes, and artifactual con- scale). The neurons in figures 31.1B and 31.1C exemplify
struction. Because of these constraints, object boundaries are tuning in the orientation/curvature domain. The figure
relatively smooth and continuous on a local level, producing 31.1B neuron responds to sharp convex curvature oriented
smooth, continuous contrast boundaries in the retinal image. (pointing) toward the left and upper left. The figure 31.1C
This creates an opportunity for massive compression of the neuron responds to shallow convex curvature oriented
pixel map representation. The highly correlated pixel values toward the right and lower right. Figure 31.1D exemplifies
along regions of smooth, continuous contrast can be rede- V4 neurons tuned for orientation at zero curvature, respond-
scribed in terms of contrast boundary derivatives. For ing to all stimuli with a nearly flat component at the preferred
example, an image region that contains a long, vertical con- orientation. Curvature/orientation tuning reflects a further
trast edge comprises many pixel values but can be rede- compression of object boundary information. Larger bound-
scribed with a single slope or orientation value. The visual ary fragments that were originally represented by many com-
system exploits this opportunity in primary visual cortex ponent orientation signals in V1 can be summarized with a
(V1) with neurons that are tuned for orientation (and spatial single hill of activity in the V4 curvature/orientation domain.
frequency) of local contrast regions (Hubel & Wiesel, 1959,
1965, 1968). These tuning functions provide a basis set for Objects are represented as spatial configurations of
representing local orientation. Every point in visual space is boundary fragments
represented by V1 neurons with a range of orientation-
tuning peaks, and the local hill of activity among these The boundary fragments encoded by V4 neurons are large
neurons encodes local contrast orientation. Computational enough to constitute a basis set for structural or parts-based
studies have demonstrated that this is an optimal scheme for shape representation. According to structural shape-coding
compressing fragments of natural images (Olshausen & theories (Biederman, 1987; Marr & Nishihara, 1978; Milner,
Field, 1996; Vinje & Gallant, 2000). 1974; Selfridge, 1959; Sutherland, 1968), objects are repre-
At successive processing stages in visual cortex, further sented as spatial configurations of common parts. Structural
compression is achieved by neurons with progressively larger coding schemes have several advantages. First, they have a
receptive fields (RFs) that summarize larger image regions. comparatively low dimensionality; an object that was origi-
In parafoveal V4, RFs encompass several degrees of visual nally represented by 104–106 pixels could be redescribed as
angle (Gattas, Sousa, & Gross, 1988). On this larger scale, a configuration of parts numbering in the 101–102 range.
contrast boundaries more frequently undergo orientation Second, structural coding is highly generative, owing to the
changes within the RF and therefore can no longer be effec- combinatorial explosion of part configurations, and there-
tively summarized with a single first-order derivative. fore has the capacity for representing a virtual infinity of
However, owing to the continuity and relative smoothness objects with a finite set of neural signals. Third, to the extent
of natural objects, these larger boundary regions can be to which the spatial reference frame is defined by the object
summarized with a combination of first- and second-order itself, structural representations can be stable across views.
derivatives. Orientation often changes at a relatively con- Fourth, explicit structural information could be used not
stant rate that can be represented with a single second-order only for recognition, but also for physical evaluation and
derivative: curvature. Convex or positive curvature (protru- guidance of physical interactions with objects. Structural
sion of the boundary away from the object interior) ranges coding also seems consistent with our linguistic tendency to
from near zero (flat, infinite radius) to shallow to sharp. At describe objects as configurations of parts.
the limit of sharpness, curvature becomes infinite (zero These theoretical considerations and the figure 31.1 neu-
radius) and is perceived as a discontinuity in orientation (a rophysiological result suggest that V4 instantiates a struc-
point or angle). Concave curvature (indentation into the tural object representation based on boundary fragments. A
object interior) likewise ranges from shallow to infinite. critical prediction of this hypothesis is that neurons maintain
Visual cortex exploits this larger-scale structural regularity parts-level selectivity across different global shape contexts.
by explicitly representing curvature and orientation of con- This prediction conflicts with the standard, intuitive notion
trast boundary fragments in area V4. Just as V1 encodes tiny that ventral pathway neurons are selective for a single range
boundary fragments with a basis set of orientation tuning of stimulus shapes, centered on a “best stimulus” that evokes
functions, V4 encodes larger boundary fragments with a basis the strongest response and defines the information conveyed
set of tuning functions in the curvature/orientation domain. by that neuron. According to structural theories, the same
The tuning domain has been sampled in V4 neural recording neuron should respond strongly to an infinite variety
experiments with the kind of stimuli plotted in figure 31.1A of stimulus shapes as long as they contain the parts-level
90
Contour feature orientation 135
180
225
270
315
0
45
B 90 40
Contour feature orientation
225
270 20
315
10
0
45
C 90 30
Contour feature orientation
0
45
0
D 75
30
Contour feature orientation
345
30 0
Figure 31.1 V4 neural tuning in the orientation/curvature repetitions) of a V4 neuron to each stimulus is indicated by back-
domain. (A) Stimulus set comprising multiple levels of convex, ground color behind each stimulus (black corresponds to 40 spikes
outline, and concave boundary curvature at eight orientations. per second; see the scale bar at right). This neuron is tuned along
Spike activity of well-isolated individual neurons was recorded from both the curvature dimension (for sharp convexity) and the orienta-
lower visual field representations in V4 of rhesus macaque monkeys tion dimension (around 135 degrees, which here means sharp con-
performing a fixation task to stabilize eye position. Stimuli were vexities pointing toward the upper left). (C ) Average responses of
flashed in the neuron’s RF for 500 ms each in random order. The another V4 neuron tuned for shallow curvature oriented toward
stimulus was at full illumination within the RF perimeter and the right (0 degrees). (D) Average responses of another V4 neuron
gradually faded to the background color outside the RF (only part tuned for orientation and zero curvature, like most lower-level
of the fading is shown in these stimulus icons; the circular boundar- neurons. (From Pasupathy & Connor (1999). Used with permission
ies were not part of the stimuli). (B) Average response (across five from Journal of Neurophysiology.)
Figure 31.2 Responses of a V4 neuron to stimuli constructed by concave curvature facing downward (270 degrees) adjacent to
factorial combination of boundary fragments. Average responses sharp convex curvature facing to the lower left (225 degrees). This
across five repetitions are indicated by the background color for tuning for local boundary structure remained consistent across wide
each stimulus. (The circular background was not part of the display.) variations in global stimulus shape. (From Pasupathy & Connor
This neuron was tuned for boundary fragments consisting of shallow (2001). Used with permission from Journal of Neurophysiology.)
structure encoded by that neuron. This counterintuitive pre- support complete structural shape representations (Pasupa-
diction has been confirmed in experiments exemplified by thy & Connor, 2002).
figure 31.2 (Pasupathy & Connor, 2001). For these experi- Configural coding of object structure becomes even more
ments, a large set of stimuli was constructed by factorial explicit at the next processing stage in posterior IT (PIT).
combination of boundary fragments, so any given fragment Neurons in PIT are tuned for spatial configurations of mul-
appeared in a diverse set of global shapes. Each stimulus was tiple V4-like boundary fragments (Brincat & Connor, 2004).
presented entirely within the V4 neuron’s RF, providing a The stimulus responses of the figure 31.3 example neuron
strong test of consistency across global shape, since other (figures 31.3A and 31.3B) reflect combined tuning for the
parts of the shape could directly influence responses. This boundary fragments diagrammed in the figure 31.3C tuning
neuron responded to shapes that contained broad concave model. The best-fit model for this cell comprised tuning for
curvature facing downward adjacent to sharp convexities two sharp concavities (labeled A and B) oriented toward the
pointing to the lower left (see the stimuli labeled 1 and 2; lower left and lower right, respectively, and positioned to the
either feature alone evoked weaker responses, as in stimuli left of object center, combined with flat or shallow curvature
3–8). As can be seen in figure 31.2, this local structure facing to the right and positioned to the right of object center
evoked strong responses across wide variations in global (labeled C). The model includes an inhibitory term corre-
stimulus shape. Thus V4 neurons encode parts-level, not sponding to the concavity labeled D. Figure 31.3D illustrates
global, shape. Analysis at the neural population level has how these boundary fragment sensitivities interact to deter-
demonstrated that these boundary fragment signals could mine neural responses.
* * 20
B A B 10
0
30
C C
20
* A B 10
D 30
D
20
C
A B 10
R es pons e = 6.1A + 5.2B + 0.0C + 35.2A B C
− 21.4D + 0.2 0
R es pons e rate
(s pikes /s )
20
10
Figure 31.3 PIT neural tuning for boundary fragment configura- (A–C ) and one inhibitory tuning region (D). The equation at the
tions. (A) Average responses of a PIT neuron to stimuli constructed bottom shows that the strongest response factor was the combined
by factorial combination of boundary fragments. In this primary presence of all three excitatory boundary fragments. (D) Example
test, local curvature values were held constant. (B) Tuning for local stimuli showing the interactive effects of boundary fragments near
curvature, tested with two representative shapes from the primary the tuning peaks. In each case, the left bar in the histogram indi-
test. (C ) Response model. The boundary fragment tuning dimen- cates observed response ± standard error, and the right bar indi-
sions for this model are orientation, curvature (mapped to a scale cates the response predicted by the model. (From Brincat & Connor
from −1.0 to 1.0), and XY position relative to stimulus center of (2004). Originally published in Nature Neuroscience.)
mass. The best-fit model comprised three excitatory tuning regions
This kind of explicit single-neuron signal for configura- natural objects could be explainable in terms of structural
tions of disjoint parts is not envisioned in standard theories. components (Fujita, Tanaka, Ito, & Cheng, 1992; Perrett,
One potential advantage of such signals is further com- Rolls, & Caan, 1982; Sigala & Logothetis, 2002; Tanaka,
pression of the object representation into a smaller number Saito, Fukada, & Moriya, 1991; Tsunoda, Yamane,
of more complex components. This would be particularly Nishizaki, & Tanifuji, 2001; Wang, Fujita, & Murayama,
efficient if PIT neurons emphasize statistically common 2000).
part configurations. At this scale, encompassing complete Single-neuron configuration signals might also enhance
shapes, there is no simple geometric constraint on boundary the cognitive accessibility of configural structure, supporting
structure, but common part configurations are bound to our ability to evaluate the physical potential of objects and
occur, owing to ecological factors. While this has not been interact with them in an accurate and intelligent manner.
investigated rigorously, it can be observed at an anecdotal Finally, configural shape signals might also provide a basis
level. The responses of the same example neuron to photo- for further integration, leading to global shape sensitivity at
graphic stimuli (figure 31.4) illustrate how tuning for bound- higher processing stages. This might be especially true for
ary fragment configurations might relate to common shape highly familiar or behaviorally relevant object categories
structures in natural object categories. In this case, the that require maximally efficient processing, at the cost of
neuron responded to a leftward-facing quadruped (the dedicated neurons with extremely narrow selectivity. For
polar bear), presumably owing to its combined sensitivity generic object representation, representation in terms of
to opposed concavities on the left and broad curvature on component part configurations could be the optimal com-
the right. This is not to say that the neuron by itself signals promise between flexibility and efficiency.
the presence of quadrupeds. Rather, this neuron efficiently One critical result from these PIT studies is the demon-
captures a common shape motif that would help to define stration of spatial tuning in an object-centered reference
quadrupeds as well as other object categories that have the frame. This is a central prediction of structural shape-coding
same component structure. Other studies in more anterior theories (Biederman, 1987; Marr & Nishihara, 1978). Trans-
parts of inferotemporal cortex suggest that responses to formation from retinotopic to object-centered coordinates
connor, pasupathy, brincat, and yamane: neural transformation of object information 459
Figure 31.4 Responses of the same PIT neuron as in figure 31.3 to photographic stimuli. In each case, the average response to the pho-
tograph is indicated by the color of the background square (see scale bar at right).
achieves stability across changes in object position on the ple octaves. However, the spatial reference frame does not
retina. Relative position signals in an object-centered refer- appear to rotate with the object, since shape tuning is not
ence frame provide the configural information that is consistent across rotated versions of the same stimulus. This
required to distinguish different arrangements of the same is exemplified in figure 31.3A, in which substantially rotated
parts. PIT neurons exhibit clear tuning for boundary frag- versions of the high-response shapes evoke no response
ment position with respect to object center. This is empha- whatsoever. Thus the theoretical prediction of a stable refer-
sized for the example neuron in figure 31.5, by showing the ence frame completely defined by the object is only partially
average response to stimuli containing the double-concavity confirmed at this level in the ventral pathway. Generaliza-
configuration to the left of object center (top, strong tion across object rotation could be achieved in some other
responses) versus to the right (bottom, minimal responses). way, possibly by learning associations between different
Figure 31.3A includes many stimuli with the same configura- object views (Vetter, Hurlbert, & Poggio, 1995; Edelman &
tion in other object-centered positions (e.g., top or bottom) Poggio, 1991).
that likewise evoke little or no response. Control experiments
(not shown) demonstrate that this tuning is consistent across Three-dimensional object shape is represented in
retinotopic positions (within the PIT RF), so spatial tuning terms of surface fragments
is much more acute in object-centered coordinates than in
retinotopic coordinates. The results described above relate to two-dimensional
Thus PIT neurons provide the kind of relative position object boundary shape. Objects produce two-dimensional
information critical for parts-based structural coding. In contrast boundaries in the retinal image; therefore it
addition to being centered on the object, the PIT reference would be reasonable if two-dimensional boundary rep-
frame probably scales with object size, given that PIT tuning resentation were the primary mode for object vision. In
functions (and tuning functions at higher stages in the ventral physical reality, however, objects are three-dimensional, and
pathway; Ito, Tamura, Fujita, & Tanaka, 1995) exhibit classical theories (Biederman, 1987; Marr & Nishihara,
remarkable consistency across size changes spanning multi- 1978) posit explicit representation of three-dimensional
Non-preferred 20
object-relative position
0
0 500
Post-stimulus time (ms)
Figure 31.5 PIT tuning for object-centered position. Average is high, as shown by the peristimulus-time response histogram at
responses of the same example neuron to two subsets of the stimuli the right. In the bottom row, the double concavity configuration is
in figure 31.3. In the top row, the opposed concavity configuration positioned to the right of object center, and the average response
is positioned to the left of object center, and the average response is low.
object structure, based on three-dimensional spatial con- the three-dimensional shape information encoded by the
figurations of three-dimensional parts. This is a particul- neuron. The best-fit model for this cell (figure 31.7) cap-
arly strong prediction, given the computational difficulty tured both the forward-facing ridge near the front of the
of extracting three-dimensional structure from two- object and the shallow concave dorsal surface behind it that
dimensional retinal images and the higher-dimensional characterized high-response stimuli. The response model
neural coding that is required to represent three- was highly nonlinear; predicted (and observed) responses
dimensional structure. In contrast, current computational were substantial only for stimuli with both surface fragments.
vision models favor direct processing of two-dimensional The result shown here typifies three-dimensional shape
images with no explicit representation of three-dimensional tuning observed for a substantial fraction of CIT/AIT cells.
object structure (Fei-Fei, Fergus, & Perona, 2006; Lowe, These neurons were tuned for three-dimensional spatial
2004; Moghaddam & Pentland, 1997; Murase & Nayar, configurations of multiple surface fragments defined by their
1995; Riesenhuber & Poggio, 1999; Turk & Pentland, 1991; surface curvatures and three-dimensional orientations.
Weber, Welling, & Perona, 2000). These observations support the classic hypothesis that three-
The three-dimensional structural coding hypothesis dimensional shapes are represented as structural configura-
has not been directly tested, owing to the experimental tions of three-dimensional parts.
difficulty of exploring the virtually infinite domain of In contrast to these findings regarding biological object
three-dimensional object shape. In a recent attempt to vision, recent computational systems for object recognition
overcome this obstacle, Yamane and colleagues (2008) have been most successful with nonstructural processing of
used an evolutionary morphing strategy to sample three- two-dimensional image information (Fei-Fei et al., 2006;
dimensional object shape (figure 31.6). Neurons in central Lowe, 2004; Moghaddam & Pentland, 1997; Murase
and anterior IT (CIT/AIT) were studied by first measuring & Nayar, 1995; Riesenhuber & Poggio, 1999; Turk &
their responses to an initial generation of 50 random three- Pentland, 1991; Weber et al., 2000). This makes sense, given
dimensional shapes (figure 31.6, generation 1). The second the computational expense of inferring and encoding
stimulus generation included partially morphed descen- three-dimensional structure. It may be that even in the
dants of higher-response stimuli from the first generation. brain, rapid object recognition depends on two-dimensional
This process was iterated across 8–10 generations, pro- processing (Hung, Kreiman, Poggio, & DiCarlo, 2005;
ducing extensive sampling of stimuli in the high- and inter- Serre, Oliva, & Poggio, 2007) and neural coding of three-
mediate-response range of the cell. High-response stimuli dimensional structure instead supports other aspects of
were typically characterized by some shared local shape object vision requiring detailed structural knowledge. Com-
structure. In this case, the most noticeable shared structure paring similar objects within a recognized class, evaluating
is a ridge near the front of the shape facing out of the image the functionality and utility of unfamiliar objects, anticipat-
plane. ing physical events, and guiding physical interactions with
When sampling was sufficiently complete, the response objects are all likely to require detailed knowledge of three-
pattern could be used to constrain a quantitative model of dimensional structure.
connor, pasupathy, brincat, and yamane: neural transformation of object information 461
generation 1 generation 2
generation 3 generation 4
generation 5 generation 6
generation 7 generation 8
0 sp/sec 45
Boundary fragment configurations are derived by integration provides a selective, explicit signal for the
recurrent network processing overall configuration. In contrast, more linear summation
of fragment information produces ambiguous signals
The results described above suggest that the ventral pathway associated with many different fragments or fragment
encodes objects as spatial configurations of boundary frag- combinations.
ments. If so, this begs the more difficult question of how such Fine-scale temporal analysis shows that PIT responses
configural information is derived. A direct answer to this initially reflect linear summation, with nonlinear informa-
question would require comprehensive measurement of tion emerging more gradually (Brincat & Connor, 2006). In
neural network activity across multiple processing stages, many cases, this is observable in the responses of individual
which is not currently possible. But one indirect approach neurons. The PIT neuron represented in figure 31.8 was
to inferring underlying network mechanisms is fine-scale sensitive to concave boundary fragments oriented toward
temporal analysis of neural responses. If shape information the upper left (135 degrees) and concave fragments oriented
is derived by time-consuming network processes, the evolu- toward the lower right (315 degrees). The average temporal
tion of that information across time (following stimulus onset) response profiles (solid gray histograms) for stimuli contain-
may be observable in the neural responses. As will be detailed ing only the 315-degree concavity (top row) or only the
below, the evolution of configural shape signals in PIT is 135-degree concavity (middle row) are phasic, confined to a
observable in this way. window between 100 and 200 ms following stimulus onset.
The figure 31.3 PIT neuron exemplifies supralinear The response profile for stimuli containing both of these
integration of information across boundary fragments. As fragments (bottom row) includes an initial phasic spike in the
the equation in figure 31.3C reflects, responses produced 100- to 200-ms window that closely approximates the sum
by individual fragments were low (factors A, B, and C of the individual fragment responses (represented by the
have coefficients below 7 spikes per second), while responses dark gray curve, which shows the predicted linear sum based
to fragment combinations were high (factor ABC has a on a temporal response model). For these stimuli, however,
large coefficient of 35 spikes per second). This nonlinear the response persists throughout the entire 500-ms stimulus
Response = 0.4A+0.0B+49.0AB+0.0
1 180 1 1
Minimum curvature
Angle on YZ plane
Relative Y position
Relative Y position
(deg)
0 0 0
–1 00 –1 –1
–1 0 1 180 360 –1 0 1 –1 0 1
Maximum curvature Angle on XY plane (deg) Relative X position Relative Z position
Figure 31.7 AIT neural tuning for three-dimensional surface The surface normal orientation points toward the viewer (near 0
fragment configuration. Each stimulus was defined in terms of its on the YZ-plane). The position is toward the front of the object
constituent surface fragments. Surface fragments were character- (near 1 on the Z-axis). The surface region on a high response
ized in terms of their XYZ position relative to object center, three- stimulus (at right) corresponding to this Gaussian function is tinted
dimensional orientation, and three-dimensional surface curvature gray. The other Gaussian tuning region (black) defines shallow
(maximum and minimum cross-sectional curvature). Response pat- concave surfaces with normals pointing upward (near 90 degrees
terns were fit with multiple Gaussian tuning functions in the posi- on the XY- and YZ-planes) positioned near object center. The
tion/orientation/curvature domain. The best-fit model for this cell response equation at the top indicates low responses to stimuli with
was based on two Gaussian tuning regions (black and gray circles, only one surface fragment or the other (A or B) but high responses
describing 1.0 standard deviation boundaries). The gray tuning for the combination (AB). (From Yamane et al. (2008). Originally
region defines surface fragments with sharp convex maximum cur- published in Nature Neuroscience.)
vature (near 1) and flat minimum curvature (near 0), that is, a ridge.
connor, pasupathy, brincat, and yamane: neural transformation of object information 463
Predicted and
Example stimuli observed responses
20
10
selectivity 20
10
0
20
Configuration
10
selectivity
0
0 500
Post-stimulus time (ms)
Figure 31.8 Time course of linear and nonlinear boundary frag- degrees. Bottom row, Average response to stimuli containing both
ment responses for a PIT neuron. Top row, Average response to concavities. The total predicted response (black curve), predicted
stimuli containing a concavity oriented toward 315 degrees. The linear response due to individual fragment terms (dark gray curve),
time course of observed response (gray histogram) and response and predicted nonlinear response due to the fragment combination
predicted by a temporal model (black curve) are shown. The gray term (light gray curve) are shown. (Adapted from Neuron (Brincat
box indicates the stimulus presentation period. Middle row, Average & Connor, 2006) with permission from Elsevier.)
response to stimuli containing a concavity oriented toward 135
presentation. Thus beyond 200 ms following stimulus onset, Newsome, 1985; Rodman & Albright, 1989; Pack,
this neuron exhibits highly nonlinear integration and conveys Berezovskii, & Born, 2001; Smith, Majaj, & Movshon, 2005).
an explicit signal for the necklike configuration of two In both cases, it could be that some kind of recurrent network
opposed concavities. processing is required to generate unambiguous signals
This example reflects a general trend in PIT for linear based on integration across multiple stimulus components.
boundary fragment responses to evolve more quickly, There are, however, alternative interpretations that cannot
peaking around 120 ms after stimulus onset (figure 31.9A; be ruled out at this point. For example, selectivity for com-
the black curve summarizes linear response strength across bined inputs might be produced by a static threshold non-
a sample of 89 PIT neurons). In contrast, nonlinear response linearity and could be temporarily masked by transient onset
strength evolved more gradually, peaking 180 ms after responses.
onset (figure 31.9A, gray curve). These trends were partly
due to single-neuron tuning transitions as in figure 31.8 Summary: Configural representation of object structure
(figure 31.9B, thin curves) and partly due to differential
response profiles of consistently linear neurons (figure 31.9B, The findings reviewed above suggest at least a partial expla-
thick black curve) versus consistently nonlinear neurons nation of how the ventral visual pathway achieves compact,
(figure 31.9B, thick gray curve). This overall pattern is stable representation of useful object information. In area
consistent with a fairly simple neural network model V4, object boundary fragments are summarized in terms of
in which neurons vary in the relative strength of V4-like their first- and second-order derivatives (orientation and cur-
boundary fragment inputs and recurrent inputs (recurrent vature), greatly reducing the dimensionality of the retinal
excitatory inputs from cells with similar configuration response pattern. At the next processing stage in PIT, con-
tuning, recurrent inhibitory inputs from cells with dissimilar figurations of multiple fragments are represented (further
tuning) (Brincat & Connor, 2006; Salinas & Abbott, 1996). reducing dimensionality) in an object-centered reference
Neurons with stronger V4 inputs respond quickly and in a frame (producing stability across changes in retinal position).
more linear fashion, while neurons with stronger recurrent Recent results in more anterior regions of inferotemporal
connectivity respond more slowly and in a more nonlinear cortex suggest that this configural coding scheme generalizes
fashion. The 60-ms delay for part configuration signals in to three-dimensional shape representation in terms of object
PIT is consistent with a remarkably similar delay for pattern surface fragments. Considering the processing time that is
motion signals in area MT (Movshon, Adelson, Gizzi, & required to perfect these configural representations (on the
0.2
0
B
0.6
0.4
0.2
0
0 100 200 300 400 500
Post-stimulus time (ms)
Figure 31.9 The average time course of linear and nonlinear into neurons with consistently linear responses across time (thick
response components in PIT. (A) Normalized linear (black curve) black curve), neurons with consistently nonlinear responses (thick
and nonlinear (gray curve) response strength averaged across tem- gray curve), and linear (thin black curve) and nonlinear (thin gray
poral models fit to responses of 89 PIT neurons. Dots with corre- curve) model components for neurons that transitioned across time.
sponding colors indicate the estimated onset and peak (90% (Adapted from Neuron (Brincat & Connor, 2006) with permission
maximum) times for linear and nonlinear responses. (B) The same from Elsevier.)
linear and nonlinear response strength averages are partitioned
order of 200 ms), they might not explain the most rapid Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchi-
human recognition speeds (Thorpe, Fize, & Marlot, 1996). cal processing in the primate cerebral cortex. Cereb. Cortex, 1(1),
1–47.
However, these rapid reaction times could be limited to
Fujita, I., Tanaka, K., Ito, M., & Cheng, K. (1992). Columns for
coarse categorization (Rousselet, Mace, & Fabre-Thorpe, visual features of objects in monkey inferotemporal cortex.
2003), based on nonconfigural parts-level information avail- Nature, 360(6402), 343–346.
able after about 100 ms (figure 31.9A, black curve). Finer Gattas, R., Sousa, A. P. B., & Gross, C. G. (1988). Visuotropic
discrimination based on larger-scale shape configurations organization and extent of V3 and V4 of the macaque. J.
rather than diagnostic parts requires longer processing times Neurosci., 8(6), 1831–1845.
Hubel, D. H., & Wiesel, T. N. (1959). RFs of single neurones in
(Arguin & Saumier, 2000; Wolfe & Bennett, 1997; Ringach the cat’s striate cortex. J. Physiol., 148, 574–591.
& Shapley, 1996), consistent with the delayed emergence of Hubel, D. H., & Wiesel, T. N. (1965). Receptive fields and func-
explicit configural signals (figure 31.9A, gray curve). Aside tional architecture in two nonstriate visual areas (18 and 19) of
from recognition and discrimination, configural representa- the cat. J. Neurophysiol., 28(2), 229–289.
tions could support other important aspects of object vision, Hubel, D. H., & Wiesel, T. N. (1968). Receptive fields and func-
tional architecture of monkey striate cortex. J. Physiol., 195(1),
including cognitive evaluation of object structure and func- 215–243.
tion as well as guidance of precise physical interactions with Hung, C. P., Kreiman, G., Poggio, T., & DiCarlo, J. J. (2005).
complex objects. Fast readout of object identity from macaque inferior temporal
cortex. Science, 310(5749), 863–866.
Ito, M., Tamura, H., Fujita, I., & Tanaka, K. (1995). Size and
REFERENCES
position invariance of neuronal responses in monkey inferotem-
Arguin, M., & Saumier, D. (2000). Conjunction and linear non- poral cortex. J. Neurophysiol., 73(1), 218–226.
separability effects in visual shape encoding. Vis. Res., 40(22), Lowe, D. (2004). Distinctive image features from scale-invariant
3099–3115. keypoints. Int. J. Comput. Vis., 60(2), 91–110.
Biederman, I. (1987). Recognition-by-components: A theory of Marr, D., & Nishihara, H. K. (1978). Representation and recog-
human image understanding. Psychol. Rev., 94(2), 115–147. nition of the spatial organization of three-dimensional shapes.
Brincat, S. L., & Connor, C. E. (2004). Underlying principles of Proc. R. Soc. Lond. B Biol. Sci., 200(1140), 269–294.
visual shape selectivity in posterior inferotemporal cortex. Nat. Milner, P. M. (1974). A model for visual shape recognition. Psychol.
Neurosci., 7(8), 880–886. Rev., 81(6), 521–535.
Brincat, S. L., & Connor, C. E. (2006). Dynamic shape synthesis Moghaddam, B., & Pentland, A. (1997). Probabilistic visual learn-
in posterior inferotemporal cortex. Neuron, 49(1), 17–24. ing for object representation. IEEE Trans. Pattern Analysis Machine
Edelman, S., & Poggio, T. (1991). Models of object recognition. Intelligence, 19(7), 696–710.
Curr. Opini. Neurobiol., 1(2), 270–273. Movshon, J. A., Adelson, E. H., Gizzi, M. S., & Newsome, W. T.
Fei-Fei, L., Fergus, R., & Perona, P. (2006). One-shot learning of (1985). The analysis of moving visual patterns. In C. Chagas,
object categories. IEEE Trans. Pattern Analysis Machine Intelligence, R. Gattass, & C. Gross (Eds.), Pattern recognition mechanisms (pp.
28(4), 594–611. 117–151). New York: Springer.
connor, pasupathy, brincat, and yamane: neural transformation of object information 465
Murase, H., & Nayar, S. K. (1995). Visual learning and recogni- Smith, M. A., Majaj, N. J., & Movshon, J. A. (2005). Dynamics
tion of 3-D objects from appearance. Int. J. Comput. Vis., 14, of motion signaling by neurons in macaque area MT. Nat.
5–24. Neurosci., 8(2), 220–228.
Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell Sutherland, N. S. (1968). Outlines of a theory of visual pattern
receptive field properties by learning a sparse code for natural recognition in animals and man. Proc. R. Soc. Lond. B Biol. Sci.,
images. Nature, 381(6583), 607–609. 171, 297–317.
Pack, C. C., Berezovskii, V. K., & Born, R. T. (2001). Dynamic Tanaka, K., Saito, H., Fukada, Y., & Moriya, M. (1991). Coding
properties of neurons in cortical area MT in alert and anaesthe- visual images of objects in the inferotemporal cortex of the
tized macaque monkeys. Nature, 414(6866), 905–908. macaque monkey. J. Neurophysiol., 66(1), 170–189.
Pasupathy, A., & Connor, C. E. (1999). Responses to contour Thorpe, S., Fize, D., & Marlot, C. (1996). Speed of processing
features in macaque area V4. J. Neurophysiol., 82(5), 2490– in the human visual system. Nature, 381(6582), 520–522.
2502. Tsunoda, K., Yamane, Y., Nishizaki, M., & Tanifuji, M. (2001).
Pasupathy, A., & Connor, C. E. (2001). Shape representation Complex objects are represented in macaque inferotemporal
in area V4: Position-specific tuning for boundary conformation. cortex by the combination of feature columns. Nat. Neurosci., 4(8),
J. Neurophysiol., 86(5), 2505–2519. 832–838.
Pasupathy, A., & Connor, C. E. (2002). Population coding of Turk, M., & Pentland, A. (1991). Eigenfaces for recognition.
shape in area V4. Nat. Neurosci., 5(12), 1332–1338. J. Cogn. Neurosci., 3(1), 71–86.
Perrett, D. I., Rolls, E. T., & Caan, W. (1982). Visual neurones Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual
responsive to faces in the monkey temporal cortex. Exp. Brain systems. In D. G. Ingle, M. A. Goodale, & R. J. Q. Mansfield
Res., 47(3), 329–342. (Eds.), Analysis of visual behavior (pp. 549–586). Cambridge, MA:
Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of MIT Press.
object recognition in cortex. Nat. Neurosci., 2(11), 1019–1025. Vetter, T., Hurlbert, A., & Poggio, T. (1995). View-based
Ringach, D. L., & Shapley, R. (1996). Spatial and temporal prop- models of 3D object recognition: Invariance to imaging trans-
erties of illusory contours and amodal boundary completion. Vis. formations. Cereb. Cortex, 5(3), 261–269.
Res., 36(19), 3037–3050. Vinje, W. E., & Gallant, J. L. (2000). Sparse coding and decor-
Rodman, H. R., & Albright, T. D. (1989). Single-unit analysis of relation in primary visual cortex during natural vision. Science,
pattern-motion selective properties in the middle temporal visual 287(5456), 1273–1276.
area (MT). Exp. Brain Res., 75(1), 53–64. Wang, Y., Fujita, I., & Murayama, Y. (2000). Neuronal mecha-
Rousselet, G. A., Mace, M. J., & Fabre-Thorpe, M. (2003). Is it nisms of selectivity for object features revealed by blocking inhi-
an animal? Is it a human face? Fast processing in upright and bition in inferotemporal cortex. Nat. Neurosci., 3(8), 807–813.
inverted natural scenes. J. Vis., 3(6), 440–455. Weber, M., Welling, M., & Perona, P. (2000). Towards auto-
Salinas, E., & Abbott, L. F. (1996). A model of multiplicative matic discovery of object categories. In Proceedings of IEEE
neural responses in parietal cortex. Pro. Natl. Acad. Sci. USA, Conference on Computer Vision and Pattern Recognition (Vol. 2, pp.
93(21), 11956–11961. 101–108).
Selfridge, O. G. (1959). Pandemonium: A paradigm for learning. Wolfe, J. M., & Bennett, S. C. (1997). Preattentive object files:
In The mechanization of thought process. London: H.M. Stationery Shapeless bundles of basic features. Vis. Res., 37(1), 25–43.
Office. Yamane, Y., Carlson, E. T., Bowman, K. C., Wang, Z., &
Serre, T., Oliva, A., & Poggio, T. (2007). A feed forward archi- Connor, C. E. (2008). A neural code for three-dimensional
tecture accounts for rapid categorization. Proc. Natl. Acad. Sci. object shape in macaque inferotemporal cortex. Nat. Neurosci.,
USA, 104(15), 6424–6429. 11(11), 1243–1244.
Sigala, N., & Logothetis, N. K. (2002). Visual categorization
shapes feature selectivity in the primate temporal cortex. Nature,
415(6869), 318–320.
abstract Conventional wisdom has long held that face recogni- recognition get wired up during development in the first
tion develops very slowly throughout infancy, childhood, and ado- place?
lescence, with perceptual experience as the primary engine of this Our review of the available evidence supports a view of
development. However, striking new findings from just the last few
years have overturned much of this traditional view by demonstrat-
the development of face recognition that is dramatically dif-
ing genetic influences on the face recognition system as well as ferent from the one suggested by the first studies in the field.
impressive face discrimination abilities that are present in newborns Twenty years ago, the standard theory was that core aspects
and in monkeys that were reared without ever seeing a face. Nev- of the ability to discriminate faces were not present until 10
ertheless, experience does play a role, for example, in narrowing years of age, and their emergence and eventual maturity
the range of facial subtypes for which discrimination is possible and
were determined primarily by experience (Carey & Diamond,
perhaps in increasing discrimination abilities within that range.
Here we first describe the cognitive and neural characteristics of 1977; Carey, Diamond, & Woods, 1980). This position has
the adult system for face recognition, and then we chart the devel- been overturned by recent findings that demonstrate striking
opment of this system over infancy and childhood. This review abilities even in neonates and by mounting evidence of
identifies a fascinating new puzzle to be targeted in future research: genetic contributions.
All qualitative aspects of adult face recognition measured behavior-
We organize our review by age group. Throughout, we
ally are present very early in development (by 4 years of age; all
that have been tested are also present in infancy), yet functional ask how the available data address the following fundamen-
magnetic resonance imaging and event-related potential evidence tal theoretical questions:
shows very late maturity of face-selective neural responses (with the
1. What are the inherited genetic contributions to the
fusiform face area increasing substantially in volume between age
7 years and adulthood). specification of the adult system for processing facial identity
information?
2. What is derived from experience?
Introduction 3. How exactly do genes and/or experience work sepa-
rately or together across the course of development to
One of the most impressive skills of the human visual produce the adult system?
system is our ability to identify a specific individual from
a brief glance at their face, thus distinguishing that indi- The perception of face identity in adulthood
vidual from hundreds of other people we know, despite
the wide variations in the appearance of each face as it We begin with a characterization of the end state of develop-
changes in viewpoint, lighting, emotional expression, and ment: the cognitive and neural basis of the perception of
hairstyle. Though many mysteries remain, important facial identity in adults. Note that this is a major topic in its
insights have been gleaned over the last two decades about own right, with much internal theoretical debate. However,
the cognitive and neural mechanisms that enable humans to facilitate our present interest in the developmental course of
to recognize faces. Here, we address an even more difficult face recognition, we focus on empirical phenomena, espe-
and fundamental question: How does the machinery of face cially those that are well established in adults and have sub-
sequently been tested in development.
elinor mckone and kate crookes Department of Psychology,
Core Behavioral Properties of Face Identity Perception
Australian National University, Canberra, Australia
nancy kanwisher McGovern Institute for Brain Research and in Adult Humans Basic properties of face identification in
Department of Brain and Cognitive Science, Massachusetts adults are as follows. Identification is more accurate when
Institute of Technology, Cambridge, Massachusetts faces are upright than when they are inverted (i.e., upside
Development: Infancy
In exploring genetic and experience-based contributions to
face recognition via infancy studies, several interrelated
questions are relevant. First, which abilities, if any, are
present at birth? Visual abilities that are present in neonates
(or in monkeys that have been deprived of all face input)
cannot be derived from experience and therefore provide
the only method of revealing genetic influences in isolation
from any visual learning. Second, if babies are born with a
face representation, is its purpose merely to draw attention
to faces (cf. CONSPEC in Morton & Johnson, 1991) or to
support individuation? Third, how broadly tuned is any such
representation: broad enough to cover any primate face,
specific to own-species faces, or perhaps even to own-race
faces? Finally, which, if any, of the types of effects of experi-
ence in early infancy that are found in other perceptual and
cognitive domains occur for faces: Improvements with
increasing experience? Perceptual narrowing (i.e., destruc-
tion of earlier ability)? Critical periods? Studies of these
topics published within the last few years have dramatically
altered our understanding of infant face recognition.
In a classic result, newborns (median age: 9 minutes) track
an upright “paddle face” (figure 32.2A) further than versions
in which the position of the internal blobs is scrambled or
inverted (Goren, Sarty, & Wu, 1975; Johnson, Dziurawiec,
Ellis, & Morton, 1991). Although it has been suggested that
this preference could arise from general visual biases (e.g.,
for stimuli with more elements in the upper visual field;
Simion, Macchi Cassia, Turati, & Valenza, 2003), prefer-
ence only for the normal contrast polarity of a (Caucasian)
face (Farroni et al., 2005) argues for a level of specificity to
facelike structure. Thus humans are born with some type of
innate preference that, at the very least, attracts infants’
attention to faces. Note that the innate representation sup-
porting face preference could be different from that support- Figure 32.2 Face perception without experience. (A) Newborn
humans (<1 hour old) track the “paddle face” on the left further
ing face individuation in adults ( Johnson, 2005); indeed, a than the scrambled version (Morton & Johnson, 1991). (B) Newborn
finding that neonates track faces in the temporal but not humans (<3 days) look longer at the novel than habituated face,
nasal visual field (Simion, Valenza, Umlita, & Dalla Barba, indicating recognition of face identity even across view change
1998) suggests a subcortical rather than cortical origin. (Turati et al., 2008). (C ) Japanese macaques raised with no expo-
Our concern in this chapter is primarily with the develop- sure to faces can, on first testing, discriminate very subtle differ-
ences between individual monkey faces (including differences both
ment of face individuation ability. This can be measured in
in shape and in spacing of internal features) and can also do this
infants by looking time measures that assess preference and for human faces (Sugita, 2008).
dishabituation-to-perceived-novelty. A classic finding is that
neonates less than 4 days old can discriminate their mother
from similar-looking women (Pascalis, de Schonen, Morton,
Deruelle, & Fabre-Grenet, 1995; Bushnell, 2001), although
old-new memory
old-new memory
All plots show age in years
% correct in
on x-axis; A = Adult
upright distinctive
d' in
inverted typical
50 0
6 10 5 7 9 11 13 A
latency to respond
% 'same' responses
to target half-face
latency to name
'familiar' (ms)
unaligned aligned unstudied
aligned
unaligned studied
2AFC memory
upright
% correct in
% correct in
unaligned
spacing
changed
inverted
unaltered
aligned
Carey, 1981
50 50 40
4 5 6 10 6-7 A 6 A
Figure 32.3 Behavioral face recognition effects in the preschooler in younger children. Our major point is that apparent developmen-
to adult age range. A basic finding is of overall improvement with tal trends in the strength of core effects (size of inversion effect, size
age: higher accuracy or lower reaction time. Note that in part C, of composite effect, ability to represent recently seen faces in
the left and middle plots show studies in which the researchers implicit memory, etc.) depend on whether and how room to show
deliberately removed this trend by using smaller learning set sizes effects is potentially restricted.
Three studies have used fMRI to scan children age 5 years children by group analyses (in which all subjects are aligned
to adult on face and object tasks, enabling these studies in a common space; 5- to 8-years old: Scherf et al., 2007; 8–
to track the existence and size of face-selective regions of 10 years old: Aylward et al., 2005), in the two studies report-
cortex (figure 32.4). (A fourth study will not be discussed here ing individual-subject analyses, Scherf and colleagues found
because it used such liberal criteria to define “FFAs” that the an FFA in 80% of the children in 5- to 8-year-olds (albeit at
regions that were so identified were clearly not face-selective a very liberal statistical threshold), and Golarai and col-
even in adults; see figure 1d–f in that study, Gathers et al., leagues (2007) found an FFA in 85% of children in their 7- to
2004.) Considering qualitative effects, evidence of a face- 11-year-old group (using a more standard statistical thresh-
selective FFA has been found in most children at the young- old). One study (Passarotti, Smith, DeLano, & Huang, 2007)
est ages tested. Although no FFA was revealed in young also reported an inversion effect (a higher response to inverted
Figure 32.5 ERPs from right posterior temporal scalp locations in response to face stimuli, separately for each age group. (From Taylor
et al., 2004.) (See color plate 48.)
faces than to upright faces) in the region of the right (but not Quantitatively, the neural machinery that is involved in
the left) FFA in children 8–11 years of age (and an effect in face perception demonstrates substantial changes in face-
the opposite direction in adults). Regarding ERPs, young selective neural responses continuing late into development.
children (like infants) show both face-selective responses and In all three fMRI studies, the FFA increases markedly in
inversion effects upon these (see figures 32.5 and 32.6; Taylor, volume between childhood and adulthood (Aylward et al.,
Batty, & Itier, 2004). These fMRI and ERP findings in 2005; Golarai et al., 2007; Scherf et al., 2007), even though
children add to the infant data to confirm that at least some total brain volume does not change substantially after age 5
form of face-specific neural machinery is established early. years. These studies clearly show that the rFFA is still chang-
ing late in life—certainly after age 7 and in some studies (see figure 32.5), including a reversal of the direction of
much later. the inversion effect between children and adults in both
Comparing fMRI data across children and adults is methods (Taylor et al., 2004; Passarotti et al., 2007). Future
fraught with potential pitfalls. Children move more in the research might best approach this question not just by
scanner and are less able to maintain attention on a task. measuring mean responses to upright versus inverted faces,
These or other differences between children and adults but also by using identity-specific adaptation to ask when the
could in principle explain the change in volume of the rFFA. better discrimination of upright than inverted faces seen
However, notably, control areas that are identified in the in adulthood emerges (Yovel & Kanwisher, 2005; Mazard
same scanning sessions do not change with age. For example, et al., 2006).
object-responsive regions and the scene-selective “parahip-
pocampal place area” in the right hemisphere or rPPA Comparing Development for Behavioral and Neural
(Epstein & Kanwisher, 1998) did not change in volume from Measures Taking the findings from the 4-to-adult range
childhood to adulthood (Golarai et al., 2007; Scherf et al., together with the infant literature, we can draw the following
2007), although somewhat surprisingly, Golarai and col- conclusions. First, the results regarding qualitatively adultlike
leagues found that the lPPA did increase in volume with face processing appear to agree well across behavioral and
age. These findings reassure us that the changes in the neural measures; that is, just as all behavioral face recognition
rFFA with age are not due to across-the-board changes in effects have been obtained in the youngest age groups tested,
the ability to extract good functional data from young face-selective neural machinery as revealed by fMRI, ERPs,
children. NIRs, and single-cell recording has also been found in the
Golarai and colleagues (2007) asked how changes in youngest children and infants tested. Nonetheless, fMRI
the rFFA relate to changes in behavioral face recognition data are not available for children younger than 5–8 (pooled
over development. Right FFA size was correlated (separately together), and the ERP studies in infants and children often
in children and adolescents but not in adults) with face rec- go in opposite directions from those in adults. For example,
ognition memory but not with place or object memory. the inversion effect on the N170 switches polarity between
Conversely, lPPA size was correlated (in all age groups inde- childhood and adulthood, as shown in figure 32.6, despite
pendently) with place memory but not with object or face maintaining the same polarity in behavior.
memory. This double dissociation of behavioral correlations Second, the evidence for quantitative development is less
clearly associates the rFFA with changes in face recognition clear. It might be that the improvements with age on behav-
measured behaviorally. ioral tasks do reflect ongoing development of face perception
ERP findings are consistent with the evidence from fMRI itself; if so, this could agree neatly with the increasing size
that the cortical regions that are involved in face recognition of the FFA. As we have noted, however, findings such as
continue to change well into the teenage years. Face-related those shown in figures 32.3B and 32.3C suggest that behav-
ERPs show gradual changes in scalp distribution, latency, ioral face perception could be fully mature early and that
and amplitude into the mid-teen years (figures 32.5 and ongoing behavioral improvements with age reflect changes
32.6). Both the early P1 component and the later N170 in other, more general, cognitive factors. This view would
component show gradual decreases in latency from age produce an apparent discrepancy—behavioral maturity
4 to adulthood. Regarding neural inversion effects, late arising well before maturity of relevant cortical regions—
developmental changes are found with both fMRI and ERP that would need to be resolved. If this is the case, two ideas
Later infancy
< = 3 months
10 years
11 years
4 years
5 years
6 years
7 years
8 years
9 years
Adults
Behavioral Properties
Ability to discriminate
individual faces
Inversion effect on
discrimination (looking time
or recognition memory)
Composite-like effect,
upright not inverted
Composite effect
Part-whole effect, upright not
inverted
Part-in-spacing-altered-whole
effect, upright not inverted
Sensitivity to spacing
changes
Inversion effect on spacing
sensitivity
Perceptual bias to upright in
superimposed faces
Distinctiveness effects
Adaptation aftereffects
Attractiveness preference,
upright not inverted
Neural Properties
Face-selective cells,
macaques
Face-selective ERPs
FFA present
Some type of inversion effect
on neural response
Perceptual Narrowing
Looking time discrimination X X
of other race/species faces
Figure 32.7 For each property of face processing, we indicate for be found in text except inversion effect on spacing sensitivity aged
each age group whether that property is qualitatively present ( ), 6 years to adult is from Mondloch et al. (2002) and adaptation
debatable (?), not present (X), or not yet tested (gray). Deprived = aftereffect aged 9 years to adult is from Pellicano, Jeffery, Burr, &
monkeys deprived of face input from birth. Note: All references can Rhodes (2007).
might be worth exploring. It might be that the measured size Lazeyras, & Vuilleumier, 2005; Williams, Berberovic, &
of the FFA in children is affected by top-down strategic Mattingley, 2007) and that the increased size of the FFA
processing that (for some unknown reason) affects faces and could arise simply because people continue to learn faces
not objects. Another possibility is that the FFA might play across life; this idea would have to propose that the number
some role in the long-term storage of individual faces (e.g., of new faces learned is much greater than the number of
it shows repetition priming; Pourtois, Schwartz, Seghier, new objects.
abstract One of the most impressive capacities of the visual Geometric depth cues are those that arise when a scene
system is the ability to infer the three-dimensional structure of is viewed from multiple vantage points. For species with
the environment from images formed on the two retinas. Although frontally located eyes, the horizontal separation between the
several areas of visual cortex are involved in computing depth, the
precise roles of different areas in three-dimensional vision remain two eyes generates systematic differences—known as binoc-
unclear. It is important to establish how neural representations ular disparities—between the images projected onto the
of depth in different brain regions are specialized to perform dif- two retinas (figure 33.1A). Thus images from two simul-
ferent tasks. This chapter summarizes studies that establish such taneous vantage points are available. Binocular disparity
links between representation and function in visual area MT. The (hereafter referred to as disparity) is known to be sufficient to
nature of the representation of binocular disparity in MT is
first considered, along with the functional roles of MT in coarse
provide precise depth discrimination in the absence of other
and fine depth discrimination. The recently discovered role of area depth cues, as demonstrated with random-dot stereograms
MT in computing depth from motion parallax is then examined. (Howard & Rogers, 1995; Julesz, 1971; Parker, 2007). Com-
These findings are compared with those from other visual areas bined with an estimate of viewing distance (the distance from
to consider possible functional streams of analysis in three- the eye to the point of fixation), disparity can provide quan-
dimensional vision.
titative estimates of the location of objects in depth. Another
geometric cue, motion parallax, arises because of the transla-
tion of the observer, as illustrated in figure 33.1B. As the
We carry out our daily activities in a three-dimensional (3D) observer’s head moves from left to right, for example, the
environment. Therefore a fundamental task for the visual vantage point of the left eye changes over time. If the observ-
system is to construct a 3D representation of our surround- er’s head moves through one interocular distance, then the
ings. This is difficult because the image formed on the retina image that is projected onto the retina of the left eye will
of each eye is a two-dimensional projection of 3D space— vary over time, the endpoint being the same view that would
hence there is no direct quantitative information about be seen by the right eye at the beginning of the movement
depth in a single retinal image. Rather, the depth structure (figure 33.1B). Thus there is a formal geometric similarity
of the scene must be reconstructed by the brain. between disparity and motion parallax cues, at least when
The visual system makes use of a wide variety of cues to the latter arise because of lateral head movements. This
estimate depth relationships (Howard & Rogers, 1995, means that motion parallax can provide metric depth infor-
2002). Broadly speaking, these cues can be placed into two mation when a subject views the scene with one eye, as long
categories that I shall label pictorial cues and geometric cues. as the eye moves relative to the scene. Not surprisingly, then,
Pictorial cues to depth are those that are present in a single humans can make judgments of depth from motion parallax
snapshot of the scene, including occlusion, perspective, that are similar in precision to judgments based on disparity
shading, relative size, texture gradients, and blur. Together, (Rogers & Graham, 1979, 1982). As we shall see later, the
these cues can be potent, as is evidenced by the fact that we similar geometry of these two cues suggests that they might
can infer depth relationships in photographs. However, they be processed using the same neural mechanisms.
generally provide only ordinal depth information or require Where and how are depth cues processed in the brain?
prior knowledge to provide metric depth information. For For the pictorial depth cues, very little is known about the
example, the size of an object in the retinal image can be neural mechanisms that lead to depth percepts; therefore I
used to estimate the distance to that object if one knows the shall not consider pictorial cues further here. Until recently,
true physical size of the object. very little was also known about the neural basis of depth
from motion parallax, and we shall consider the available
gregory c. deangelis Department of Brain and Cognitive physiological information in the last section of this chapter.
Sciences, Center for Visual Science, University of Rochester, By comparison, a great deal is known about the neural cir-
Rochester, New York cuits that process disparity cues for depth perception, as has
Figure 33.2 Schematic illustration of random-dot stereogram was required to maintain fixation on the fixation point during each
stimulus and example disparity tuning curves measured with this trial. (B) Disparity tuning curves for seven representative MT
stimulus. (A) A circular patch of moving dots having variable dis- neurons. Solid symbols and error bars show the mean response to
parity was presented over the receptive field (circle, not present in each disparity ± standard error. The smooth curve through each
the actual display) of an MT neuron. Solid and open dots within data set is the best-fitting Gabor function. Neurons are presented
the receptive field denote the images seen by the left and right eyes; (from top to bottom) in order of their preferred disparities, from
the separation between each pair of open and solid dots is the bin- large Near to large Far. The vertical scale bar corresponds to 100
ocular disparity. The remainder of the screen was filled with sta- spikes per second. (Adapted from DeAngelis & Uka, 2003.)
tionary dots presented with zero disparity (gray dots). The monkey
abstract The brain combines different sources of sensory infor- 2004). The basic concept is that there exists inherent uncer-
mation to optimize perception. Information from different sensory tainty in the information that is available to our senses, as
modalities is often seamlessly integrated into a unified percept with well as in the encoding of that information by our sensory
improved behavioral performance. Here we summarize the first
attempt to understand the neural basis of multisensory cue integra-
apparatus. Consequently, perceptual judgments should rely
tion in the context of a behavioral task in which cues are combined on computations involving conditional probability density
according to statistically optimal predictions. We describe multi- functions, sensory likelihoods, and prior probability func-
sensory cue integration in the macaque extrastriate visual cortex tions that are consistent with the Bayesian framework (Clark
using a simple heading discrimination task in which monkeys were & Yuille, 1990; Knill & Pouget, 2004). Assuming Gaussian
asked to judge their direction of self-motion using visual (optic flow)
distributions of the underlying sensory information, inde-
and extraretinal (vestibular) cues. Results suggest that rhesus
macaques and humans use similar computational principles for pendent noise sources, and broad prior distributions relative
combining multiple sensory cues and that these principles can be to the individual cue likelihoods, it is predicted that an
accounted for by the properties of individual neurons in multisen- optimal estimator (in terms of minimizing the variance of
sory cortical areas. the final estimate) will combine sensory information using a
rule that weights the cues according to their reliability (Ernst
& Banks, 2002; Knill & Saunders, 2003). As a result, weaker
A fundamental aspect of our sensory experience is that infor- cues would have a lower weighting in the bimodal estimate.
mation from different modalities is often seamlessly inte- In addition, the variance of the bimodal estimate (s2bi, as
grated into a unified percept. Examples of multisensory cue assessed by psychophysical performance) should be lower
integration include a number of well-known sensory illu- than that of the unimodal estimates, s21,2, as given by (see
sions, such as the McGurk effect (McGurk & MacDonald, figure 34.1)
1976), ventriloquism (Bertelson & Radeau, 1981), and the σ2bi = σ21 * σ22/(σ21 + σ22) (1)
illusion of self-motion triggered by visual motion, known as
These predictions have been tested in human psycho-
vection (Previc, 1992). Combining sensory inputs can improve
physical experiments using a number of different para-
behavioral performance on a number of tasks, including
digms (van Beers, Sittig, & Gon, 1999; Ernst & Banks, 2002;
object recognition (Molholm, Ritter, Javitt, & Foxe, 2004),
Knill & Saunders, 2003; Alais & Burr, 2004; Hillis, Watt,
stimulus detection (Frassinetti, Bolognini, & Ladavas, 2002),
Landy, & Banks, 2004). The basic result is remarkably
and localization (Hairston et al., 2003).
consistent across studies: When combining multiple sensory
Recently, understanding of multisensory integration has
cues, humans perform as nearly optimal Bayesian observers.
gained momentum, as several psychophysical studies have
Yet no direct neural correlates of these phenomena have
shown that human observers combine sensory cues accord-
been available, in part owing to the lack of a suitable
ing to a statistically optimal weighting scheme derived from
animal model for combined behavioral and electro-
Bayesian probability theory (Mamassian, Landy, & Moloney,
physiological experiments in the context of cue integration.
2002; Kersten, Mamassian, & Yuille, 2004; Knill & Pouget,
Rather, studies of multisensory integration at the neuronal
level have often been performed in either anesthetized or
dora e. angelaki and yong gu Department of Anatomy and passively fixating animals, and the pioneering studies
Neurobiology, Washington University School of Medicine, St.
of Stein and colleagues emphasized nonlinearity (superaddi-
Louis, Missouri
gregory c. deangelis Department of Brain and Cognitive tivity) as the hallmark of multisensory integration (Meredith
Sciences, Center for Visual Science, University of Rochester, New & Stein, 1983, 1986, 1996; Wallace, Wilkinson, & Stein,
York 1996).
angelaki, gu, and deangelis: multisensory integration in macaque visual cortex 499
ception is an intriguing problem in sensory integration,
requiring the neural combination of visual signals (e.g., optic
flow), vestibular signals regarding head motion, and perhaps
somatosensory and proprioceptive cues (Dichgans & Brandt,
1974; Hlavacka, Mergner, & Schweigart, 1992; Hlavacka,
Mergner, & Bolha, 1996). In particular, patterns of image
motion across the retina (optic flow) can provide strong cues
to self-motion, as is evidenced by the fact that optic flow alone
can elicit the illusion of self-motion. As early as 1875, Ernst
Mach described self-motion sensations (i.e., circular and
linear vection) induced by visual stimuli. Since then, several
other studies have characterized the behavioral observation
that large-field optic flow stimulation induces self-motion
perception (Brandt, Dichgans, & Koenig, 1973; Berthoz,
Pavard, & Young, 1975; Dichgans & Brandt, 1978). Although
self-motion perception generally involves the analysis of
observer translation and rotation, we shall limit our scope
here to translational movements. Thus the central issue we
explore is how we compute our direction of heading.
Many visual psychophysical and theoretical studies have
shown that optic flow provides powerful cues to heading
(Gibson, 1950) and have examined how heading can be
computed from optic flow (Warren, 2003). In parallel, inde-
pendent information about the motion of our head or body
in space can arise from the vestibular system. Specifically,
Figure 34.1 Schematic illustration of one of the predictions of vestibular signals provide information about the angular and
optimal cue integration. (A) Probability density functions (sensory linear accelerations of the head in space (Angelaki, 2004;
likelihoods) corresponding to two cues: cue 1 (solid curve, e.g., Angelaki & Cullen, 2008) and thus provide important cues
vestibular) and cue 2 (dashed curve, e.g., visual). It is predicted that to self-motion estimation. A role of the vestibular system in
the bimodal probability distribution (gray curve) will be narrower
than those for the individual cues (equation 1). This improvement the perception of self-motion has long been acknowledged
will be largest when the two single cues have the same standard (Guedry, 1974, 1978; Benson, Spencer, & Stott, 1986;
deviation (s). (B) Expected performance for an ideal observer Telford, Howard, & Ohmi, 1995).
judging heading on the basis of the probability distributions in In one such heading discrimination task, the subject expe-
panel A. In this case, the threshold in the combined (bimodal) riences forward motion with a small leftward or rightward
condition is predicted to be lower than both single-cue thresholds.
(Modified with permission from Ernst & Banks, 2002.)
component. At the end of each trial, the task requires an eye
movement to report whether the subject experienced left-
ward or rightward motion (figure 34.2A). Both humans
Here we summarize the first attempt to understand the (Smith, Bush, & Stone, 2002) and monkeys (Gu, DeAngelis,
neural basis of multisensory cue integration in the context & Angelaki, 2007) can be quite accurate in discriminating
of a behavioral task in which cues are combined according their heading direction in the absence of optic flow, with
to the statistically optimal predictions. We describe multisen- thresholds that can be as small as 1–3 degrees during motion
sory cue integration in the macaque extrastriate visual cortex in darkness. These threshold values during motion in dark-
using a simple heading discrimination task in which monkeys ness are comparable to (although larger than) those described
are asked to judge their direction of self-motion using visual in visual heading discrimination tasks (Warren & Hannon,
(optic flow) and extraretinal (vestibular) cues (figure 34.2A). 1990; Royden, Banks, & Crowell, 1992; van den Berg &
Brenner, 1994; Stone & Perrone, 1997). Vestibular heading
Perception of heading from optic flow and thresholds increase more than ten-fold after bilateral laby-
vestibular signals rinthectomy (figure 34.2B, solid symbols), suggesting that
vestibular information is critical for heading discrimination.
How do we perceive our direction of self-motion through Although some recovery was seen over the first few days
space? To navigate effectively through a complex three- postlesion, thresholds remained elevated when measured
dimensional environment, we must accurately estimate our 3–6 months following the lesion (Gu et al., 2007). In con-
own motion relative to objects around us. Self-motion per- trast, labyrinthectomy had a very modest effect on visual
heading thresholds where the animals remained stationary ent visual and vestibular cues were presented together. Three
and heading was specified solely by optic flow (figure 34.2B, stimulus conditions were randomly interleaved within a
open symbols). single block of trials: (1) a vestibular condition, in which
To test whether macaques, like humans, combine sensory heading was defined solely by inertial motion cues by trans-
cues according to a statistically optimal, Bayesian-style lating the animal on a motion platform; (2) a visual condi-
weighting scheme, an experiment was performed in which tion, in which heading was defined solely by optic flow
heading was specified not only by optic flow alone or inertial provided by a projector that was mounted on the platform;
motion alone, but also by bimodal stimulation when congru- and (3) a combined condition consisting of congruent inertial
angelaki, gu, and deangelis: multisensory integration in macaque visual cortex 501
motion and optic flow cues. Each movement trajectory, Bremmer, Duhamel, Ben Hamed, & Graf, 2002; Bremmer,
either real (vestibular condition) or visually simulated (visual Klam, Duhamel, Ben Hamed, & Graf, 2002), posterior pari-
condition), had a duration of 2 s and consisted of a Gaussian etal cortex (7a) (Siegel & Read, 1997), and superior temporal
velocity profile (for details, see Gu, Watkins, Angelaki, & polysensory area (STP) (Anderson & Siegel, 1999). In par-
DeAngelis, 2006). ticular, neurons in MSTd/VIP have large visual receptive
Average behavior, in the form of psychometric functions, fields and are selective for optic flow patterns similar to
from one of the animals is illustrated in figure 34.2C. Note those seen during self-motion (MSTd: Tanaka et al., 1986;
that the reliability of the individual cues was roughly equated Tanaka, Fukada, & Saito, 1989; Duffy & Wurtz, 1991,
during training by reducing the coherence of the visual 1995; Bradley, Maxwell, Anderson, Banks, & Shenoy, 1996;
motion stimulus such that visual and vestibular thresholds Lappe, Bremmer, Pekel, Thiele, & Hoffmann, 1996); (VIP:
were approximately equal (figure 34.2C, open/solid circles Schaafsma & Duysens, 1996; Bremmer, Duhamel, et al.,
and solid/dashed curves). This balancing of the two cues is 2002). Importantly, electrical stimulation of MSTd or
crucial, as it affords the maximal opportunity to observe VIP has been reported to bias heading judgments that
improvement in performance under cue combination (Ernst are based solely on optic flow (Britten & van Wezel, 1998,
& Banks, 2002). In the combined condition (figure 34.2C, 2002; Zhang & Britten, 2003). MSTd/VIP neurons are
gray circles and curve), the monkey’s heading threshold was also selective for motion in darkness, suggesting that they
substantially smaller, as evidenced by the steeper slope of the receive vestibular inputs (Duffy, 1998; Bremmer, Kubischek,
gray curve. If the monkey combined the two cues optimally, Pekel, Lappe, & Hoffmann, 1999; Bremmer, Duhamel,
as predicted by Bayesian cue integration principles, thresh- et al., 2002; Schlack, Hoffmann, & Bremmer, 2002; Gu
olds should be reduced by approximately 30% under cue et al., 2006; Chen, Henry, DeAngelis, & Angelaki, 2007;
combination (equation 1). That bimodal behavioral thresh- Takahashi et al., 2007).
olds are similar to the optimal cue integration predictions is Using a custom-built virtual reality system, the heading
illustrated in figure 34.2D for data from three animals. Thus, selectivity of MSTd (Gu et al., 2006; Takahashi et al., 2007)
like humans, macaques can combine multiple sensory cues and VIP (Chen et al., 2007) neurons has recently been quan-
nearly optimally to improve perceptual performance. tified in three dimensions. Inertial motion (vestibular) signals
This demonstration of near-optimal cue integration in the were provided by translating a motion platform, and optic
monkey’s behavior provides a unique opportunity to search flow (visual) signals were provided by a projector that was
for the neural basis of Bayesian inference at the level of mounted on the platform and rear-projected images onto a
individual neurons and populations of neurons. In identify- screen in front of the monkey. Approximately 60% of MSTd
ing candidate populations of neurons that integrate visual neurons were significantly tuned for heading under both the
and vestibular signals for self-motion perception, we seek visual and vestibular stimulus conditions. These convergent
neurons that are tuned for direction of motion in optic flow MSTd cells fell into one of two groups: (1) “congruent”
fields and that carry vestibular signals related to the direction neurons, which had similar visual/vestibular preferred direc-
of head motion through space. As will be summarized next, tions, thus signaled the same motion direction in three-
such visual/vestibular convergence occurs in multiple corti- dimensional space under both unimodal stimulus conditions,
cal areas. In contrast, responsiveness to optic flow is gen- and (2) “opposite” neurons, which preferred nearly opposite
erally absent in subcortical areas with vestibular-related directions of heading under visual and vestibular stimulus
activities, including the brain stem vestibular and deep cer- conditions (Gu et al., 2006). The response modulation of
ebellar nuclei (Bryan, Meng, DeAngelis, & Angelaki, 2007) MSTd neurons during inertial motion (vestibular condition)
and primate thalamus (Meng, May, Dickman, & Angelaki, was indeed shown to be of labyrinthine origin, as MSTd cells
2007). In the following, we first briefly summarize what has were no longer tuned during inertial motion following bilat-
been previously known regarding visual/vestibular conver- eral labyrithectomy (figure 34.3) (Gu et al., 2007; Takahashi
gence in the macaque cortex; we then describe in more et al., 2007).
detail how MSTd neurons respond in the context of the Notably, responsiveness to both visual (optic flow)
multimodal heading discrimination task. and vestibular stimulation is generally not present within
more traditionally considered areas of “vestibular cortex”
Responses of primate cortical neurons to optic (Fredrickson & Rubin, 1986; Fukushima, 1997; Guldin &
flow and vestibular stimuli Grusser, 1998). Three main cortical areas have been char-
acterized as either exhibiting responses to vestibular sti-
Optic flow-sensitive neurons have been found in the dorsal mulation and/or receiving short-latency vestibular signals
portion of the medial superior temporal area (MSTd) (trisynaptic through the vestibular nuclei and the thalamus).
(Tanaka et al., 1986; Duffy & Wurtz, 1991, 1995), ventral They are (1) area 2v, located in the transition zone of areas
intraparietal area (VIP) (Schaafsma & Duysens, 1996; 2, 5, and 7 within the intraparietal sulcus (Fredrickson,
angelaki, gu, and deangelis: multisensory integration in macaque visual cortex 503
Figure 34.4 Heading sensitivity in area MSTd. (A, B) Heading- range of heading stimuli presented while the monkey performed
tuning curves of two example neurons with (A) congruent and the discrimination task. (E, F ) Neurometric functions computed
(B) opposite visual/vestibular heading preferences. Negative angles by ROC analysis for the same two neurons. Smooth curves show
correspond to leftward headings; positive numbers illustrate right- best-fitting cumulative Gaussian functions. (Modified from Gu
ward directions. (C, D) Responses of the same neurons to a narrow et al., 2008.)
threshold, the steeper is the neurometric function and the population of neurons. To summarize this dependency, a
more sensitive the neuron is to subtle variations in heading. quantitative index of visual/vestibular congruency (CI) was
For the congruent neuron in figure 34.4E, the neuronal established that ranged from +1, when visual and vestibular
threshold was smallest in the combined condition (gray tuning functions have a consistent slope (figure 34.4A), to −1,
symbols and lines), indicating that the neuron could dis- when they have opposite slopes (figure 34.4B). We then
criminate smaller variations in heading when both cues computed, for each neuron, the ratio of the neuronal
were provided. In contrast, for the opposite neuron in threshold in the combined condition to the threshold
figure 34.4F, the reverse was true: the neuron became expected if neurons combine cues optimally according to
less sensitive in the presence of both cues (gray symbols equation 1. A significant correlation was seen between the
and lines). ratio of combined to predicted thresholds and CI (figure
The effect of visual/vestibular congruency on neuronal 34.5A), such that neurons with large positive CIs (congruent
sensitivity during bimodal stimulation held across the whole cells, black circles) had thresholds close to the optimal
angelaki, gu, and deangelis: multisensory integration in macaque visual cortex 505
physical performance under cue combination. These find-
ings implicate area MSTd in sensory integration for heading
perception and establish an excellent model system for
studying the detailed mechanisms by which neurons combine
different sensory signals and dynamically reweight these
signals to optimize performance as the reliability of cues
varies (Knill & Pouget, 2004). However, because the reli-
ability of the visual and vestibular cues was not varied in
these experiments, it is currently unclear whether monkeys
and MSTd neurons dynamically reweight these cues, as
predicted by statistically optimal cue integration schemes.
While experiments are currently underway to test this very
important prediction of Bayesian cue integration in trained
animals (Fetsch, Angelaki, & DeAngelis, 2007), we next
summarize results from a simpler experiment in which the
reliability of the visual cue was varied during neural record-
ings in a passively fixating animal (Morgan, DeAngelis, &
Angelaki, 2008). This experiment sought to characterize the
mathematical rule by which MSTd neurons combine their
visual and vestibular inputs. Specifically, we asked whether
bimodal responses in MSTd are well fit by weighted linear
sums of unimodal responses or whether a nonlinear combi-
nation rule is required. Moreover, we asked whether the
weights that neurons apply to these cues change with the
relative reliabilities of the two cues.
angelaki, gu, and deangelis: multisensory integration in macaque visual cortex 507
Figure 34.8 Dependence of vestibular and visual weights on ence. Triangles are plotted at the medians. (C, D) Vestibular and
visual motion coherence. Vestibular and visual weights for each visual weights are plotted as a function of motion coherence. Data
MSTd neuron were derived from linear fits to bimodal responses points are coded by the significance of unimodal visual tuning
like those in figure 34.7. (A, B) Histograms of vestibular and visual (open versus solid circles). (Replotted with permission from Morgan
weights computed from data at 100% (black) and 50% (gray) coher- et al., 2008.)
multisensory integration and Bayesian inference in general. G.C.D.). We thank Michael Morgan, whose Ph.D. thesis provided
Here, we have summarized recent findings of neurons in the MSTd data at different visual coherences.
extrastriate visual cortex that might mediate visual/vestibu-
lar cue integration for heading perception. Although some
critical experiments have not yet been conducted, results to REFERENCES
date suggest that rhesus macaques and humans use similar
Alais, D., & Burr, D. (2004). The ventriloquist effect results from
computational principles for combining multiple sensory near-optimal bimodal integration. Curr. Biol., 14, 257–262.
cues and that these principles may be accounted for by the Anderson, K. C., & Siegel, R. M. (1999). Optic flow selectivity in
properties of individual neurons in multisensory cortical the anterior superior temporal polysensory area, STPa, of the
areas. Multiple questions remain: How distributed are these behaving monkey. J. Neurosci., 19, 2681–2692.
Angelaki, D. E. (2004). Eyes on target: What neurons must do for
representations of multisensory integration at the neuronal
the vestibuloocular reflex during linear motion. J. Neurophysiol.,
level? What are the mechanisms by which neurons reweight 92, 20–35.
their inputs according to reliability? Are the responses of Angelaki, D. E., & Cullen, K. E. (2008). Vestibular system: The
sensory cells consistent with the encoding of probability dis- many facets of a multimodal sense. Annu. Rev. Neurosci., 31,
tributions? Finally, how are these sensory signals read out 125–150.
from population responses, and how much of the necessary Battaglia, P. W., Jacobs, R. A., & Aslin, R. N. (2003). Bayesian
integration of visual and auditory signals for spatial localization.
computations take place in sensory representations versus J. Opt. Soc. Am. [A], 20, 1391–1397.
decision-making networks? Benson, A. J., Spencer, M. B., & Stott, J. R. (1986). Thresholds
for the detection of the direction of whole-body, linear
acknowledgments This work was supported by NIH EY017866, movement in the horizontal plane. Aviat. Space Environ. Med.,
EY019087, and DC04260 (to D.E.A.) and NIH EY016178 (to 57, 1088–1096.
angelaki, gu, and deangelis: multisensory integration in macaque visual cortex 509
ability influences cross-modal bias. J. Cogn. Neurosci., 15, Royden, C. S., Banks, M. S., & Crowell, J. A. (1992). The
20–29. perception of heading during eye movements. Nature, 360,
Hillis, J. M., Watt, S. J., Landy, M. S., & Banks, M. S. (2004). 583–585.
Slant from texture and disparity cues: Optimal cue combination. Schaafsma, S. J., & Duysens, J. (1996). Neurons in the ventral
J. Vis., 4, 967–992. intraparietal area of awake macaque monkey closely resemble
Hlavacka, F., Mergner, T., & Bolha, B. (1996). Human self- neurons in the dorsal part of the medial superior temporal area
motion perception during translatory vestibular and propriocep- in their responses to optic flow patterns. J. Neurophysiol., 76,
tive stimulation. Neurosci. Lett., 210, 83–86. 4056–4068.
Hlavacka, F., Mergner, T., & Schweigart, G. (1992). Interac- Schlack, A., Hoffmann, K. P., & Bremmer, F. (2002). Interaction
tion of vestibular and proprioceptive inputs for human self- of linear vestibular and visual stimulation in the macaque ventral
motion perception. Neurosci. Lett., 138, 161–164. intraparietal area (VIP). Eur. J. Neurosci., 16, 1877–1886.
Kersten, D., Mamassian, P., & Yuille, A. (2004). Object percep- Schwarz, D. W., & Fredrickson, J. M. (1971a). Rhesus monkey
tion as Bayesian inference. Annu. Rev. Psychol., 55, 271–304. vestibular cortex: A bimodal primary projection field. Science,
Knill, D. C., & Pouget, A. (2004). The Bayesian brain: The 172, 280–281.
role of uncertainty in neural coding and computation. Trends Schwarz, D. W., & Fredrickson, J. M. (1971b). Tactile direction
Neurosci., 27, 712–719. sensitivity of area 2 oral neurons in the rhesus monkey cortex.
Knill, D. C., & Saunders, J. A. (2003). Do humans optimally Brain Res., 27, 397–401.
integrate stereo and texture information for judgments of surface Siegel, R. M., & Read, H. L. (1997). Analysis of optic flow in the
slant? Vis. Res., 43, 2539–2558. monkey parietal area 7a. Cereb. Cortex, 7, 327–346.
Krug, K. (2004). A common neuronal code for perceptual Smith, S. T., Bush, G. A., & Stone, L. S. (2002). Amplitude
processes in visual cortex? Comparing choice and attentional response of human vestibular heading estimation. Soc. Neurosci.
correlates in V5/MT. Philos. Trans. R. Soc. Lond. B Biol. Sci., 359, [Abstracts], 56, 1.
929–941. Stone, L. S., & Perrone, J. A. (1997). Human heading estimation
Lappe, M., Bremmer, F., Pekel, M., Thiele, A., & Hoffmann, during visually simulated curvilinear motion. Vis. Res., 37,
K. P. (1996). Optic flow processing in monkey STS: A theoretical 573–590.
and experimental approach. J. Neurosci., 16, 6265–6285. Takahashi, K., Gu, Y., May, P. J., Newlands, S. D., DeAngelis,
Mamassian, P., Landy, M. S., & Maloney, L. T. (2002). Bayesian G. C., & Angelaki, D. E. (2007). Multimodal coding of
modelling of visual perception. In R. P. N. Rao, B. A. Olshausen, three-dimensional rotation and translation in area MSTd:
& M. S. Lewicki (Eds.), Probabilistic models of the brain (pp. 13–36). Comparison of visual and vestibular selectivity. J. Neurosci., 27,
Cambridge, MA: MIT Press. 9742–9756.
McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing Tanaka, K., Fukada, Y., & Saito, H. A. (1989). Underlying mech-
voices. Nature, 264, 746–748. anisms of the response specificity of expansion/contraction and
Meng, H., May, P. J., Dickman, J. D., & Angelaki, D. E. (2007). rotation cells in the dorsal part of the medial superior temporal
Vestibular signals in primate thalamus: Properties and origins. area of the macaque monkey. J. Neurophysiol., 62, 642–656.
J. Neurosci., 27, 13590–13602. Tanaka, K., Hikosaka, K., Saito, H., Yukie, M., Fukada, Y., &
Meredith, M. A., & Stein, B. E. (1983). Interactions among con- Iwai, E. (1986). Analysis of local and wide-field movements
verging sensory inputs in the superior colliculus. Science, 221, in the superior temporal visual areas of the macaque monkey.
389–391. J. Neurosci., 6, 134–144.
Meredith, M. A., & Stein, B. E. (1986). Visual, auditory, and Telford, L., Howard, I. P., & Ohmi, M. (1995). Heading judg-
somatosensory convergence on cells in superior colliculus results ments during active and passive self-motion. Exp. Brain Res., 104,
in multisensory integration. J. Neurophysiol., 56, 640–662. 502–510.
Meredith, M. A., & Stein, B. E. (1996). Spatial determinants van Beers, R. J., Sittig, A. C., & Gon, J. J. (1999). Integration of
of multisensory integration in cat superior colliculus neurons. proprioceptive and visual position-information: An experimen-
J. Neurophysiol., 75, 1843–1857. tally supported model. J. Neurophysiol., 81, 1355–1364.
Molholm, S., Ritter, W., Javitt, D. C., & Foxe, J. J. (2004). van den Berg, A. V., & Brenner, E. (1994). Why two eyes are
Multisensory visual-auditory object recognition in humans: better than one for judgements of heading. Nature, 371,
A high-density electrical mapping study. Cereb. Cortex, 14, 700–702.
452–465. Wallace, M. T., Wilkinson, L. K., & Stein, B. E. (1996). Repre-
Morgan, M. L., DeAngelis, G. C., & Angelaki, D. E. (2008). sentation and integration of multiple sensory inputs in primate
Multisensory integration in macaque visual cortex depends on superior colliculus. J. Neurophysiol., 76, 1246–1266.
cue reliability. Neuron, 59(4), 662–673. Warren, W. H. (2003). Optic flow. In L. M. Chalupa &
Odkvist, L. M., Schwarz, D. W., Fredrickson, J. M., & Hassler, J. S. Werner (Eds.), The visual neurosciences (pp. 1247–1259).
R. (1974). Projection of the vestibular nerve to the area 3a arm Cambridge, MA: MIT Press.
field in the squirrel monkey (Saimiri sciureus). Exp. Brain Res., 21, Warren, W. H., Jr., & Hannon, D. J. (1990). Eye movements and
97–105. optical flow. J. Opt. Soc. Am. [A], 7, 160–169.
Parker, A. J., & Newsome, W. T. (1998). Sense and the single Zhang, T., & Britten, K. H. (2003). Microstimulation of area VIP
neuron: Probing the physiology of perception. Annu. Rev. Neuro- biases heading perception in monkeys. Soc. Neurosci. [Abstracts],
sci., 21, 227–277. 339, 9.
Previc, F. H. (1992). The effects of dynamic visual stimulation on
perception and motor control. J. Vestib. Res., 2, 285–295.
abstract We frequently reposition our gaze by making rapid objects while they are being contemplated, the form of every
ballistic eye movements called saccades to position the fovea on one of the objects facing the eye . . . will move on the eyes
objects of interest. While the strategy is highly efficient for the visual as the latter moves. But sight has become accustomed to the
system, allowing it to analyze the whole visual field with the high
resolution of the fovea, it poses several problems for perception.
motion of the objects’ forms on its surface when the objects
Saccades cause rapid, large-field motion on the retina, potentially are stationary, and therefore does not judge the objects to
confusable with large-field motion in the external world. They also be in motion” (Alhazen, 1083). But only recently have the
change the relationship between external space and retina position, tools become available to monitor eye movements accurately
confounding information about visual direction. Much effort has and to measure their effects qualitatively.
been made in recent years to attempt to understand the effects of
The problem of visual stability can be broadly divided into
saccades on visual function. Electrophysiological, imaging, and psy-
chophysical evidence suggests that saccades trigger two distinct three separate issues: Why we do not perceive the motion of
neural processes: a suppression of visual sensitivity, specific to the retinal image produced as the eye sweeps over the visual
motion analysis, probably mediated by the magnocellular pathway, field? How do we cope dynamically “on-line” with the con-
and a gross perceptual distortion of visual space just before the tinual changes in the retinal image produced by each
repositioning of gaze. While our knowledge of how the visual
saccade? How (and where) do we construct a stable spatiotopic
system copes with the potentially damaging effects of continual
saccadic eye movements has increased considerably over the past representation of the world centered in real-world external
few decades, many interesting avenues of research remain open. coordinates from the successive “snapshots” of each fixa-
tion? Although the problem of visual stability is far from
solved, tantalizing progress has been made over the last few
years, some of which will be highlighted in this chapter.
Vision is always clear and stable, despite continual saccadic
eye movements, that is, ballistic movements of the eyes that
reposition our gaze two to three times a second. Saccades Saccadic suppression
may be made deliberately, but normally they are automatic
Part of the general problem of visual stability is why the fast
and pass unnoticed. An observer at a sporting event, someone
motion of the retinal image generated the movement of
conversing with a companion, or a person reading a book
the eyes completely escapes notice. Comparable wide-field
usually makes many saccades without knowing that they
motion generated externally is highly visible and somewhat
have occurred. Not only does the actual movement of the
disturbing (Burr, Holt, Johnstone, & Ross, 1982). It has long
eyes escape notice; so too does the motion of images as they
been suspected that vision is somehow suppressed during
sweep across the retina and the fact that gaze itself has been
saccades (Holt, 1903), but the nature of the suppression
repositioned. The world seems to stay put. Comparable
has remained elusive. Now it is clear that the suppression is
image motion that is produced externally, rather than by
neither a “central anaesthesia” of the visual system (Holt,
movements of the observer’s own eyes, has an alarming effect
1903), nor a “gray-out of the world” due to fast motion
on the observer’s sense of stability. The problem of visual
(Campbell & Wurtz, 1978; Dodge, 1900; Woodworth, 1906),
stability is an old one that has fascinated many scientists,
as this motion is actually visible—extremely so at low spatial
including Descartes, von Helmholtz, Mach, and Sherrington,
frequencies (Burr & Ross, 1982). What happens is that
and indeed goes back at least to the 11th-century Persian
some stimuli are actively suppressed by saccades while others
scholar Alhazen: “For if the eye moves in front of visible
are not. Stimuli of low spatial frequencies are very difficult
to detect if flashed just prior to a saccade, while stimuli of
concetta morrone Department of Physiological Sciences,
high spatial frequencies remain equally visible (Burr et al.,
University of Pisa, and Scientific Institute Stella Maris, Pisa, Italy
david burr Department of Psychology, University of Florence, 1982; Volkmann, Riggs, White, & Moore, 1978). Equilumi-
Italy; School of Psychology, University of Western Australia, Perth, nant stimuli (varying in color but not luminance) are not
Australia suppressed during saccades and can even be enhanced (Burr,
morrone and burr: visual stability during saccadic eye movements 511
Morrone, & Ross, 1994), implying that the parvocellular sensitivity is greater than that expected with comparable
pathway, essential for chromatic discrimination, is left unim- motion without the saccade, possibly implying a post-
paired, while the magnocellular pathway is specifically saccadic facilitation (consistent with the physiology).
suppressed. That real saccades cause a different pattern of results from
Saccadic suppression follows a specific and very tight time simulated saccades shows that suppression results at least in
course, illustrated in figure 35.1A (replotted from Diamond, part from an active, extraretinal signal. Interestingly, the
Ross, & Morrone, 2000). Sensitivity for seeing low-spatial- amount of suppression varies with age, being much stronger
frequency, luminance-modulated stimuli declines 25 ms in adolescent children than in adults (Bruno, Brambati,
before saccadic onset, reaching a minimum at the onset of Perani, & Morrone, 2006), even though in adolescence,
the saccade, then rapidly recovering to normal levels 50 ms motion perception and masking are largely adultlike (Maurer,
afterward. Does the suppression result from a central non- Lewis, & Mondloch, 2005; Parrish, Giaschi, Boden, &
visual “corollary discharge” signal (discussed in the next Dougherty, 2005). This indicates that the mechanisms that
section), or could it result simply from the visual “masking” mediate suppression are still developing at this age. Because
effects? This would seem unlikely, as great care was taken to the saccadic motor system is also not completely mature
ensure a uniform surround. However, the question is impor- during adolescence (Fischer, Biscaldi, & Gezeck, 1997), this
tant, so to be certain that the saccade itself was essential for is further evidence that the extraretinal signal that is respon-
the suppression, we simulated saccadic eye movements by sible for mediating the saccadic suppression may be linked
viewing the stimulus setup through a mirror that could be to the motor system.
rotated at saccadic speeds. When the background was Psychophysical studies indicate that saccadic suppression
uniform, with minimal visual referents, the simulated sac- occurs early in the visual system (Burr et al., 1994), at or
cades had little or no effect on sensitivity (open symbols of before the site of contrast masking and before low-level
figure 35.1A).
But that is not to say that masking does not occur under
more natural conditions. When the test stimulus is embed-
ded within a textured screen, simulated saccades do decrease
contrast sensitivity (figure 35.1B). Indeed, the maximum sup-
pression is nearly as great as that caused by real saccades
and lasts much longer. This suggests that after the saccade,
morrone and burr: visual stability during saccadic eye movements 513
Apparent position (degs)
would otherwise be disturbing, and the rapid return to
40 A
normal sensitivity after the saccade.
Number of bars
Mittelstädt (1954) with the concept efference copy: The 4 B
effort of will of making the eye movements (corollary dis-
charge) is subtracted from the retinal signal to cancel the eye 3
movement and stabilize perception. Now we know that
retinal motion signal cannot be easily compensated, given 2
the sophisticated analysis performed by motion detectors.
However, there is evidence for the existence of a corollary 1
discharge signal that must be instrumental in maintaining
visual stability.
-200 -100 0 100 200
Considerable psychophysical evidence exists for a corol- Time (ms)
lary discharge in humans, going back to the 1960s, when
Leonard Matin and others reported large transient changes Figure 35.2 Effect of saccades on apparent bar position and
in spatial localization at the time of saccades. When asked number. (A) Perceived position of narrow green bars, briefly flashed
on a red background at various times relative to the onset of a
to report the position of a target that was flashed during a saccade from −10 to +10 degrees. The physical position of the bars
saccade, subjects mislocalized it, primarily in the direction (shown by the dashed lines) could be −20 degrees (for the triangle
of the saccade (Honda, 1989; Mateeff, 1978; Matin & Pearce, symbols), 0 degrees (square symbols), and 20 degrees (round
1965). The localization error is typically on the order of half symbols). The effect of the saccades (maximal at saccadic onset) is
the saccadic size. Later, Mateeff and Honda measured the to shift the apparent position of the bar toward the saccadic target,
where the eyes land. For stimuli at 0 or −20 degrees the shift is in
time course and showed that the error starts about 50 ms
the direction of the saccade, but for stimuli at +20 degrees the shift
before the saccadic onset and continues well after fixation is is in the other direction. In all cases, the shift is toward the saccadic
regained. The error before the saccadic onset has been taken landing point. (B ) Reported number of bars seen, as a function of
as an indication of the existence of a slow and sluggish corol- presentation time (relative to saccadic onset). A variable number of
lary discharge signal that compensates partly for the eye bars (0, 1, 2, 3, or 4) were presented simultaneously in positions
straddling 10 degrees either side of the saccadic target site. The
movement; the internal representation of the position and
results reported here are for trials in which four bars were pre-
the actual position of the gaze do not match and errors in sented; but when presented near saccadic onset, the four collapse
the localization of a brief target are generated. onto each other, so only one was seen. The other bars were not
We have examined saccadic mislocalization in photopic suppressed, because one bar was always reported as one, and zero
conditions using equiluminant stimuli (that remain visible bars were reported as zero (no false positives). (Reproduced with
during saccades). This approach revealed a bizarre result: permission from Ross et al., 1997.)
At the time of saccades, visual space is not so much shifted
in the direction of the saccade but compressed toward the sac- (Kaiser & Lappe, 2004). These results are intriguing because
cadic target (Morrone, Ross, & Burr, 1997; Ross, Morrone, they indicate that the process described mathematically by
& Burr, 1997) (see figure 35.2A). Objects that are flashed at a simple translation of the internal coordinate system is not
saccadic onset to a range of positions, from close to fixation plausible; perhaps the system cannot perform the transfor-
to positions well beyond the saccadic target, are all perceived mation of space without additional perceptual costs.
at or near the saccadic target. The effect is primarily parallel Figure 35.2B shows that saccadic compression is so strong
to the saccade direction (Ross et al., 1997), although a small that four bars spread over 20 degrees are perceived as being
compression is also observed in the orthogonal direction fused into a single bar. Discrimination of shape (Matsumiya
morrone and burr: visual stability during saccadic eye movements 515
Vision Bimodal Audition
A B C
position (deg)
-10 -10
Perceived
0 0
threshold (deg) D E F
Precision
10 10
0 0
-100 -50 0 50 -100 -50 0 50 -100 -50 0 50
Time (ms)
Figure 35.3 Illustration of how saccadic mislocalization can the sound was played contemporaneously with the bar display, the
result from optimal “Bayesian” fusion. In a two-alternative forced mislocalization of the bar was reduced (B). The lower curves
choice, subjects were asked to report whether a perisaccadic test show the localization thresholds. Again, sound was unaffected
bar that was displayed midway between fixation and saccadic by saccades, but the precision of visual localization was reduced
target seemed to be located to the right or left of a presaccadic drastically near saccadic onset. During the bimodal audiovisual
probe bar. (For full details, see Binda, Bruno, et al., 2007.) Psycho- presentation, precision improved to the extent of being better than
metric functions were fitted to these data to give an estimate of either the visual or auditory unimodal localization precision.
perceived position and also of precision of localization. The upper Indeed, this performance, both for perceived position and for preci-
curves show how perceived position varied with time (relative to sion thresholds, was very close to the Bayesian prediction, indicated
saccadic onset). Visual stimuli presented on their own (A) showed by the thick gray line. The dotted horizontal lines indicate perform-
the characteristic mislocalization, like that of figure 35.1A. Auditory ance during fixation. (Reproduced with permission from Binda,
stimuli were not at all affected by the saccade (C ). However, when Bruno, et al., 2007.)
1976a, 1976b; Hansen & Skavenski, 1977, 1985). Other point to the apparent sound source (by head turn), the com-
studies (e.g., Bridgeman, Lewis, Heit, & Nagle, 1979) also pression disappears, as it does for vision (Burr et al., 2001).
reported that subjects can point accurately to targets that However, for visual judgments, introducing clearly visible
were displaced perisaccadically, even though the subject postsaccadic references under normal lighting conditions
did not perceive the change in target position. However, a causes both verbal report and pointing to show compression.
few experiments have failed to replicate the original dissocia- This suggests that vision has access to two maps, one subject
tion between motor accuracy and perceptual error during to distortion and the other not. The motor map shows no
saccades, reporting localization errors for both tasks compression except when visual references remain in view
(Bockisch & Miller, 1999; Dassonville, Schlag, & Schlag- for a substantial time after saccade, indicating that these
Rey, 1992, 1995; Honda, 1991; Miller, 1996; Schlag & maps are updated postsaccadically, while for perceptual
Schlag-Rey, 1995). Recently, Burr, Morrone, and Ross judgments, the updating occurs before and during the actual
(2001) and Morrone, Ma-Wyatt, and Ross (2005) reported saccade. Both maps contribute to determining the weight
a clear dissociation between verbal reports and blind point- given to each map. Perhaps the popular distinction between
ing for saccadic compression. The plot of figure 35.4 shows conscious perception and action (Goodale & Milner, 1992;
that briefly flashed stimuli were perceived clearly in false Trevarthen, 1968) is at best an oversimplification.
positions, causing the characteristic compression (solid But where in the brain do these maps reside? Is there any
symbols); but when asked to point blindly at the stimuli, with evidence that a dynamically updated spatiotopic map actu-
the screen temporally obscured by liquid crystal shutter, ally exists? Electrophysiological studies have reported several
observers did so veridically (open symbols). transient perisaccadic phenomenon. In the lateral intrapari-
Interestingly, analogous effects have been reported in etal cortex (LIP), receptive fields change positional selectivity
audition. Although saccadic eye movements do not affect the (Duhamel, Colby, & Goldberg, 1992) just before a monkey
localization of tones, saccadic head movements do (Leung, makes a saccadic eye movement, anticipating the change in
Alais, & Carlile, 2008). Sounds are compressed toward the gaze. This is illustrated in figure 35.5A, showing the response
endpoint of the head turn. However, if subjects are ask to of an LIP cell to stimuli flashed to the receptive field position
and what will become the receptive field after the saccade intermingled with those that do not remap, or is there some
has been made (“future receptive field”). Note that the specific organization?
response in the current receptive field starts to reduce and Clever psychophysical studies have also demonstrated
that in the future receptive field starts to increase, long before remapping in humans (Burr & Morrone, 2005; Melcher,
the eye has actually moved to reposition the retinal image. 2005, 2007), by studying the spatial selectivity of visual after-
This is termed predictive remapping. effects. Most aftereffects are spatially selective. But is the
This phenomenon occurs not only in LIP, but also in selectivity in retinotopic or spatiotopic coordinates? By
many other visual areas, including the superior colliculus imposing a saccade between the adaptor and the test, Melcher
(Walker, Fitzgibbon, & Goldberg, 1995) and area V3 was able to show that the selectivity was both retinotopic and
(Nakamura & Colby, 2002), with area V4 showing a some- spatiotopic. The degree to which adaptation was spatiotopic
what different behavior (Tolias et al., 2001). It has even been varied with the complexity of the aftereffect. Simple adapta-
suggested that 10% of neurons in primary visual cortex (V1) tion aftereffects, like contrast (thought to be mediated by
show dynamic updating of receptive fields (Nakamura & primary visual cortex) were primarily retinotopic, while more
Colby, 2002). The origin of the phenomenon has been complex aftereffects (such as faces) were primarily spati-
studied in the frontal eye field (FEF), and firm evidence otopic; aftereffects of intermediate complexity, like the tilt
demonstrates that it is mediated by a corollary discharge aftereffect, were both retinotopic and spatiotopic.
signal, probably originating in the colliculus and mediodor- Adaptation techniques (Melcher, 2007) can also be used
sal thalamus (Sommer & Wurtz, 2002, 2006). Deactivation to reveal the dynamics of the updating, by briefly presenting
of the nucleus abolishes the predictive updating of the recep- the test just prior to a saccade (figure 35.5B). Long before
tive field. The corollary discharge signal arrives nearly the saccade, adaptation is maximal when test and adaptor
100 ms before the updating starts in the FEF, indicating the are presented to the same position, at fixation, with very little
complexity of the reorganization. adaptation at the position of saccadic target. However, when
Despite these recent efforts, there are several aspects of the test is presented perisaccadically but before the eyes have
the remapping phenomenon that remain unclear. For moved, the maximum adaptation occurs for tests near sac-
example, between the time that the neuron starts to respond cadic target, the position that will correspond to the adapted
to stimuli in the updated position and when it regains post- retina after the eyes have moved. The similarity of the time
saccadicaly retinotopic specificity, are receptive fields courses of the adaptation and the response of the LIP neuron
anchored in a transiently craniotopic map? Do the receptive strongly imply that Melcher’s experiment reveals the psy-
fields undergo changes in size during the remapping? Are chophysical counterpart of the “predictive remapping.”
the neurons that are susceptible to remapping randomly At present, it is still uncertain exactly how this transient
morrone and burr: visual stability during saccadic eye movements 517
The dynamics of the remapping receptive field is also very
similar to that of perisaccadic mislocalization (figure 35.2),
suggesting that a common mechanism could be driving all
these phenomena. However, there are several problems in
relating the two sets of data quantitatively. Within a frame-
work of labeled-line theory, a neuron that is placed in a
specific anatomical position in a cortical map will, when
stimulated, signal the presence of a stimulus at that position.
However, if it responds presaccadically to a stimulus falling
in the future receptive field (one displaced in the direction of
the saccade), it should still signal this stimulus location as
being in the normal location, that is shifted in the direction
opposite to the saccade: but the results (figures 35.2 and 35.3)
show that the primary result is the perception of a shift in
the same direction of the saccade. There are two possible
schemas to resolve this apparent contradiction. The first is
to consider that the remapped activity of the future receptive
field is the neuronal response of the corollary discharge
signal, mapped in retinal coordinates (Binda, Bruno, et al.,
2007). This activity is present only if a visual stimulus is
present; it is active only when important information needs
to be updated, reducing also the complexity of the phenom-
enon. Within this framework, the addition (fusion) of this
activity with the retinotopic activity of visual cortex could
generate the shift of apparent positions in the appropriate
direction, as we have recently demonstrated for the audio-
Figure 35.5 Predictive remapping in an LIP cell and human visual targets (figure 35.3). The other possibility is to consider
observers. (A) The response of a “remapping” cell of area LIP of that the remapped neuronal activity is not referred to the
the macaque around the time of the saccade to brief stimuli dis- exact time of the stimulus presentation but is read after the
played in the “current” (presaccadic) receptive field (open circles) saccade is complete, in a form of postdiction (Eagleman &
and to stimuli flashed in what will become its receptive field after
Sejnowski, 2000). This would also imply that perceptual time
the saccade is made. The response to stimuli in the current recep-
tive field begins to decrease before the eyes actually move. Around should be altered by saccades, as indeed it is (see below).
the same time, the response in the “future” position begins to At the time of the saccade, the timing of the neuronal
increase, long before the eyes have actually displaced the receptive response changes dramatically. In all cells of areas V3A and
field. (B ) An experiment showing analogous behavior in human FEF that remap during saccades, their remapped response
psychophysics. Subjects adapted to a tilted grating, then measured
is faster than that during fixation (Nakamura & Colby, 2002).
the aftereffect to a grating presented in the same (retinal) position
(“current,” open circles) or to the position that will correspond to Similarly, the latencies of neurons in areas MT and MST
the retinal position of the adaptor after a saccade has been made are shorter in response to real saccade than to simulated
(“future,” solid squares). Long before the saccade, there is no adap- saccades (Price, Ibbotson, Ono, & Mustari, 2005). These
tation in the future field, and there is full adaptation in the current effects have psychophysical implications: Saccades cause a
field (normalized to unity). Like the cell firing rate, adaptation compression, and even an inversion, of perceived time
effects in the current field begin to reduce, and those in the future
field begin to increase, before the eyes have actually moved. Well (Morrone, Ross, & Burr, 2005). When asked to compare the
after the saccade is terminated, the effects do not drop completely perceived duration of a temporal interval presented around
to zero, because this position corresponds to the spatiotopic posi- the time of a saccade with one presented 2 s afterward, sub-
tion of the adaptor, and orientation adaptation has a spatiotopic jects judged it much shorter, about half the duration (figure
component (Melcher, 2005). (Reproduced with permission from 35.6). Again the time course of this distortion is quite tight
Kusunoki & Goldberg, 2003, Melcher, 2007.)
and, after taking into account the duration of the stimuli,
similar to that of the spatial compression.
updating of receptive fields leads to visual stability, but it is Preliminary data (Binda, Burr, & Morrone, 2007) also
clearly important to test the future activity before the direct indicate that the perceived time at saccadic onset, measured
input will excite the neuron after the saccade. It could bridge using an auditory tone, is delayed about 100 ms, while
the perception between the two fixations, but can this phe- about 50 ms before saccade, the latency is reduced by about
nomenon explain perisaccadic mislocalizations? 20 ms, consistent with the inversion of time data and with
50
Test Probe
the fact that during the remapping neuronal latency became tion, we could provide a description of the origin of all
shorter about 40 ms, explaining the inversion. In addition, perisaccadic phenomena. The rotation in space and time of
they also indicate that stimuli that are presented at saccadic the neuronal selectivity is a concept that has strong and
onset are coincident with stimuli presented soon after sac- important analogies to the physical rotation on space and
cades, facilitating the interpretation of their position in the time that occurs in motion at relativistic speeds, discussed in
postsaccadic coordinate system. detail elsewhere (Morrone, Ross, & Burr, 2008). Unfortu-
Space and time are generally studied separately and nately, the dynamics of the changes in receptive fields during
thought of as separate and independent dimensions. However, saccades are not yet well enough described to pursue this
as we have observed, both space and time undergo severe idea much further at present.
transient distortions at the time of saccades, as objects become
compressed toward the saccadic target (Ross et al., 1997), Transsaccadic integration and craniotopic maps
and perceived temporal durations are severely shrunk
(Morrone, Ross, and Burr, 2005). As was discussed above, Our normal experience comes from the information derived
the relationship between perceptual shifts and receptive field from one fixation being transferred to the next, even when a
updating is far from clear, and compression of time and of particular object or part of the scene becomes hidden. Theo-
space is even more difficult to understand. Nevertheless, we ries about transsaccadic integration have abounded over the
can advance a few firm properties that might help to explain past decades. Early ideas (e.g., Jonides, Irwin, & Yantis, 1982)
compression. As the transient changes both in space and in assumed a “transsaccadic memory buffer” that accumulated
time follow very similar dynamics, they might well be mani- high-precision information from each saccade to construct a
festations of a common neural cause, a distortion in the detailed representation of the world (like pinning tails on
space-time metric (Morrone, Ross, & Burr, 2005). Comp- a donkey). These ideas fell out of favor, largely because of
ression of relative distances in space and time are consistent the implicit implication that the visual system must construct
with a reduction of spatial and temporal sampling. This some form of stable Cartesian theater to be viewed by
is also one of the few concepts that would explain the peri- a homunculus. More recent theories have swung to the
saccadic increase in sensitivity for size (Santoro, Burr, & opposite extreme, assuming that perceptual stability depends,
Morrone, 2002) and duration (Morrone, Ross, & Burr, 2005) paradoxically, on the lack of internal representation of the
judgments. world (O’Regan & Noe, 2001). Observers are largely insensi-
If together with the undersampling, the receptive field tive to transsaccadic changes in the visual scene, questioning
becomes transiently oriented in space-time such that stimuli how much detailed visual information can be gleaned by
presented before the saccade and near the fixation are inte- making an eye movement on demand; many have assumed
grated with stimuli presented later for position far from fixa- that no visual memory is necessary at all (Findlay & Gilchrist,
morrone and burr: visual stability during saccadic eye movements 519
2003; McConkie & Zola, 1979; Tatler, 2001). In practice, their input). Unfortunately, the exact transformation from
however, it is still necessary for the brain to know where to retinal to spatiotopic coordinates is not yet fully understood,
look for the information that it needs, since eye movements although the suggestion has been made that Bayesian fusion
are not random and are rarely wasted in natural tasks (Land, of the retinal signal with eye position signals is sufficient in
Mennie, & Rusted, 1999; Najemnik & Geisler, 2003). Thus principle to generate spatiotopic maps, probably acting via
some information about the layout of the scene and the posi- eye position–dependent modulation of the neural response,
tion of important objects must somehow be represented and also referred to as gain fields (Pouget, Deneve, & Duhamel,
accumulated across saccades. There is clear evidence showing 2002; Snyder, Grieve, Brotchie, & Andersen, 1998; Zipser
that at least three or four objects are transferred successfully & Andersen, 1988).
across saccades even in the absence of allocentric cues (Prime, Functional magnetic resonance imaging has also indi-
Niemeier, & Crawford, 2006). cated the existence of spatiotopic coding in human cortex,
Recently, Melcher and Morrone (2003) showed that both in LO (McKyton & Zohary, 2006), an area deputed to
transsaccadic integration occurs for motion signals that are the analysis of objects and in MT+ (d’Avossa et al., 2007;
individually below threshold (and hence are not perceived Goossens, Dukelow, Menon, Vilis, & van den Berg, 2006).
when presented alone). Two periods of coherent horizontal Using stimuli similar to those used by Melcher and Morrone,
motion (150 ms each) were shown successively, separated by our group has reported that the response of a portion of
sufficient time to allow for a saccadic eye movement between human MT complex varies with gaze position in a way that
them. On some blocks of trials, subjects saccaded across the is consistent with spatiotopic coding. The results are illus-
stimulus between the two motion intervals; on others, they trated in figure 35.7. With gaze fixed in the centre of the
maintained fixation above or below the stimulus. Thresholds screen, both areas V1 and MT show spatial selectivity,
were similar in the two conditions, showing that the motion responding only when the stimuli are presented to the con-
signals were temporally integrated across the saccade—but tralateral field (figures 35.7A and 35.7B). However, if the
only when the two motion signals were in the same position stimulus is fixed (in the center) and its retinal projection
in space, indicating that the brain must use a mechanism varied by varying gaze, the results are different. V1 still
that is anchored to external rather than retinal coordinates. responds only to the contralateral stimulus, but MT responds
Importantly, the methodology excluded cognitive strategies to both ipsilateral and contralateral stimuli, equally strongly.
or verbal recoding, since the motion signals presented before Further experiments suggested that MT actually shifts its
and after the saccade were each well below the conscious receptive fields to cause spatiotopic coding.
detection threshold; only by summating the two signals However, it must be pointed out that this result is cur-
could motion be correctly discriminated. rently controversial, and contrary results have been reported.
Another example of craniotopic mechanisms is the dem- Gardner, Merriman, Movshon, and Heeger (2008) report
onstration of spatially specific adaptation of event-time that under the conditions of their experiment, the response
(Burr, Tozzi, & Morrone, 2007), showing that adaptation to of MT is retinotopic rather than spatiotopic. One interesting
a fast-moving (20 Hz) spatially localized grating decreases difference between the two studies is that in Gardner and
the apparent duration of gratings that are presented to that colleagues’ experiment (but not in d’Avossa and colleagues’
part of the visual field (in external space) but not to other experiment) attention was directed toward the fovea. We
spatial locations. have recently replicated the conditions of their experiment
Because of the spatial selectivity of individual neurons, the and shown that when attention is withdrawn from the
response of primary and secondary visual cortex forms a map stimulus, the spatiotopic mapping changes to a retinotopic
(Morgan, 2003), similar in principle to that imaged on the mapping (Crespi et al., 2009). Why attention should be nec-
retinae (except for distortions due to magnification of central essary for the remapping is far from clear, but this suggests
vision). This retinotopic representation, which changes com- the operation of normalizing gain control. Fully understand-
pletely each time the eyes move, forms the input for all ing this mechanism will be an interesting future challenge.
further representations in the brain. So a major question is The fact that spatiotopic (or at least craniotopic) coding
how this retinotopic representation becomes transformed is more common in the dorsal area might suggest that it
into the spatiotopic representation that we perceive, anchored could be used for the action system. As was discussed above,
in stable real-world coordinates. the action system seems to update spatial maps much later
Electrophysiological studies have shown that neurons than the perceptual system does. Perhaps the updating of
in specific areas of associative visual cortex, including V6 craniotopic maps takes time but leads to more robust coding
(Galletti, Battaglini, & Fattori, 1993) and VIP (Duhamel, of information, explaining the resistance of this system to
Bremmer, BenHamed, & Graf, 1997), do show the spati- saccadic mislocalization. The perceptual system, on the
otopic selectivity that we would expect to exist; their tuning other hand, might operate not with a complete map anchored
is invariant of gaze, unlike areas V1 and V2 (that provide in external coordinates but with ensembles of neurons with
morrone and burr: visual stability during saccadic eye movements 521
Bockisch, C., & Miller, J. (1999). Different motor systems use Deubel, H., Schneider, W. X., & Bridgeman, B. (1996). Postsac-
similar damped extraretinal eye position information. Vis. Res., cadic target blanking prevents saccadic suppression of image
39, 1025–1038. displacement. Vis. Res., 36, 985–996.
Bodis-Wollner, I., Bucher, S. F., & Seelos, K. C. (1999). Cortical Deubel, H., Schneider, W. X., & Bridgeman, B. (2002). Trans-
activation patterns during voluntary blinks and voluntary sac- saccadic memory of position and form. Prog. Brain Res., 140,
cades. Neurology, 53, 1800–1805. 165–180.
Bridgeman, B., Hendry, D., & Stark, L. (1975). Failure to detect Diamond, M. R., Ross, J., & Morrone, M. C. (2000). Extraretinal
displacement of visual world during saccadic eye movements. control of saccadic suppression. J. Neurosci., 20, 3442–3448.
Vis. Res., 15, 719–722. Dodge, R. (1900). Visual perception during eye movements.
Bridgeman, B., Lewis, S., Heit, G., & Nagle, M. (1979). Relation Psychol. Rev., 7, 454–465.
between cognitive and motor-oriented systems of visual position Duhamel, J., Bremmer, F., BenHamed, S., & Graf, W. (1997).
perception. J. Exp. Psychol. Hum. Percept. Perform., 5(4), 692–700. Spatial invariance of visual receptive fields in parietal cortex
Bristow, D., Haynes, J. D., Sylvester, R., Frith, C. D., & Rees, neurons. Nature, 389, 845–848.
G. (2005). Blinking suppresses the neural response to unchanging Duhamel, J. R., Colby, C. L., & Goldberg, M. E. (1992). The
retinal stimulation. Curr. Biol., 15(14), 1296–1300. updating of the representation of visual space in parietal cortex
Bruno, A., Brambati, S. M., Perani, D., & Morrone, M. C. by intended eye movements. Science, 255(5040), 90–92.
(2006). Development of saccadic suppression in children. Eagleman, D. M., & Sejnowski, T. J. (2000). Motion integration
J. Neurophysiol., 96(3), 1011–1017. and postdiction in visual awareness. Science, 287(5460),
Burr, D., & Morrone, M. C. (2005). Eye movements: Building 2036–2038.
a stable world from glance to glance. Curr. Biol., 15(20), Findlay, J. M., & Gilchrist, I. D. (2003). Active vision: The psychology
R839–R840. of looking and seeing. Oxford, UK: Oxford University Press.
Burr, D., Tozzi, A., & Morrone, M. C. (2007). Neural mecha- Fischer, B., Biscaldi, M., & Gezeck, S. (1997). On the develop-
nisms for timing visual events are spatially selective in real-world ment of voluntary and reflexive components in human saccade
coordinates. Nat. Neurosci., 10(4), 423–425. generation. Brain Res., 754(1–2), 285–297.
Burr, D. C., Holt, J., Johnstone, J. R., & Ross, J. (1982). Selective Galletti, C., Battaglini, P. P., & Fattori, P. (1993). Parietal
depression of motion sensitivity during saccades. J. Physiol., 333, neurons encoding spatial locations in craniotopic coordinates.
1–15. Exp. Brain Res., 96, 221–229.
Burr, D. C., Morgan, M. J., & Morrone, M. C. (1999). Saccadic Galletti, C., & Fattori, P. (2003). Neuronal mechanisms for
suppression precedes visual motion analysis. Curr. Biol., 9, detection of motion in the field of view. Neuropsychologia, 41(13),
1207–1209. 1717–1727.
Burr, D. C., & Morrone, M. C. (1996). Temporal impulse Gardner, J. L., Merriam, E. P., Movshon, J. A., & Heeger, D. J.
response functions for luminance and colour during saccades. (2008). Maps of visual space in human occipital cortex are reti-
Vis. Res., 36, 2069–2078. notopic, not spatiotopic. J. Neurosci., 28(15), 3988–3999.
Burr, D. C., Morrone, M. C., & Ross, J. (1994). Selective sup- Goodale, M. A., & Milner, A. D. (1992). Separate pathways for
pression of the magnocellular visual pathway during saccadic eye perception and action. Trends Neurosci., 15, 20–25.
movements. Nature, 371, 511–513. Goossens, J., Dukelow, S. P., Menon, R. S., Vilis, T., & van den
Burr, D. C., Morrone, M. C., & Ross, J. (2001). Separate visual Berg, A. V. (2006). Representation of head-centric flow in the
representations for perception and action revealed by saccadic human motion complex. J. Neurosci., 26(21), 5616–5627.
eye movements. Curr. Biol., 11(10), 798–802. Hallett, P. E., & Lightstone, A. D. (1976a). Saccadic eye move-
Burr, D. C., & Ross, J. (1982). Contrast sensitivity at high veloci- ments towards stimuli triggered by prior saccades. Vis. Res., 16(1),
ties. Vis. Res., 23, 3567–3569. 99–106.
Cai, R. H., Pouget, A., Schlag-Rey, M., & Schlag, J. (1997). Hallett, P. E., & Lightstone, D. (1976b). Saccadic eye move-
Perceived geometrical relationships affected by eye-movement ments to flashed targets. Vis. Res., 16, 107–114.
signals. Nature, 386, 601–604. Hansen, R. M., & Skavenski, A. A. (1977). Accuracy of eye posi-
Campbell, F. W., & Wurtz, R. H. (1978). Saccadic ommission: tion information for motor control. Vis. Res., 17(8), 919–926.
Why we do not see a greyout during a saccadic eye movement. Hansen, R. M., & Skavenski, A. A. (1985). Accuracy of spatial
Vis. Res., 18, 1297–1303. locations near the time of saccadic eye movments. Vis. Res., 25,
Crespi, S., Biagi, L., Burr, D. C., d’Avossa, G., Tosetti, M., & 1077–1082.
Morrone, M. C. (2009). Spatial attention modulates the spatio- Harris, L. R., & Lieberman, L. (1996). Auditory stimulus detection
topicity of human MT complex. Perception, 38 (ECVP Abstract is not suppressed during saccadic eye movements. Perception,
Supplement). 25(8), 999–1004.
d’Avossa, G., Tosetti, M., Crespi, S., Biagi, L., Burr, D. C., & Holt, E. B. (1903). Eye movements and central anaesthesia.
Morrone, M. C. (2007). Spatiotopic selectivity of BOLD Psychol. Rev., 4, 3–45.
responses to visual motion in human area MT. Nat. Neurosci., Honda, H. (1989). Perceptual localization of visual stimuli flashed
10(2), 249–255. during saccades. Percept. Psychophys., 46, 162–174.
Dassonville, P., Schlag, J., & Schlag-Rey, M. (1992). Oculomo- Honda, H. (1991). The time courses of visual mislocalization
tor localization relies on a damped representation of saccadic eye and of extra-retinal eye position signals at the time of vertical
movement displacement in human and nonhuman primates. Vis. saccades. Vis. Res., 31, 1915–1921.
Neurosci., 9, 261–269. Ibbotson, M., Crowder, N., Cloherty, S., Price, N., &
Dassonville, P., Schlag, J., & Schlag-Rey, M. (1995). The use Mustari, M. (2008). Saccadic modulation of neural responses:
of egocentric and exocentric location cues in saccadic program- Possible roles in saccadic suppression, enhancement, and time
ming. Vis. Res., 35, 2191–2199. compression. J. Neurosci., 28, 10952–10960.
morrone and burr: visual stability during saccadic eye movements 523
Sommer, M. A., & Wurtz, R. H. (2002). A pathway in primate Trevarthen, C. B. (1968). Two mechanisms of vision in primates.
brain for internal monitoring of movements. Science, 296(5572), Psychol. Forsch., 31, 299–348.
1480–1482. Volkmann, F. C., Riggs, L. A., White, K. D., & Moore, R. K.
Sommer, M. A., & Wurtz, R. H. (2006). Influence of the thalamus (1978). Contrast sensitivity during saccadic eye movements.
on spatial visual processing in frontal cortex. Nature, 444(7117), Vis. Res., 18, 1193–1199.
374–377. von Helmholtz, H. (1866). Handbuch der Physiologischen Optik.
Sperry, R. W. (1950). Neural basis of the spontaneous optokinetic (Reprinted in J. P. C. Southall (Ed.), A treatise on physiological optics.
response produced by visual inversion. J. Comp. Physiol. Psychol., New York: Dover, 1963.)
43, 482–489. von Holst, E., & Mittelstädt, H. (1954). Das Reafferenzprinzip.
Sylvester, R., Haynes, J. D., & Rees, G. (2005). Saccades differ- Naturwissenschaften, 37, 464–476.
entially modulate human LGN and V1 responses in the presence Walker, M. F., Fitzgibbon, J., & Goldberg, M. E. (1995).
and absence of visual stimulation. Curr. Biol., 15(1), 37–41. Neurons of the monkey superior colliculus predict the visual
Tatler, B. W. (2001). Characterising the visual buffer: Real-world result of impending saccadic eye movements. J. Neurophysiol., 73,
evidence for overwriting early in each fixation. Perception, 30(8), 1988–2003.
993–1006. Woodworth, R. S. (1906). Vision and localization during eye
Thiele, A., Henning, P., Kubischik, M., & Hoffmann, K. P. movements. Psychol. Bull., 3, 68–70.
(2002). Neural mechanisms of saccadic suppression. Science, Wurtz, R. H. (2008). Neuronal mechanisms of visual stability.
295(5564), 2460–2462. Vis. Res., 48(20), 2070–2089.
Thilo, K. V., Santoro, L., Walsh, V., & Blakemore, C. (2003). Zipser, D., & Andersen, R. A. (1988). A back-propagation pro-
The site of saccadic suppression. Nat. Neurosci., 7, 13–14. grammed network that simulates response properties of a subset
Tolias, A. S., Moore, T., Smirnakis, S. M., Tehovnik, E. J., of posterior parietal neurons. Nature, 331(6158), 679–684.
Siapas, A. G., & Schiller, P. H. (2001). Eye movements modu-
late visual receptive fields of V4 neurons. Neuron, 29(3),
757–767.
abstract A variety of experimental studies suggest that sensory estimator. In the second half, I will ask how biological
systems are capable of performing estimation or decision tasks at systems might go about computing optimal estimates. This
near-optimal levels. In this chapter, I explore the use of optimal is not intended as a complete review of this rich multidisci-
estimation in describing sensory computations in the brain. I define
what is meant by optimality and provide three quite different
plinary topic, and I apologize in advance to the many authors
methods of obtaining an optimal estimator, each based on different whose important contributions I have neglected to mention.
assumptions about the nature of the information that is available Instead, my purpose is to clarify and resolve a number of
to constrain the problem. I then discuss how biological systems myths and misunderstandings about optimal estimation and
might go about computing (and learning to compute) optimal to offer a personal perspective on the relationship between
estimates.
these concepts and the design and function of biological
sensory systems.
The brain is awash in sensory signals. How does it interpret Definition and formulations of optimal estimation
these signals so as to extract meaningful and consistent infor-
mation about the environment? Many tasks require estima- A common problem for systems that must interact with the
tion of environmental parameters, and there is substantial world (including both biological organisms and human-
evidence that the system is capable of representing and made devices) is that of obtaining estimates of environmental
extracting very precise estimates of these parameters. This properties, x, from sensory measurements, m. An estimator is
is particularly impressive when one considers that the simply a deterministic function, f(m), that maps measure-
brain is built from a large number of low-energy, unreliable ments to values of the variable of interest. If x is a binary
components, whose responses are affected by many extrane- variable, then the estimator reduces to a decision function.
ous factors (e.g., temperature, hydration, blood glucose and Generally, the measurements are assumed to be corrupted
oxygen levels). by noise, which could arise from a number of sources,
The problem of optimal estimation has been well studied including the signal itself (e.g., the quantization of light into
in the statistics and engineering communities, in which a photons, when one is interested in knowing the light inten-
plethora of tools have been developed for designing, imple- sity), the transduction mechanism, or variability within the
menting, calibrating, and testing such systems. In recent neurons that are transmitting and computing with this infor-
years, many of these tools have been used to provide bench- mation (see Faisal, Selen, & Wolpert, 2008), for a recent
marks or models for biological perception. Specifically, the review of noise in the nervous system).
development of signal detection theory led to widespread use Our primary question is: How does an organism select
of statistical decision theory as a framework for assessing and implement a good estimator or (more optimistically) the
performance in perceptual experiments. More recently, best estimator? To address this, we will have to state explic-
optimal estimation theory (in particular, Bayesian estima- itly what we mean by best. The traditional statistical formula-
tion) has been used as a framework for describing human tion of the best estimator is the one that minimizes the
performance in perceptual tasks. average value of a predefined loss (cost) function, L(x, f(m)).
In this chapter, I will explore the use of optimal estimation The loss function specifies the cost of generating an esti-
in describing sensory computations in the brain. In the first mated value of f(m) when the true value is x. It is generally
half, I will define what I mean by optimality and will develop assumed to be positive and equal to zero only when the
three quite different formulations for obtaining an optimal estimate is equal to the true value.
eero p. simoncelli Center for Neural Science and Courant Regression Formulation Suppose we wanted to build a
Institute of Mathematical Sciences, New York University, New machine that could perform optimal estimation of x, given a
York, New York noisy measurement m.1 We can imagine “training” this
f(m)
m
0 2
−2 0
0 2 4 −2 0 2
x m
C
frequency
0 2 4
x
Figure 36.1 Regression formulation of the optimal estimation ments back to estimated signal values. The optimal estimator (solid
problem, illustrated for a one-dimensional signal and measure- line) does this so as to minimize a specified loss function. Note that
ment. (A) The measurement process (also known as the encoding this need not be (and is generally not) the inverse of the average
process). We assume a set of data pairs (plotted points), {xn, mn}, measurement function (dashed line). Note also that the optimal
indexed by n ∈ [1, 2, . . . , N ], representing true signal values and estimator will depend on the signal values that are included in the
associated noisy measurements. The dashed line indicates the data set, which are summarized by the histogram shown in panel
average measurement as a function of the true signal value. (B) The C. (See color plate 49.)
estimation (or decoding) process. The estimator f(m) maps measure-
machine by showing it many signal-measurement pairs, Of course, the precision with which we can constrain the
{xn, mn}. Typically, we imagine that each measurement arises function f depends on how much data we have. Loosely
from its associated true value through some sort of noisy trans- speaking, the usual approach is to restrict f to be sufficiently
formation. Figure 36.1A illustrates such a set of training data. simple (e.g., smooth, or defined by a small number of param-
An estimator, f, attempts to invert the measurement eters) that the available data will constrain it properly. For
process, mapping measurements m back to signal values x example, the estimator shown in figure 36.1B was computed
(figure 36.1B). This mapping is deterministic: Each measure- by binning the data (as a function of m) and computing the
ment leads to a unique estimate. But if we hold the signal best estimate value for each bin. More formally, we might
value fixed and make a set of estimates (each arising from a specify a restricted set of possible functions (denoted F ) from
different measurement), these estimates will fluctuate because which the solution will be selected.2 Finally, note that the
of the variability in the underlying measurements. The solution we obtain will depend on the distribution of data. If
optimal estimator is the one that minimizes the average loss the set of training examples includes many x values clustered
over these examples: in a particular region of the space, then the average loss will
1
f opt = arg min ∑ L ( xn , f ( mn ) ) contain many terms from that region, and the optimization
f Ν process will thus attempt to reduce the estimation errors
n
We will refer to this as a “regression” estimator; a special there, typically at the expense of larger errors elsewhere.
case is the linear regression solution, which arises when L is This suggests that the training examples should be selected
the squared error. An example optimal estimator is indi- to represent the distribution of values that might be encoun-
cated by the solid line in figure 36.1B. Note that this trans- tered in the environment.
formation is not the same as the inverse of the transformation The regression formulation is appealing because it is
to average measurements (i.e., the inverse of the dashed line simple and intuitive. Its primary limitation is that it requires
shown in figure 36.1A). supervised training. That is, obtaining an optimal estimator
Supervised learning for estimation and classification prob- Unlike the regression formulation, which is written directly
lems has been well studied. A standard example is the in terms of data, the probabilistic formulation is written
problem of learning an input-output relationship with a sim- in terms of a continuous probability density. Since this for-
plified network of artificial neurons, for which the optimal mulation effectively results from assuming infinite amounts
solution may be obtained by backpropagation (essentially, a of data, the smoothness constraint that was necessary
form of stochastic gradient descent on the objective func- for selecting an estimator in the regression case is now
tion). But this requires large amounts of data, especially optional.
when learning multidimensional functions. The probabilistic objective function may be simplified by
From the biological/behavioral perspective, a fully super- using the definition of conditional probability to rewrite the
vised training paradigm also seems implausible. Although joint density as a product of the marginal density of m and
most organisms absorb enormous amount of sensory data the conditional density of x given m (known as the posterior
during their lifetimes, the information that they receive distribution):
regarding “correct” answers would seem to be relatively f opt = arg min ∫ P (m )∫ P (x m ) L (x , f (m )) dxdm
f
sparse. For example, consider the problem of estimating the
If the estimator is unrestricted, then we may ignore the outer
distance to a nearby object on the basis of visual input. We
integral and optimize the estimator separately for each mea-
can compare our estimate to the one that is obtained by
surement value:
reaching out and touching the object. But the amount of this
kind of feedback we receive seems vastly insufficient to train f opt (m ) = arg min ∫ P (x m ) L (x , f (m )) dx
f
the enormous cascade of neurons that are involved in esti-
That is, for each measurement, the best estimate is the one
mating distances from visual input. Similarly, optimization
that minimizes the expected value of the loss function over
through natural selection (with surviving organisms passing
the posterior distribution for that measurement.
preferred solutions to their offspring genetically) seems
Finally, the posterior distribution may be rewritten in
implausible, both because of the time required and because
terms of densities that are more naturally associated with the
genetic material seems unlikely to contain sufficient informa-
process from which the data arise. Specifically, we can the
tion to encode even a fraction of the detailed connectivity of
describe measurement noise using a conditional probability
those neurons.
P(m⎪x). This measurement density expresses the probability of m
Instead, it seems that evolution has endowed the brain
for each value x of the signal. If we think of it the other way
with powerful capabilities for unsupervised learning (based on
around, holding the measurement fixed and reading off a
noisy measurements alone) and that this is used to supple-
function of the signal, this is known as a likelihood function.
ment and bolster the supervised learning that may be
Now Bayes’ rule can be used to express the posterior in terms
used in the relatively infrequent cases for which the correct
of the measurement density and the prior distribution P(x),
answers are known. Unsupervised learning is a heavily
which expresses the probability of occurrence of value x in
studied topic in machine learning (e.g., Hinton & Sejnowski,
the world:
1999), and methods have been developed for learning
P (m x ) P (x )
patterns in data, mostly for purposes of optimal coding or f opt ( m ) = arg min ∫ L ( x , f ( m ) ) dx (1)
clustering/categorization. Perhaps less well known is the fact f P (m )
that optimal estimators may also be written in unsupervised An example of the Bayesian solution, based on the same
form. To explain this, I will turn first to a probabilistic for- distributions that were used to generate the data in figure
mulation of the problem. 36.1, is illustrated in figure 36.2. To provide some intuition,
it is worth mentioning several well-known special cases.
Probabilistic (Bayesian) Formulation When we describe
optimality in terms of minimizing an objective function over Quadratic error (least squares) solution The most common case
a training data set, we usually have in mind that this set is used in the engineering community is the least squares loss
representative of future data we will encounter. This notion function, L(x, f(m)) = (x − f(m))2. In this case, the optimal esti-
may be formalized by describing both the training and mate (which can be derived by differentiating the objective
future data as samples randomly drawn from a common function and setting equal to zero) is simply the mean of the
probability distribution. The law of large numbers tells us posterior:
that as the number of data pairs grows, the original regression
objective function will converge to the expected value (mean) f LS (m ) = ∫ xP (x m ) dx
of the loss function, integrated over all possible combinations It is worth mentioning that in the special case of a jointly
of x and m: Gaussian probability density over signal and measurement,
f(m)
m
0
2
1
−2
1 2 3 4 −2 0 2
x m
C
p(x)
1 2 3 4
x
Figure 36.2 Bayesian formulation of the optimal estimation posterior density, P(x⎪m). The solid line indicates the mean of the
problem. (A) The measurement density, P(m ⎪x), shown as a gray- density, and the dashed line indicates the (inverted) mean of the
scale image, where intensity indicates log probability. The dashed measurement density in panel A. (C) the prior density, P(x). (See
line indicates the mean of the density as a function of x. (B) The color plate 50.)
this solution turns out to be a linear function of the measure- • The measurement density, P(m⎪x), which represents
ment (the solution is the same as that of our next example). the (probabilistic) relationship between the signal and
measurement
Linear estimator, quadratic error Now consider what happens • The loss function, L(x, f(m)), which represents the cost of
when the estimator is restricted to be a linear function of the making errors
measurement. The linear least squares solution is • The family of functions F from which the estimator is
σ xm to be chosen. (This ingredient might not be required for the
f LLS (m ) = m
σ mm Bayesian solution, which effectively operates under condi-
tions of infinite data.)
The linear solution relies only on the cross-correlation
between signal and measurement and between the measure- Note that although the regression solution of the previous
ment and itself, and not on full knowledge of the posterior seciton was developed directly from pairs of input-output
density. This result extends naturally to multidimensional data, it is also implicitly relying on these same ingredients.
inputs or outputs. Specifically, it is effectively based on the joint probability
density of signal and measurement, which is equivalent to
Maximum probability solution Suppose that the loss function the product of the prior likelihood. And as was stated in the
penalizes all errors equally except for the correct answer previous section, it also requires the specification of a loss
(which incurs no penalty). Then the solution is the maximum function and a family of functions from which the solution
of the posterior density, known as the maximum a posteriori is to be drawn.
(MAP) estimator: It is worth emphasizing the most obvious implication of
f MAP( m ) = arg max P ( x m ) this ingredient list, since it is often misunderstood. Optimal-
x
ity is not a fixed universal property of an estimator but
In summary, the probabilistic formulation expresses the one that depends on each of these defining ingredients;
estimation problem in terms of four natural ingredients: statements about optimality that do not fully specify the
• The prior, P(x), which represents the probability of ingredients are therefore relying on hidden assumptions. For
encountering different signal values in the world example, many authors assume that optimality implies that
Optimal estimation in the brain an “ideal observer” model (e.g., Barlow, 1980; Geisler, 1989;
Kersten, 1990; Knill, Field, & Kersten, 1990). A number of
In this section, we ask how the optimal estimation formula- reviews document the activity to date (Knill & Richards,
tions developed in the previous section can be used in model- 1996; Maloney, 2002; Mamassian, Landy, & Maloney,
ing biological sensory systems and how these models can be 2002; Kersten, Mamassian, & Yuille, 2004; Körding, 2007),
tested experimentally. These questions can be addressed at and this endeavor has been expanded by recent activity in
many levels, and in this short chapter, I will not attempt to “neuroeconomics,” a cross-disciplinary enterprise that aims
provide a complete overview. Rather, I will describe few pub- to characterize decision-making and more general behavioral
lished results and try to explain what I see as some of the more processes with respect to prior probabilities and reward
important challenges that we currently face in this endeavor. contingencies (Glimcher, Camerer, Poldrack, and Fehr,
The concept that sensory perception arises through the 2008).
fusion of incoming sensor measurements with one’s prior What does it mean to say that a human subject is perform-
experience is often attributed to Hermann von Helmholtz ing optimally? As I have emphasized in the first part of this
(1925). Although his descriptions are qualitative and do not chapter, the definition of the word optimal requires specifica-
mention noise or loss functions, they do capture the essence tion of a set of ingredients: the measurement probability, the
of the Bayesian formulation described in the previous section. prior, the loss function, and (in some cases) a family of esti-
This interpretation of perception seems to have lain dormant mators. Specifying these ingredients for a human observer
from von Helmholtz’s day until the 1950s, when E. T. performing a particular task is often difficult or impossible.
Jaynes, a statistically minded physicist, submitted an article For example, specifying the measurement probability
to IRE Transactions on Information Theory, in which he proposed requires knowledge of how the signal of interest is repre-
that Bayesian estimation might be used as a framework for sented within the brain (including a specification of the
modeling sensory transformations ( Jaynes, 1957). The noise). In some experiments, investigators have incorporated
journal rejected the article (on the grounds that it was too noise into the stimulus, which can provide insights into the
speculative), and the concept appears to have lain dormant properties of internal noise (and thus the measurement prob-
for another 30 years! In the interim, perceptual psychologists ability) (e.g., Pelli & Farell, 1999; Körding & Wolpert, 2004).
began using signal detection theory as a framework for ana- The specification of an appropriate family of estimators
lyzing psychophysical data (Green & Swets, 1966) and for should be determined by the set of computations that can
providing an upper bound on performance. This methodol- potentially be performed by neurons, but we currently lack
ogy often does not include explicit loss functions and rarely a detailed description of this set.
includes a prior, but the formalization nevertheless repre- The loss function can pose more substantial difficulties.
sents an important step toward the optimal estimation Subjects may differ inherently in the way they behave in an
framework. experimental situation (e.g., consider personality traits such
as risk aversion versus thrill-seeking). Even in cases in which
Perceptual Bayesianism In the 1980s and 1990s, there the investigator attempts to control for this by building a loss
was a dramatic revival of the Bayesian methodology across function directly into the design of the experiment (for
many fields, and perceptual science was one of these. A example, by paying/penalizing subjects for correct/incor-
variety of experiments have aimed to test optimality of rect answers), one does not know a priori whether or how
human estimation judgments by comparing performance to the subject will learn and internalize these costs, what type
by each neuron). Or subsequent stages could operate on the The first term is a sum of the observed spike counts, weighted
posterior information, postponing the explicit determination by the log tuning curve value of each neuron (Zhang et al.,
of an estimate until it is needed. In either case, the prior in 1998; Jazayeri & Movshon, 2006). The second term is the
this model can be adjusted by changing the gain on each of sum of the tuning curves, and the third is the (negative) log
the neurons. of the prior. Much of the previous work on population
The explicit representation of uncertainty, through the coding has focused on the special case of orientation repre-
breadth and shape of population responses, and the pos- sentation in V1 neurons or selectivity for motion direction
sibility of linear readout rules are conceptually appealing in MT neurons, and in these cases the prior over orientation
features of this framework. But a detailed model of this form is typically assumed to be constant, as is the sum of the
needs to address the inconsistency of directly representing tuning curves (e.g., Zemel et al., 1998; Jazayeri & Movshon,
probability values with neural responses that are noisy (e.g., 2006). These assumptions allow one to ignore the last two
Sahani & Dayan, 2003). In addition, the responses of many terms, and the resulting log-posterior objective function
visual neurons do not seem consistent with direct representa- reduces to a simple weighted sum of spike counts, consistent
tion of posterior probability. For example, the shape of ori- with earlier proposals for linear readout (Bialek et al., 1991;
entation tuning curves in area V1 neurons is preserved Anderson & van Essen, 1994; Rieke et al., 1997). Note that
under changes in stimulus contrast. But lowering the stimu- later stages of processing (i.e., the estimator) presumably
42 todorov 613
43 rizzolatti, fogassi,
and gallese 625
44 grafton, aziz-zadeh,
and ivry 641
Introduction
scott t. grafton and emilio bizzi
abstract A broad variety of motor plans and concomitant complexity, given the many different ways in which one may
control actions can be expressed by the superposition of force fields have to activate the muscles to reach the same point in
representing the mechanical effects of motor synergies. This prin- space.
ciple of superposition may lead to the execution of motor plans by
controlling the nonlinear dynamics of the body in the presence of A basic concept in many fields of science, including
redundant muscles and degrees of freedom. systems-level neuroscience, is the concept of a coordinate
system (Bishop & Goldberg, 1980). A coordinate system is a
system of numbers that, taken together, identify the location
From movement planning to execution of a point in space. The space could be the ordinary three-
dimensional space in which we move, or it could be an
A critical issue in the generation of motor behavior concerns abstract space with a larger or even infinite number of
the hierarchical organization of movement planning and dimensions that may be placed in correspondence with a
movement execution. This concept is derived from engi- physical system. For example, the posture of a marionette
neering notions of modular control by which the problem of may be represented by specifying each of its joint angles on
movement is decomposed into subproblems that can be a separate axis. Thus, the joint angles provide a coordinate
addressed separately. One of the great complications of system for the marionette. The state of a biological system,
the movement control problem is that any given goal can be such as the human arm, is described within the nervous
reached by a multiplicity of means. If the goal is to move system by the collection of neural activities that constitute
the hand from point A to point B, a variety of paths can be incoming sensory signals and outgoing motor commands.
chosen; a variety of trajectories in joint space can be utilized Although there are several possible coordinate systems—
to realize the path. As an example, consider the simple task actually an infinite number—to describe different sensory
of reaching for a glass of water on a table. To reach for it, and motor signals, these coordinate systems fall quite natu-
the brain must generate a temporal sequence of activations rally into three classes: neuromuscular coordinates, joint
of the arm muscles. The pattern of neural impulses that coordinates, and endpoint coordinates.
controls the contraction of each muscle can be thought of as
“coordinates” in an abstract geometrical space (Holdefer & Endpoint Coordinates Endpoint coordinates are ap-
Miller, 2002). In this space, the goal of reaching the glass propriate for describing the goal of an action and the
can be represented as a point whose coordinates are the interaction with the environment. These coordinates may
muscle activations needed to perform the appropriate reach. capture the highly regular properties of reaching behavior
What happens to these motor coordinates—and to the goal when the location of the hand is rendered in Cartesian space
of reaching the glass in the space of muscle activities—if our (Morasso, 1981; Soechting & Lacquaniti, 1981). Morasso
body moves to another position, such as from standing to (1981) instructed human subjects to point with one hand to
sitting? To reach the glass, the arm must now move in a different visual targets that were randomly activated (figure
different way, and the muscles must be driven by different 37.1). His analysis of the movements showed two kinematic
commands. As a consequence, the coordinates of the goal invariances: (1) The hand trajectories were approximately
of reaching the glass in the space of muscle activities are straight segments, and (2) the speed profile or tangential
changed. This is what mathematicians call a coordinate trans- velocity of the hand for different movements always appeared
formation, a computation that can be of quite considerable to have a bell-shaped configuration, as the time needed to
accelerate the hand was approximately equal to the time
needed to bring it back to rest. Because these simple and
emilio bizzi McGovern Institute for Brain Research, Massachusetts
Institute of Technology, Cambridge, Massachusetts invariant features were detected at a different shoulder and
ferdinando a. mussa-ivaldi Department of Physiology, elbow angles, these results suggest that planning by the
Northwestern University Medical School, Chicago, Illinois central nervous system (CNS) takes place in terms of hand
relatively small, and the timing between W3 and W4 is in response to unexpected perturbations. They placed an
reversed. Swimming (third column), in contrast, is domi- obstacle in the path of the leg and showed that when the leg
nated by W2 and to a lesser extent by W5. hit the obstacle, the added synergy was one of the sets of six
The examples illustrated by figure 37.3 demonstrate two previously identified synergies.
important points: (1) that the same synergies are found in Other investigators have generated corroborative evidence
different behaviors and (2) that different behaviors may be for modular organization in cats (Lemay, Galagan, Hogan, &
constructed by combining the same synergies with different Bizzi, 2001; Ting & Macpherson, 2005; Krouchev, Kalaska,
timing and amplitude. & Drew, 2006; Torres-Oviedo et al., 2006) and the turtle
The examples illustrated in figures 37.2 and 37.3 address (Stein, Oguztoreli, & Capaday, 1986; Stein, McCullough, &
the important question of whether the synergies extracted Currie, 1998). In addition, results from the study of the
by a computational procedure have biological standing. A muscle patterns during reaching in humans (d’Avella et al.,
compelling criterion is the presence of the same synergy, 2006) suggest that this is a general strategy used by all verte-
with its own internal temporal structure, in different behav- brates for simplifying the control of limb movements.
iors, as illustrated in figure 37.3. Additional evidence sup- A clear-cut example of a recombination of synergies is
porting the idea that synergies are indeed functional units from locomotion with the different limb central pattern gen-
was shown by Giszter and Kargo (2000). They showed erators (CPGs). Each CPG can operate independently, but
examples of deletions as well as of additions during leg the four limb CPGs can also be combined in different pat-
motions in the frog. Kargo and Giszter (2000) showed that terns as in a walk, a trot, or a gallop. On the basis of exten-
the spinalized frog is able to produce corrective movements sive indirect evidence, Grillner (1981, 1985) suggested that
each limb CPG can be further subdivided into unit CPGs only a few distinct types of motor outputs could be evoked
that control synergist muscles acting at each joint. It has also by either electrical (Bizzi et al., 1991) or NMDA stimulation
been proposed that these different unit CPGs or synergies (Saltiel et al., 2001). Importantly, when stimulation was
can be the independent target for the supraspinal commands applied simultaneously to two different sites in the spinal
used to design different volitional movements involving a cord, each of which when stimulated produced a different
limited set of joints (Grillner, 1985, 2006; Grillner & Zangger, motor output, the resulting motor output was a simple linear
1979). combination of the separate motor outputs (Mussa-Ivaldi,
In conclusion, the evidence provided by studies from dif- Giszter, & Bizzi, 1994; Lemay et al., 2001). In subsequent
ferent laboratories and in different species indicates that experiments, Tresch and colleagues (1999) showed that the
combining muscle synergies is a strategy that the CNS uti- motor response evoked from cutaneous stimulation of a
lizes for the construction of movements in vertebrates. particular site on the hindlimb resulted from the weighted
combination of a few muscle synergies. When Tresch and
Physiological basis of muscle synergies: Modularity in the colleagues (1999) compared the distinct muscle synergies
frog spinal motor system derived from cutaneous stimulation with the patterns of
muscle activation evoked by microstimulation of the frog
With microstimulation of the spinal interneuronal regions, spinal cord, he found that the two sets of EMG responses
Bizzi, Mussa-Ivaldi, and Giszter (1991), Giszter, Mussa- were very similar to one another. In addition, the synergies
Ivaldi, and Bizzi (1993), and Tresch and colleagues (1999) evoked by NMDA were found by Saltiel and colleagues
have provided evidence for a modular organization of the (2001) to be qualitatively similar to those described by Tresch
frog’s and rat’s spinal cord. These experiments found that and colleagues (1999). Taken together, these experiments
Figure 37.4 Coordinate transformations for planning and control Jacobian matrices. (B) The vertical arrows map motion variables
of movement in a redundant limb. (A) Kinematic and force trans- (position and velocity) onto force variables. They represent force
formations for the human arm between muscle coordinates, joint fields. On the left is the force field generated by the muscles. On
coordinates, and endpoint coordinates. Arrows indicate the direc- the right is the force field in endpoint coordinates that represents
tions in which the transformations are well posed. Abbreviations: a desired behavior. Both the endpoint field and the muscle field
l, muscle lengths; q, joint angles; r, hand position; F, hand force; have a well-defined image in joint coordinates, and the implemen-
Q, joint torque; f, muscle force. M represents the transformation tation of a desired behavior can be represented as a problem of
from joint angles to muscle lengths and L the transformation approximation.
from joint angles to hand position aM and aL are the respective
abstract What are the functions of the basal ganglia and cerebel- nates from widespread regions of the cerebral cortex, includ-
lum? It is now clear that output of the basal ganglia and cerebellum ing motor, sensory, posterior parietal, prefrontal, cingulate,
targets motor, premotor, prefrontal, posterior parietal, and infero- orbitofrontal, and temporal cortical areas. The “output
temporal areas of cortex. These connections provide the basal
ganglia and cerebellum with the anatomical substrate to influence
layer” of basal ganglia processing is represented by the inter-
not only the control of movement, but also many aspects of cogni- nal segment of the globus pallidus (GPi), the pars reticulata
tive behavior like planning, working memory, sequential behavior, of the substantia nigra (SNpr), and the ventral pallidum. The
visuospatial perception, and attention. Similarly, abnormal activity comparable structures for cerebellar processing are the three
in specific basal ganglia and cerebellar loops with the cerebral deep cerebellar nuclei: dentate, interpositus, and fastigial.
cortex may contribute to a variety of neuropsychiatric disorders,
Neurons in the output layers of both circuits send their axons
such as schizophrenia, autism, attention-deficit/hyperactivity dis-
order, and obsessive-compulsive disorder. Thus, defining the corti- to the thalamus and, by this route, project back upon the
cal targets of the basal ganglia and cerebellum provides important cortex. Thus, a major structural feature of basal ganglia and
insights into their diverse motor and nonmotor function. cerebellar circuits is that they form loops with the cerebral
cortex (e.g., Kemp & Powell, 1971; Allen & Tsukahara,
1974; Brooks & Thach, 1981). These loops were believed to
What are the functions of the basal ganglia and the cerebel- function largely in the domain of motor control. Indeed,
lum? Numerous reports describe the motor deficits asso- basal ganglia and cerebellar efferents were thought to termi-
ciated with damage to these subcortical structures. As a nate in a common region of the ventrolateral thalamus that
consequence, concepts about basal ganglia and cerebellar projected largely to the primary motor cortex (M1). Thus
function have focused primarily on their contributions to these circuits were viewed as a neural substrate for enabling
the generation and control of movement. We have used an information from a diverse set of cortical areas to influence
anatomical approach to examine the macro-organization of motor output at the level of M1. This view has been
basal ganglia and cerebellar connections with the cerebral supported by the obvious motor symptoms that can result
cortex. In this chapter, we focus on one critical question: from basal ganglia and cerebellar dysfunction (for refer-
Which cortical areas are the target of the outputs from the ences and reviews, see Brooks & Thach, 1981; DeLong &
basal ganglia and the cerebellum? The answers to this ques- Georgopoulos, 1981; Bhatia & Marsden, 1994).
tion lead to some novel and important insights about basal Over the past 20 years, an accumulation of information
ganglia and cerebellar function. about basal ganglia and cerebellar anatomy has led a number
Classically, the macro-organization of basal ganglia and of investigators to challenge this view (e.g., Schell & Strick,
cerebellar circuitry is described using a relatively simple hier- 1984; Alexander, DeLong, & Strick, 1986; Goldman-Rakic
archical model. The “input layer” of basal ganglia process- & Selemon, 1990). It is now clear that basal ganglia and
ing is represented by the striatum (caudate, putamen, and cerebellar efferents terminate in different subdivisions of the
ventral striatum). The functionally analogous level in cere- ventrolateral thalamus (for a review, see Percheron, François,
bellar circuits is represented by specific pontine nuclei that Talbi, Yelnik, & Fénelon, 1996), which, in turn, project to
send “mossy fiber” inputs to cerebellar cortex. A major a myriad of cortical areas. Thus the outputs from the basal
source of afferents to the input layers of both circuits origi- ganglia and cerebellum influence more widespread regions
of the cerebral cortex than was previously recognized.
richard p. dum Center for the Neural Basis of Cognition, Systems On the basis of these and other anatomical results,
Neuroscience Institute and the Department of Neurobiology, Alexander and colleagues (1986) proposed that the basal
University of Pittsburgh, Pittsburgh, Pennsylvania
ganglia participate in at least five separate loops with the
peter l. strick Veterans Affairs Medical Center; Center for the
Neural Basis of Cognition, Systems Neuroscience Institute and the cerebral cortex. These loops were based in part on their
Department of Neurobiology, University of Pittsburgh, Pittsburgh, cortical target from the output layer of processing and
Pennsylvania were designated the skeletomotor, oculomotor, dorsolateral
dum and strick: basal ganglia and cerebellar circuits with the cerebral cortex 553
prefrontal, lateral orbitofrontal, and anterior cingulate cir- Primary motor cortex
cuits. According to this scheme, the output of the basal
ganglia has the potential to influence not only the control of Our first experiments used retrograde transneuronal trans-
movement, but also higher-order cognitive and limbic func- port of HSV1 to examine the organization of basal ganglia
tions that are subserved by prefrontal, orbitofrontal, and and cerebellar outputs to M1 (figure 38.1) (Hoover & Strick,
anterior cingulate cortex. 1993, 1999). We injected virus into physiologically identified
Similarly, Leiner, Leiner, and Dow (1986, 1991, 1993) portions of M1 (i.e., regions where face, arm, or leg move-
suggested that cerebellar output is directed to prefrontal as ments were evoked by intracortical stimulation with currents
well as motor areas of the cerebral cortex. They noted that < 25 μA). Then we set the survival time to allow transneu-
in the course of hominid evolution, the lateral output nucleus
of the cerebellum—the dentate—undergoes a marked expan-
sion that parallels the expansion of cerebral cortex in the Pre- SMA
arm
SMA
9m CgS
frontal lobe. They argued that the increase in the size of the
dentate is accompanied by an increase in the extent of the
M1
cortical areas in the frontal lobe that are influenced by dentate leg
9l
output. As a consequence, Leiner and colleagues proposed
S
CS
IP
that cerebellar function in humans has expanded to include 46
PS 7b
involvement in certain language and cognitive tasks. 8 M1
arm
FEF
LuS
Attempts to test these proposals and map cerebellar and 12
M1
basal ganglia projections to the cerebral cortex have been AS
PMv
arm face
GPe
GPi
D
o i
M
A 15.0 A 14.2 A 13.7 A 14.0 1 mm A 14.2
Figure 38.2 Origin of pallidal projections to M1, PMv, SMA, approximate anterior-posterior location is indicated at the bottom
area 46, area 9. Representative coronal sections through the GPi of each section outline. Abbreviations: GPe, external segment of
of animals that received virus injections into different cortical areas globus pallidus; GPi, internal segment of the globus pallidus; o,
(see figure 38.1). The dots indicate the positions of neurons labeled outer portion of the interal segment of globus pallidus; i, inner
by retrograde transneuronal transport of virus. The maps display portion of the internal segment of globus pallidus. (Adapted from
labeled neurons found on two or three adjacent sections whose Middleton & Strick, 1996b.)
4
neurons in the middle of the GPi rostrocaudally. Within this
46d “Motor” region, neurons labeled after injections into the SMA, M1,
3 Pre- or PMv formed separate clusters in a dorsal-to-ventral
SMA SMA arm arrangement (figures 38.2 and 38.3). These observations
2 indicate that pallidal output is not confined to M1 but pro-
jects via the thalamus to multiple premotor areas in the
1 “Non- M1 arm frontal lobe (see also Jinnai, Nambu, Tanibuch, & Yoshida,
Motor” PMv arm 1993; Inase & Tanji, 1995; Sakai, Inase, & Tanji, 1999).
C Furthermore, the arm representation of each motor area
0 1 2 3 4 5 6 receives input from a topographically distinct set of GPi
neurons. We have proposed that this arrangement creates
D distinct “output channels” in the sensorimotor portion of
4 GPi (i) GPi (Hoover & Strick, 1993; Akkal et al., 2007).
9L
We found a similar topographic organization of output
Distance (mm)
Pre-
3 SMA neurons in the dentate. Injections of virus into the arm
46d “Motor” representations of M1, PMv, and SMA labeled clusters of
2 neurons in the middle of the dentate rostrocaudally (figures
SMA arm 38.4 and 38.5) (Middleton & Strick, 1997; Akkal et al., 2007).
1 M1 arm However, the “hotspot” of each cluster appeared to be cen-
“Non-
PMv arm tered in a slightly different region of the dentate. The hot-
Motor”
C spots for the different motor areas are shown on a single
0 1 2 3 4 5 unfolded map of the dentate (figure 38.5). This diagram
Distance (mm) emphasizes two important observations. First, the arm
representations of the PMv and SMA are the target of
Figure 38.3 Summary map of the basal ganglia output channels.
output from the dentate. Second, the output channels to the
The outer and inner segments of the GPi are shown as separate
unfolded maps (for details of unfolding, see Akkal et al., 2007). This different arm representations are clustered together in a
map provides a planar view of the rostrocaudal and dorsoventral common region of the dorsal dentate. This observation
location of output channels in each segment of the GPi. The corti- raises the possibility that the dorsal dentate contains a single
cal target of each output channel is placed at the site of the peak integrated map of the body in which the maps for output
labeling following retrograde transneuronal transport of virus from channels to different cortical areas are in register within the
that cortical area. Note that the GPi can be divided into “motor”
and “nonmotor” domains based on the grouping of the cortical nucleus. In any event, the dentate, like the GPi, contains
targets of its output channels. Abbreviations: D, dorsal; C, caudal; distinct output channels that innervate different cortical
see also figures 38.1 and 38.2. (Adapted from Akkal et al., 2007.) motor areas.
dum and strick: basal ganglia and cerebellar circuits with the cerebral cortex 555
M1arm PMvarm Area 46 Area 9
D
IP
M
1 mm
DN
Figure 38.4 Origin of cerebellar projections to M1, PMv, area to figure 38.2. Abbreviations: D, dorsal; DN, dentate nucleus; IP,
46, and area 9. Representative coronal sections through the dentate interpositus nucleus; M, medial. (Adapted from Middleton & Strick,
and interpositus nuclei of animals that received virus injections into 1996b.)
different cortical areas (see figure 38.1). Conventions are according
D
major subcortical nuclei: SNpr, the superior colliculus (SC),
and the deep cerebellar nuclei. To test this proposal, we
injected virus into physiologically identified portions of the
C FEF (i.e., regions where eye movements were evoked by
1 mm intracortical stimulation with currents < 50 μA). Neurons
Figure 38.5 Summary map of dentate output channels. The labeled by retrograde transneuronal transport were found in
dentate is displayed as an unfolded map (for details of unfolding, lateral portions of SNpr, the optic and intermediate gray
see Dum & Strick, 2003). The cortical target of each output channel layers of the SC, and ventrally in the caudal third of the
is placed at the site of the peak labeling following retrograde trans- dentate nucleus. Within the dentate, labeled neurons were
neuronal transport of virus from that cortical area. Note that the confined to its posterior pole, where some neurons exhibit
dentate can be divided into “motor” and “nonmotor” domains
based on the grouping of the cortical targets of its output channels. activity correlated with saccadic eye movements (van Kan,
Abbreviations as in figure 38.1. (Adapted from Dum & Strick, Houk, & Gibson, 1993). Within the basal ganglia, FEF injec-
2003; Akkal et al., 2007.) tions labeled neurons in a posterior and lateral region of
OLD REVISED
CORTEX SMA APA, MC, SC M1 PMv SMA PMd* CMAr* CMAd* CMAv*
F A L
Figure 38.6 The original skeletomotor circuit proposed by GPi, internal segment of globus pallidus; PUT, putamen; SNr
Alexander, DeLong, and Strick (1986) and our revised scheme. substantia nigra pars reticulata; cl, caudolateral; mid, middle; vl,
Asterisks indicate loops whose existence is suspected but not specifi- ventrolatedal. Thalamic abbreviations: VApc, nucleus ventralis
cally tested using virus transport. Cortical abbreviations: CMAd, anterior, parvocellular portion; VLcc, nucleus ventralis lateralis
dorsal cingulate motor area; CMAr, rostral cingulate motor pars caudalis, caudal division; VLcr, nucleus ventralis lateralis pars
area; CMAv, ventral cingulate motor area; M1, primary motor caudalis, rostral division; VLm, nucleus ventralis lateralis pars
cortex; PMd, dorsal premotor area; PMv, ventral premotor area; medialis; VLo, nucleus ventralis lateralis pars oralis. (Adapted from
SMA, supplementary motor area. Basal ganglia abbreviations: Middleton & Strick, 2000.)
SNprCaudal SNprRostral
FEF Area TE Area 12 Area 9m Area 9l
pr
pc pc
pc pr pr
pr pr D
CC CC
pc M
CC CC CC 1 mm
pc
Figure 38.7 Origin of nigral projections to the FEF, area TE, SNpr following virus injections into the different cortical areas (see
area 12, area 9m, and area 9l. Coronal sections indicating the figure 38.1). Abbreviations: CC, crus cerebri; pc, pars compacta;
location of labeled neurons in the caudal and rostral regions of the pr, pars reticulata. (Adapted from Middleton & Strick, 1996b.)
SNpr (figure 38.7, FEF) where neurons also display changes tion are distinct from those concerned with skeletomotor
in activity related to saccadic eye movements (Hikosaka & function.
Wurtz, 1983a, 1983b). Overall, the regions of the basal
ganglia and cerebellum that were labeled after FEF injec- Prefrontal cortex
tions of virus were strikingly different from those labeled
after injections into any of the skeletomotor areas of the It is clear from the studies reviewed above that the output
frontal lobe. Thus the output channels in the basal ganglia nuclei of the basal ganglia and cerebellum have well-
and cerebellum that are concerned with oculomotor func- organized projections to skeletomotor and oculomotor areas
dum and strick: basal ganglia and cerebellar circuits with the cerebral cortex 557
of cortex. Nevertheless, substantial portions of these output spinal cord. Instead, the PreSMA is densely interconnected
nuclei do not project to cortical motor areas. This observa- with regions of prefrontal cortex. We used virus tracing to
tion raises the possibility that the remaining portions of test whether basal ganglia and cerebellar projections to the
these output nuclei target nonmotor areas of cortex. Because PreSMA originate from the motor or nonmotor domains of
of prior suggestions that the basal ganglia and cerebellum the GPi and the dentate (Akkal et al., 2007). We found that
influence some of the cognitive operations that are normally the output channel in the GPi that projects to the PreSMA
thought to be subserved by the frontal lobe (Alexander is located dorsally in a rostral portion of the nucleus (figure
et al., 1986; Leiner et al., 1986, 1991, 1993), we used virus 38.3). The output channel in the dentate that projects to the
tracing to test whether basal ganglia and cerebellar projec- PreSMA is located in a ventral part of the nucleus (figure
tions to prefrontal cortex provide an anatomical substrate 38.5). Thus the output channels to the PreSMA in both the
for this influence. GPi and the dentate are adjacent to output channels that
Our experiments focused on subfields within areas 9, 12, project to regions of prefrontal cortex rather than near
and 46 of the prefrontal cortex (Middleton & Strick, 1994, output channels to the cortical motor areas (figures 38.3 and
2001, 2002). Each of these areas appears to be involved 38.5). These observations provide further support for the
in aspects of “working memory” and is thought to guide proposal that the PreSMA is more similar to regions of pre-
behavior based on transiently stored information rather frontal cortex than it is to a cortical motor area (Picard &
than immediate external cues (for reviews, see Passingham, Strick, 2001; Akkal et al., 2007).
1993; Goldman-Rakic, 1996; Fuster, 1997). Virus injections
into area 9, 12, or 46 labeled many neurons in the output Posterior parietal cortex
nuclei of the basal ganglia (figures 38.2, 38.3, and 38.7).
Injections into area 12 labeled neurons in a localized portion Areas 5 and 7 in the posterior parietal cortex are known to
of SNpr. In contrast, injections into area 46 labeled neurons project to the input stage of basal ganglia and cerebellar
largely in GPi. Area 9 injections labeled neurons in both the processing (e.g., Kemp & Powell, 1971; Glickstein, May, &
SNpr and GPi. The topographic nature of basal ganglia Mercier, 1985; Cavada & Goldman-Rakic, 1991; Yeterian
projections to prefrontal cortex is further emphasized by the & Pandya, 1993; Schmahmann & Pandya, 1997). These
finding that different regions within the rostral SNpr project connections led us to ask whether the posterior parietal
to medial and lateral portions of area 9 (figure 38.7, areas cortex is a target of basal ganglia and cerebellar output
9m and 9l). In all cases, the locations of the neurons labeled (Clower, West, Lynch, & Strick, 2001; Clower, Dum, &
in the GPi and SNpr after injections into prefrontal areas of Strick, 2005). Our results demonstrate that a portion of area
cortex are different from the locations of neurons labeled 7b in the intraparietal sulcus is the target of output from the
after injections into motor areas of cortex. dentate nucleus, whereas a portion of area 7b on the cortical
Virus injections into areas 9 and 46 (but not area 12) surface is the target of output from the SNpr as well as from
labeled neurons in ventral regions of the dentate nucleus the dentate nucleus. These results clearly indicate that the
(figures 38.4 and 38.5). The neurons that were labeled after sphere of influence of basal ganglia and cerebellar output
injections into area 9 were found largely medial and caudal extends to include portions of the posterior parietal cortex.
to those labeled by injections into area 46. The ventral Space limitations do not allow us to describe the full implica-
regions of the dentate that project to these nonmotor areas tions of basal ganglia and cerebellar projections to posterior
in the frontal lobe clearly differ from the more dorsal regions parietal cortex. Instead, we will highlight two specific pro-
of this nucleus that innervate motor areas of the cortex posals about these circuits. We have suggested that the
(figures 38.4 and 38.5). Thus, both the basal ganglia and the cerebellar projection to the posterior parietal cortex may
cerebellum project via the thalamus to multiple areas of provide signals that contribute to the sensory recalibration
prefrontal cortex. Moreover, the output channels in the that occurs during some adaptation paradigms (Clower
basal ganglia and cerebellum that influence prefrontal areas et al., 2001). On the other hand, we have suggested (Clower
of cortex are separate from those that influence motor areas et al., 2005) that abnormal signals in the basal ganglia
of cortex. This observation suggests that GPi and the dentate projection to the posterior parietal cortex may contribute
can be divided into motor and nonmotor domains (figures to the visuospatial deficits that are observed in some patients
38.3 and 38.5) (Dum & Strick, 2003; Akkal et al., 2007). with basal ganglia lesions (Karnath, Himmelbach, & Rorden,
Although the presupplementary motor area (PreSMA) has 2002).
traditionally been included with the motor areas of the
frontal lobe, a number of recent observations emphasize the Inferotemporal cortex
nonmotor nature of this cortical area (for a review, see Picard
& Strick, 2001). For example, unlike the cortical motor In general, each of the cortical areas found to receive input
areas, the PreSMA does not project directly to M1 or to the from the basal ganglia or cerebellum is known to send pro-
dum and strick: basal ganglia and cerebellar circuits with the cerebral cortex 559
Functional implications the output channels that influence dorsomedial regions of
prefrontal cortex are located largely in the GPi, whereas the
Clearly, the outputs from the basal ganglia and cerebellum output channels that influence ventrolateral regions of pre-
gain access to more widespread and diverse areas of cortex frontal cortex are located largely in the SNpr (Middleton &
than was previously imagined. To date, our studies have Strick, 2002; Akkal et al., 2007). Both sets of output channels
shown that the output nuclei of the basal ganglia and cere- are separate from those that influence skeletomotor and
bellum project (via the thalamus) to skeletomotor, oculomo- oculomotor areas of cortex. Output channels within the
tor, prefrontal, and posterior parietal areas of cortex. In dentate are as topographically organized as those in the
addition, a portion of SNpr projects to inferotemporal basal ganglia, if not more so (Dum & Strick, 2003; Akkal
cortex. Thus, the anatomical substrate exists for the basal et al., 2007).
ganglia and cerebellum to influence higher-order aspects of Evidence for a segregation of function in the human GPi
cognition such as planning, working memory, sequential comes from the observation that the cognitive and motor
behavior, visuospatial perception, and attention as well as effects of pallidotomies, performed to ameliorate the symp-
skeletomotor and oculomotor function. As a consequence, a toms of Parkinson’s disease, depend significantly on the loca-
sizable component of basal ganglia and cerebellar output tion of the lesion (Lombardi et al., 2000). Lesions located in
operates outside of the domain of motor control. the most anteromedial region of the GPi, the likely origin of
Some support for this conclusion comes from recent analy- output channels to prefrontal cortex, produced the greatest
ses of the consequences of cerebellar pathology in human degree of cognitive impairment. In contrast, lesions in the
subjects. In addition to the classical motor deficits, there is intermediate region of the GPi, the likely origin of output
considerable evidence that cerebellar damage can lead to channels to motor areas of cortex, led to maximal effects on
deficits in the performance of cognitive tasks that require motor performance but produced little effect on cognition.
rule-based learning, judgment of temporal intervals, visuo- Thus the human GPi appears to have spatially separate
spatial analysis, shifting attention between sensory modali- motor and cognitive output channels.
ties, and working memory and planning (see reviews by To date, we have identified the output channels in the
Leiner et al., 1986, 1991, 1993; Botez, Botez, Elie, & Attig, basal ganglia and cerebellum to skeletomotor, oculomotor,
1989; Ivry & Keele, 1989; Fiez, Petersen, Cheney, & Raichle, prefrontal, and some posterior parietal areas of cortex. All
1992; Akshoomoff & Courchesne, 1992; Grafman et al., together, these output channels occupy approximately 70%
1992; Schmahmann, 1991, 1997; Schmahmann & Sherman, of the volume of these subcortical nuclei. This means that
1998). Many of these deficits reflect functions that are nor- the cortical targets for approximately 30% of the output
mally thought to be subserved by areas of prefrontal cortex. from the basal ganglia and cerebellum remain to be identi-
On the basis of our results, one interpretation of the origin fied. The architecture of basal ganglia and cerebellar loops
of these deficits is that they result from an interruption of with the cerebral cortex allows us to make some predictions
input to prefrontal cortex from the cerebellum. A study by about the identity of these targets. Cingulate, orbital frontal,
Fiez and colleagues (1992) provides some support for this and medial posterior parietal cortex are known to be major
interpretation. They described a patient, designated RC1, sources of input to the basal ganglia and cerebellum. Our
who had circumscribed damage to the lateral portion of his results suggest that cortical areas that project to the input
right cerebellar cortex. This patient exhibited few classical stage of the basal ganglia and cerebellum processing are the
signs of cerebellar damage but was impaired on the perfor- targets of the output stage of processing in these circuits. If
mance of specific types of rule-based language and memory this proposal is correct, then the remaining 30% of the basal
tasks. The deficits appeared on tasks that in normal subjects ganglia and cerebellar output is directed at cingulate, orbital
activate lateral portions of the cerebellar hemispheres and frontal, and medial posterior parietal areas of cortex. This
areas 9 and 46 (Petersen, Fox, Posner, Mintun, & Raichle, prediction will be tested in future experiments.
1988; Raichle et al., 1994; Fiez et al., 1996). Our anatomical The new insights gained from virus tracing have impor-
studies suggest that the portions of the cerebellum that are tant implications for hypotheses about basal ganglia and
damaged in RC1 are part of the cerebellar loop with the cerebellar contributions to normal and abnormal behavior.
prefrontal cortex (Kelly & Strick, 2003). Thus the cognitive Detailed discussions of this issue have been presented in our
deficits in RC1 may have been a consequence of interrupt- recent papers (Middleton & Strick, 1996a, 2001, 2002;
ing this circuit. Clower et al., 2001, 2005; Akkal et al., 2007); therefore only
In general, we found that basal ganglia and cerebellar some examples will be presented here. It is known that
projections to a cortical area originate from a localized abnormal activity in basal ganglia and cerebellar loops
cluster of neurons that we have termed an output channel. with motor areas of cortex results in striking disorders of
The output channels to different cortical areas display a movement. Likewise, abnormal activity in basal ganglia
surprising degree of topographic organization. For example, and cerebellar loops with nonmotor areas of the cerebral
dum and strick: basal ganglia and cerebellar circuits with the cerebral cortex 561
Gross, C. G. (1972). Visual functions of inferotemporal cortex. In liculus, and dentate nucleus demonstrated by transneuronal
R. Jung (Ed.), Handbook of sensory physiology (pp. 451–482). Berlin: transport. Exp. Brain Res., 100, 181–186.
Springer-Verlag. Middleton, F. A., & Strick, P. L. (1994). Anatomical evidence for
Hikosaka, O., & Wurtz, R. H. (1983a). Visual and oculomotor cerebellar and basal ganglia involvement in higher cognitive
functions of monkey substantia nigra pars reticulata: I. Relation function. Science, 266, 458–461.
of visual and auditory responses to saccades. J. Neurophysiol., 49, Middleton, F. A., & Strick, P. L. (1996a). The temporal lobe
1230–1253. is a target of output from the basal ganglia. Proc. Natl. Acad. Sci.
Hikosaka, O., & Wurtz, R. H. (1983b). Visual and oculomotor USA, 93, 8683–8687.
functions of monkey substantia nigra pars reticulata: III. Middleton, F. A., & Strick, P. L. (1996b). New concepts regard-
Memory-contingent visual and saccade responses. J. Neurophysiol., ing the organization of basal ganglia and cerebellar ouput. In
49, 1268–1284. M. Ito & Y. Miyashita (Eds.), Integrative and molecular approach to
Hoover, J. E., & Strick, P. L. (1993). Multiple output channels in brain function (pp. 253–271). New York: Elsevier Science.
the basal ganglia. Science, 259, 819–821. Middleton, F. A., & Strick, P. L. (1997). Cerebellar output chan-
Hoover, J. E., & Strick, P. L. (1999). The organization of nels: Substrates for the control of motor and cognitive function.
cerebello- and pallido-thalamic projections to primary motor In J. Schmahmann (Ed), The cerebellum and cognition (vol. 41, pp.
cortex: An investigation employing retrograde transneuronal 61–82.) San Diego: Academic.
transport of herpes simplex virus type 1. J. Neurosci., 19, Middleton, F. A., & Strick, P. L. (2000). Basal ganglia and cere-
1446–1463. bellar loops: Motor and cognitive circuits. Brain Res. Brain Res.
Inase, M., & Tanji, J. (1995). Thalamic distribution of projection Rev., 31, 236–250.
neurons to the primary motor cortex relative to afferent terminal Middleton, F. A., & Strick, P. L. (2001). Cerebellar projections
fields from the globus pallidus in the macaque monkey. J. Comp. to the prefrontal cortex of the primate. J. Neurosci., 21, 700–
Neurol., 353, 415–426. 712.
Ivry, R. B., & Keele, S. W. (1989). Timing functions of the cere- Middleton, F. A., & Strick, P. L. (2002). Basal ganglia “projec-
bellum. J. Cogn. Neurosci., 1, 136–152. tions” to the prefrontal cortex. Cereb. Cortex, 12, 926–935.
Jinnai, K., Nambu, A., Tanibuch, I., & Yoshida, S. (1993). Miyashita, Y. (1993). Inferior temporal cortex: Where visual per-
Cerebello- and pallido-thalamic pathways to areas 6 and 4 in the ception meets memory. Annu. Rev. Neurosci., 16, 245–263.
monkey. Stereotactic Funct. Neurosurg., 60, 70–79. Mushiake, H., & Strick, P. L. (1993). Preferential activity of
Karnath, H. O., Himmelbach, M., & Rorden, C. (2002). The dentate neurons during limb movements. J. Neurophysiol., 70,
subcortical anatomy of human spatial neglect: Putamen, caudate 2660–2664.
nucleus and pulvinar. Brain, 125, 350–360. Mushiake, H., & Strick, P. L. (1995). Pallidal neuron activity
Kelly, R. M., & Strick, P. L. (2000). Rabies as a transneuronal during sequential arm movements. J. Neurophysiol., 74, 2754–
tracer of circuits in the central nervous system. J. Neurosci. 2758.
Methods, 103, 63–71. Passingham, R. (1993). The frontal lobes and voluntary action.
Kelly, R. M., & Strick, P. L. (2003). Cerebellar loops with motor Oxford, UK: Oxford University Press.
cortex and prefrontal cortex of a nonhuman primate. J. Neurosci., Percheron, G., François, C., Talbi, B., Yelnik, J., & Fénelon,
12, 8432–8444. G. (1996). The primate motor thalamus. Brain Res. Rev., 22,
Kelly, R. M., & Strick, P. L. (2004). Macro-architecture of basal 93–181.
ganglia loops with the cerebral cortex: Use of rabies virus to Petersen, S. E., Fox, P. T., Posner, M. I., Mintun, M., & Raichle,
reveal multisynaptic circuits. Progr. Brain Res., 143, 449–459. M. E. (1988). Positron emission tomographic studies of the
Kemp, J. M., & Powell, T. P. S. (1971). The connexions of cortical anatomy of single-word processing. Nature, 331, 585–
the striatum and globus pallidus: Synthesis and speculation. 589.
Phil. Trans. R. Soc. Lond. B Biol. Sci., 262, 441–457. Picard, N., & Strick, P. L. (2001). Imaging the premotor areas.
Larsell, O. (1970). The comparative anatomy and histology of the Curr. Opin. Neurobiol., 11, 663–672.
cerebellum from monotremes through apes. Minneapolis: University of Raichle, M. E., Fiez, J. A., Videen, T. O., MacLeod, A. M.,
Minnesota Press. Pardo, J. V., Fox, P. T., & Petersen, S. E. (1994). Practice-
Leiner, H. C., Leiner, A. L., & Dow, R. S. (1986). Does the related changes in human brain functional anatomy during non-
cerebellum contribute to mental skills? Behav. Neurosci., 100, motor learning. Cereb. Cortex, 4, 8–26.
443–454. Rapoport, J. L., & Wise, S. P. (1988). Obsessive-compulsive dis-
Leiner, H. C., Leiner, A. L., & Dow, R. S. (1991). The human order: Evidence for basal ganglia dysfunction. Psychopharm. Bull.,
cerebro-cerebellar system: Its computing, cognitive, and lan- 24, 380–384.
guage skills. Behav. Brain Res., 44, 113–128. Saint-Cyr, J. A., Ungerleider, L. G., & Desimone, R. (1990).
Leiner, H. C., Leiner, A. L., & Dow, R. S. (1993). Cognitive and Organization of visual cortical inputs to the striatum and
language functions of the human cerebellum. Trends Neurosci., 16, subsequent outputs to the pallido-nigral complex in the monkey.
444–447. J. Comp. Neurol., 298, 129–156.
Lichter, D. G., & Cummings, J. L. (2000). Frontal-subcortical circuits Sakai, S. T., Inase, M., & Tanji, J. (1999). Pallidal and cerebellar
in psychiatry and neurology. New York: Guilford. inputs to thalamocortical neurons projecting to the supplemen-
Lombardi, W. J., Gross, R. E., Trepanier, L. L., Lang, A. E., tary motor area in Macaca fuscata: A triple-labeling light micro-
Lozano, A. M., & Saint-Cyr, J. A. (2000). Relationship of lesion scopic study. Anat. Embryol. (Berl.)., 199, 9–19.
location to cognitive outcome following microelectrode-guided Schell, G. R., & Strick, P. L. (1984). The origin of thalamic
pallidotomy for Parkinson’s disease: Support for the existence of inputs to the arcuate premotor and supplementary motor areas.
cognitive circuits in the human pallidum. Brain, 123, 746–758. J. Neurosci., 4, 539–560.
Lynch, J. C., Hoover, J. E., & Strick, P. L. (1994). Input to the Schmahmann, J. D. (1991). An emerging concept: The cerebellar
primate frontal eye field from the substantia nigra, superior col- contribution to higher function. Arch. Neurol., 48, 1178–1187.
dum and strick: basal ganglia and cerebellar circuits with the cerebral cortex 563
39 The Basal Ganglia and Cognition
ann m. graybiel and jonathan w. mink
abstract Clinical evidence, experimental studies in animals, neuronal circuitry, but they do not function in isolation. The
and anatomical findings suggest that the basal ganglia act to influ- basal ganglia are intimately connected with the cerebral
ence not only motor behavior but also cognitive functions. We cortex and are perhaps best viewed as parts of cortico-basal
discuss the functions of the basal ganglia in relation to four catego-
ries: (1) movement release and inhibition, (2) response selection, (3)
ganglia circuits.
attention and assignment of salience, and (4) learning and adaptive How could this system have such broad functions? A clue
control of behavior. In establishing these functions, striatal output that the basal ganglia might contribute to cognitive process-
neurons lead into different output pathways: the direct, indirect, ing is that the basal ganglia attain a very large size in the
hyperdirect, and striosomal pathways. Divergence of cortical inputs human brain. But an even more telling clue is that a large
to the striatum and reconvergence of these motor and cognitive
part of the outflow of the basal ganglia in primates is directed
signals in cortico-basal ganglia pathways is seen as essential in
remapping forebrain representations of action and intrastriatal net- via the thalamus toward executive areas of the frontal
works in the binding process. We propose that a crucial feature of cortex—areas that are themselves associated with attention,
this remapping is a learning-related recoding of sequential motor planning, volitional decision, and selection among potential
and cognitive action representations so that they can be expressed responses to external or internal cues (Fuster, 1997; Paus,
as units. This chunking function of the striatum and associated
2001). Yet more evidence comes from brain imaging studies
cortico-basal ganglia loops may be a key mechanism operative
across each of the functional categories of behavioral control attrib- of subjects engaged in cognitive tasks (Klein, Zatorre, Milner,
uted to the basal ganglia. Meyer, & Evans, 1994; Grafton, Hazeltine, & Ivry, 1995;
Braver et al., 1997; Rao et al., 1997; Desmond, Gabrieli, &
Glover, 1998; Poldrack, Prabhakaran, Seger, & Gabrieli,
1999; Peigneux et al., 2000; Poldrack & Gabrieli, 2001;
The basal ganglia make up a group of interconnected sub- Small, Zatorre, Daghler, Evans, & Jones-Gotman, 2001; van
cortical nuclei that are organized into circuits involved in den Heuvel et al., 2003; Cools, Ivry, & Esposito, 2006;
the control of behavior. The basal ganglia have long been Chang, Crottaz-Herbette, & Menon, 2007; Cools, Gibbs,
recognized as important for motor control, because promi- Miyakawa, Jagust, & D’Esposito, 2008; Dahlin, Neely,
nent movement disorders such as Parkinson’s disease result Larson, Backman, & Nyberg, 2008; McNab & Klingberg,
from basal ganglia dysfunction. However, it is now widely 2008) and from findings in patients with brain dysfunction
recognized that the basal ganglia are parts of cortico- due to disease or injury (Mendez, Adams, & Lewandowski,
thalamo-basal ganglia loops, and that they function not only 1989; Bhatia & Marsden, 1994; Sawamoto et al., 2007).
in sensorimotor control, but also in a wide range of cognitive Experiments on animals have also generated working
processes ranging from attention to emotion, from response hypotheses about the neurobiology underlying basal ganglia
release and inhibition to response selection, and from on-line function (Oberg & Divac, 1979; Graybiel, 1995, 1998,
control to a primary function in learning and memory. 2005, 2008; Miyashita, Hikosa, & Kato, 1995; Bergman
Accordingly, the basal ganglia have now been implicated in et al., 1998; Hikosaka et al., 1999; Jog, Kubota, Connolly,
an equally broad range of clinical disorders, ranging from Hillegart, & Graybiel, 1999; Brainard & Doupe, 2000; Mink,
the classical extrapyramidal disorders (Parkinson’s disease, 2001; Packard & Knowlton, 2002; Barnes, Kubota, Hu, Jin,
Huntington’s disease, and dystonia) to neuropsychiatric dis- & Graybiel, 2005; Apicella, 2007). Together, these findings
orders including obsessive-compulsive disorder, Tourette have brought the basal ganglia to the forefront of work on
syndrome, attention-deficit disorder, and even schizophre- how the brain engages in interactions with the sensory and
nia. The basal ganglia themselves contain highly organized internal environment to form structured predictions about
the world and, on this basis, to make and execute action
ann m. graybiel Department of Brain and Cognitive Sciences plans (figures 39.1 and 39.2).
and the McGovern Institute for Brain Research, Massachusetts
Institute of Technology, Cambridge, Massachusetts
jonathan w. mink Departments of Neurology, Neurobiology and Perspectives from anatomy
Anatomy, Brain and Cognitive Sciences, and Pediatrics, University
of Rochester School of Medicine and Dentistry, Rochester, New The basal ganglia receive a massive input from the neo-
York cortex. Most of these cortical inputs are directed toward
Figure 39.1 Diagrams illustrating the postulated functions of the basal ganglia in relation to central pattern generators for eliciting goal-
directed behavior (A) and movement (B). Planner circuits of the forebrain are influenced by motivation-related inputs (A) and sensory-motor
stimuli (B). Abbreviations: 5-HT, serotonin; NE, norepinephrine. (Adapted from Graybiel, 1997.)
Cognitive Actions 2001). Further inputs come to other nuclei in basal ganglia
circuits, as we will see below. If one includes, as should be
done, the ventral striatum/ventral pallidum in the basal
Cognitive ganglia, cortical inputs to the system also come from the
Pattern Generators hippocampal formation and amygdala (Groenewegen,
Wright, & Uylings, 1997; Fudge, Kunishio, Walsh, Richard,
Evaluation -
Intent & Haber, 2002). When we add inputs from neuromo-
Thalamocortical
dulatory systems, including the dopamine-containing nigro-
Memory Basal
Planner Ganglia striatal tract and serotonergic inputs, and inputs from the
Circuits
Sensory-Motor neocortex and elsewhere to other nuclei of the basal ganglia,
Stimuli the inputs to the basal ganglia system as a whole are rich
5 HT and diverse and by no means restricted to one functional
Motor domain.
Pattern Generators Dopamine
It is also important to keep in mind that different regions
within each nucleus of the basal ganglia are probably as
different from one another functionally as different parts of
Motor Actions the neocortex are from one another. When we think of
Figure 39.2 Schematic diagram illustrating potential influences behavior-related functions of the neocortex, we naturally
of the basal ganglia not only on motor pattern generators but also think of the functions of individual cortical areas, for example,
on cognitive pattern generators. (Adapted from Graybiel, 1997.) the middle temporal area for visual motion, parietal areas
for reach and grasp, or prefrontal areas for working memory.
the striatum (caudate nucleus and putamen). The range We do not know as much about the regionally specialized
of these corticostriatal inputs is impressive. They come subdivisions of the basal ganglia, but it is clear that there are
not only from primary and higher-order sensory areas different “families” of cortico-basal ganglia circuits related
and from motor and premotor areas, but also from the to motor, associative, and limbic functions (Graybiel, 1984;
large areas of association cortex in the parietal, temporal, Alexander, DeLong, & Strick, 1986). The behavioral evi-
medial, and frontal association cortex (Webster, Bachevalier, dence leading to this idea is important: Lesion studies have
& Ungerleider, 1993; Eblen & Graybiel, 1995; Yeterian shown that localized lesions of the striatum produce symp-
& Pandya, 1998; Ferry, Ongur, An, & Price, 2000; toms similar to those induced by lesions in the cortical areas
Leichnetz, 2001; Haber, Kim, Mailly, & Calzavara, 2006; projecting strongly to the particular parts of the damaged
Calzavara, Mailly, & Haber, 2007). There are other very striatum (Divac, Rosvold, & Szwarcbart, 1967; Goldman &
large inputs to the striatum from thalamic nuclei, especially Rosvold, 1972). Somehow, the functional domain of specific
the intralaminar nuclei (Parent, Mackey, & De Bellefeuille, basal ganglia circuits seems to relate to the function of their
1983; Ragsdale & Graybiel, 1991; Haber & McFarland, cortical input sources.
Release! important. First, the cortical input to the STN comes only
DA from the frontal cortex, whereas the input to the striatum
arises from all or nearly all areas of the cerebral cortex.
B
“Indirect
Pathway” Neocortex
+ Cerebral
Cortex
+
Striatum D2 GPi
− Thalamus
+ Inhibit! Striatum
DA −
GPe
− STN
C Direct-Indirect Pathways
GPi
Neocortex
+ Excitatory
Inhibitory STN
+
Striatum D1 − GPi
− Thalamus
Thalamocortical and
D2 + Balance Brainstem Targets
DA −
GPe
− STN
Competing
Motor Desired
Figure 39.4 Highly schematic diagrams of the direct and indirect Patterns Motor Pattern
pathways identified in basal ganglia circuitry. A and B separate out
these two pathways to emphasize their proposed “release” and Figure 39.5 Schematic diagram of functional organization of the
“inhibit” functions. The diagrams in C puts them together to show basal ganglia output. Excitatory projections are indicated with
the balance between them that is thought to underlie normal open arrows; inhibitory projections are indicated with filled arrows.
behavioral control. The hyperdirect and striosomal pathways are Relative magnitude of activity is represented by line thickness.
not shown here (see figures 39.3, 39.5, and 39.6). (Modified from Mink, 2001.)
Figure 39.6 (A) Schematic diagram of the hyperdirect cortico- subthalamic nucleus; Str, striatum; Th, thalamus. (B) Schematic
subthalamo-pallidal, the direct cortico-striato-pallidal, and the diagram depicting the hypothesized activity change over time (t) in
indirect corticostriato-GPe-subthalamo-GPi pathways. White and the thalamocortical projection (Th/Cx) following the sequential
black arrows represent excitatory glutamatergic (glu) and inhibitory inputs through the hyperdirect cortico-subthalamo-pallidal (middle)
GABAergic (GABA) projections, respectively. Abbreviations: GPe, and direct cortico-striato-pallidal (bottom) pathways. (Modified from
external segment of the globus pallidus; GPi, internal segment of Nambu et al., 2002.)
the globus pallidus; SNr, substantia nigra, pars reticulata; STN,
Second, the output from the STN is excitatory, whereas the The anatomical arrangement of STN and striatal inputs
output from the striatum is inhibitory. Third, the excitatory to the GPi and SNpr form the basis for a functional center-
route through the STN is faster than the inhibitory route surround organization as shown in figure 39.5. When a
through the striatum (Nambu et al., 2000). Finally, the STN voluntary movement is initiated by cortical mechanisms, a
projection to the GPi is divergent, and the striatal projection separate signal is sent to the STN, exciting it. The STN
is more focused (Parent & Hazrati, 1993). Thus, the two projects in a widespread pattern and excites the GPi. The
disynaptic pathways from cerebral cortex to the basal ganglia increased GPi activity produces inhibition of thalamocorti-
output nuclei, the GPi and SNpr, provide fast, widespread, cal motor mechanisms. In parallel to these pathways through
divergent excitation through the STN, and slower, focused, the STN, signals are sent from all areas of the cerebral cortex
inhibition through the striatum. Because the outputs of the to the striatum. The cortical inputs are transformed by
GPi and the SNpr are thought to be inhibitory (but see integrative circuitry in the striatum to a focused, context-
potential evidence for the contrary reviewed in Graybiel, dependent output that inhibits specific neurons in the GPi.
2005), this arrangement would result in focused facilitation The inhibitory striatal input to the GPi is slower than the
and surround inhibition of basal ganglia thalamocortical excitatory STN input, but it is more powerful. The resulting
targets. focally decreased activity in the GPi selectively disinhibits
In this scheme, the tonically active inhibitory output of the desired thalamocortical motor circuits. Indirect path-
the basal ganglia acts as a “brake” on motor control circuits ways from the striatum to the GPi (striatum → external
of the cerebral cortex and brain stem. When a movement is pallidum (GPe) → GPi and striatum → GPe → STN →
initiated by a particular motor pattern generator, basal GPi) result in further focusing of the output. The net result
ganglia output neurons projecting to competing generators of basal ganglia activity during a voluntary movement is the
increase their firing rate, thereby increasing inhibition and inhibition (“braking”) of competing motor patterns and
applying a “brake” on these generators. Other basal ganglia focused facilitation (releasing the “brake”) from the selected
output neurons projecting to the generators that are involved voluntary movement pattern generators.
in the desired movement decrease their discharge, thereby This scheme provides a framework for understanding
removing tonic inhibition and releasing the “brake” from both the pathophysiology of parkinsonism and involuntary
the desired motor patterns. Thus, the intended movement is movements (Young, Albin, & Penney, 1989; Mink, 1996,
enabled, and competing movements are prevented from 2003; Goldberg et al., 2002). Different involuntary move-
interfering with the desired one. ment disorders such as parkinsonism, chorea, dystonia, and
Σ Stochastic
one-out-of-n
selector
Figure 39.7 (A) Divergent-reconvergent processing of signals modulated by dopamine-containing inputs from the substantia
through cortico-basal ganglia pathways. Divergence of cortical nigra (SN). (Modified from Graybiel et al., 1994.) (B) Mixture-
input to modules (matrisomes A, B, C) occurs at the level of the of-experts learning network model. (Modified from Jacobs et al.
striatum. In the globus pallidus (GP), information is reconverged, 1991.) Note the similarity of the models in A and B.
resulting in the remapping of the cortical output. The network is
corticostriatal information flow (A. D. Smith & Bolam, 1990; anterior cingulate/orbitofrontal cortical regions are impor-
Bolam, Hanley, Booth, & Bevan, 2000). In addition, dopa- tant in response selection and attentional shifting.
mine is critically involved in long-term potentiation and Selecting which action to perform is critical for normal
depression in the striatum (Reynolds, Hyland, & Wickens, behavior. But when particular actions (or thoughts) are
2001; Wise, 2004; Calabresi, Piconi, Tozzi, & Di Filippo, selected over and over again, the repetitiveness can signal the
2007; Tang, Pawlak, Prokopenko, & West, 2007). occurrence of syndromes such as OC-spectrum disorders or
Many movement disorders resulting from diseases that other disorders in which behavioral stereotypies occur. There
affect the basal ganglia can be understood as disorders of is some evidence that the repetitiveness of action selection
response selection and inhibition. These include disorders can be controlled independently of which action is selected.
characterized by paucity of movement, such as Parkinson’s That is, different actions can be selected but, when selected,
disease (Mink, 1996; Goldberg et al., 2002), and disorders each is repetitively selected. In both rodents and primates, a
characterized by excessive involuntary movements, such as specific modular pattern of neuronal activation in the stria-
chorea, dystonia, or tics (Mink, 1996, 2003; Sato et al., tum is highly predictive of the stereotypies induced by psy-
2008). Notably, the treatment of Parkinson’s disease by STN chomotor stimulants: Activity in striosomes is greater than
deep brain stimulation can improve both response selection activity in the surrounding matrix regardless of which partic-
and inhibition (Nieuwenhuis, Yeung, van den Wildenberg, ular actions are being repeated—that is, regardless of which
& Ridderinkhof, 2003). have been selected (figure 39.8) (see Canales & Graybiel,
In parallel with these movement disorders, neuropsychi- 2000; Saka, Goodrich, Harlan, Madras, & Graybiel, 2004).
atric disorders can also result from impaired response selec- This is interesting, because anatomical work in the primate
tion or inhibition. Inappropriate facilitation or impaired suggests that striosomes receive differentially strong input
inhibition may lead to the cognitive and motor intrusions, from parts of the anterior cingulate and orbitofrontal cortex.
inflexibility, repetitiveness, and overt cognitive and motor In the human, as was noted above, these are cortical regions
stereotypic responses that occur in OC-spectrum disorders that are abnormal in OCD patients and in addictive states
(Graybiel & Rauch, 2000; Leckman, 2002; Graybiel, 2008). (for a review, see Graybiel, 2008). Modular patterns of stria-
Functional imaging studies of individuals with OC-spectrum tal activation have also been invoked to account for focal tics
disorders indicate increased activity in the caudate nucleus and repetitive actions in Tourette syndrome (Mink, 2001). In
combined with increased activity in the cingulate and this case, overactivity in particular modules (matrisomes) is
orbitofrontal cortices (see Rauch et al., 2001). Moreover, thought to be involved in the “selection” of the repeated
symptom provocation in OCD patients further increases behavior (figure 39.7a). Thus both striosomes and matri-
the activity of these regions (Breiter et al., 1996), and the somes could contribute to disorders of action selection and
increased activation can be lessened by treatment of the behavioral switching and could be important for the normal
symptoms (Baxter et al., 1992; Schwartz, Stoessel, Baxter, discharge of these complex functions.
Martin, & Phelps, 1996; Lazaro et al., 2008). This dynamic Impulse-control disorders may relate to impaired res-
modulation supports the idea that the basal ganglia and ponse inhibition. This has become an area of substantial
Spikes/s Spikes/s
CS 30 CS 30
15 15
0 0
CN
P A19
CN
P A21
CS Spikes/s Spikes/s
30 CS 30
15 15
0 0
1 ms
Figure 39.9 The responses of tonically active neurons (TANs) of sites (black dots or squares) are shown in raster plots and spike histo-
the macaque monkey striatum in response to conditioned stimuli grams, and the anteroposterior (AP) sites at which they were
(CS: clicks or light-emitting diodes) in a simple behavioral condi- recorded are shown in diagrams. Note the widespread, coherent
tioning paradigm in which the monkey receives liquid rewards fol- appearance of the response, suggesting that these interneurons
lowing delivery of the CS. The neurons acquire responses to the might serve as a temporal binding mechanism across cortico-basal
cues associated with the rewards (see pauses in activity). The ganglia loops. (Adapted from Graybiel et al., 1994.)
responses of six representative TANs recorded at the illustrated
Hz Hz
0 0
-1 0 +1 -1 0 +1
Seconds Seconds
Start Goal
80 40
Hz Hz
0 0
-1 0 +1 -1 0 +1
Seconds Seconds
Percent
Percent
Percent
60 60 60 60
40 40 40 40
20 20 20 20
0 0 0 0
1 3 5 7 9 1 3 5 7 9 1 3 5 7 9 1 3 5 7 9
Stage Stage Stage Stage
100
Percent
Early Late
Learning Stage
0
Figure 39.11 Event-related ensemble activity of neurons in the tion of task-related activity patterns of the striatal neurons that
dorsolateral striatum of rats recorded during the acquisition and occurs during the acquisition of the task. The behavioral criterion
performance of an auditory conditional turning task in a T-maze. for acquisition was at stage 3. Color plots at bottom illustrate sche-
Perievent histograms displayed around the T-maze show examples matically the gradual changes in the response profiles of the striatal
of the activities of single striatal neurons in relation to start, tone, neurons during the course of behavioral learning. (Modified from
turn, and goal events. Plots below the maze illustrate the reorganiza- Jog et al., 1999; Graybiel and Kubota, 2003.) (See color plate 52.)
abstract We review some of the impairments in motor control, The computational problem of motor control
motor learning, and higher-order motor control in patients with
lesions of the cerebellum, parietal cortex, and basal ganglia. We In 1954, Fitts published a short paper in which he reported
attempt to explain some of these impairments in terms of compu-
tational ideas such as state estimation, optimization, prediction,
that there were regularities in people’s movements (Fitts,
cost, and reward. We suggest that a function of the cerebellum is 1954). He asked volunteers to move a pen from one “goal
system identification: to built internal models that predict sensory region” to another as fast and accurately as they could. He
outcome of motor commands and correct motor commands found that the movement durations grew logarithmically as
through internal feedback. A function of the parietal cortex is state a function of the distance between the goals (figure 40.1).
estimation: to integrate the predicted proprioceptive and visual
This relationship was modulated by two factors. One factor
outcomes with sensory feedback to form a belief about how the
commands affected the states of the body and the environment. was the size of the goal region. As the goal region became
A function of basal ganglia is related to optimal control: learning smaller, movements slowed down. A second factor was the
costs and rewards associated with sensory states and estimating the mass of the pen. People slowed their movements when they
“cost-to-go” during execution of a motor task. moved a heavier pen. To explain these results, consider that
the target box was surrounded by two penalty regions, so it
seems rational to aim for the center of the target box. What
if the penalty region was only on one side? Now one should
aim for a point farther away from the penalty region and not
at the center of the target box (Trommershauser, Gepshtein,
Over the last 25 years, a large body of experimental and Maloney, Landy, & Banks, 2005). This is because move-
theoretical work has been directed toward understanding ments have variability, and one will maximize reward (in
the computational basis of motor control, particularly visu- terms of sum of hits and misses) if one takes into account this
ally guided reaching. Roboticists and engineers largely initi- variability. This variability explains the speed of movements
ated this work, with the aim of deriving from first principles in Fitt’s experiment and the sensitivity to pen weight: Rapid
some of the strikingly stereotypical features of movements movements are more variable than slow movements, so one
observed in people and other primates. That is, they aimed should slow down if there is a need to be accurate. Moving
to understand why we move the way that we do. The theo- heavier objects tends to increase movement variability, again
ries began to explain why in reaching to pick up a cup or in requiring a reduced speed to maintain accuracy. Therefore
moving the eyes to look at an object, there was such consis- in planning our movements, our brain takes into account
tency in the detailed trajectory of the hand and the eyes. In movement variability because variability affects accuracy,
many ways, the approach was reminiscent of physics and its which in turn affects our ability to acquire reward.
earliest attempts to explain regularity in motion of celestial Harris and Wolpert (1998) began formalizing these ideas
objects except that the regularity was in our movements, and by linking variability and movement planning. They noted
the search was for theories that explained our behavior. that larger motor commands required larger neural activity,
Here, we will summarize these theories and then link them which in turn produced larger variability owing to a noise
to experimental findings in healthy subjects and in patients process that grew with the mean of the signal. Therefore,
with neurological disease. motor commands carried an accuracy cost because the
larger the command, the larger the standard deviation of the
reza shadmehr Laboratory for Computational Motor Control, noise that rides on top of the force produced by the muscles
Department of Biomedical Engineering, Johns Hopkins School of (Jones, Hamilton, & Wolpert, 2002). Noise makes move-
Medicine, Baltimore, Maryland
john w. krakauer The Motor Performance Laboratory, ments inaccurate.
Department of Neurology, Columbia University College of In a sense, the theory restated the purpose of move-
Physicians and Surgeons, New York, New York ments using language of mathematics: Be as fast as possible,
The computational problem in reaching the objective is to minimize the quantity (yv(t) − r)T (yv(t) − r)
at time t = N after the reach starts (e.g., this is the time that
Let us use the well-studied reach adaptation paradigm to the movement is rewarded if the cursor is in the target).
formulate the problem in the framework outlined in figure Superscript T is the transpose operator. To denote the fact
40.2. What are the costs and rewards of a reaching task? that this cost is zero except for time N, we write it as
Suppose that we are instructed to hold a tool and move it
so that a cursor displayed on a monitor arrives at a target. N
∑ ( y( ) − r ) Q (t ) ( yv(t ) − r )
t T
If we accomplish this in a specific time period, we are pro- v
t =1
vided a monetary reward, or juice, or perhaps a “target
explosion.” We can sense the position of the cursor yv and where the matrix Q is a measure of our cost at each time
the target r via vision and position of our arm yp via pro- step (which may be zero except at time N). That is, matrix
prioception. Through experience in the task, we learn that Q specifies how important it may be for us to put the cursor
abstract During on-line control of movement, the posterior dures (e.g., motor commands) that will cause a particular
parietal cortex (PPC) serves as a functional bridge between sensory state of the motor system to occur. While inverse models
and motor areas in the brain. One of the sensorimotor functions likely play an important role in sensorimotor control, they
of this area appears to be prediction of the state of the arm during
movement. Because sensory information is substantially delayed, it
will not be discussed further in this chapter; instead, we will
has been proposed that the brain makes use of an internal forward place emphasis on the forward model and, in particular, the
model that integrates both sensory and motor feedback signals to role of the posterior parietal cortex (PPC) in forward state
estimate current and upcoming positions and motions of the limb estimation for motor planning and control.
during reaching. These predicted states are more useful for rapid
on-line control than are delayed sensory signals. The first part of
this chapter focuses on investigations of on-line control mechanisms Movement intention and anticipation in PPC
in PPC. The results of these studies indicate that one of the func-
tions of PPC is to serve as a forward model. The second section PPC is a critical node for bridging sensory and motor rep-
highlights research that aims to read-out forward state estimates resentations in the brain. PPC associates multiple sensory
from PPC neurons and harness them for direct control of neural modalities (e.g., visual—the dominant sensory input to PPC,
prostheses.
somatosensory, and auditory) and transforms these inputs
into a representation that is useful for guiding actions to
objects in the external world (Andersen & Buneo, 2002).
A growing body of clinical and psychophysical evidence
Evidence from lesions studies indicates that damage to PPC
supports the theory that the brain makes use of an internal
results in an inability to link the sensory requirements of a
model during control of movement; a sensorimotor repre-
task with the appropriate motor behavior necessary to com-
sentation of the interaction of one’s self with the physical
plete it. For example, parietal lesion patients can have diffi-
world ( Jordan, 1995; Kawato, Furukawa, & Suzuki, 1987).
cultly planning skilled movements, a condition known as
Two primary types of internal models for sensorimotor
apraxia (Geshwind & Damasio, 1985). Impairments from
control have been proposed: the forward model and the
apraxia can range from an inability to properly perform an
inverse model. A forward model (i.e., forward output model)
instructed or desired arm movement to how to coordinate a
predicts the sensory consequences of a movement ( Jordan
specific sequence of movements to accomplish an end goal.
& Rumelhart, 1992; Miall & Wolpert, 1996; Wolpert,
Numerous neurophysiological studies in monkeys have
Ghahramani, & Jordan, 1995). That is, it mimics the behav-
shed light on the neural correlates of reach planning in PPC.
ior of a motor system by predicting the expected, upcoming
Monkeys have served as a successful model for studying
state of an end effector (e.g., sensory feedback of one’s own
sensorimotor representations in humans since the two
limb) using knowledge of the characteristic dynamics of the
species engage in a variety of similar sensorimotor behaviors.
system as well as stored copies of recently issued motor com-
Moreover, functional magnetic resonance imaging (fMRI)
mands. Conversely, an inverse model encodes the motor
studies have provided evidence that PPC’s functional role is
commands necessary to produce a desired outcome (Atkeson,
similar in both monkeys and humans (Connolly, Andersen,
1989). That is, an inverse model estimates the set of proce-
& Goodale, 2003; DeSouza et al., 2000; Pellijeff, Bonilha,
Morgan, McKenzie, & Jackson, 2006; Rushworth, Paus,
grant h. mulliken Computational and Neural Systems, & Sipila, 2001). When trained monkeys plan a reach to
California Institute of Technology, Pasadena, California
richard a. andersen Computation and Neural Systems, Division an illuminated target, the firing rates of neurons in the
of Biology, California Institute of Technology, Pasadena, medial bank of the intraparietal sulcus (MIP) generally reflect
California a combination of both sensory and motor parameters
Motor
Error
Sensory
Comparator
+ Processing
- Observer
^
xk
Figure 41.1 Flow diagram illustrating sensorimotor integration Following movement onset, the state of the arm is continuously
for reach planning and on-line control. Items in rounded boxes monitored and corrected, if necessary, to ensure successful comple-
denote pertinent sensorimotor variables; computational processes tion of the reach. Critical to rapid on-line correction of movement
are contained in rectangular boxes. Prior to a reach, an intended is the forward model, which generates an anticipatory, a priori
trajectory is formulated as a function of both the initial state of the estimate of the next state of the arm, x̂ k̄ , as a function of the previ-
arm and the desired endpoint, the target location. An inverse ous state and efference copy. Intermittent sensory feedback is used
model is used to determine a set of motor plans that will result in to refine the a priori estimate of the forward dynamics model
the desired trajectory. Motor plans are then issued (e.g., by primary (observer). The a posteriori current state estimate, x̂ k, can then be
motor cortex, M1) and subsequently executed by muscles acting evaluated to make corrections to subsequent motor commands.
upon the physical environment (i.e., biomechanical plant hexagon). (After Desmurget & Grafton, 2000.)
Since the output of the forward model reflects a best guess yk = H k x k + vk (state observation model) (2)
of the next state of the arm, errors due to various sources of
noise will inevitably accumulate over time for this estimate. where xk is the time-varying state of the arm at time step k
Therefore it is likely that sensory observations, which arrive and is modeled as a linear function of the previous state, xk−1,
at later times, are also continually integrated by the brain to and the control term, uk−1. The control term is considered to
update and refine the estimate of the forward model (Miall be a known motor command, which is likely specified by
& Wolpert, 1996) (figure 41.1). A system that estimates the frontal motor areas (e.g., primary motor or premotor cortex)
state of a movement by combining the output of a forward and then fed back to sensorimotor circuits performing state
model with sensory feedback about the state is generally estimation. For instance, the motor command at each time
referred to as an observer (Goodwin & Sin, 1984). For linear step might be determined by using an optimization proce-
systems in which the noise is additive and Gaussian, the dure that minimizes a cost function associated with carrying
optimal (i.e., in the mean squared error sense) observer is out a particular trajectory (Todorov, 2006). Here, yk is a
known as a Kalman filter (Kalman, 1960). Wolpert and col- sensory measurement (visual and proprioceptive) made at
leagues first applied the Kalman filter to model how subjects time step k. (Note that sensory feedback is in fact a delayed
estimate the sensorimotor state of the hand during goal- representation of the state of the arm.)
directed reaches. They showed that a Kalman filter could To estimate the state of the arm at each time step k, the
accurately account for subjects’ estimates of the perceived output of the forward model, x̂ −k (i.e., the a priori estimate),
end location of their hand while making arm movements in is linearly combined with the difference between the output
the dark (Wolpert et al., 1995). Therefore the Kalman filter of the observation model (i.e., the predicted sensory mea-
can serve as a useful theoretical model for studying senso- surement) and the actual sensory measurement. This dis-
rimotor state estimation in the brain. crepancy, the “sensory innovation,” is then optimally scaled
Two linear stochastic equations govern the basic opera- by the Kalman gain, Kk, to produce an a posteriori estimate
tion of the Kalman filter: of the state of the arm:
Goal
angle
Fixation
Firing rate (Hz)
C 18 20 22 24 26 28 D
18
30 0.03
20
0
0.02 Movement Angle
-30
Goal Angle
-60
0.01
-90
-120 0
π/4 π/2 3π/4 π 5π/4 3π/2 7π/4 -120 -90 -60 -30 0 30 60 90 120
Movement angle (radians) Lag time (ms)
Figure 41.2 Experimental design and representative neuron. (A) angles measured over a range of lag times (−120 ms ≤ τ ≤ 120 ms)
Example center-out trajectory showing the goal angle and move- relative to the firing rate. (D) Movement angle temporal encoding
ment angle, and their respective origins of reference. Large and function (TEF) and corresponding goal angle TEF, where mutual
medium-sized circles represent the target and fixation point, respec- information between firing rate and movement angle is plotted as
tively. Dots denote cursor position sampled at 15-ms intervals along a function of lag time. The firing rate contained the most informa-
the trajectory. (B) Example trajectories for obstacle task. The tion about the movement angle at an optimal lag time of 0 ms. The
dashed circle depicts the starting location of the target and is not dashed lines denote surrogate TEFs, for both movement (black-
visible once the target has been jumped to the periphery. The large dashed) and goal (gray-dashed) angles, that were derived from
gray circles represent the visual obstacle. (C ) Movement angle surrogate spike trains and actual angles. (Reprinted with permission
space-time tuning function (STTF). The contour plot shows the from Mulliken, Musallam, & Andersen, 2008a.)
average firing rate of a cell that occurred for different movement
Neurons that are significantly tuned for goal angle persis- past (negative-lag time), and many peaked around the
tently encode the static direction to the target, independent current state (zero-lag time).
of the changing state of the cursor. These cells were consis- It is helpful to interpret the OLT results in the context
tent with previous reports of target-sensitive tuning in area of the observer framework. Passive sensory feedback
5 (Ashe & Georgopoulos, 1994). Therefore, the intended (e.g., y in equation 2) would require at least 30–90 ms
goal of the trajectory is maintained in PPC during on-line (proprioceptive-visual) to reach PPC; consistent with some
control of movement. PPC neurons that are tuned for move- of the negative OLTs (≤−30 ms) observed here (Decety
ment angle encode dynamic information about the time- et al., 1994; Flanders & Cordo, 1989; Miall & Wolpert,
varying state of the cursor. Figure 41.3A shows TEFs for the 1996; Petersen et al., 1998; Raiguel et al., 1999). Conversely,
movement angle population. The histogram in figure 41.3B if PPC neurons were responsible for generating outgoing
summarizes the distribution of OLTs for the movement motor commands (u in equation 1), subsequent stages of
angle population, which was centered at 0 ± 90 ms and 30 processing and execution of the movement would require
± 90 ms, for the center-out and obstacle tasks, respectively at least 90–100 ms to produce the corresponding cursor
(median ± interquartile range (IQR)). These plots show that motion (Miall & Wolpert, 1996). For instance, similar analy-
movement angle neurons contained a temporal distribution ses for velocity have been performed in the primary motor
of information about the state of the ongoing movement; cortex and report average OLTs of approximately 90–
some neurons best represented states in the near future 100 ms (Ashe & Georgopoulos, 1994; Paninski et al., 2004).
(positive-lag time), some best represented states in the recent Therefore, it is unlikely that PPC is primarily driving motor
Center-out neural
s.d. of Δ PD (radians)
90 Center-out behavior
Lag time (ms)
π/3
2
0
-30
π/6
2
-60
-90
2
0
-120
0 π/3 2π/3 π 4π/3 5π/3 2π -120 -90 -60 -30 0 30 60 90 120
Movement angle (radians) Time relative to OLT (ms)
C100 D
90 Center-out 80 Center-out
# Movement angle cells
Fractional energy (%)
80 Obstacle 70 Obstacle
70 60
60 50
50
40
40
30
30
20 20
10 10
0 0
0 1 2 3 4 5 6 7 8 0 10 20 30 40 50 60 70 80 90 100
Singular value FE in 1st singular value (%)
Figure 41.4 Curvature and separability of STTFs. (A) Example of fractional energy (FE) accounted for by each singular vector in
STTF containing slight curvature. The qpd of this cell (dashed line) the singular vector decomposition (SVD) analysis. The majority of
changed smoothly but slightly as a function of lag time. (B) Stan- energy in movement angles STTFs was captured by the first sin-
dard deviation of the population’s distribution of qpd changes (sdq), gular vectors for the center-out and obstacle tasks, respectively. (D)
plotted as a function of time relative to the OLT. For both center- Population histogram showing distribution of FE of the first singu-
out and obstacle tasks, the population sdq (neural, solid lines) was lar value for all movement angle cells. (Reprinted with permission
significantly less than the sdq for the actual movement angle (behav- from Mulliken, Musallam, & Andersen, 2008a.)
ior, dashed lines) over the same time range. (C ) Population summary
Figure 41.5 A neural prosthesis using PPC for trajectory control. pathway may still be largely intact, which includes PPC. Decoding
A spinal cord injury can render communication (afferent and effer- algorithms are designed to optimally estimate the state of the
ent) between somatosensory and motor areas of cortex and the effector from the measurement of neural activity from PPC
limbs useless. However, the integrity of the “vision for action” ensembles.
abstract The computational problems solved by the sensory and single point in time (1 s ago); if this can be done, then the
motor systems appear very different: One has to do with inferring clock will advance (to say 0.99 s ago), and the computation
the state of the world given sensory data, the other with generating will be repeated.
motor commands appropriate for given task goals. However recent
mathematical developments summarized in this chapter show that The above inference problem does not have a unique
these two problems are in many ways related. Therefore informa- solution, because there are many sequences of muscle activa-
tion processing in the sensory and motor systems may be more tions that could have caused the state transition we are trying
similar than was previously thought—not only in terms of compu- to explain. Even at the final time, the arm could be in many
tations, but also in terms of algorithms and neural representations. postures that all correspond to a successful grasp; thus the
Here, we explore these similarities and clarify some differences
between the two systems.
fictive measurement is incomplete. The same ill-posedness
is present in the control problem and is known as motor
redundancy (Bernstein, 1967). Inference problems do not
normally involve this kind of redundancy. Indeed, the infer-
Similarity between inference and control: An intuitive ence here is rather unusual: There is a period of time (1 s in
introduction our example) when there are no sensory measurements, and
the only available measurement at the end of the movement
Consider a control problem in which we want to achieve a is incomplete. We could consider a different control problem
certain goal at some point in time in the future—say, grasp that corresponds to a more usual inference problem involv-
a coffee cup within 1 s. To achieve this goal, the motor ing complete sensory measurements. That control problem
system has to generate a sequence of muscle activations that is one in which we are given a detailed goal state at each
result in joint torques that act on the musculoskeletal plant point in time, that is, a reference trajectory for all musculo-
in such a way that the fingers end up curled around the cup. skeletal degrees of freedom, and have to generate muscle
Actually, the motor system does not have to compute the activations so as to force the plant to track this trajectory.
entire sequence of muscle activations in advance. All it has When the latter control problem is mapped into an inference
to compute are the muscle activations right now, given the problem, the sequence of detailed goal states turns into a
current state of the world (including the body) and some sequence of complete sensory measurements, thus eliminat-
description of what the goal is. If the system is capable of ing redundancy. It is important to realize, however, that
performing this computation, then it will generate the result- trajectory tracking represents only a small fraction of eco-
ing muscle activations, the clock will advance to the next logically relevant behaviors (Todorov & Jordan, 2002). Thus
point in time, and the computation will be repeated. the natural control problem (which involves a large amount
How can this control problem be interpreted as an infer- of redundancy) corresponds to an unnatural inference
ence problem? Instead of aiming for a goal in the future, problem (in which sensory data are very sparse) and vice
imagine that the future is now and the goal has been achieved. versa. Inference is easier if complete sensory measurements
More precisely, shift the time axis by 1 s and create a fictive are available at all times; similarly, control is easier if detailed
sensory measurement corresponding to the hand grasping goal states are specified at all times.
the cup. The inference problem is now as follows: Given that This reasoning suggests that control is a harder problem
the fingers are around the cup and that the world was at a than inference, at least in the temporal domain. Indeed,
certain state 1 s ago, infer the muscle activations that caused inference in the absence of measurements is called predic-
the observed state transition. As in the control problem, all tion (except that here it is performed backward in time), and
that needs to be inferred are the muscle activations at a prediction tends to be hard. On the other hand, redundancy
makes it possible to be sloppy most of the time and still
emanuel todorov Department of Cognitive Science, University achieve the goal. This is because, even if the initial part of
of California San Diego, San Diego, California the movement somehow goes wrong, there is time later in
( )
2 the state of the world before observing the measurement.
p ( u y ) ∝ p ( y u ) p ( u ) ∝ exp −½ t * − Mu −½r u 2
(2) The likelihood (which formalizes the generative model) is the
Thus the posterior probability in the inference problem probability of measurement y being generated when the
(equation 2) coincides with the exponent of the negative cost world is in state x. The posterior summarizes everything that
in the control problem (equation 1); in particular, the most we know after the measurement is taken into account. If
probable muscle activations coincide with the optimal muscle there are multiple independent measurements, the right-
activations. hand side of equation 3 contains the product of the corre-
This completes our example of duality in isometric tasks. sponding likelihoods. The latter setting is used in models of
Although it is a simple example that does not involve cue integration, in which subjects are presented with two
state variables changing over time, it nevertheless illustrates (often incompatible) sensory cues and asked to estimate some
a key idea that is used extensively later. The idea is that property of the world. Such experiments have provided the
costs and probabilities are related by an exponential trans- simplest and perhaps most compelling evidence that percep-
formation. This is to be expected; costs add while probabili- tion relies on Bayesian inference (e.g., Ernst & Banks, 2002).
ties multiply, and it is the exponential transformation that The probability distributions that are used in these studies
turns sums into products. The same transformation shows are typically Gaussian.
up in other fields as well. In statistical mechanics, for example, Unlike the static nature of many cue integration experi-
the energy of a given state and the probability of finding the ments, sensory processing in the real world takes place in
system in that state at thermal equilibrium are related by time and requires integration of measurements obtained at
the Gibbs distribution, which is the exponent of the negative different points in time. This is called recursive estimation
energy. or filtering. The basic update scheme that is applied at each
We are now ready to develop a general form of duality point in time has the predictor-corrector form:
between optimal control and optimal/Bayesian inference p ( x ) ∝ l ( y x ) ∑ d ( x x prev ) p ( x prev ) (4)
over time. To this end, we will first review the concepts of x prev
optimality in sensory and motor processing and note the
Here, p(x) is the posterior at the current state, p(xprev) is the
similarities and differences between the two formalisms. This posterior at the previous state (which we have already com-
analysis will then indicate how the control problem should puted at the previous time step), l(y⎪x) is the likelihood func-
be phrased so as to become mathematically equivalent to tion, and d(x⎪xprev) is the stochastic one-step dynamics of the
Bayesian inference. world. In estimating the state of the body, the dynamics d
will also depend on the control signal that is available to the
Optimality in sensory and motor processing sensory system in the form of an efference copy. The product
of d and p, which is being summed over, is the joint probabil-
While all aspects of neural function have evolved to produce ity of x and xprev. The sum marginalizes out xprev and yields a
behavior that is beneficial to the organism, the evolutionary prediction (or prior) over x. In this way, the posterior at one
pressures on real-time sensory and motor processing may point in time is used to compute the prior at the next point
have been particularly strong and direct because of the in time. The multiplication by the likelihood l is the sensory-
crucial role that such processing plays in getting food to based correction discussed earlier.
the mouth, escaping predators, and generally keeping the A number of experimental findings support the notion of
organism alive. It is, then, not surprising that the under- Bayesian inference over time (Wolpert, Gharahmani, &
lying neural mechanisms perform about as well as any Jordan, 1995; Kording & Wolpert, 2004; Saunders & Knill,
2004). These studies typically use arm movements, not so sensory measurements are generated as a function of world
much for the purpose of studying the motor system but as a states. This generative model may incorporate a model of
continuous readout of perception. Such studies demonstrate optics in vision or a model of acoustics in audition plus a
that subjects take into account multiple sources of informa- model of sensory transduction in the corresponding modal-
tion over time (visual and proprioceptive, along with internal ity. One can think of perception as a computational process
predictions) and rely on that information to guide move- that inverts the generative model in a probabilistic sense (this
ments. As in cue integration, the probability distributions idea goes back to Helmholtz).
that are assumed here are typically Gaussian. When the Optimality has also been applied in motor control, perhaps
dynamics are linear and all noise is Gaussian, the posterior even more extensively than in perception. This may be
is also Gaussian and can be computed by using the Kalman because, apart from its general appeal as an organizing prin-
filter. ciple, optimality appears to be the right way to resolve
There is a graphical representation of Bayesian inference redundancy. There is a wealth of experimental data (for
problems (figure 42.1A) that is known as a graphical model reviews, see Todorov, 2004; Kording & Wolpert, 2006) sug-
or a belief network (dynamic belief network when time is gesting that the motor system generates actions that maxi-
involved). This representation is very popular in statistics mize task performance or utility. Optimal control models
and machine learning (Pearl, 1988). Belief networks help us have accounted in parsimonious ways for numerous features
to understand the mathematical models intuitively and will of motor behavior on the levels of kinematics, dynamics, and
also be useful later in clarifying the relationship between muscle activity. There are two general approaches: open
estimation and control. To avoid confusion, keep in mind loop control and closed loop control. Open loop control
that unlike neural networks, the nodes in belief networks do precomputes the entire sequence of motor commands from
not correspond to neurons, and the arrows do not corre- now until the goal is achieved, while closed loop control (or
spond to synaptic connections. Instead, the nodes corre- feedback control) computes only the current motor command
spond to collections of random variables, whose probabilities given the current state estimate and then uses information
are presumably represented by populations of neurons. about the next state to compute the next command. Since
Strictly speaking, the arrows encode conditional probabili- movements are under continuous sensory guidance, the
ties, but in reality they often correspond to the causal rela- latter type of model corresponds more closely to what the
tions in the world, as illustrated in figure 42.1A. We show brain does. Although optimal feedback controllers are harder
only part of the network containing the states of the world to construct, we now have efficient algorithms and fast com-
at two consecutive points in time as well as the correspond- puters that enable us to explore such models.
ing sensory measurements/inputs. Solid gray circles denote Here is how optimal feedback control works in a nutshell:
variables whose values are observed and that therefore con- Define an instantaneous cost that accumulates over time and
tribute a likelihood function. Open circles denote variables yields a cumulative cost. The instantaneous cost is usually a
whose values are to be inferred. The forward arrows encode sum of a control cost r(u), which encourages energetic effi-
the stochastic dynamics of the world, that is, the one-step ciency, and a state cost q(x), which encourages accuracy or,
transition probability d. The downward arrows encode how more generally, getting to desirable states and avoiding
The fact that the synergy corresponds to a period of time One might also ask where intermediate representations
and not a single point in time yields temporal abstraction. come from. In sensory systems, it has been shown that unsu-
The fact that the synergy corresponds to only some aspects pervised learning applied to collections of natural sensory
of the state of the plant and not the entire state (e.g., it does inputs can recover the features that are observed experimen-
not specify all the joint angles but only the fingertip position) tally. The most notable examples come from the visual
yields spatial abstraction. Different forms of spatial and tem- system (Olshausen & Field, 1996), although the approach
poral abstraction have played an important role in designing has also been applied successfully to the auditory system
automatic controllers for complex tasks, suggesting that the (Lewicki, 2002). Unsupervised learning looks for statistical
brain may also rely on such tools. regularities in high-dimensional data. Traditional unsuper-
Thus intermediate representations in both sensory and vised learning methods such as principal components analy-
motor systems can be thought of as being part of hierar- sis reduce the dimensionality of the data. In contrast, the
chical generative models. One might ask, however, what is forms of unsupervised learning that are thought to be used
the point of having such representations when generative by sensory systems tend to increase dimensionality, that is,
models can be built without them. For example, given the they form overcomplete (and sparse) representations. This
full state of the arm, we can directly compute where the fin- might seem counterproductive; however, it resonates well
gertips are, without the help of motor synergies. Similarly, with recent computational approaches in which increasing
we can directly compute the retinal image resulting from a dimensionality simplifies computation. Support vector
given configuration of three-dimensional objects and light machines and kernel methods in general are based on this
sources, without relying on sensory features (this is what idea (Scholkopf & Smola, 2001). Liquid state machines in
computer graphics does). Indeed, intermediate representa- neuroscience have the same flavor (Maass, Natschlager, &
tions not only are unnecessary to build generative models, Markram, 2002).
but may even complicate the construction of such models. Unsupervised learning has also been applied in motor
However, the goal of both the sensory and motor systems is control to extract candidate synergies (D’Avella, Salticl, &
not so much to build generative models but rather to invert Bizzi, 2003; Santello, Flanders, & Soechting, 1998). However,
them. The inversion is the harder problem and is also the the situation here is qualitatively different. While in sensory
problem that has to be solved in real time. Intermediate systems, unsupervised learning is applied to sensory data that
representations are likely to facilitate this inversion, by pro- are available to the brain during learning/development, in
viding various forms of abstraction and enabling the infer- motor systems, it is applied to movement data that are avail-
ence algorithm to construct the final answer in manageable able to the brain only after it has mastered the motor task.
pieces. Thus intermediate representations may exist not for If we agree that appropriate synergies must exist before suc-
the sake of representation but because they facilitate the cessful movements can be generated in a given task, then the
computation. brain cannot learn those synergies from successful move-
abstract In this chapter we provide evidence that the cortical after being processed in the observer’s visual system, are
motor system is involved in action and intention understanding. In directly mapped on his or her motor representations without
the first part of the chapter, we show that at the core of the cortical any need of cognitive mediation.
motor system, formed by ventral premotor and inferior parietal
cortex, there are vocabularies of motor acts, such as grasping,
Strong evidence in favor of the existence of a direct mech-
holding, and breaking. Neurons that form these vocabularies code anism of understanding others’ actions by matching them
the goal of motor acts independent of how the goal is achieved. on the observer’s own motor system came from the discov-
Many of these motor neurons also respond to the observation of ery of mirror neurons (MNs), a class of visuomotor neurons
the same motor acts they motorically code (mirror neurons). In the that discharge both when a monkey performs goal-related
second part, we show that mirror neurons are involved in both the
motor acts (e.g., grasping) and when it observes or hears
understanding of motor acts done by others and the understanding
of the intention behind the acts. In the last part of the chapter we another individual (monkey or human) doing similar acts.
show that the mirror system in humans also plays a role in action Neurons with these properties are found in the rostral sector
and intention understanding. We conclude by presenting data sug- of the ventral premotor cortex (area F5) and in a sector of
gesting that some of the deficits present in the autistic syndrome the posterior parietal cortex (essentially corresponding
could be caused by an impairment of the mirror system.
to area PFG) that is anatomically connected with area F5.
Thus, premotor and parietal MNs form a cortical mirror
neuron system that translates sensory information about
Traditionally, it has been assumed that understanding actions biological actions into a motor format.
done by others, and even more so their intentions, occurs by There is evidence that in addition to the parietofrontal
applying a kind of reasoning not much different from that mirror neuron system, there are other mirror systems, at
used to solve a logical problem. According to this view, when least in humans. One, most likely present also in monkeys,
witnessing the actions of others, we process the actions with is involved in translating observed emotions into a viscero-
our sensory system; this information is then elaborated by motor pattern that expresses the same emotions (see Gallese,
some sophisticated cognitive apparatus and compared with Keysers, & Rizzolatti, 2004). In addition, humans are
other similar, previously stored data. At the end of this endowed with a mirror system for phonemes (Fadiga,
process, we know what others are doing and why. Craighero, Buccino, & Rizzolatti, 2002) and one for coding
Such complex cognitive operation likely occurs in many non-goal-directed movements (Fadiga, Fogassi, Pavesi, &
situations, for example, when the behavior of the observed Rizzolatti, 1995).
person is difficult to interpret (Brass, Schmitt, Spengler, & In the present chapter, we will focus on the parietofrontal
Gergely, 2007). Yet the simplicity and lack of effort with mirror system for actions. We will first review the anatomical
which we usually understand what the others are doing and functional properties of the mirror system in monkeys
suggest an alternative solution. The actions done by others, and humans and address the issue of how action is repre-
sented within primates cortical motor system. We will then
discuss a neurophysiological model of how actions and the
giacomo rizzolatti, and vittorio gallese Dipartimento di
Neuroscienze, Università di Parma, Parma, Italy intentions that promote them are understood. Finally, we
leonardo fogassi Dipartimento di Psicologia and Dipartimento will discuss some implications of this model for our under-
di Neuroscienze, Università di Parma, Parma, Italy standing of autism.
Figure 43.1 Lateral view of the monkey brain showing the par- Gregoriou et al. (2006). Abbreviations: AI, inferior arcuate sulcus;
cellation of the motor and the posterior parietal cortex. The areas AS, superior arcuate sulcus; C, central sulcus; FEF, frontal eye-
located within the arcuate and the intraparietal sulcus are shown fields; IP, intraparietal sulcus; IO, inferior occipital sulcus; L, lateral
in an unfolded view of these sulci in the left and right parts of fissure; Lu, lunate sulcus; P, principal sulcus; STS, superior tem-
the figure, respectively. For the nomenclature and definition, see poral sulcus. (See color plate 54.)
Rizzolatti, Luppino, and Matelli (1998), Nelissen et al. (2005), and
represented are grasping, holding, manipulating, and tearing. Mirror neurons show a close relationship between their
Unlike another category of visuomotor neurons that are visual and motor responses. Using as classification criterion
present in area F5 (“canonical neurons”) (Murata et al., the congruence between the executed and observed motor
1997; Raos et al., 2006), they do not fire in response acts that are effective in triggering them, mirror neurons
to simple presentation of objects, including food. The have been subdivided into two broad classes: strictly congru-
observation of intransitive motor acts, including mimed ent and broadly congruent neurons (Gallese, Fadiga, Fogassi,
motor acts, is also ineffective. & Rizzolatti, 1996). They are defined as strictly congruent when
the observed and executed effective motor acts are identical tively long period during which monkeys observed the
in terms of goal (e.g., grasping) and in terms of the way in experimenters performing actions using tools, some mirror
which that goal is achieved (e.g., precision grip). In contrast, neurons respond, although weakly, also to this type of action
mirror neurons are defined as broadly congruent when there is (Rizzolatti & Arbib, 1998). More recently, Ferrari, Rozzi,
a similarity, but not identity, between the observed and and Fogassi (2005) reported that in a specific ventral sector
executed effective motor acts. Among the different types of of F5, there are neurons that discharge very vigorously to
broadly congruent neurons, the most common is constituted the observation of tool use (e.g., a stick or a pair of pliers).
of neurons that become active during the execution of a It is not clear whether these neurons, like those previously
specific motor act made by the monkey (e.g., grasping, observed, derived this property because of prolonged action
holding, or manipulating) but visually respond to more than observation.
one motor act (e.g., manipulation and grasping). The most widely accepted hypothesis on the functional
In the first studies on mirror neurons, it was reported role of mirror neurons is that they play a role in understand-
that these neurons do not discharge during the observa- ing the goal of the observed motor acts (Rizzolatti et al.,
tion of goal-directed actions done by using tools (Gallese 2000). The proposed mechanism is the following: individuals
et al., 1996; Rizzolatti, Fadiga, Gallese, & Fogassi, 1996). know the outcome of their motor acts. Thus, when the
Subsequently, however, it was shown that following a rela- mirror neurons of an observing individual, which code a
given motor act (e.g., grasping), discharge in response to the neurons that had responded to visual observation of acts
observation of that motor act (grasping) done by another accompanied by sounds also responded to the sound alone.
individual, the observer understands its goal, because that These neurons were named “audiovisual” mirror neurons.
discharge corresponds to the one that occurs when the In the second series of experiments, the researchers
observer wants to achieve the same goal. hypothesized that if mirror neurons are involved in under-
To provide evidence in favor of the view that mirror standing a motor act, they should also discharge when
neurons play a role in understanding motor acts done by the monkey does not actually see the motor act but has suf-
others, neurons’ responses were investigated when the ficient clues to create a mental representation of it. There-
monkeys could comprehend the goal of a motor act without fore, F5 mirror neurons were tested in two conditions. In
actually seeing it. If mirror neurons truly mediate under- one, the monkey was shown a fully visible motor act directed
standing, their activity should reflect the meaning of the toward an object (“full vision” condition). In the other, the
motor act rather than its visual features. Two series of exper- monkey saw the same act but with its final critical part
iments were carried out for this purpose. hidden (“hidden” condition) (Umiltà et al., 2001). The results
The first series tested whether mirror neurons could rec- showed that more than half of F5 mirror neurons also dis-
ognize motor acts merely from their sounds (Kohler et al., charged in the hidden condition. An example is shown in
2002). The activity of mirror neurons was recorded while a figure 43.5.
monkey was observing a motor act, such as ripping a piece These experiments strongly support the notion that the
of paper or breaking a peanut shell, that is normally accom- activity of mirror neurons underpins the understanding of
panied by a distinctive sound. Then the monkey was pre- motor acts. Even when the motor act comprehension is pos-
sented with the sound alone. It was found that many mirror sible on a nonvisual basis, such as via sound or nonlinguistic
that appeared at random locations. The experiment con- Mirror system in humans
sisted of two phases: active movement and observation. In
the active movement phase, the monkey controlled the Anatomy of the Mirror System A large number of brain
cursor, while in the observation phase, the monkey observed imaging studies showed that parietal and frontal areas that
the replayed movements generated in the active phase. The activate during motor acts execution are also active when
observation phase had three conditions. In the first, both the an individual observes similar motor acts done by others
cursor and the targets were visible; in the second, the monkey (see Rizzolatti & Craighero, 2004). Most of these studies
saw only the replayed targets; in the third, the monkey saw concerned observation of object-directed grasping movements.
only the moving cursor but not the targets. The results The regions that are activated in these studies form the
showed that passive observation of the task determined a grasping human mirror system. The two main nodes of this
neural discharge similar to that found during task execution. system are the inferior parietal lobule (IPL) and the ventral
The observation of the cursor without targets or of the premotor cortex (PMv) plus the caudal part of the inferior
targets without cursor gave either no responses or responses frontal gyrus (IFG), roughly corresponding to its pars
that were weaker than those found during the observation opercularis. The localization of human grasping mirror system
of both cursor and targets. The authors concluded that the corresponds to that of the homologous mirror system in the
most likely explanation of their findings is that the observa- monkey (figure 43.7).
tion of the movements determined a covert generation of a Several experiments addressed the issue of how observed
motor command. motor acts performed by different effectors are organized in
the human mirror system by presenting videos of motor acts mouth motor acts, activation of the caudal part of the same
performed with leg, hand, and mouth (Buccino et al., 2001; cortex but extending into the superior parietal lobule for the
Sakreida, Schubotz, Wolfensteller, & von Cramon, 2005; leg motor acts, and activation of an intermediate sector for
Shmuelof & Zohary, 2006; Wheaton, Carpenter, Mizelle, & the hand motor acts (Buccino et al., 2001). In the experiment
Forrester, 2008) or using point-light displays of biological (Filimon et al., 2007) in which the focus was on observation
motion of different body parts (Saygin, Wilson, Hagler, of the transport phase (reaching movement), the activation
Bates, & Sereno, 2004; Ulloa & Pineda, 2007). was located more dorsally, in the superior parietal lobule
As far as the premotor cortex is concerned, the results extending toward the dorsal bank of the IPS.
showed that the observed leg motor acts are represented In a recent study ( Jastorff, Rizzolatti, & Orban, 2007),
more dorsally in the ventral premotor cortex (PMv) extend- video clips showing four distal motor acts (grasping, drag-
ing across the superior frontal sulcus into the dorsal premo- ging, dropping, and pushing), each performed by using three
tor cortex (PMd), and the hand motor acts are represented different effectors (foot, hand, and mouth), were presented
in an intermediate position in PMv, while the mouth motor to volunteers. The results showed that while in PMv, the
acts are represented ventrally, extending into the IFG. There activations determined by the observed motor acts were
was considerable overlap between adjacent representations. clustered according to the effector used, independently of
While the goal of the observed motor acts in these studies their positive (grasping and dragging) or negative (dropping
was achieved mostly by distal movements, a recent study and pushing) behavioral valence, in the parietal cortex,
investigated the organization of reaching movements, that the organization followed another principle: The observed
is, the transport phase of the hand to a particular location motor acts were found to be clustered according to their
in space, eliminating the contribution of grasping move- valence, regardless of whether they were done with the
ments (Filimon, Nelson, Hagler, & Sereno, 2007). It was mouth, hand, or foot. The most activated region corre-
found that in both observation and execution, the sector of sponded to putative human AIP, extending ventrally to the
premotor cortex that was activated was located in the cortex inferior parietal lobule and dorsally to the superior parietal
of the superior frontal gyrus (SFG), that is, in PMd. Thus, it lobule. Motor acts with negative valence were represented
appears that observation of motor acts focused on the distal dorsally, while those with positive valence ventrally. It can
part of the effector activates PMv, while when the focus is be hypothesized that this parietal organization, by general-
on the proximal part, activation mostly concerns PMd. izing the motor act valence across effectors, allows a unified
The activation pattern in the parietal lobe is rather understanding of the observed behavior.
complex. The observation of goal-directed motor acts in In addition to an organization based on the valence of the
which the focus was on distal movements showed activation motor act, the parietal lobe activation also showed a coarse
of the rostral part of the cortex inside and around the intra- effector-based organization. The strongest activations for
parietal sulcus, extending into the convexity of IPL for foot motor acts were located dorsally, and those for mouth
B1 B2
typically-developing children
0.08 0.08
place
eat 0.06
0.06 0.06
autistic children
0.08 0.08
place
0.06 eat 0.06 0.06 0.06
place
0.05 0.05 0.05 0.05
eat
eat place
0.04 0.04 0.04 0.04
place
0.03 0.03 0.03 0.03
Figure 43.8 Differential activation of a mouth-opening muscle curves are aligned (dashed vertical line) with the moment in which
during execution and observation of two actions in typically devel- the object is lifted from the touch-sensitive plate. Right: Mean EMG
oping and autistic children. (A) Schematic representation of the two activity of the same muscle in three epochs of the two actions.
actions executed and observed by the two groups of subjects. Upper Vertical bars indicate 95% confidence intervals. (B2) Left: Time
part: The individual reaches for and grasps a piece of food located course of the EMG activity of the mylohyoid muscle during obser-
on a touch-sensitive plate, brings it to the mouth, and eats it. Lower vation of grasping for eating (red) and grasping for placing (blue).
part: The individual reaches for and grasps a piece of paper located Other conventions as in B1. Right: Mean EMG activity of the same
on the same plate and puts it into a container placed on the shoul- muscle in three epochs of the two observed actions. Other conven-
der. (B1) Left: Time course of the EMG activity of the mylohyoid tions as in B1. (Modified from Cattaneo et al., 2008.) (See color
muscle during execution of grasping for eating (red) and grasping plate 59.)
for placing (blue). Vertical bars indicate the standard error. The
abstract Hierarchy is a central concept for understanding how problems, including those associated with task planning?
complex goal-oriented behaviors are organized. In this chapter we Do all plans require goals? How does a task get organized?
present recent functional and behavioral evidence that supports the One possible mechanism for planning goal-directed
existence of a control hierarchy in the human brain for organizing
complex motor. It is proposed that the functional hierarchy is not sequences of actions is based on hierarchical control. The
based on strict anatomical connectivity within the motor system. argument for hierarchical control was elaborated within a
Instead, there are multiple motor planning circuits, each of which cognitive science framework by Keele and colleagues (1990)
can serve a supraordinate role, and this role can be readily inter- during the 1980s. Using a set of model tasks such as hand-
changed to achieve a much wider range of task outcomes. This can writing and the serial reaction time task, researchers sought
be observed at the level of hand-object interactions, bimanual
control, and the integration of semantics into action planning.
to describe the structure of control hierarchies by identifying
consistent patterns of variation in the time required to initi-
ate successive components of an action, as well as through
studies of motor transfer. These studies showed that many
Historical perspective: The hierarchy of serial behavior aspects of control reflected constraints that were related
to the abstract nature of action representations, separable
Within the field of motor control, the problem of how from the musculoskeletal system. A fundamental distinction
people accomplish complicated tasks has historically been derived from this perspective can be made between abstract
intertwined with the concept of what constitutes a motor plans and their implementation, the basic components of
program. This in turn depends on solving the problem of a hierarchy.
serial order, first articulated by Lashley (1951). He sought to Another form of hierarchical control within the imple-
understand how the nervous system organized sequential mentation process itself became evident through studies
motor elements to achieve a desired motor goal. Bernstein of naturalistic grasping. Kinematic analysis showed that
(1996) elaborated on the serial order problem by emphasiz- the transport phase of an arm movement has an exquisite
ing that the control system was flexible, designed to produce interdependency with processes involved in shaping the
actions that were constrained by task demands rather than hand to grasp an object ( Jeannerod, 1984, 1986). The
fixed action patterns. This shift in emphasis brought to the velocity of the transport phase is subordinate to the grasp
forefront the concept of a goal. We move to accomplish requirements, with timing that is tightly coupled to maximal
goals. hand aperture. At a more abstract level, prior experience
These early theories were critical in minimizing the role and task goals can also influence grasping (Rosenbaum,
of the simple chaining of reflexes, proposing instead the Meulenbroek, & Vaughan, 2001; Rosenbaum, Vaughan,
existence of a motor plan as an alternative. But they also Barnes, & Jorgensen, 1992). For example, the way in which
introduced fundamental questions that continue to chal- an object is grasped is constrained by task demands. Given
lenge the field: What is a motor plan? Is it composed of dis- a fixed starting position, the adopted grasping posture will
crete representational elements? Can associative mechanisms depend on how the tool is to be moved (defined by the center
that underlie reflex chains be used to solve more complex of mass) and used (defined by the tool’s functional proper-
ties), as well as the comfort of the end-state posture. In this
scott. t. grafton UCSB Brain Imaging Center, Department of case the selected grip is subordinate to the desired goal for
Psychology, University of California Santa Barbara, Santa Barbara, using the tool as well as biomechanical constraints.
California Computational models of motor planning have also
l. aziz-zadeh Brain and Creativity Institute and The Department exploited hierarchical features in action representation. In
of Occupational Therapy, University of Southern California, Los
Angeles, California a model of hierarchical behavior proposed by Cooper and
r. b. ivry Department of Psychology, University of California Shallice (2006), a logical tree structure of discrete behaviors
Berkeley, Berkeley, California is developed to organize an action sequence. To make a cup
grafton, aziz-zadeh, and ivry: hierarchies and the representation of action 641
Figure 44.1 Task hierarchy for making a cup of tea. (A) An explicit whole-part schema for performing a task with three levels of
complexity. (B) A schema of the same task after the different components have been compiled into sequential units.
of coffee, the act of adding sugar is distinct from the act of Anatomic versus representational hierarchy
adding milk, and each must be scheduled after the coffee has
been brewed. This scheduling occurs within a large multi- The concept of an action hierarchy has not been limited to
layered, interactive network, with the top level of the hier- the psychological and computational realms; neuroscientists
archy providing constraint in terms of its specification of task have also been highly influenced by hierarchical notions in
goal (see figure 44.1A). theorizing about the organization of the nervous system.
In such models, the notion of a hierarchy is explicit, with Efforts to map different levels of motor planning into distinct
the layered representation defined as a task schema. In neural substrates were motivated in large part by the belief
an alternative approach, action planning could be goal- that there existed a strict anatomical hierarchy, at least
independent, with the hierarchy arising as an emergent within lower levels of the nervous system. As one ascends
property of processes that arises from sequential transitions from muscle activity to peripheral nerves and then into
between different components. For example, Botvinick spinal cord and ultimately to motor cortex, there is increas-
(2008) has shown that a simple recursive network based on ing abstraction in the type of information represented
an action layer, a perception layer, and an intermediate (d’Avella & Bizzi, 2005; Giszter, Mussa-Ivaldi, & Bizzi,
layer can learn fairly complex motor actions without the 1993). It is only natural to assume that there is a continua-
need for top-down task structuring with respect to a goal. tion of this control hierarchy into premotor and ultimately
Furthermore, sequencing and the formation of motor pro- prefrontal and parietal areas. An example is the sensor-
grams can lead to compilation of complex acts into a smaller imotor hierarchy first proposed by Fuster (Fuster, 1995)
set of tasks, as shown in figure 44.1B. The evaluation of these and recently implemented by Botvinick (2007) (figure 44.2A).
computational models has primarily relied on behavioral In this model, only the primary motor cortex influences
studies that involve dependent variables such as variation in the environment. As a task becomes more complex,
planning time and errors of substitution. increasing reliance is placed on premotor heteromodal
Methods of cognitive neuroscience can provide additional sensory circuits and ultimately prefrontal-polymodal sensory
means for addressing these issues. The present chapter seeks circuits. In an extreme version of this model, there is a linear
to incorporate these other forms of evidence to reconsider gradient between task complexity and posterior to anterior
how the notion of hierarchy and action goals may be concep- prefrontal cortex (Badre & D’Esposito, 2007; Botvinick,
tualized to aid our understanding of how movement is 2007, 2008).
achieved. To assess these questions, we review a wide range Early brain imaging studies of action planning were often
of actions, spanning tool use, bimanual coordination, and, interpreted as being consistent with an anatomical hierarchi-
finally, how language influences motor planning and control. cal framework (Roland, Larsen, Lassen, & Skinhoj, 1980;
Figure 44.2 Examples of anatomic networks. (A) In this model, A dorsal-dorsal stream (upper arrow) is used for learning arbitrary
based on Fuster (1995), sensorimotor loops represent information sensorimotor transformations and reaching. A ventral-dorsal
of increasing abstraction or complexity. Only the motor cortex stream (middle arrow) is used for object centered actions. A third
controls interactions with the environment, and there is strict stream, positioned between inferior parietal lobule and inferior
segregation between unimodal and polymodal sensory areas. frontal gyrus (lower arrow), has been hypothesized for representing
(B) In this multiple stream model, there are at least two parietal- complex actions and tool use. In this model, these circuits operate
prefrontal-premotor streams engaged for goal-oriented behavior. in tandem, with no fixed hierarchical arrangement.
Roland, Skinhøj, Lassen, & Larsen, 1980). One of these The third argument is based on recent evidence that even
studies used positron emission tomography to measure blood the lowest levels of this presumed cortical hierarchy are
flow and compared activation patterns during real versus capable of organizing extremely complex serial behavior (Lu
imagined movement. Whereas the supplementary motor & Ashe, 2005; Matsuzaka, Picard, & Strick, 2007). Recent
area (SMA) was active during both real and imagined move- recordings from the primary motor cortex in nonhuman
ment, motor cortex was only weakly activated during imag- primates demonstrate sequence-specific responses that are
ined movements. This dissociation was interpreted as showing tied to the action rather than particular muscles. Similar
that the SMA provided a more abstract representation of evidence for learning-dependent changes within motor
the action plan, one that provided the plan of the action, and cortex for complex serial actions has been observed in
motor cortex activation was primarily limited to the actual humans (Grafton, Hazeltine, & Ivry, 1998; Grafton, Salidis,
implementation of that plan. From this view, the SMA has & Willingham, 2001; Karni et al., 1998).
sometimes been referred to as a “supramotor” area. Finally, models that emphasize a strict anatomical hierar-
However, four arguments suggest that caution is required chy for motor planning run the risk of requiring a command
in attempting to identify a direct correspondence between a and control structure with a “decider” at the top; the problem
well-defined anatomical hierarchy and a functional hierar- of the homunculus resurfaces in such models. This type of
chy. First, many of the descending pathways to the spinal architecture seems difficult to reconcile with the effortless
cord originate outside the primary motor cortex, including nature with which we perform many of our everyday actions.
rich projects from premotor and parietal cortex, as well as These are planned unconsciously and adjusted on-line at an
the extrapyramidal brain stem pathways (Dum & Strick, extremely rapid rate (Desmurget & Grafton, 2000).
1991, 1996). The diversity of these cortical and subcortical An alternative conceptualization of functional anatomy is
projections underscores the ability of these areas to directly motivated by the existence of multiple interactive loops across
influence movement. In addition to their direct influences prefrontal-parietal cortex. For example, the concept of two
on motor commands, these areas likely play a role in estab- visual streams for object identification versus action pragmat-
lishing the context of an action and the coordination of ics has been extended, as is shown graphically in figure
movement commands with current information about the 44.2B (Goodale, Milner, Jakobson, & Carey, 1991). There is
state of the actor. now extensive anatomical and functional evidence to support
A second argument against the presence of a strict ana- at least two and possibly three processing streams within the
tomical hierarchy within motor regions of the cortex is that classic “dorsal” stream related to object-centered action, tool
most premotor areas have direct inputs onto motor cortex, use, and reaching (Johnson & Grafton, 2003; Rizzolatti &
and no premotor area appears to have a dominant role over Luppino, 2001; Rizzolatti & Matelli, 2003). In addition,
another (Dum & Strick, 2005). It has become clear that there there is little anatomical evidence to segregate polymodal
are multiple body representations within premotor cortex, sensory from unimodal association cortex as originally pro-
and the anatomy fails to indicate some sort of hierarchical posed by Fuster. Within each parietal-premotor-prefrontal
structure across these subregions. pathway, all forms of sensory information are integrated.
grafton, aziz-zadeh, and ivry: hierarchies and the representation of action 643
The preceding arguments suggest that insight into the gration (Wolpert, Ghahramani, & Jordan, 1995; Wolpert,
hierarchical nature of action planning and goal representa- Goodbody, & Husain, 1998).
tion will not be defined by the existence of an anatomical There are at least two solutions to this problem. One is
hierarchy. Instead, an anatomical organization with multi- that areas such as aIPS are akin to low-level visual areas and
ple parallel parietal-prefrontal and premotor pathways sup- pass information off to higher cortical areas. This is a classic
ports a multitude of relative hierarchies that can be flexibly functional-anatomical hierarchy of ascending representa-
recruited as a function of task demands, experience, and tional complexity. For example, the goal level of the action
context. In this framework, there are dissociable functional might be represented in ventral premotor cortex, an area
anatomic substrates, but these are not constrained by a fixed that is richly connected to aIPS. In this framework, premotor
hierarchy. This shifts the focus of inquiry to understanding areas would make the ultimate control decisions. An alterna-
representational hierarchies that are highly flexible and goal tive view is that of an inverted or flexible hierarchy. This
based. emphasizes that information about the task goal can have a
direct influence on processing within areas such as aIPS. In
Goal representation and the on-line control of grasp this scheme, computations within aIPS use this goal informa-
tion to constrain sensorimotor integration needed to related
Grasping studies traditionally focus on the interplay between motor commands with object information and an internal
grip formation and limb transport to understand the repre- representation of the body in relationship to the object.
sentational organization of these two task components A series of transcranial magnetic stimulation (TMS)
(Haggard & Wing, 1997; Jeannerod, 1997; Jeannerod, studies targeting aIPS during grasping suggest that the latter
Arbib, Rizzolatti, & Sakata, 1995). For grasping, the object perspective is more appropriate. In this work, subjects were
itself defines the task goal. The problem then becomes one required to reach and grasp a small 1 × 1 × 5 cm rectangular
of sensorimotor transformation, in which object features are wooden block located on a computer-controlled torque
decoded to generate hand configurations that are optimally motor (Tunik, Frey, & Grafton, 2005). The orientation of
shaped to match the object geometry. Object knowledge the block could be changed in less than 30 ms. The subject
involves both physical properties (texture, mass, center of was required to start with the right hand on a button and,
gravity) and utility (how the parts of an object such as a when ready, use the index finger and thumb to grasp the
handle and action surfaces are used to accomplish particular object, aligning these fingers on the vertical axis. To assess
goals). Through experience and cumulative knowledge, a on-line updating, the initiation of the grasping movement
library of possible hand-object affordances and utility are would trigger the motor to spin the object 90 or 180 degrees.
constructed. In this manner, the object’s orientation always changed on
Neuropsychological and neuroimaging studies have pro- every trial. However, the planned grasping action would
vided evidence of two primary pathways in posterior cortex: have to be updated when the object rotated 90 degrees,
the classic “how” and “what” visual streams (Culham et al., requiring a larger aperture to grasp the 5-cm width. In con-
2003; Goodale et al., 1994). Processing distinctions within trast, when the object rotated 180 degrees, no adjustment
these pathways can be viewed as supporting pragmatic and was required, since the object’s vertical axis remained at
conceptual representations for action. Within the dorsal 1 cm diameter.
“how” pathway, there is clear evidence that the anterior Single pulses of TMS were delivered to aIPS, timed to the
intraparietal sulcus (aIPS) in humans (area AIP in nonhu- start of the hand movement. The TMS pulses disrupted the
man primates) is critical for sensorimotor transformations subjects’ ability to modify their grip aperture on trials in
that relate the visual and/or haptic features of an object to which the grasp had to be updated but had no effect on trials
a desired hand shape, with limb transport, body stabiliza- in which the original grip could be used (figure 44.3). In
tion, and eye movements playing subordinate roles. control conditions in which the TMS pulses were applied to
This framing in terms of a sensorimotor transformation other brain regions (e.g., caudal or mid intraparietal cortex),
provides a starting point to understand a functional hierar- or 400 ms after movement onset, no behavioral effects were
chy, but it is missing a critical piece: how the goal is related observed.
to the sensorimotor transformations. We grasp objects to If reaching and grasping are anatomically dissociable,
achieve goals (e.g., pick up a nut) or to solve problems (e.g., TMS of aIPS should affect only the control of grip aperture.
use a tool to open the nut). How does an area such as aIPS To test this hypothesis, the same subjects were tested in a
integrate the low-level details required to control grip aper- critical second experiment. Instead of always grasping the
ture with information about an object that includes high- block along the vertical axis, they were now told to always
level features that may be defined functionally and in a way grip the narrow (1-cm) axis. With these instructions, the task
that varies with context? This shifts the problem from one goal required that they update the orientation of the forearm
of sensorimotor transformation to one of sensorimotor inte- and wrist, but the grip aperture remained fixed. In this con-
Figure 44.3 Effect of a single pulse of TMS during object grasp- Goal representation in bimanual coordination
ing. The figure plots finger aperture measured when subjects grasp
an object that has increased in size, as shown in the insert of the Studies of bimanual movements have provided another
hand. When TMS is applied to the anterior intraparietal sulcus framework for examining how the selection and control of
(arrow on brain, insert) at movement onset, there is a delay in the
formation of the required grip aperture (lower curve in plot) com- movement are constrained by action goals. Much of this
pared to the no-TMS, control condition (upper curve in plot). work has involved rhythmic movements, evaluating changes
(Adapted from Tunik et al., 2005.) in pattern stability when performers are asked to adopt a
range of phase relationships between the two hands (see
dition, when TMS pulses were applied to aIPS at movement Schöner & Kelso, 1988). A large body of evidence demon-
onset, the participants were unable to rotate the arm appro- strates that certain patterns are more stable than others;
priately to match the new orientation of the object. people are much more adept in adopting antiphase and in-
Taken together, the results argue against the hypothesis phase patterns of motion than in adopting patterns in which
that the TMS is interfering with the adjustment of a particu- the two hands must adopt more complex phasing patterns.
lar set of muscles or an elemental process such as grip aper- These constraints have been formally described by models
ture. A more parsimonious explanation is that this region of in which the limbs are conceptualized as coupled oscillators.
the parietal cortex is involved in using information concern- An alternative, process-oriented perspective focuses on the
ing the task constraints to generate a desired hand-object manner in which the task goals are represented. In the event-
interaction. As such, the TMS appears to disrupt the repre- timing model of Spencer, Semjen, Yang, and Ivry (2007),
sentation of the goal itself rather than some “downstream” the task is represented as a series of salient temporal events
operation that controls some component of that goal. such as the point of contact during finger tapping or
Note that in the two studies reviewed above, the disruptive maximum flexion or extension during movements performed
effects of TMS were observed on trials in which an action without such haptic cues. In this model, stability is con-
plan had to be updated rapidly. Perhaps aIPS is important strained by the complexity of the temporal representation.
for this updating process and not needed when the action Thus, antiphase and in-phase patterns are more stable
has already been planned. To address this, shutter goggles because the representation of the temporal goals for these
were used to control when visual information about the patterns is simpler than for more complex phase relation-
object was available to the subjects (Rice, Tunik, & Grafton, ships (Semjen & Ivry, 2001). Moreover, the phase transitions
2006). The goggles provided a brief 200-ms view of the observed from antiphase to in-phase movement when move-
object but were closed just before reach onset. Thus the ment frequency increases arise because the latter entails a
subjects were required to reach and grasp the object without simpler temporal representation (Spencer et al., 2007).
vision of the hand or of the object. Critically, TMS to aIPS A similar perspective has been offered to account for
interfered with grasp kinematics when the pulses were deliv- bimanual interactions observed in the spatial domain (Ivry,
ered at movement onset but not when it was delivered during Diedrichsen, Spencer, Hazeltine, & Semjen, 2004). Consider
the viewing period. a task in which a person must simultaneously draw two
This result argues strongly against the hypothesis that three-sided squares (figure 44.4A). Performance is fluid when
aIPS is essential for planning the requisite sensorimotor the patterns are symmetric (e.g., U and U). In contrast, when
transformation solely on the basis of visual information. If the patterns are orthogonal (e.g., U and C), severe limitations
grafton, aziz-zadeh, and ivry: hierarchies and the representation of action 645
A
Figure 44.4 (A) In this task, the participant must simultaneously Symbolically cued actions produce stronger activation across the
draw the shape on the left with the left hand and the shape on the left intraparietal sulcus/superior parietal lobule and left premotor
right with the right hand. Representative trajectories produced by cortex than do directly cued actions. (Adapted from Diedrichsen
a control participant and a callosotomy patient are shown. (B) et al., 2006.)
are observed; the time to initiate each segment increases Waki, Yamada, & Ishii, 1997). Diedrichsen and colleagues
dramatically, and spatial distortions are observed such that (2006) looked at simpler movements, comparing conditions
the trajectories for the two hands become assimilated (Albert in which reaching movements were either directly cued
& Ivry, 2009; Franz, Eliasson, Ivry, & Gazzaniga, 1996). (e.g., targets specified by the locations of the stimuli) or
These effects can also be observed with simpler patterns; for symbolically cued (e.g., locations specified by letters indicat-
example, reaction times are much slower when two linear ing target locations). The SMA showed no difference between
movements are symmetric than when they follow orthogonal conditions requiring unimanual or bimanual movements.
trajectories or are of different amplitudes (Heuer, Kleinsorge, However, a large parietal region extending along the intra-
Spijkers, & Steglich, 2001). However, these costs are essen- parietal sulcus as well as premotor cortex showed greater
tially abolished when stimuli appear at the endpoint location, activation in the symbolic condition than in the direct condi-
serving as direct cues for the required movements (Diedrich- tion (figure 44.4B). Interestingly, this activation was much
sen, Hazeltine, Kennerley, & Ivry, 2001). stronger in the left hemisphere than in the right hemisphere,
The fact that the constraints are highly dependent on the and the magnitude of the activation was greater for biman-
manner in which the actions are cued indicates that the ual movements that were incongruent (e.g., orthogonal
limitations here are not related to processes that are typically directions) than when they were congruent (e.g., parallel
associated with motor programming and execution. Rather, conditions).
they arise at a more abstract level, one associated with the In terms of the focus of this chapter, these findings
goal of the action. This may be related to the sensory con- speak to three issues. First, contrary to predictions derived
sequences of the movements (Franz, Zelaznik, Swinnen, & from a traditional anatomical-inspired framework, there
Walter, 2001; Mechsner, Kerzel, Knoblich, & Prinz, 2001) was no region, including SMA, that appeared to be specifi-
or to the manner in which the movements themselves are cally sensitive to the contrast of unimanual and bimanual
conceptualized (e.g., as movements to produce trajectories movements. Thus, at least for reaching, the evidence fails
or movements to locations; see Ivry et al., 2004). to support the hypothesis that there exists a neural region
The neural locus of bimanual coordination has been the that is specialized for bimanual coordination. Second,
subject of considerable study. Within a traditional hierarchi- manipulation of the task goals did not engage new neural
cal perspective, the debate has centered on whether certain regions but rather led to modulation of the magnitude of
neural regions are specialized for coordinating the gestures neural activity. That is, a similar network was recruited for
of the two hands. Much of this work has focused on the reaching movements in response to direct and symbolic
supplementary motor area, motivated by anatomical, lesion, cues, the activation in these areas being greater in the latter
and neuroimaging evidence involving sequential or rhyth- conditions. Thus, similar to our review of reaching and
mic movements (e.g., Brinkman, 1984; Sadato, Yonekura, grasping, the representation of the task goal and control
grafton, aziz-zadeh, and ivry: hierarchies and the representation of action 647
achieved. Rather, repetition of the kinematics (e.g., push or
pull) led to a reduced BOLD response in the left middle
intraparietal sulcus, left lateral occipital cortex, and left supe-
rior temporal sulcus.
Taken together, these three experiments support a model
of representational hierarchy that distinguishes action means,
kinematics, object-centered behavior, and ultimately, action
consequences. The decoding of object-centered action
appears to be strongly left lateralized, whereas the decoding
of more complex action intentions arising as a consequence
of the action engaged bilateral frontal-parietal circuits. The
bilateral recruitment that is observed in this latter condition Figure 44.5 Two routes, one mediated by semantics, for building
up a motor representation. (Adapted from Tessari & Rumiati, 2004.)
is quite different from the relative hierarchies described in
the other sections of this chapter. One explanation focuses used. For example, consider a task in which a person is asked
on perceptual factors. Complex intentions might require to imitate gestures. If the gestures are meaningless, then it is
more global perceptual analysis (Ivry & Robertson, 1998). thought that imitation must occur via a direct visuomotor
An alternative explanation is that the right hemisphere plays route. If the gestures are meaningful, however, then imita-
a central role in representing more complex action goals. tion could be achieved either by this direct visuomotor route
Most studies of action understanding or production focus on or by accessing long-term semantic memory (Rumiati &
simple object-centered actions rather than complex goals Tessari, 2002; Rumiati et al., 2005). Behavioral studies of
and do not address this hypothesis. This explanation is imitation of meaningful and meaningless gestures (Tessari
supported by patient studies. As Hartmann, Goldenberg, & Rumiati, 2004) support the theory that actions can be
Daumüller, and Hermsdörfer (2005, p. 625) recently empha- organized by these two systems.
sized, “It takes the whole brain to make a cup of coffee.” Is processing in these pathways independent, or do the
systems share some common neural substrates? A visuomo-
Action semantics tor route would seem to involve motor-related areas. To
what degree does a semantic route use (some of) the same
How might language semantics fit into the representational motor-related brain regions? Does hearing the word hammer-
hierarchy of motor control? It seems plausible that a word ing directly activate motor-related brain areas? One way to
such as hammering could summon the actions associated with explore this question is to return to the study of action com-
this concept. Thus, when one hears the word, an entire prehension. If an action-related semantic area is indepen-
action plan would be activated, one composed of various dent of motor-related areas, then comprehension should
subcomponents: retrieving the required tools, grasping the remain possible if the motor regions are damaged, at least
hammer with one hand and the nail with the other, striking for meaningful actions. In this view, words related to actions
the nail by pounding the hammer. The hypothesis of an are processed by nonmotor, language-related areas; their
interaction between semantic processing and action plan- effect on motor performance is indirect, perhaps occurring
ning is supported by evidence from various methodologies. via spreading activation to motor regions. Alternatively, if
For example, adjectives related to object properties have action semantics is intimately linked with motor-based rep-
been found to influence movement execution (Gentilucci, resentations, then lesions of the motor regions should disrupt
Benuzzi, Bertolani, Daprati, & Gangitano, 2000; Gentilucci comprehension. That is, in this view, semantic knowledge
& Gangitano, 1998; Glover & Dixon, 2002). A subject’s cannot be separated from the systems that are involved
initial grasp kinematics is influenced by seeing the word large in producing the actions themselves, a form of embodied
or small printed over the target object. Similarly, initial reach cognition (e.g., Gallese & Lakoff, 2005). As such, lesions to
kinematics to an object are altered if the word far or near is these areas will affect both action production and action
printed adjacent to the object. These findings indicate that comprehension.
semantic processing, even when not explicitly related to the This question has been asked in a number of neuropsy-
motor task, influences motor planning. As such, they reveal chological studies. Some of this work has focused on patients
how language provides another representational system with apraxia, a disorder defined by impairments in the pro-
through which motor plans are organized and influenced. duction of gestures that cannot be attributed to problems in
The form of these interactions has been the focus of the actual control of the effectors. With regard to their motor
numerous recent investigations. One hypothesis is that lan- output, many of these patients appear to have lost their
guage and motor systems constitute two parallel systems knowledge of action semantics; for example, they are unable
(figure 44.5). Task requirements determine which system is to pantomime familiar gestures or use tools. Ideomotor
grafton, aziz-zadeh, and ivry: hierarchies and the representation of action 649
Figure 44.6 (A) Observation of movements performed by the ROI associated with mouth action observation was most active
hand, mouth, or foot was used to localize regions of interests (ROIs) for mouth-related phrases; similarly, the region defined by foot
in the premotor cortex. (B) The same participants read phrases action observation was most active in reading of foot-based actions.
related to hand, mouth, or foot actions. (C ) The left hemisphere (D) No significant effects were observed in the right hemisphere
ROI associated with hand action observation was most active when ROIs. (Adapted from Aziz-Zadeh et al., 2006.)
the participants read phrases described as hand-based actions. The
ability to retrieve action knowledge, using a task in which access the former. Rather, the evidence is more consistent
the participants matched pictures that depicted related with an embodied cognition framework, one in which our
actions. Based on a lesion overlap approach, the highest conceptual knowledge of actions is dependent on the systems
incidence of impairment was associated with damage to the that are required to produce actions. In the more extreme
left premotor/prefrontal cortex, the left parietal region, and form, this embodiment would extend to our linguistic knowl-
the white matter underneath the left posterior middle tem- edge of actions (Feldman, 2006).
poral region. A similar dual pattern of deficit was reported
in a study in which aphasic patients were tested for their Summary
comprehension of visually or verbally presented actions.
Patients with lesions of premotor or parietal area were Hierarchy as a word was first used around 1380 to describe
impaired on these tasks, although lesions in premotor areas the strict relationship between the three layers of angels
were more predictive of the observed deficits (Saygin, Wilson, (seraphim, cherubim, and thrones) ascending toward heaven.
Dronkers, & Bates, 2004). Each was subordinate yet dependent on the lower level. In
Taken together, these studies indicate that action compre- this chapter, we have argued for the existence of a hierarchy
hension deficits can be observed in patients who have in the human brain for organizing complex motor behavior
damage to areas associated with planning actions, in particu- that, like the angels, carries with it distinct functional depen-
lar premotor and parietal regions. Consistent with the argu- dencies. However, unlike the angels, the anatomy of the
ments raised in our earlier discussion of hierarchies, these motor system and the multitude of solutions for achieving
results further challenge a traditional view in which motor complex behaviors suggest that the supraordinate or subor-
control and language are segregated into separate modules, dinate roles played by different layers of functional hierarchy
with the latter occupying a supraordinate position that can can be readily interchanged.
grafton, aziz-zadeh, and ivry: hierarchies and the representation of action 651
Hamilton, A. F., & Grafton, S. T. (2008). Action outcomes are requirement to update: New insights from transcranial magnetic
represented in human inferior frontoparietal cortex. Cereb. Cortex, stimulation. J. Neurosci., 26, 8176–8182.
18, 1160–1168. Rizzolatti, G., & Luppino, G. (2001). The cortical motor system.
Hartmann, K., Goldenberg, G., Daumüller, M., & Hermsdörfer, Neuron, 31, 889–901.
J. (2005). It takes the whole brain to make a cup of coffee: The Rizzolatti, G., & Matelli, M. (2003). Two different streams form
neuropsychology of naturalistic actions involving technical the dorsal visual system: Anatomy and functions. Exp. Brain Res.,
devices. Neuropsychologia, 43, 625–637. 153, 146–157.
Hauk, O., Johnsrude, I., & Pulvermuller, F. (2004). Somatotopic Roland, P. E., Larsen, B., Lassen, N. A., & Skinhoj, E. (1980).
representation of action words in human motor and premotor Supplementary motor area and other cortical areas in organiza-
cortex. Neuron, 41, 301–307. tion of voluntary movements in man. J. Neurophysiol., 43,
Heilman, K. M., Maher, L. M., Greenwald, M. L., & Rothi, 118–136.
L. J. (1997). Conceptual apraxia from lateralized lesions. Neurol- Roland, P. E., Skinhøj, E., Lassen, N. A., & Larsen, B. (1980).
ogy, 49, 457–464. Different cortical areas in man in organization of voluntary
Heuer, H., Kleinsorge, T., Spijkers, W., & Steglich, W. (2001). movements in extrapersonal space. J. Neurophysiol., 43, 137–150.
Static and phasic cross-talk effects in discrete bimanual reversal Rosenbaum, D. A., Meulenbroek, R. G., & Vaughan, J. (2001).
movements. J. Motor Behav., 33, 67–85. Planning reaching and grasping movements: Theoretical pre-
Ivry, R. B., Diedrichsen, J., Spencer, R. C. M., Hazeltine, E., mises and practical implications. Motor Control, 5, 99–115.
& Semjen, A. (2004). A cognitive neuroscience perspective on Rosenbaum, D. A., Vaughan, J., Barnes, H. J., & Jorgensen,
bimanual coordination. In S. Swinnen & J. Duysens (Eds.), Neuro- M. J. (1992). Time course of movement planning: Selection of
behavioral determinants of interlimb coordination (pp. 259–295). Boston: handgrips for object manipulation. J. Exp. Psychol. Learn. Mem.
Kluwer. Cogn., 18, 1058–1073.
Ivry, R. B., & Robertson, L. C. (1998). The two sides of perception. Rumiati, R. I., & Tessari, A. (2002). Imitation of novel and well-
Cambridge, MA: MIT Press. known actions: The role of short-term memory. Exp. Brain Res.,
Jeannerod, M. (1984). The timing of natural prehension move- 142, 425–433.
ments. J. Motor Behav., 16, 235–254. Rumiati, R. I., Weiss, P. H., Tessari, A., Assmus, A., Zilles, K.,
Jeannerod, M. (1986). The formation of finger grip during prehen- Herzog, H., et al. (2005). Common and differential neural
sion: A cortically mediated visuomotor pattern. Behav. Brain Res., mechanisms supporting imitation of meaningful and meaning-
19, 99–116. less actions. J. Cogn. Neurosci., 17, 1420–1431.
Jeannerod, M. (1997). The cognitive neuroscience of action. Oxford, Sadato, N., Yonekura, Y., Waki, A., Yamada, H., & Ishii, Y.
UK: Blackwell. (1997). Role of the supplementary motor area and the right
Jeannerod, M., Arbib, M. A., Rizzolatti, G., & Sakata, H. premotor cortex in the coordination of bimanual finger move-
(1995). Grasping objects: The cortical mechanisms of visuomo- ments. J. Neurosci., 17, 9667–9674.
tor transformation. Trends Neurosci., 18, 314–320. Saygin, A. P., Wilson, S. M., Dronkers, N. F., & Bates, E. (2004).
Johnson, S. H., & Grafton, S. T. (2003). From “acting on” to Action comprehension in aphasia: Linguistic and non-linguistic
“acting with”: The functional anatomy of object-oriented action deficits and their lesion correlates. Neuropsychologia, 42,
schemata. Prog. Brain Res., 142, 127–139. 1788–1804.
Karni, A., Meyer, G., Rey-Hipolito, C., Jezzard, P., Adams, Schöner, G., & Kelso, J. A. (1988). Dynamic pattern generation
M. M., Turner, R., et al. (1998). The acquisition of skilled in behavioral and neural systems. Science, 239, 1513–1520.
motor performance: Fast and slow experience-driven changes Semjen, A., & Ivry, R. B. (2001). The coupled oscillator model of
in primary motor cortex. Proc. Natl. Acad. Sci. USA, 95, between-hand coordination in alternate-hand tapping: A reap-
861–868. praisal. J. Exp. Psychol. Hum. Percept. Perform., 27, 251–265.
Keele, S. W., Cohen, A., & Ivry, R. (1990). Motor programs: Spencer, R. C. M., Semjen, A., Yang, S., & Ivry, R. B. (2007). An
Concepts and issues. In M. Jeannerod (Ed.), Attention and per- event-based account of coordination stability. Psychon. Bull. &
formance: Vol. 13. Motor representation and control (pp. 77–110). Rev., 13, 702–710.
Hillsdale, NJ: Lawrence Erlbaum. Tessari, A., & Rumiati, R. I. (2004). The strategic control of mul-
Kourtzi, Z., & Kanwisher, N. (2000). Cortical regions involved tiple routes in imitation of actions. J. Exp. Psychol. Hum. Percept.
in perceiving object shape. J. Neurosci., 20, 3310–3318. Perform., 30, 1107–1116.
Lashley, K. S. (1951). The problem of serial order in behavior. In Tettamanti, M., Buccino, G., Saccuman, M. C., Gallese, V.,
L. A. Jeffress (Ed.), Cerebral mechanisms in behavior (pp. 112–136). Danna, M., Scifo, P., et al. (2005). Listening to action-related
New York: Wiley. sentences activates fronto-parietal motor circuits. J. Cogn. Neuro-
Lu, X., & Ashe, J. (2005). Anticipatory activity in primary motor sci., 17, 273–281.
cortex codes memorized movement sequences. Neuron, 45, Tranel, D., Kemmerer, D., Damasio, H., Adolphs, R., & Damasio,
967–973. A. R. (2003). Neural correlates of conceptual knowledge for
Matsuzaka, Y., Picard, N., & Strick, P. L. (2007). Skill represen- actions. Cogn. Neuropsychol., 20, 409–432.
tation in the primary motor cortex after long-term practice. Tunik, E., Frey, S. H., & Grafton, S. T. (2005). Virtual lesions
J. Neurophysiol., 97, 1819–1832. of the anterior intraparietal area disrupt goal-dependent on-line
Mechsner, F., Kerzel, D., Knoblich, G., & Prinz, W. (2001). adjustments of grasp. Nat. Neurosci., 8, 505–511.
Perceptual basis of bimanual coordination. Nature, 414, 69–73. Wolpert, D. M., Ghahramani, Z., & Jordan, M. I. (1995). An
Ochipa, C., Rothi, L. J., & Heilman, K. M. (1989). Ideational internal model for sensorimotor integration. Science, 269,
apraxia: A deficit in tool selection and use. Ann. Neurol., 25, 1880–1882.
190–193. Wolpert, D. M., Goodbody, S. J., & Husain, M. (1998). Main-
Rice, N. J., Tunik, E., & Grafton, S. T. (2006). The anterior taining internal representations: The role of the human superior
intraparietal sulcus mediates grasp execution, independent of parietal lobe. Nat. Neurosci., 1, 529–533.
47 nader 691
49 kensinger 725
50 miller 739
51 schacter, addis,
and buckner 751
Introduction
daniel l. schacter
656 memory
these observations to the idea that memory is a fundamen- The chapters in this section reveal expansions in both the
tally constructive process, sometimes prone to errors and depth and breadth of memory research, which bodes well
illusions. They consider the possibility that the flexible use for the future of the enterprise. We cannot know with any
of information from memory to simulate alternative future certainty what path memory research will follow in the
scenarios constitutes a key function of a constructive memory upcoming years, but we can be confident that it will be excit-
system. ing to find out.
abstract Detailed neuroanatomical studies that focused on the identifying which of the structures in the medial temporal
connections of the medial temporal lobe in monkeys provided criti- lobe, when damaged, were responsible for the severe declar-
cal clues toward identifying the structures important for normal ative memory impairment seen in patient HM.
declarative memory. These structures include the hippocampus
together with the surrounding and strongly interconnected ento-
Whereas early experimental lesion (Mishkin, 1978) and
rhinal, perirhinal, and parahippocampal cortices. Detailed ana- neurophysiology studies (O’Keefe & Nadel, 1978) tended to
tomical descriptions of the connections of the analogous cortical focus on the role of the hippocampus in declarative-like
regions in rats suggest both similarities and differences in the con- memory, because later studies showed that selective hippo-
nections of these cortical medial temporal lobe areas across species. campal lesions in humans (Zola-Morgan, Squire, & Amaral,
In this chapter we will review the quantitative anatomical studies
1986) resulted in mild memory impairment relative to the
describing the cortical inputs, intrinsic projections, and intercon-
nections of the entorhinal, perirhinal, and parahippocampal impairment seen in patient HM, this finding suggested that
cortices in monkeys and rats. A detailed understanding of the cross- brain areas beyond the hippocampus may also be involved.
species similarities and differences in the anatomical organization The anatomical studies of Amaral and colleagues (Amaral,
of these regions can provide valuable insight into understanding Insausti, & Cowan, 1987; Insausti, Amaral, & Cowan,
the core mnemonic functions of these areas.
1987a, 1987b) provided critical insight into which other
medial temporal lobe areas might be participating in declar-
ative memory. Specifically, their quantitative neuroana-
The landmark description by Scoville and Milner (1957) of tomical studies showed that the monkey entorhinal cortex,
a group of brain-damaged patients including the well-known the major source of cortical inputs to the hippocampus,
amnesic patient HM demonstrated for the first time that received the vast majority of its cortical projections from
bilateral damage limited to the region of the medial tempo- the surrounding perirhinal and parahippocampal cortices.
ral lobe in humans resulted in a permanent and devastating Further anatomical studies revealed that the perirhinal and
memory impairment. Later studies showed that patients parahippocampal cortices (Suzuki & Amaral, 1994a, 1994b)
with medial temporal lobe damage exhibited a memory loss received a powerful convergence of unimodal and polymo-
that was selective for fact and event memory (i.e., declarative dal cortical inputs and in this way served as a critical relay
memory; Eichenbaum & Cohen, 2001; Gabrieli, 1998; for multimodal information into the hippocampal formation
Squire, Knowlton, & Musen, 1993; Squire, 1992). While the (i.e., hippocampus and entorhinal cortex). Taken together,
original report of Scoville and Milner (1957) first identified these anatomical insights were critical in focusing attention
the region of the medial temporal lobe as key for memory on the possible mnemonic role of the entorhinal, perirhinal,
function, a convergence of systematic anatomical and neu- and parahippocampal cortices. A convergence of subsequent
robehavioral studies in animal model systems was critical for lesion studies (Leonard, Amaral, Squire, & Zola-Morgan,
1995; Suzuki, Zola-Morgan, Squire, & Amaral, 1993;
wendy a. suzuki Center for Neural Science, New York University, Meunier, Bachevalier, Mishkin, & Murray, 1993; Zola-
New York, New York Morgan, Squire, Amaral, & Suzuki, 1989; Murray &
660 memory
A
TFl
36d
36r
36c
PH
rs PR TFm
rs
EC
EL TH
35 EC ECL
ER EI PaS
EO L
A P
B
POR
36
rs
35
POR
PR LEA
rs
EC
MEA
Figure 45.1 (A) Left: Photograph illustrating the ventral view of The stippled region within areas 36r and 36c shows the approxi-
the macaque monkey brain showing the locations of the entorhinal mate extent and location of the disputed anterior and lateral
(EC), perirhinal (PR), and parahippocampal (PH) cortices sur- borders of the perirhinal cortex. We will use the more anterior and
rounding the rhinal sulcus (rs). The shaded region at the level of lateral boundaries of the perirhinal cortex as described by Suzuki
the dorsal temporal pole corresponds to area 36d. Right: An and Amaral (1994b, 2003a). (B) Left: Photograph illustrating the
unfolded representation of the same cortical medial temporal lobe lateral view of a rat brain showing the locations of the entorhinal
areas shown on the left is illustrated along with major subdivisions. (EC), perirhinal (PR), and postrhinal (POR) cortices. Right: An
The perirhinal cortex is subdivided into areas 35, 36r, and 36c. illustration of an unfolded representation of the same cortical
The parahippocampal cortex includes areas TFl, TFm, and areas along with all major subdivisions. As in the monkey, the rat
TH. The monkey entorhinal cortex is subdivided into the olfactory perirhinal cortex is subdivided in areas 35 and 36 while the ento-
(EO), rostral (ER), lateral (EL), intermediate (EI), caudal (EC), and rhinal cortex is subdivided into the lateral entorhinal area (LEA)
caudal limited (ECL) subdivisions. The location of area 36d is indi- and the medial entorhinal area (MEA). The rat postrhinal cortex
cated on the unfolded map but not included within the boundaries has not been subdivided further. Additional abbreviations: PaS,
of the perirhinal cortex for this chapter (see text for explanation). parasubiculum; A, anterior; P, posterior; M, medial; L, lateral.
nal cortex has been further subdivided into area 35, which the borders of the monkey perirhinal cortex. One concerns
forms a long and narrow strip of cortex situated in the fundus whether the cortex of the temporal pole adjacent to the rhinal
and lateral bank of the rhinal sulcus, and a larger, more later- sulcus should also be considered part of the perirhinal cortex
ally situated area 36. Area 36 has further been subdivided (Saleem, Price, & Haskikawa, 2007; Kondo, Saleem, & Price,
into two major subdivisions (areas 36r and 36c) based on 2005, 2003; Suzuki & Amaral, 1994a; Insausti et al., 1987a).
cytoarchitectonic criteria. Two main controversies exist over The disputed regions, illustrated in the unfolded map in
662 memory
regions of medial area TE converge on all parts of the area area 36. In general area 35 receives a pattern of cortical
36. The second strongest input to area 36 arises from area inputs similar to that of area 36, though it receives a rela-
TF of the adjacent parahippocampal cortex with the stron- tively stronger input from the dorsal bank of the superior
gest inputs originating from the anterior two-thirds of area temporal sulcus (see box 45.1).
TF and terminating throughout area 36. Only weak projec-
tions are seen from area TH. Moderate projections arise Intrinsic projections The perirhinal cortex also exhibits prom-
from the visual areas of the ventral bank of the superior inent intrinsic projections such that each major perirhinal
temporal sulcus (STSv) that terminate anteriorly in the peri- subdivision (36r, 36c, and 35) has prominent interconnec-
rhinal cortex. Weak projections from the polymodal areas tions within that subdivision and moderate projections to the
of the dorsal bank of the STS (STSd) and area 36d also ter- other subdivisions (shading in figure 45.2A; Lavenex, Suzuki,
minate anteriorly in area 36. Weak projections from orbital & Amaral, 2004). Given the strong convergence of projec-
frontal areas 11, 12, and 13 (Kondo et al., 2005), insular tions from areas TE and the parahippocampal cortex to all
cortex, and anterior cingulate cortex terminate throughout levels of the perirhinal cortex, this observation suggests that
Abbreviations
35 Area 35 of the perirhinal cortex PaS Parasubiculum
36d Dorsal division of area 36 of the perirhinal cortex PH Parahippocampal cortex
36r Rostral division of area 36 of the perirhinal cortex PIR Piriform cortex
36c Caudal division of area 36 of the perirhinal cortex POR Postrhinal cortex
PR Perirhinal cortex
EC Caudal division of the entorhinal cortex
rs Rhinal sulcus
ECL Caudal limiting division of the entorhinal cortex
RSP Retrosplenial cortex
EI Intermediate division of the entorhinal cortex
EL Lateral division of the entorhinal cortex
SS Somatosensory cortical areas
EO Olfactory division of the entorhinal cortex
STG Superior temporal gyrus
A Anterior STSd Dorsal bank of the superior temporal sulcus
ER Entorhinal cortex STSv Ventral bank of the superior temporal sulcus
IB Intermediate band TE/TEO Visual areas in the ventral temporal lobe of the
L Lateral monkey
LB Lateral band TEv Auditory, somatosensory and visual processing
LEA Lateral entorhinal area area in the rat temporal lobe
M Medial TH Subdivision of the parahippocampal cortex
MB Medial band TFl Lateral subdivision of area TF of the parahippo-
MEA Medial entorhinal area campal cortex
Motor Motor regions of the frontal cortex TFm Medial subdivision of area TF of the parahippo-
OBF Orbitofrontal cortex campal cortex
ORB Orbitofrontal cortex Vis Visual processing areas in the occipital lobe in rats
P Posterior
TE OBF
Insula A P
STSv Cingulate
STSd TF TH
M
36d
36r 36c
rs 35
TEv
Cingulate
Parietal
Insula
SS POR
Motor Vis
36
OBF
PIR
rs
35
Figure 45.2 (A) Schematic representation of the topography and TEv (mainly from areas processing auditory and somatosensory
strength of cortical inputs to the monkey perirhinal areas 35, 36r, information) with moderate inputs from insula and somatosensory
and 36c. The relative strength of the cortical projections is indi- areas (SS). Area 35 receives its strongest inputs from the piriform
cated by the size of the lettering, and the locations of the arrows cortex (PIR) and insular cortex with moderate inputs from area
indicate the relative topography of projections throughout the peri- TEv and the orbitofrontal areas of the frontal lobe (ORB). While
rhinal cortex. The pattern of intrinsic projections is illustrated by dorsal/lateral regions of area 36, project mainly to more medial/
the shading pattern within areas 35, 36r, and 36c. The strongest ventral regions of area 36, the medial regions of area 36 project
inputs arise from visual area TE and the adjacent area TF of the strongly to area 35 (shading). These intrinsic projections, however,
parahippocampal cortex that project to all levels of the perirhinal are not strongly reciprocal. Additional abbreviations: Motor, motor
cortex. As indicated by the shading pattern, area 36r projects most areas of the frontal lobe; POR, postrhinal cortex; rs, rhinal sulcus;
strongly to itself and moderately to 36c and 35 and visa versa. (B) SS, somatosensory areas; STG, superior temporal gyrus; STSd,
Schematic representation of the topography and strength of corti- dorsal bank of the superior temporal sulcus; STSv, ventral bank of
cal inputs to the rat perirhinal cortex. All conventions are the same the superior temporal sulcus.
as in Panel A. Area 36 in rats receives its strongest input from area
664 memory
this convergent cortical input is further processed through- ized in processing visual object information in memory as
out large extents of the perirhinal cortex. well as polymodal input from the parahippocampal cortex,
the rat perirhinal cortex appears to be poised to integrate
The Rat Perirhinal Cortex information from all sensory modalities in memory.
Cortical afferents In contrast to the preponderance of visual The Monkey Parahippocampal Cortex
object input to the monkey perirhinal cortex, the rat perirhi-
nal cortex is characterized by a strong convergence of inputs Cortical afferents Like the perirhinal cortex, area TF of the
from all sensory modalities (Furtak, Wei, Agster, & Burwell, parahippocampal cortex also receives its strongest single
2007; Burwell & Amaral, 1998a; Deacon et al., 1983). The input from visual areas (Blatt et al., 2003; Suzuki & Amaral,
strongest inputs to area 36 of the perirhinal cortex arise from 1994a), but the visual areas that project to the parahippo-
anterior and ventral temporal association areas known to campal cortex (mainly areas V4 and TEO; figure 45.3A) are
receive strong projections from somatosensory (anterior TEv) posterior to the visual areas that project to the perirhinal
and auditory areas (mid-rostrocaudal levels of TEv), respec- cortex (i.e., area TE; figure 45.2A). Moreover, these projec-
tively (Burwell & Amaral, 1998a). The projections from area tions exhibit a clear medial lateral topography projecting
TEv along with weak projections from cingulate and parietal more strongly to lateral portions of area TF than to medial
cortex terminate throughout area 36. Weak projections from portions. The next most prominent input to area TF comes
the postrhinal cortex as well as weak projections from poste- from brain areas involved in the so-called ventral visual
rior visual areas both terminate most strongly in caudal por- processing pathway important for analyzing spatial informa-
tions of area 36. Moderate projections are seen from insular tion (the “where” pathway of Ungerleider & Mishkin, 1982).
cortex and somatosensory cortical areas (Burwell, 2001; The most prominent of these dorsal stream inputs arise from
Remple, Henry, & Catania, 2003; Shi & Cassell, 1998) that the retrosplenial cortex, which projects to all levels of area
terminate more strongly in rostral portions of area 36. Frontal TF. Similarly, moderate projections from area STSd, also
areas, including both orbitofrontal areas and frontal motor considered a dorsal stream area, provide a moderate projec-
regions together with the piriform cortex, provide weak pro- tion to all levels of area TF. Weak projections are also seen
jections mainly to anterior levels of area 36. In contrast to from the posterior parietal cortex, which terminate laterally
area 36, area 35 receives its strongest cortical inputs from in area TFl. Moderate projections also originate from area
piriform cortex and insular cortex, with moderate inputs 36c of the perirhinal cortex and weak projections from area
from TEv and orbitofrontal cortex and weak projections 36r that terminate most strongly in anterior portions of area
from parietal cortex, cingulate cortex, posterior visual areas, TF while weak projections from area 36d tend to terminate
and postrhinal cortex. Thus, taken together, the afferent more medially in area TF. Weak projections are also seen
inputs to rat perirhinal areas 35 and 36 are dominated by from frontal areas including area 46, orbital and medial
sensory inputs from the olfactory, somatosensory, and audi- prefrontal areas (Kondo et al., 2005), and the insular cortex.
tory, as well as the visual, modalities (see box 45.1). Area TH exhibits some differences in its cortical inputs rela-
tive to the cortical inputs of area TF. The most striking
Intrinsic projections While the cortical inputs to area 36 of the difference is that area TH receives only sparse input from
rat perirhinal cortex tend to exhibit a more prominent rostro- visual area V4, but similar to area TF, it receives prominent
caudal topography, the intrinsic projections of the perirhinal projections from the retrosplenial cortex. Another striking
cortex have a clear dorsal-to-ventral gradient (shaded pattern difference is the moderate input from auditory association
in figure 45.2B; Burwell & Amaral, 1998b). Thus the most areas of the STG, which appears to constitute the strongest
dorsal regions of area 36 project mainly to more ventral areas direct auditory projections to any parahippocampal region.
of 36 while the ventral areas of area 36 project strongly to area Moderate inputs are seen from STSd, and weak inputs arise
35. In contrast, area 35 returns a weaker projection to area from insular cortex, the perirhinal cortex (including area
36. Thus area 35 may be the ultimate site of convergence for 36d), and similar portions of the frontal lobe that project to
all sensory modalities within the rat perirhinal cortex. area TF (see box 45.1).
Summary and comparisons The monkey perirhinal cortex is Intrinsic projection The intrinsic projections of the parahip-
dominated by high-level visual inputs from area TE as well pocampal cortex (illustrated by the shading in figure 45.3A)
as prominent polymodal inputs from the parahippocampal parallel the medial lateral topography of the inputs to this
cortex. The rat perirhinal cortex, by contrast, receives a region. Thus area TFl (lateral portions of area TF) has the
much more diverse range of sensory inputs from olfactory, strongest connections with itself, moderate interconnections
somatosensory, and auditory, as well as visual, modalities. with area TFm, and only weak projections with area TH.
Thus, while monkey perirhinal cortex appears to be special- Similarly, area TFm projects most strongly with itself, but
RSP
TFl STSd
Frontal
PR Insula
TFm
L
36d TH
A P
RSP Insula
PR
M V4
STG
Frontal
STSd
Vis
PR
TEv
POR
RSP
Parietal
Frontal
Insula
Figure 45.3 (A) Cortical inputs of the monkey parahippocampal dorsal bank of the STS (STSd). Intrinsic projections are strongest
cortical areas TFl, TFm, and TH. All conventions are the same as within each subdivision and progressively weaker to more distantly
in figure 45.2. The strongest inputs to area TF arise from visual located subdivisions (shading). (B) The cortical inputs to the rat
areas V4 and the retrosplenial cortex (RSP) with moderate inputs postrhinal cortex are strongest from primary posterior visual areas
from visual areas TE/TEO and polymodal inputs from the dorsal (Vis) and area TEv, with moderate inputs from the retrosplenial
bank of the superior temporal sulcus (STSd). Area TH receives its cortex. No topography of inputs or intrinsic projections is seen.
strongest single input from the retrosplenial cortex (RSP), with Additional abbreviations: PIR, piriform cortex; SS, somatosensory
moderate input from the superior temporal gyrus (STG) and the areas.
moderately with both area TH and area TFl. Finally, like visual inputs from both occipital cortex and temporal lobe
area TFl, area TH projects most strongly with itself, mod- visual areas (TEv) as well as from dorsal stream areas includ-
erately with area TFm, and only weakly with area TFl. ing the retrosplenial cortex and parietal cortex. Weak pro-
Thus, while information arriving only in lateral area TF does jections are seen from frontal and insular cortices. The
not have strong direct interactions with area TH (and vice perirhinal cortex provides a weak projection to anterior
versa), this information can reach area TH by way of inter- regions of the postrhinal cortex, with the strongest projec-
mediate connections with area TFm. tions arising from area 36. No strong topography of projec-
tions to the POR cortex was seen, with all afferent regions
The Rat Postrhinal Cortex projecting to most or all of the postrhinal cortex with the
exception of the perirhinal cortex, which tended to project
Cortical afferents Similar to the monkey parahippocampal to more anterior regions of the postrhinal cortex. Thus, like
cortex, the rat postrhinal cortex is dominated by secondary the parahippocampal cortex in monkeys, the postrhinal
666 memory
cortex in rats appears to be a strong site of convergence for jecting anteriorly and laterally in the entorhinal cortex
both visual and visuospatial input (see box 45.1). (Mohedano-Moriano et al., 2008, 2007). Insausti and col-
leagues (Mohedano-Moriano et al., 2007) have highlighted
Intrinsic projections The postrhinal cortex does not exhibit the lateral band of the monkey entorhinal cortex as receiving
any strong topography or polarity in its intrinsic projections. the strongest convergence of afferent input from widespread
Thus all regions of the postrhinal cortex appear to project cortical areas (see box 45.1).
to all other regions.
Intrinsic projections and connections with the hippocampus Given
Summary and comparison Striking similarities are seen in the that the lateral half of the entorhinal cortex receives the
patterns of connections of the monkey parahippocampal strongest convergent input, another important question con-
cortex and the rat postrhinal cortex. Both areas are domi- cerns how that convergent information is processed intrinsi-
nated by visual input from posterior visual areas together cally within the entorhinal cortex. Chroback and Amaral
with visuospatial input from so-called dorsal stream areas. (2007) showed that the intrinsic entorhinal connections in
More specifically, area TF of the parahippocampal cortex the monkey are organized into rostrocaudally oriented
has the strongest resemblance to the postrhinal cortex in bands where each band extends for about half the anterior-
rats. In contrast, the two regions differ in that the monkey posterior extent of the entorhinal cortex (shaded regions in
parahippocampal cortex exhibits a striking topography of figure 45.4A). There is also a clear medial lateral topography
inputs and intrinsic connections that is not seen in the rat such that two adjacent bands situated end to end cover the
postrhinal cortex. lateral portion of the entorhinal cortex, two more bands situ-
ated end to end cover the mid-mediolateral portion of the
The Monkey Entorhinal Cortex entorhinal cortex, and a single band covers the most rostral
and medial entorhinal cortex at the level of area EO. Interest-
Afferents The cortical inputs of the monkey entorhinal ingly, the projections from the perirhinal and parahippo-
cortex were first studied using anterograde degeneration campal cortices terminate in multiple bands spanning the
techniques (Van Hoesen et al., 1975; Van Hoesen & Pandya, mediolateral extent of the entorhinal cortex. Additional
1975a, 1975b) and later using WGA-HRP and fluorescent studies in the monkeys showed that the three mediolaterally
retrograde tracers (Insausti et al., 1987a). These comprehen- oriented bands in the entorhinal cortex project in a topo-
sive studies showed that the entorhinal cortex is the recipient graphic fashion to different anterior-posterior levels of the
of prominent input from higher-level polymodal association. hippocampus as illustrated in figure 45.4A (Witter, Van
If one does not include the temporal pole as part of the Hoesen, & Amaral, 1989). Thus information originating
perirhinal cortex, then the perirhinal and parahippocampal from the perirhinal and parahippocampal cortices is ulti-
cortices together make up about half of all the cortical input mately processed in the posterior half of the hippocampus.
to the monkey entorhinal cortex, with somewhat more than
half of that proportion arising from the parahippocampal The Rat Entorhinal Cortex
cortex and the remainder arising from the perirhinal cortex
(Table 1 of Insausti et al., 1987a). The cortex of the temporal Afferents The rat entorhinal cortex is subdivided into the
pole contributes only weak projections to the entorhinal medial entorhinal area (MEA) and the lateral entorhinal
cortex. Moreover, the perirhinal and parahippocampal area (LEA), and these two areas have been further separated
cortices exhibit a clear topography of projection, with the into lateral, intermediate, and medial bands that project to
parahippocampal cortex projecting most prominently to different septotemporal regions of the dentate gyrus (Dolorfo
the posterior entorhinal cortex with weaker projections to & Amaral, 1998b). Given this striking and well-described
anterior and lateral regions and the perirhinal cortex pro- topography, we will summarize the projections to the ento-
jecting most prominently to anterior and lateral portions of rhinal cortex with respect to the different entorhinal subdivi-
the entorhinal cortex with weaker projections more laterally sions (MEA and LEA) as well as the different hippocampal
and caudally. Another prominent input comes from the ret- projection bands. By far the strongest cortical input to the
rosplenial cortex, which, like the parahippocampal cortex, LEA originates from the piriform cortex, with moderate
projects most prominently posteriorly in the entorhinal inputs also arising from the perirhinal cortex, insula, and
cortex. Weaker projections are seen from the superior tem- frontal cortices. Both the piriform and perirhinal cortices
poral gyrus (STG) and STSd, which project posteriorly, and project most strongly to the lateral and intermediate bands
the insular cortex, which projects anteriorly and laterally. and more weakly to the medial band. Parietal cortex and
Weak projections are also seen from the piriform cortex to the postrhinal cortex provide a weak input to the LEA with
the olfactory subdivision of the entorhinal cortex (EO). Weak the same overall termination pattern as the piriform and
inputs from visual area TE have also been described pro- perirhinal cortices. The insular input projects most strongly
TE L PR A P
A P Parietal M
rs EL POR LEA
M
EC ECL
EI LB Dentate Gyrus
ER Insula IB
MB Dorsal
MEA
EO
Post.
PIR
Ant. Cortical Connections of the MEA
Ventral
Hippocampus
MEA
rs
LB
LEA
Cingulate
IB TEv
MB
Vis
Parietal
Frontal
PIR Insula
POR
PR
Figure 45.4 (A) The cortical inputs of the monkey entorhinal locations of the lateral (LB), intermediate (IB), and medial band
cortex arise predominantly from the parahippocampal (PH) (MB) that represent both the pattern of intrinsic projections that
and retrosplenial (RSP) cortices that project posteriorly and later- are maintained mainly within a single band as well as the topo-
ally as well as from the perirhinal cortex (PR) that projects anteri- graphic projection to different dorsoventral levels of the dentate
orly and laterally. The shading illustrates three mediolaterally gyrus (shown schematically at the right). The LEA region receives
differentiated bands of intrinsic entorhinal projections. The most its strongest cortical input from piriform cortex, with moderate
lateral and middle mediolateral bands (dark and medial dark projections from the perirhinal cortex, insula, and frontal cortices.
shades) also exhibit intrinsic connections that are arranged rostro- The arrows in this figure illustrate the relative strength of projec-
caudally. The most medial band (lightest shading) has only a single tions to different bands with relatively stronger projections illus-
module. These mediolaterally oriented bands also provide a clear trated as solid lines and relatively weaker projections illustrated
topographic projection to different rostro-caudal levels of the with dashed lines. Bottom: Illustration of the cortical inputs to the
hippocampus such that posterior hippocampus receives inputs medial entorhinal area (MEA). All conventions are the same as in
from the most lateral bands of the entorhinal cortex, mid anter- the top panel. The strongest cortical input to the MEA originates
ior-posterior hippocampal areas receive projections from the in the piriform cortex (PIR), with moderate projections seen from
mid-mediolateral bands, and the most anterior portions of the hip- the cingulate cortex and posterior visual areas (Vis). Note the weak
pocampus receive inputs from the most medially situated entorhi- projections from both the perirhinal (PR) and postrhinal (POR)
nal band. (B) Top: Illustration of the cortical connections of the cortices to the MEA. All additional abbreviations are the same as
lateral entorhinal area (LEA) of the rat. Also illustrated are the in figures 45.2 and 45.3.
668 memory
to the medial band, while the frontal and temporal projec- subcortical inputs to the rat entorhinal cortex is illustrated
tions terminate similarly in all three bands. Weak projections in figure 45.5. Similarly, only about half of all afferent inputs
from cingulate cortex and visual cortex (Vis) project mainly to the rat MEA arise from cortical areas. Although a parallel
to the intermediate band. quantification of all cortical and subcortical inputs to the
Similar to the LEA, the most prominent cortical projection monkey entorhinal cortex has not been done (Insausti et al.,
of the MEA originates in the piriform cortex, terminating 1987a, 1987b), estimations based on the illustrations of
mainly in the medial and intermediate bands. Moderate Insausti and colleagues (1987b) suggest that subcortical pro-
inputs that arise from the cingulate cortex and visual cortex jections make up a much smaller proportion of inputs to the
(Vis) mainly target either the lateral and medial or the lateral monkey entorhinal cortex compared to the rat entorhinal
band, respectively. Weak inputs are seen from ventral tempo- cortex (figure 45.5). Thus the rat entorhinal cortex not only
ral area TEv, and parietal, frontal, and insula cortices. Post- receives different patterns of cortical inputs but also appears
rhinal and perirhinal cortices provide only a weak input to to be influenced much more by its subcortical projections
the MEA, mainly to the intermediate band (see box 45.1). compared to the monkey entorhinal cortex.
PR PH PR POR
PIR PIR
Insula
RSP EC LEA MEA
Subcortical Subcortical
Subcortical Olfactory Olfactory
Claustrum Claustrum
HPC Amygdala HPC Amygdala
D. Thalamus
Figure 45.5 Schematic illustration of the patterns and relative tive strength of projections (see box 45.1 for a description of the
strength of inputs to the parahippocampal region in monkeys and calculation of the relative strength of cortical inputs). All quantita-
rats. Similarities are seen in the general patterns of inputs to the tive data from the rat entorhinal inputs taken from Kerr et al.
perirhinal and parahippocampal/postrhinal cortices in monkeys (2007). However, the weak projections from subcortical regions to
and rats. However, more striking differences are noted in the pat- the monkey entorhinal cortex are only estimations since similar
terns of inputs to the entorhinal cortex. To better illustrate one of quantitative comparisons have not been published. Note that the
the key differences, we show the relative strength of the cortical subcortical inputs to the perirhinal (PR) and parahippocampal (PH)
and subcortical projections of the monkey and rat entorhinal or postrhinal (POR) cortices are not illustrated. Additional abbre-
cortex, with the relative thickness of the arrows illustrating the rela- viations: D. thalamus, dorsal thalamus; RSP, retrosplenial.
auditory). It will be of interest to compare and contrast the tions has recently been highlighted in a series of fMRI
full range of mnemonic signals seen in the monkey and rat studies. Bar and colleagues (Bar, Aminoff, & Ishai, 2008;
perirhinal cortex. Aminoff, Gronau, & Bar, 2007; Bar & Aminoff, 2003) report
Perhaps the most striking cross-species similarities in corti- that the parahippocampal cortex is activated in response
cal afferent inputs are seen in the parahippocampal and to objects highly associated with particular contexts (i.e., a
postrhinal cortices (figure 45.5). These areas in both monkeys traffic light) irrespective of whether the context is spatial
and rats receive prominent visual inputs as well as visuospa- (i.e., swings associated with a playground) or nonspatial (i.e.,
tial inputs from dorsal stream structures including the birthday cake associated with a birthday party). Based on
retrosplenial and parietal cortices. Consistent with these these findings, this group has proposed a contextual asso-
anatomical inputs, the parahippocampal cortex in both ciative theory of parahippocampal function whereby the
monkeys and humans has most commonly been associated parahippocampal cortex is thought to mediate contextual
with spatial memory functions (Malkova & Mishkin, 2003; associative processing, an important component of both
Burgess, Maguire, Spears, & O’Keefe, 2001; Bohbot, Allen, spatial memory and episodic memory. It will be fascinating
& Nadel, 2000; Johnsrude, Owen, Crane, Milner, & Evans, to test this contextual associative theory of parahippocampal
1999; Maguire, Frackowiak, & Firth, 1997; Aguirre, Detre, functions in both monkeys and rats. Neuroanatomical data
Alsop, & D’Esposito, 1996), and in humans it has also been supports the idea that contextual signals could be seen across
associated with episodic memory (Squire, Stark, & Clark, both the monkey parahippocampal cortex and the rat post-
2004; Ranganath et al., 2004; Schacter & Wagner, 1999). A rhinal cortex.
growing body of lesion studies in rats suggests a role of the While the parahippocampal/postrhinal cortices in
postrhinal cortex in memory for context (Eacott & Easton, monkeys and rats exhibit clear similarities in their cortical
2007; Burwell et al., 2004; Bucci et al., 2002, 2000). However, inputs, more striking differences are seen in both the pattern
reports on the contribution of the postrhinal cortex to spatial and relative strength of cortical inputs to the entorhinal
memory as measured by water maze tasks have been mixed cortex (figure 45.5). For example, monkey entorhinal inputs
(Burwell et al., 2004; Liu & Bilkey, 2002). Consistent with are dominated by the prominent projections from the peri-
these lesion studies in rats, the important role of the human rhinal and parahippocampal cortices together with strong
parahippocampal cortex in processing contextual associa- inputs from the retrosplenial cortex. In contrast, the ento-
670 memory
rhinal cortex in rats receives its strongest cortical inputs from grid cells in the rat MEA, it will be important to continue to
piriform cortex with moderate inputs from the perirhinal explore the functions of the monkey entorhinal cortex. It will
cortex (specifically to LEA), insula, frontal cortex, cingulate also be critical to define any mnemonic role the grid cells
cortex, and visual cortical areas. Postrhinal cortex only proj- may play in the processing of spatial information. Could
ects weakly to the entorhinal cortex (figure 45.4B). The dif- there be grid cells in the monkey entorhinal cortex, or will
ferences in the patterns of inputs are even more striking if they look more like spatial context cells? How will these
one takes into account that while the monkey entorhinal entorhinal cells in both monkeys and rats participate in
cortex appears to receive the majority of its inputs from memory? Only further studies will tell.
other unimodal and multimodal cortical areas, the rat LEA
and MEA receive only about half of all their inputs from
REFERENCES
cortical structures, with the remaining half arising from sub-
cortical structures (figure 45.5). Aguirre, G. K., Detre, J. A., Alsop, D. C., & D’Esposito,
Despite the differences in the details of the anatomical M. (1996). The parahippocampus subserves topographical learn-
ing in man. Cereb. Cortex, 6, 823–829.
connections, there remain intriguing parallels between the rat Amaral, D. G., Insausti, R., & Cowan, W. M. (1987). The ento-
and monkey entorhinal cortex. For example, it is clear from rhinal cortex of the monkey. I. Cytoarchitectonic organization.
recent physiological studies focused on the rat entorhinal J. Comp. Neurol., 264, 326–355.
cortex that cells in LEA and MEA process distinct kinds of Aminoff, E., Gronau, N., & Bar, M. (2007). The parahippocam-
information, with the MEA contributing to spatial informa- pal cortex mediates spatial and nonspatial associations.
Cereb. Cortex, 17, 1493–1503.
tion, including the striking grid cells specific for the MEA
Bar, M., & Aminoff, E. (2003). Cortical analysis of visual context.
(Hafting et al., 2008; Fyhn et al., 2007; Hafting et al., 2005; Neuron, 38, 347–358.
Hargreaves, Rao, Lee, & Knierim, 2005; Fyhn et al., 2004). Bar, M., Aminoff, E., & Ishai, A. (2008). Famous faces activate
Though recent evidence suggests that cells in LEA are not contextual associations in the parahippocampal cortex. Cereb.
spatial (Hargreaves et al., 2005), the nature of the input that Cortex, 18, 1233–1238.
Barker, G. R., Bird, F., Alexander, V., & Warburton,
maximally activates these cells has yet to be identified. The
E. C. (2007). Recognition memory for objects, place, and
spatial versus nonspatial dissociation in the MEA and LEA temporal order: A disconnection analysis of the role of the
in rats parallels the anatomical data in monkeys that medial prefrontal cortex and perirhinal cortex. J. Neurosci., 27,
the posterior entorhinal cortex receives projections from the 2948–2957.
strongly visuospatial parahippocampal cortex while the ante- Barker, G. R., & Warburton, E. C. (2008). NMDA receptor
rior entorhinal cortex receives its strongest projections from plasticity in the perirhinal and prefrontal cortices is crucial for
the acquisition of long-term object-in-place associative memory.
the visual-object-processing areas of the perirhinal cortex. J. Neurosci., 28, 2837–2844.
These striking topographic projections suggest an anterior- Baylis, G. C., & Rolls, E. T. (1987). Responses of neurons in the
posterior differentiation in object and spatial memory func- inferior temporal cortex in short term and serial recognition
tions, respectively, of the monkey entorhinal cortex (Suzuki memory tasks. Exp. Brain Res., 65, 614–622.
& Amaral, 1994b). Indeed, one physiology study attempted Blackstad, T. W. (1956). Commissural connections of the hippo-
campal region in the rat, with special reference to their mode of
to test this anatomical prediction directly by examining termination. J. Comp. Neurol., 105, 417–537.
responses in the monkey entorhinal cortex during an object Blatt, G. J., Pandya, D. N., & Rosene, D. L. (2003). Parcellation
version of the delayed-match-to-sample task as well as a of cortical afferents to three distinct sectors in the parahippocam-
spatial version of the same task (Suzuki, Miller, & Desimone, pal gyrus of the rhesus monkey: An anatomical and neurophysi-
1997). While neurons throughout the entorhinal cortex sig- ological study. J. Comp Neurol., 466, 161–179.
Bohbot, V. D., Allen, J. J., & Nadel, L. (2000). Memory deficits
naled memory for both the object and spatial versions of
characterized by patterns of lesions to the hippocampus and
the task, no rostrocaudal topography was found. Because the parahippocampal cortex. Ann. NY Acad. Sci., 911, 355–368.
delayed-match-to-place task required only egocentric and Brown, M. W., Wilson, F. A. W., & Riches, I. P. (1987).
not allocentric spatial memory strategies, this task may not Neuronal evidence that inferomedial temporal cortex is more
have engaged the particular form of spatial or contextual important than hippocampus in certain processes underlying
recognition memory. Brain Res., 409, 158–162.
memory processed by the entorhinal cortex.
Brown, M. W., & Xiang, J. Z. (1998). Recognition memory: Neu-
The anatomical observations in the monkey entorhinal ronal substrates of the judgement of prior occurrence. Prog. Neu-
cortex taken together with the physiological findings from robiol., 55, 149–189.
the rat MEA and LEA suggest the possibility that despite the Bucci, D. J., Phillips, R. G., & Burwell, R. D. (2000). Contribu-
differences in the patterns of inputs, similarities in the func- tions of postrhinal and perirhinal cortex to contextual informa-
tional organization of the entorhinal cortex across species tion processing. Behav. Neurosci., 114, 882–894.
Bucci, D. J., Saddoris, M. P., & Burwell, R. D. (2002).
may be present. The entorhinal cortex is one of the least Contextual fear discrimination is impaired by damage to
studied medial temporal lobe areas in the monkey, and given the postrhinal or perirhinal cortex. Behav. Neurosci., 116,
the renewed interest in this structure with the discovery of 479–488.
672 memory
Malkova, L., & Mishkin, M. (2003). One-trial memory for Riches, I. P., Wilson, F. A., & Brown, M. W. (1991). The effects
object-place associations after separate lesions of hippocampus of visual stimulation and memory on neurons of the hippocam-
and posterior parahippocampal region in the monkey. pal formation and the neighboring parahippocampal gyrus
J. Neurosci., 23, 1956–1965. and inferior temporal cortex of the primate. J. Neurosci., 11,
Martin-Elkins, C. L., & Horel, J. A. (1992). Cortical afferents 1763–1779.
to behaviorally defined regions of the inferior temporal and Rockland, K. S., Saleem, K. S., & Tanaka, K. (1994). Divergent
parahippocampal gyri as demonstrated by WGA-HRP. J. Comp. feedback connections from areas V4 and TEO in the macaque.
Neurol., 321, 177–192. Visual Neurosci., 11, 579–600.
Meunier, M., Bachevalier, J., Mishkin, M., & Murray, Rolls, E. T., & Xiang, J. Z. (2005). Reward-spatial view represen-
E. A. (1993). Effects on visual recognition of combined and sepa- tations and learning in the primate hippocampus. J. Neurosci., 25,
rate ablations of the entorhinal and perirhinal cortex in rhesus 6167–6174.
monkeys. J. Neurosci., 13, 5418–5432. Sakai, K., & Miyashita, Y. (1991). Neural organization for
Miller, E. K., Li, L., & Desimone, R. (1991). A neural mechanism the long-term memory of paired associates. Nature, 354, 152–
for working and recognition memory in inferior temporal cortex. 155.
Science, 254, 1377–1379. Saleem, K. S., Kondo, H., & Price, J. L. (2008). Comple-
Miller, E. K., Li, L., & Desimone, R. (1993). Activity of neurons mentary circuits connecting the orbital and medial prefrontal
in anterior inferior temporal cortex during a short-term memory networks with the temporal, insular, and opercular cortex in the
task. J. Neurosci., 13, 1460–1478. macaque monkey. J. Comp. Neurol., 506, 659–693.
Mishkin, M. (1978). Memory in monkeys severely impaired by Saleem, K. S., Price, J. L., & Hashikawa, T. (2007). Cytoarchi-
combined but not by separate removal of amygdala and hippo- tectonic and chemoarchitectonic subdivisions of the perirhinal
campus. Nature, 273, 297–298. and parahippocampal cortices in macaque monkeys. J. Comp.
Mohedano-Moriano, A., Martinez-Marcos, A., Pro- Neurol., 500, 973–1006.
Sistiaga, P., Blaizot, X., Arroyo-Jimenez, M. M., Marcos, P., Saleem, K. S., & Tanaka, K. (1996). Divergent projections
Artacho-Perula, E., & Insausti, R. (2008). Convergence of from the anterior inferotemporal area TE to the perirhinal and
unimodal and polymodal sensory input to the entorhinal cortex entorhinal cortices in the macaque monkey. J. Neurosci., 16,
in the fascicularis monkey. Neuroscience, 151, 255–271. 4757–4775.
Mohedano-Moriano, A., Pro-Sistiaga, P., Arroyo-Jimenez, Schacter, D. L., & Wagner, A. D. (1999). Medial temporal lobe
M. M., Artacho-Perula, E., Insausti, A. M., Marcos, P., activations in fMRI and PET studies of episodic encoding and
Cebada-Sanchez, S., Martinez-Ruiz, J., Munoz, M., retrieval. Hippocampus, 9, 7–24.
Blaizot, X., Martinez-Marcos, A., Amaral, D. G., & Scoville, W. B., & Milner, B. (1957). Loss of recent memory after
Insausti, R. (2007). Topographical and laminar distribution of bilateral hippocampal lesions. J. Neurol. Neurosurg. Psychiatry, 20,
cortical input to the monkey entorhinal cortex. J. Anat., 211, 11–21.
250–260. Seltzer, B., & Pandya, D. N. (1976). Some cortical projections to
Mumby, D. G., & Pinel, J. P. J. (1994). Rhinal cortex lesions and the parahippocampal area in the rhesus monkey. Exp. Neurol.,
object recognition in rats. Behav. Neurosci., 108, 11–18. 50, 146–160.
Mumby, D. G., Piterkin, P., Lecluse, V., & Lehmann, H. Shapiro, M. L., Tanila, H., & Eichenbaum, H. (1997). Cues
(2007). Perirhinal cortex damage and anterograde object- that hippocampal place cells encode: Dynamic and hierarchical
recognition in rats after long retention intervals. Behav. Brain Res., representation of local and distal stimuli. Hippocampus, 7,
185, 82–87. 624–642.
Murray, E. A., Gaffan, D., & Mishkin, M. (1993). Neural Shi, C. J., & Cassell, M. D. (1998). Cascade projections from
substrates of visual stimulus-stimulus association in rhesus somatosensory cortex to the rat basolateral amygdala via the
monkeys. J. Neurosci., 13, 4549–4561. parietal insular cortex. J. Comp. Neurol., 399, 469–491.
Murray, E. A., & Mishkin, M. (1986). Visual recognition Squire, L. R. (1992). Memory and the hippocampus: A synthesis
in monkeys following rhinal cortical ablations combined with from findings with rats, monkeys, and humans. Psychol. Rev., 99,
either amygdalectomy or hippocampectomy. J. Neurosci., 6, 195–231. [Erratum, Psychol. Rev., 99(3), 582.]
1991–2003. Squire, L. R., Knowlton, B., & Musen, G. (1993). The
Naya, Y., Sakai, K., & Miyashita, Y. (1996). Activity of structure and organization of memory. Annu. Rev. Psychol., 44,
primate inferotemporal neurons related to a sought target 453–495.
in pair-association task. Proc. Natl. Acad. Sci. USA, 93, 2664– Squire, L. R., Stark, C. E., & Clark, R. E. (2004). The medial
2669. temporal lobe. Annu. Rev. Neurosci., 27, 279–306.
Naya, Y., Yoshida, M., & Miyashita, Y. (2003). Forward process- Stefanacci, L., Buffalo, E. A., Schmolck, H., & Squire,
ing of long-term associative memory in monkey inferotemporal L. R. (2000). Profound amnesia after damage to the medial
cortex. J. Neurosci., 23, 2861–2871. temporal lobe: A neuroanatomical and neuropsychological
O’Keefe, J., & Nadel, L. (1978). The hippocampus as a cognitive map. profile of patient E.P. J. Neurosci., 20, 7024–7036.
New York: Oxford University Press. Suzuki, W. A., & Amaral, D. G. (1994a). Perirhinal and parahip-
Ranganath, C., Yonelinas, A. P., Cohen, M. X., Dy, C. J., Tom, pocampal cortices of the macaque monkey: Cortical afferents.
S. M., & D’Esposito, M. (2004). Dissociable correlates of recol- J. Comp. Neurol., 350, 497–533.
lection and familiarity within the medial temporal lobes. Neuro- Suzuki, W. A., & Amaral, D. G. (1994b). Topographic
psychologia, 42, 2–13. organization of the reciprocal connections between monkey
Remple, M. S., Henry, E. C., & Catania, K. C. (2003). Organiza- entorhinal cortex and the perirhinal and parahippocampal cor-
tion of somatosensory cortex in the laboratory rat (Rattus norvegi- tices. J. Neurosci., 14, 1856–1877.
cus): Evidence for two lateral areas joined at the representation Suzuki, W. A., & Amaral, D. G. (2003a). The perirhinal
of the teeth. J. Comp. Neurol., 467, 105–118. and parahippocampal cortices of the macaque monkey:
674 memory
46 Medial Temporal Lobe Function
and Human Memory
yael shrager and larry r. squire
abstract The hippocampus and anatomically related structures lection. The different forms of nondeclarative memory are
in the medial temporal lobe support the capacity for conscious supported by specific brain systems outside of the medial
recollection (declarative memory). This chapter considers a number temporal lobe memory system (Eichenbaum & Cohen, 2001)
of topics that have been prominent in recent discussions of medial
temporal lobe function: visual perception, working memory, habit
(figure 46.1B).
learning, recollection and familiarity, path integration, remote
memory, and conscious awareness. Intact visual perception
Memory-impaired patients with medial temporal lobe
The importance of the medial temporal lobe for memory damage have consistently exhibited intact intellectual and
was established in 1957 when Brenda Milner described the perceptual functions. Thus the ability to acquire new memo-
profound effects of medial temporal lobe resection on ries appears to be a distinct cerebral function, independent
memory in a patient who became known as HM (Scoville of other perceptual and cognitive functions. This fundamen-
& Milner, 1957; Squire, 2009). Subsequently, animal models tal principle of brain organization has been revisited recently,
of human memory impairment identified the anatomical as there has been interest in the possibility that medial tem-
structures within the medial temporal lobe that are impor- poral lobe structures might be involved in visual perception
tant for understanding HM’s memory impairment: the hip- in addition to memory. Initially the focus was on perirhinal
pocampal region (hippocampus proper, dentate gyrus, and cortex. Whereas some experimental studies with monkeys
subicular complex) and the perirhinal, entorhinal, and para- underscored the role of perirhinal cortex in memory and not
hippocampal cortices. These structures comprise the medial visual perception (Buffalo et al., 1999; Hampton & Murray,
temporal lobe memory system (Lavenex & Amaral, 2000; 2002), others have implicated a role for the perirhinal cortex
Squire & Zola-Morgan, 1991) (figure 46.1A). in visual perception (Buckley, Booth, Rolls, & Gaffan, 2001;
Medial temporal lobe damage impairs only declarative Buckley & Gaffan, 1998; Bussey & Saksida, 2002; Bussey,
memory (Schacter & Tulving, 1994; Squire, 1992). Declara- Saksida, & Murray, 2003; Murray & Bussey, 1999). Yet it is
tive memory refers to the capacity to recollect facts and difficult to test experimental animals for the ability to identify
events. Its contents are thought to be accessible to conscious visual stimuli independent of the ability to learn about them,
recollection. The stored representations are flexible and can and it has been pointed out that impairments in monkeys
guide successful performance in a wide range of conditions. that have been attributed to a perceptual deficit could have
Declarative memory can be contrasted with nondeclarative resulted from impaired learning (Hampton, 2005).
memory, a collection of memory abilities including skills A distinction between perception and learning can be
and habits, simple forms of conditioning, priming, and drawn easily in studies of humans because humans can be
other instances where experience changes how we interact instructed about the requirements of the task. A number of
with the world. Nondeclarative memory occurs as modifica- studies of patients with medial temporal lobe lesions have
tions within specialized performance systems, and what is found intact perceptual abilities (Holdstock, Gutnikov,
learned is expressed through performance rather than recol- Gaffan, & Mayes, 2000; Levy, Shrager, & Squire, 2005;
Stark & Squire, 2000). Yet some work in humans found that
yael shrager Department of Neurosciences, University of a group of memory-impaired patients with damage report-
California San Diego, La Jolla, California. Now at Department of edly involving either the hippocampus, or the hippocampus
Psychology, Harvard University and Howard Hughes Medical plus additional medial temporal lobe structures, were
Institute
impaired on tests of perceptual abilities that involved diffi-
larry r. squire Veterans Affairs Healthcare System, San
Diego, California; Department of Psychiatry, Department of cult-to-discriminate faces, objects, and scenes (Lee, Buckley,
Neurosciences, Department of Psychology, University of et al., 2005; Lee, Bussey, et al., 2005). This newer work,
California, San Diego, La Jolla, California which involved rather complex visual stimuli, raised the
shrager and squire: medial temporal lobe function and human memory 675
A
Figure 46.1 (A) A schematic view of the medial temporal lobe term memory systems. The taxonomy lists the brain structures
memory system for declarative memory, which is composed of the thought to be especially important for each form of declarative
hippocampal region together with the perirhinal, entorhinal, and and nondeclarative memory. In addition to its central role in
parahippocampal cortices. (From Manns & Squire, 2002.) The emotional learning, the amygdala is able to modulate the strength
hippocampal region is composed of the dentate gyrus (DG), the CA of both declarative and nondeclarative memory. (From Squire &
fields, and the subiculum (S). (B) A taxonomy of mammalian long- Knowlton, 2000.)
possibility that appropriate tests can reveal perceptual deficits distinct source images, so that the stimuli presented on con-
that had not been detected by conventional tests of visual secutive trials were derived from the same pair of source
perception (Lee, Barense, & Graham, 2005). These new find- images and were therefore quite similar to one another.
ings therefore challenge the long-standing idea that memory Accordingly, the question arises whether memory for pre-
impairment can occur as a circumscribed disorder. vious trials could benefit test performance. To test visual
Some issues arise in interpreting these studies. First, one perception without testing memory ability, it would be
wonders if additional damage outside of the medial temporal advantageous to use unique stimuli on each trial.
lobe could underlie the visual perceptual deficits (discussed These issues were explored in a recent study of six
in the following paragraphs). Second, in these particular memory-impaired patients with well-characterized lesions
studies, the stimuli were created by morphing together two (Shrager, Gold, Hopkins, & Squire, 2006). Two of these
676 memory
patients (EP and GP) are severely amnesic and have large
bilateral lesions of the medial temporal lobe resulting
from herpes simplex encephalitis. Both patients have exten-
sive, virtually complete bilateral damage to the hippocam-
pus, amygdala, entorhinal cortex, and perirhinal cortex, as
well as the majority of the parahippocampal cortex. Four
of the patients have damage thought to be limited to the
hippocampus.
The six patients and eight matched controls were tested
with morphed grayscale images from three categories
(faces, objects, and scenes), similar to those used in the earlier
work that reported impairment (Lee, Bussey, et al., 2005).
The morphed images were created by gradually morphing
one distinct grayscale image into another (e.g., one hat
into a different hat or a lemon into a tennis ball) across a
100-step series.
In one experiment, three images were presented on each
trial (figure 46.2A). Two morphed images were presented
below one of the distinct images from which the morphed
images were derived, and participants were asked to indicate
which of the two morphed images was more similar to the
distinct image. Critically, on each trial, each pair of morphed
images was derived from a unique pair of distinct images. Figure 46.2 (A) Trial-unique visual discrimination. On each of
120 unique trials, two morphed images were presented below a
Thus participants could not benefit from their memory of single distinct image. Participants were asked to choose the lower
images they had seen in previous trials. All patients per- image (here identified by a +) that appeared more similar to the
formed as well as controls in all three stimulus categories upper image. (B) Visual matching. On each of 45 unique trials, a
(faces, objects, and scenes). target image was presented above a single image. Both images were
On each trial in another experiment, a target image, derived from a unique pair of distinct images (01 and 100). In the
case illustrated, the target image is image number 63 in the 100-
chosen from the 100-step morphed-image series, was pre-
image series, and the bottom image is image number 51 from the
sented at the top of the screen (figure 46.2B). In addition, a same series. Participants were asked to scroll through the ordered
single image from the same series was presented below the series of 100 images to find the image that matched the target
target image. Participants were asked to match the lower image. (From Shrager, Gold, Hopkins, & Squire, 2006.)
image to the target by scrolling though the ordered series of
100 morphed images, viewing only one image at a time, and amount of tissue unexamined. Furthermore, even by these
to select the image that was identical to the target. Perfor- assessments, the damage in some patients extended beyond
mance was scored as the number of image steps between the the brain structures that defined the groups. Without thor-
image that was selected and the target image (thus lower ough, quantitative assessment of the lesions, the possibility
scores indicate better performance). All patients performed remains that there is additional damage in the patients and
as well as controls in all three stimulus categories. that such damage might underlie the visual perceptual defi-
Aside from the possible importance of trial-unique stimuli, cits that were observed.
it is possible that difference in the patient groups might In contrast, the lesions of the patients in Shrager and
explain the discrepancy between the findings of Shrager and colleagues (2006) were rigorously measured using quantita-
colleagues (2006) and the findings of Lee and colleagues tive volumetric analysis of magnetic resonance images
(Lee, Buckley, et al., 2005; Lee, Bussey, et al., 2005), as well (Bayley, Gold, Hopkins, & Squire, 2005; Gold & Squire,
as related findings (Graham et al., 2006). The lesions in the 2005). For each patient, approximately 60 sections were
patients studied by Lee and colleagues (Lee, Buckley, et al., measured in 1-mm intervals rostrocaudally through the
2005; Lee, Bussey, et al., 2005) were characterized by visual medial and lateral temporal lobes. The measurements were
ratings of magnetic resonance images (the ratings were made taken in every section in which a structure of interest was
on a 4- or 5-point scale). These ratings, based on visual present. In addition, volumes were calculated for the insular
inspection, are not the same as quantitative brain measure- cortex, the fusiform gyrus, and the frontal, parietal, and
ments. Also, the ratings given for each patient were based occipital lobes.
on a single coronal section for each structure of interest in Over the past 40 years, numerous studies of memory-
the medial and lateral temporal lobe, leaving a considerable impaired patients with lesions of the medial temporal lobe
shrager and squire: medial temporal lobe function and human memory 677
have found visual perceptual function to be intact (Corkin, to the hippocampus) and controls were tested across short
1984; Levy et al., 2005; Milner, Corkin, & Teuber, 1968; delays in four different tasks. Next, the effect of distraction
Stark & Squire, 2000). It was this early work that led to the on control performance was tested in the same tasks. The
principle that memory can be severely impaired without reasoning was as follows: If amnesic patients perform well
impairing other intellectual or perceptual functions. More on tasks when they can operate within working-memory
recently, visual perception has been challenged with newer, capacity (i.e., by active maintenance), then controls given
more difficult tasks than had been used previously. This the same tasks should be impaired when distraction is inter-
new work provides additional support for the principle that posed between study and test because distraction should
memory impairment can occur in the absence of impaired disrupt the active maintenance process. Conversely, if
visual perception. amnesic patients perform poorly when their working-
memory capacity is exceeded, then controls given the same
Working memory and brain systems tasks should be minimally affected by distraction between
study and test (because performance is now supported more
Working memory refers to the capacity to maintain tempo- by long-term memory than by active maintenance).
rarily a limited amount of information in mind. This infor- Memory was first tested for names and faces in the patients
mation can then be used to support various cognitive abilities, and their controls. Participants studied either three names
including learning and reasoning (Baddeley & Hitch, 1974). presented one at a time or a single face. After a 14-second
Amnesic patients with damage to the medial temporal lobe delay, memory was tested with a single probe stimulus, and
have consistently exhibited intact working memory despite participants indicated whether the probe stimulus (a name
grave impairment in long-term memory (Drachman & Arbit, in the names test and a face in the faces test) had just been
1966; Milner, 1972). Thus working memory has been presented in the study phase. The patients performed as well
thought to be independent of the medial temporal lobe and as controls in the names test (patients scored 94.4%, and
has come to be defined as a kind of memory that is spared controls scored 94.5% correct), and they were impaired in
in patients with medial temporal lobe damage (Atkinson & the faces test (patients scored 93.2%, and controls scored
Shiffrin, 1968; Milner, 1972; Pashler & Carrier, 1996). 98.0% correct) (figure 46.3A).
These ideas have been challenged recently by the proposal The question of interest was whether the impairment in
that working memory might sometimes depend on medial the faces test resulted from a working-memory deficit or a
temporal lobe structures. Specifically, patients with medial long-term-memory deficit. Accordingly, the effect of distrac-
temporal lobe damage were found to be impaired at remem- tion on control performance was tested in both the names
bering information across brief time intervals (Hannula, and faces tests. Controls were again asked to study either
Tranel, & Cohen, 2006; Hartley et al., 2007; Nichols, three names or a single face. During the delay, on half the
Kao, Verfaellie, & Gabrieli, 2006; Olson, Moore, Stark, trials, controls were distracted. Control performance was
& Chatterjee, 2006; Olson, Page, Moore, Chatterjee, & impaired by distraction in the names condition (96.4%
Verfaellie, 2006). The interpretation that these impairments versus 87.5% correct) but not in the faces condition (96.4%
result from impaired working memory would require a revi- correct for the no-distraction condition versus 95.3% correct
sion of the long-standing principle that working memory is for the distraction condition) (figure 46.3B). Performance on
separable from long-term memory and is independent of the the distracter (counting) tasks was comparable in the names
medial temporal lobe. Yet it is possible that the impairments and faces tests.
might have occurred because the capacity for working These results revealed a correspondence between the per-
memory was exceeded and performance in these cases formance of amnesic patients and the effect of distraction on
depended on long-term memory. This possibility draws controls. Distraction impaired controls on the names test,
attention to the fact that there is a circularity in the way that presumably because the distraction interfered with an active
working memory is often defined. Working memory has been maintenance process that is based on rehearsal. Distraction
characterized as a kind of memory that is spared in amnesia, did not affect performance on the faces test, presumably
but amnesia is traditionally characterized as a condition in because the information is difficult to maintain actively
which working memory is intact. It would be useful to have a (rehearse) and must depend on long-term memory shortly
method for identifying and measuring working memory that after the information is presented (Warrington & Taylor,
is independent of the performance of amnesic patients. 1973). We suggest that amnesic patients were intact when
A recent study used distraction between study and test to task performance was supported by rehearsal (working
measure working memory in controls and also tested the memory for names) but were impaired when rehearsal was
performance of amnesic patients (Shrager, Levy, Hopkins, less effective and performance had to depend on long-term
& Squire, 2008). Amnesic patients with medial temporal memory (in the case of faces). The same finding was obtained
lobe damage (EP, GP, and six patients with damage limited in a related set of experiments that tested memory for objects
678 memory
One example of a task that is approached differently by
humans and nonhuman primates is concurrent discrimina-
tion learning, a standard task for studying mammalian
memory for more than 50 years. In a common version of
this task, eight pairs of objects are presented five times each
day, one pair at a time in a mixed order, totaling 40 trials
each day. One object in each pair is always correct, and a
choice of the correct object results in a reward. Humans
readily learn this task after one or two days of training,
scoring about 90% correct. The task ordinarily depends on
declarative memory, as indicated by the fact that task per-
formance is correlated with the ability to describe the objects
and by the fact that amnesic patients perform quite poorly
(Hood, Postle, & Corkin, 1999; Squire, Zola-Morgan, &
Chen, 1988).
In contrast to the findings in humans, monkeys learned
the same concurrent discrimination task gradually, across
several hundred trials. Furthermore, monkeys with medial
temporal lobe lesions learned this task and a similar version
of the same task at normal rates (Buffalo, Stefanacci, Squire,
& Zola, 1998; Malamut, Sanders, & Mishkin, 1984; Teng,
Figure 46.3 (A) Percent correct scores for controls (CON) and Stefanacci, Squire, & Zola, 2000). For monkeys, learning
patients with medial temporal lobe lesions (MTL) when asked to proceeded by trial and error (sometimes termed habit learn-
remember either three surnames or a single face for 14 seconds. ing), and learning was impaired by basal ganglia lesions
(B) Percent correct scores for controls on trials with and without
(Fernandez-Ruiz, Wang, Aigner, & Mishkin, 2001; Teng
distraction when asked to remember three surnames or a single
face for 14 seconds. Error bars indicate standard error. Asterisks et al., 2000). Habit memory is proposed to involve slowly
indicate p < 0.05. (From Shrager, Levy, Hopkins, & Squire, acquired associations between stimuli and responses that
2008.) develop outside of awareness and that are rigidly organized,
with the result that what is learned is not readily expressed
and object locations (a form of relational memory) (Shrager, unless the task is presented just as it was during training.
Levy, et al., 2008). Together, the findings support a brain- A recent study asked whether severely amnesic patients
based distinction between working memory and long-term can learn this task and, if so, whether the learning has the
memory, as well as the idea that working memory is inde- characteristics of nondeclarative (unconscious) memory.
pendent of medial temporal lobe structures. Two patients with large medial temporal lobe lesions, EP
and GP, and four controls successfully learned the concur-
Habit learning rent discrimination task (eight pairs, five presentations of
each pair per session) (Bayley, Frascino, & Squire, 2005).
Some tasks are acquired by humans as declarative knowl- Two sessions were scheduled each week. The controls learned
edge through memorization but nevertheless can be acquired the task quickly during three testing sessions (figure 46.4A).
nondeclaratively by experimental animals. On such tasks, In contrast, EP and GP learned gradually during 36 and 28
amnesic patients with medial temporal lobe lesions perform sessions, respectively, and reached a performance level of
poorly, whereas monkeys with medial temporal lobe lesions 85.0% and 92.5% correct (figure 46.4B,C).
acquire the task at the rate of unoperated monkeys. These The learning exhibited by the patients across weeks was
findings raise the question whether patients with profound not accompanied by declarative knowledge of the task. Thus
amnesia, with no capacity for declarative memory, could neither patient recognized that he had been tested in previ-
acquire such a task nondeclaratively in the way that the ous sessions, and neither patient could describe the testing
monkey learns it. If so, is the learning done consciously or procedure. An additional condition tested whether the
unconsciously? Or is it the case that, in humans, one memory knowledge that had been acquired was rigidly organized, as
system cannot readily substitute for another? Perhaps in is thought to occur for habit learning, or whether it could
humans, some forms of nondeclarative memory are not well be used flexibly. Three to six days after the conclusion of
developed, or perhaps a capacity for nondeclarative learning formal training, participants were given a sorting task. They
is easily overridden by the tendency to engage a conscious were presented with all 16 objects mixed together (the eight
declarative memory strategy. pairs they had learned) and were asked to sort them into two
shrager and squire: medial temporal lobe function and human memory 679
groups: one containing the correct objects and another con-
taining the incorrect objects. Controls succeeded, scoring
95.3% correct, while the patients failed altogether (EP scored
56.3%, and GP scored 50.0% correct; chance = 50%) (figure
46.4B,C). Thus the learned information could not be used
flexibly. Both patients were able to perform well when asked
to verbalize their responses instead of reaching for the
objects, but the objects needed to be presented as pairs in
order for performance to succeed (figure 46.4B,C). Seven-
teen days later, EP and GP failed the sorting task again and
then succeeded once more when the task was presented in
its original format (figure 46.4B,C).
These findings demonstrated a robust capacity for habit
learning that can operate outside awareness and indepen-
dently of declarative memory and the medial temporal lobe.
The knowledge acquired by both patients was rigidly orga-
nized and most accessible when the task was structured just
as it was during training. These results provide a particularly
compelling example of the distinction between declarative
(and conscious) and nondeclarative (and unconscious) learn-
ing systems.
680 memory
values are plotted across the confidence levels to construct
an ROC.
The ROC of normal individuals has been compared to the
ROC of memory-impaired patients (Yonelinas, Kroll,
Dobbins, Lazzara, & Knight, 1998; Yonelinas et al., 2002)
and rats with hippocampal lesions (for rats, decision criteria
are manipulated by other methods) (Fortin et al., 2004).
These ROCs were curvilinear, as is typical, but they differed
in their degree of symmetry. As is usually the case, the ROC
of controls was asymmetrical, but the ROC of patients and
rats with hippocampal lesions was symmetrical (figure 46.5).
These data have sometimes been interpreted according to a
high-threshold/signal detection model (Yonelinas et al.,
1998), which takes the degree of asymmetry in an ROC to
reflect the degree to which the recollection process contrib-
utes to recognition memory performance. Specifically, a sym-
metrical ROC indicates that recollection was absent and that
recognition memory was based only on familiarity, whereas
an asymmetrical ROC indicates that recollection also
occurred to some extent. Thus, by the high-threshold/signal
detection model, the finding that memory-impaired patients,
as well as rats with hippocampal lesions, produce a symmetri-
cal ROC suggests that the recollection process is impaired.
Although the ROC curves of patients and their controls
(and lesioned rats and their controls) did differ qualitatively
with respect to symmetry, they also differed quantitatively.
The patients and the lesioned rats had weaker memories
than their respective controls. Indeed, the standard signal
detection model of recognition memory (Macmillan & Creel-
man, 2005) explains the difference between asymmetrical
Figure 46.5 Hypothetical ROC data illustrating symmetrical
and symmetrical ROCs as a difference in memory strength. and asymmetrical ROC curves. The degree of asymmetry evident
An asymmetrical ROC reflects high memory strength, and in an ROC is typically quantified by a “slope” parameter obtained
a symmetrical ROC reflects lower memory strength (Glanzer, by fitting the standard signal detection model (Macmillan &
Kim, Hilford, & Adams, 1999). If the symmetry of the ROC Creelman, 2005) to the data. A slope of 1.0 denotes a symmetrical
is related to memory strength, then the difference in sym- ROC, whereas a slope less than 1.0 denotes an asymmetrical ROC.
The high-threshold/signal detection model (Yonelinas, Kroll,
metry between controls and memory-impaired patients (or Dobbins, Lazzara, & Knight, 1998) would yield a recollection
lesioned rats) might simply reflect the difference between parameter estimate of 0 for the symmetrical ROC (top panel) and
strong and weak memories, rather than a qualitative differ- an estimate greater than 0 for the asymmetrical ROC (bottom panel).
ence between the underlying component processes of recog- (From Wais, Wixted, Hopkins, & Squire, 2006.)
nition memory.
This idea was tested in a study of controls and memory- Six patients with damage thought to be limited to
impaired patients with circumscribed hippocampal lesions the hippocampus participated. Participants first studied
(Wais, Wixted, Hopkins, & Squire, 2006). The question of 50 words. After a three-minute interval, 50 target words
interest was how the shape of the ROC changes as a function were intermixed with 50 foil words, and participants assigned
of memory strength for patients with hippocampal lesions a confidence rating to each word from 1 (“definitely new”)
and how the performance of patients compares with the to 6 (“definitely old”). As expected, the patients performed
performance of controls. If recollection is selectively impaired more poorly than controls (H-50 versus C-50, figure 46.6).
in the patients, then the ROC should be symmetrical regard- Patients were then given a second, easier recognition-
less of memory strength. Alternatively, if the hippocampus memory test involving only 10 words (plus four untested
does not selectively support recollection, then the patients filler words, two at the beginning and two at the end of
with hippocampal lesions should produce asymmetrical the list). On this test, patient performance improved to
ROCs like the controls once differences in memory strength a level similar to that of controls (H-10 versus C-50,
are accounted for. figure 46.6).
shrager and squire: medial temporal lobe function and human memory 681
Figure 46.6 Recognition memory performance of hippocampal
patients and controls. Patients were tested with 50-item lists (H-50
condition) or 10-item lists (H-10 condition). Controls were tested
with 50-item lists (C-50 condition). The retention interval was 3
minutes. The mean score of the controls (C-50) was greater than
that of the patients in the H-50 condition, but similar to the score
obtained by the patients in the H-10 condition. The score in the
H-10 condition was also greater than the score in the H-50 con-
dition. Error bars represent standard errors. (From Wais, Wixted,
Hopkins, & Squire, 2006.)
The ROCs for the patients and controls were all curvilin-
ear (figure 46.7). The ROC from the H-50 condition was
symmetrical, but the ROCs from the H-10 and the C-50
conditions were asymmetrical to a similar extent. Thus the
ROC of the hippocampal patients was symmetric when
memory was weak but asymmetric when memory was strong
(H-50 versus H-10, respectively). Moreover, when memory
performance was similar for patients and controls (the H-10
and C-50 conditions), the degree of asymmetry in the ROC
was similar as well.
To derive theoretical estimates of recollection and famil-
iarity, the ROC data were first fitted by the high-threshold/
signal detection model. In the H-50 condition, the recollec-
tion parameter estimate was equal to zero, and in the C-50
condition it was greater than zero (0.23). Similarly, the famil-
iarity parameter estimate was lower in the H-50 condition
than in the C-50 condition (0.83 versus 1.64). Importantly,
in the H-10 condition, the parameter estimates for both
recollection and familiarity were similar to the estimates for
the C-50 condition (recollection estimate of 0.22 and 0.23
for H-10 and C-50, respectively, and familiarity estimate of
1.21 and 1.64 for H-10 and C-50, respectively, p = 0.11). Figure 46.7 ROC data produced by the hippocampal patients
Thus, according to the high-threshold/signal detection and controls. The top panel shows the data for hippocampal
patients in the 50-item condition, the middle panel shows the data
model, the recollection process is present in both patients for hippocampal patients in the 10-item condition, and the bottom
and controls. Furthermore, when memory performance was panel shows the data for controls in the 50-item condition. The
matched between patients and controls (H-10 and C-50), the H-50 ROC was symmetric (slope = 1.14). The H-10 ROC and the
nearly identical recollection estimates (0.22 and 0.23) offered C-50 ROC were both asymmetric (slope = 0.83 for both groups)
no evidence of a selective deficit in recollection after hippo- and also more asymmetric than the ROC of the H-50 group. (From
Wais, Wixted, Hopkins, & Squire, 2006.)
campal lesions.
In contrast to the high-threshold/signal detection
model, the traditional signal detection model (Macmillan &
682 memory
Creelman, 2005) does not dictate how recollection and (Shrager, Kirwan, & Squire, 2008). Two patients with large
familiarity combine to produce an ROC curve. The fact that medial temporal lobe lesions (EP and GP), three patients
patients and controls exhibited similar ROCs as a function with hippocampal lesions, and seven controls were tested for
of memory strength nevertheless suggests that the compo- their path integration ability. In the first condition (stan-
nent processes of recognition are both operative in the dard), participants wore a blindfold and earphones to reduce
patients. If the asymmetry of the ROC curve is taken as an external cues, and they were led in a laboratory space along
indicator of recollection, then these results challenge the idea 16 different paths that averaged 4.3 meters in length and
that the hippocampus subserves a recollection process and involved either 1 or 2 turns. At the end of each path, par-
that hippocampal patients do not have this process. The ticipants stepped onto a platform (5 cm above the floor and
findings are not an argument against the utility of the con- equipped with handlebars for stability) and were asked to
structs of recollection and familiarity. Rather, they challenge point to their start location. An error measure was then
the idea that recollection and familiarity can be dichoto- computed as the difference between the participant’s point-
mized and assigned to separate brain structures in the medial ing direction and the correct direction. Participants were
temporal lobe (Squire et al., 2007). encouraged to actively maintain the path in mind as they
walked, so that performance might be supported by working
Path integration memory (mean trial duration was 33.4 seconds).
The patients performed as accurately as controls
During the past several decades, there have been two influ- (mean pointing direction for patients, −4°; controls, +4°)
ential traditions about the function of the hippocam- (figure 46.8A). Furthermore, the variability in performance
pus, entorhinal cortex, and related medial temporal lobe across the 16 trials was similar for both groups (patients,
structures. One tradition emphasizes the importance of 31.3; controls, 30.5) (figure 46.8D). Debriefings of the two
these structures for memory (Scoville & Milner, 1957; most severely memory-impaired patients (EP and GP) and
Squire, Stark, & Clark, 2004). The other emphasizes their four controls indicated that subjects tried to keep track of
importance for spatial cognition (Etienne & Jeffery, 2004; their position in space as they moved, continually updating
McNaughton, Battaglia, Jensen, Moser, & Moser, 2006; their position relative to the start point.
O’Keefe & Nadel, 1978; Whitlock, Sutherland, Witter, Path integration was further challenged in two additional
Moser, & Moser, 2008). An important part of spatial cogni- conditions. In one condition, participants were blindfolded
tion is path integration, the ability to use internal cues during and led in the laboratory along 16 paths involving 3 turns
movement (i.e., self-motion cues) to keep track of a reference (compared to 1 or 2 turns in the standard condition; mean
location. Because many tasks of spatial cognition, including trial duration, 26.0 seconds). In another condition, partici-
path integration, require memory, these two traditions are pants were blindfolded and led in an outdoor space along 8
compatible with each other to a large extent. paths that were nearly four times as long (15 meters) as the
The view that medial temporal lobe structures are impor- paths in the standard condition. Mean trial duration in this
tant for memory makes a key distinction between short-term case was 29.7 seconds. In both conditions, the patients
(or working) memory and long-term memory (see the section pointed to their start location as accurately as controls, and
on working memory in this chapter). Patients with damage they also exhibited variability similar to that of controls (for
to the medial temporal lobe, including damage to the hip- 3 turns, controls, +6°, variability, 32.2; patients, −7°, vari-
pocampus or entorhinal cortex, are thought to have intact ability, 31.7; for the longer paths, controls, +9°, variability,
working memory, and they perform poorly only when 35.0; patients, −15°, variability, 27.2). In a fourth condition,
demands are made on long-term memory. This idea is participants were led along 8 paths in the laboratory envi-
meant to apply even to tasks that require spatial cognition, ronment (4 involving 1 turn, 4 involving 2 turns, for a path
such as path integration. length averaging 4.2 m). At the end of each path, partici-
In contrast, the view that the hippocampus and entorhinal pants estimated their distance from the start location (instead
cortex are important for path integration often includes the of pointing). Some paths ended far from the start point, and
suggestion that the path integrator is located in these struc- some ended near the start point. Again, the patients were as
tures (Etienne & Jeffery, 2004; McNaughton et al., 2006). accurate as controls (both groups averaged 0.7 m error for
By this view, patients with damage to the hippocampus and distances that averaged 2.8 m).
entorhinal cortex should be impaired at path integration, A separate condition served as a key control to ensure that
and this impairment should occur regardless of whether participants were in fact path integrating, that is, relying on
demands are made on long-term memory. internal cues rather than on external cues that were beyond
These ideas were tested by asking whether the hippocam- experimental control. Blindfolded participants were led in
pus and entorhinal cortex are essential for path integration the laboratory environment along 16 paths and, at the end
even when the task can be managed within working memory of each path, stepped onto the platform and held onto the
shrager and squire: medial temporal lobe function and human memory 683
controls was substantially compromised. Neither group
exhibited a significant pointing direction (i.e., pointing
across participants was random), and variability increased
(patients, 54.9; controls, 61.5) (figure 46.8B,E).
In a final condition, path integration was tested when
demands on long-term memory were increased by increas-
ing the duration of each trial and by introducing distraction
during the delay (mean trial duration, 1 minute 10 seconds).
The controls performed as well in the distraction condition
as in the standard condition. Their mean pointing direction
was +1° (compared to +4° in the standard condition), and
variability was 30.1 (compared to 30.5 in the standard condi-
tion) (figure 46.8C,F ). In contrast, the patients had difficulty
in the distraction condition. Their mean pointing direction
was −14° (numerically worse than their pointing direction
in the standard condition, −4°), and variability was 57.1
(significantly worse than in the standard condition, 31.3, and
significantly worse than controls in the distraction condition,
30.1) (figure 46.8C,F ).
These results indicate that patients with lesions of the
medial temporal lobe can path integrate as well as controls
when the task can be managed within working memory.
When demands on long-term memory were increased, the
patients were impaired. These findings suggest that medial
temporal lobe structures are not unique, essential sites where
computations needed for path integration are carried out.
These computations likely occur upstream of the medial
temporal lobe, perhaps in parietal cortex. The medial tem-
poral lobe then operates on this information, much as it
operates on information from other sensory modalities, in
Figure 46.8 Circular means of each participant’s 16 pointing
order to transform on-line perceptual information into long-
directions in the standard, rotation, and distraction conditions for term memory.
patients with damage to the medial temporal lobe (MTL, filled
circles) and controls (CON, unfilled circles). 0° indicates the correct Remote memory
direction. Group pointing directions are also indicated (solid arrow,
CON; broken arrow, MTL). Shorter arrows denote greater vari-
Damage to the hippocampus and related medial temporal
ability (dispersion) in the group’s pointing direction (following
Moore’s test for nonuniformity, Batschelet, 1981). In B, X indicates lobe structures not only impairs new learning capacity but
individuals who did not exhibit a significant pointing direction. The also impairs memory for information acquired before the
standard deviation of pointing directions around each participant’s damage occurred (retrograde amnesia). Early clinical descrip-
circular mean was calculated, and the individual standard devia- tions of retrograde amnesia led to the proposal that recently
tions were then averaged for each group (D,E,F ). Asterisk (*) indi-
acquired memories are typically more impaired than remotely
cates p < 0.05, CON versus MTL groups. Brackets indicate standard
error. (From Shrager, Kirwan, & Squire, 2008.) acquired memories (Ribot, 1881), and a large experimental
literature has supported this idea (Frankland & Bontempi,
handlebars. The platform was then slowly rotated by remote 2005; Squire & Bayley, 2007). Yet questions remain about
control through 190°, after which participants tried to point whether medial temporal lobe damage can sometimes cause
to their start location. Pilot experiments indicated that, after extensive and ungraded retrograde memory loss and about
the rotation, participants had difficulty knowing how far they the status of remote autobiographical memory.
had been turned. Accordingly, one would expect that, if Some have concluded that retrograde amnesia is tem-
participants were in fact relying on path integration (internal porally ungraded and that recent and remote memories
cues) to point to their start location, they should have diffi- are similarly impaired across the life span (Sanders &
culty in the rotation condition. Mean trial duration was the Warrington, 1971; Warrington, 1996). Others have con-
same as in the original, standard condition (32.4 seconds). cluded that retrograde amnesia is temporally limited and
The result was that performance of both patients and related to the extent and locus of the damage (Eichenbaum,
684 memory
Dudchenko, Wood, Shapiro, & Tanila, 1999; Squire et al.,
2004). There are two reasons why this issue has been difficult
to settle. First, memory has not always been assessed at early
enough time periods to permit a firm conclusion that memory
loss is ungraded. Second, the relationship between the extent
of retrograde memory loss and the extent of medial temporal
lobe damage has not always been clearly identified.
A recent study tested memory for past news events in two
patients with large medial temporal lobe lesions (EP and
GP), six patients with limited hippocampal lesions, and
matched controls (Bayley, Hopkins, & Squire, 2006). The
test involved up to 300 questions about news events that
had occurred from early life to the current year. The patients
with hippocampal lesions performed poorly during the
period of anterograde amnesia (after the onset of amnesia)
and exhibited temporally limited retrograde amnesia cover-
ing a period of about 5 years before the onset of amnesia
(figure 46.9). For more remote time periods, the patients
performed as well as controls.
EP and GP also performed poorly during their period of
anterograde amnesia and in addition exhibited extensive
retrograde amnesia covering many years before the onset
of amnesia (figure 46.10). Nevertheless, both patients per-
formed better when the questions covered the most remote
time periods. GP performed within 1.1 standard deviations
of controls in the time period 21 to 25 years before amnesia
(when he would have been 17 to 21 years old), and he per-
formed as well as controls in the time period 26 to 30 years
before amnesia. EP reached normal levels of performance
when the questions covered the period 46 to 50 years before Figure 46.9 Recall performance on a test of 279 news events that
occurred from 1951 to 2005. The scores for controls (CON) and
amnesia (when he would have been 20 to 24 years old).
six patients with damage limited to the hippocampus (H) have been
With respect to autobiographical memory, it has been aligned relative to the onset of amnesia so that performance can
proposed (usually in single-case studies) that medial tempo- be shown for the time period after the onset of amnesia and in 5-
ral lobe damage, and even limited hippocampal damage, year intervals for the time preceding the onset of amnesia. The data
leads to impaired memory for personal events that extends point at −5 represents 1–5 years before amnesia, the point at −10
represents 6–10 years before amnesia, and so on. Error bars indi-
into early life (Cipolotti et al., 2001; Hirano & Noguchi,
cate standard error. (From Bayley, Hopkins, & Squire, 2006.)
1998; Moscovitch, Nadel, Winocur, Gilboa, & Rosenbaum,
2006; Steinvorth, Levine, & Corkin, 2005). Findings from
group studies, however, suggest that both patients with and were able to produce 3-point memories in response to
limited hippocampal lesions and patients with large medial most of the key words. The narratives were then submitted
temporal lobe lesions have intact autobiographical memory to a detailed analysis of content. The narratives of patients
of early life (Bayley, Gold, et al., 2005; Bayley, Hopkins, & and controls contained the same number of details and were
Squire, 2003; Bright et al., 2006; Eslinger, 1998; Rempel- similar on several other measures as well.
Clower, Zola, Squire, & Amaral, 1996). For example, a In an effort to maximize the sensitivity with which the
study of six patients with limited hippocampal lesions, two assessment of remote memory is carried out, it is also possible
patients with large medial temporal lobe lesions, and 25 to use techniques that ask for a single memory from a given
controls were given 24 cue words, and for each word were time period (instead of 24 memories, as before) and then
asked to recollect a specific event from the first third of their probe extensively to obtain as many as 50 details for each
lives that involved the word (Bayley et al., 2003). Narratives memory (the Autobiographical Interview; Levine, Svoboda,
were first scored on a 4-point scale (scores of 0 to 3 points) Hay, Winocur, & Moscovitch, 2002). This test was given to
according to how well participants described an event that three patients with damage limited to the hippocampus, two
was specific to time and place. Patients and controls pro- patients with large lesions of the medial temporal lobe,
duced a similar number of well-formed (3-point) memories and five controls (Kirwan, Bayley, Galván, & Squire, 2008).
shrager and squire: medial temporal lobe function and human memory 685
Participants were asked to provide one memory from each Impaired remote autobiographical memory does occur
of five time periods: childhood (up to age 11 years), teenage when the brain damage extends beyond the medial temporal
years (age 12–17), early adulthood (age 18–35), middle age lobe. Another study assessed remote autobiographical
(age 36–55), and the year before testing. The patients with memory in three patients with medial temporal lobe damage
hippocampal lesions were impaired only at the most recent plus significant damage to the neocortex (Bayley, Gold,
time period, and the patients with larger medial temporal et al., 2005). As in an earlier study (Bayley et al., 2003), the
lobe lesions were impaired at the two most recent time patients were asked to recall a childhood memory in response
periods (figure 46.11). Both groups of patients performed as to a cue word for each of 24 words. As described previously,
well as controls in the three earliest time periods. patients with damage limited to the medial temporal lobe
and their controls produced well-formed memories in
response to most of the 24 word cues (21.6 for the patients,
22.9 for the controls). In contrast, the patients with signifi-
cant neocortical damage outside of the medial temporal lobe
were severely impaired and provided a mean of only 4.0
unique, well-formed memories. These patients were able to
recall some general information in response to the cue words
but had marked difficulty providing memories that were
specific to a particular time and place.
Similar findings were obtained with the Autobiographical
Memory Interview (AMI; Kopelman, Wilson, & Baddeley,
1989), a standardized test that facilitates comparison of
performance across laboratories. In the childhood portion
of this test, patients are asked to recall three unique events
from their childhood. Patients with medial temporal lobe
damage plus significant neocortical damage performed
poorly, whereas patients with limited medial temporal lobe
damage performed well (Bayley, Gold, et al., 2005; figure
46.12). These findings suggest that patients who fail the AMI
(Childhood Portion), or who otherwise have difficulty recol-
lecting events from their early life, have damage outside the
Figure 46.10 Recall performance on a test of news events that medial temporal lobe.
occurred from 1938 to 2005 (for EP, 300 events) and from 1951 to
2005 (for GP, 279 events). The scores for the two patients with
large medial temporal lobe lesions and controls (CON) have been Awareness and memory
aligned relative to the onset of amnesia (see caption for figure 46.9).
Error bars indicate standard error. (From Bayley, Hopkins, & Declarative memory has ordinarily been viewed as memory
Squire, 2006.) that is accompanied by knowledge or awareness of what
A B
Figure 46.11 Total number of (A) episodic and (B) semantic one autobiographical memory from each of five time periods. Error
details across time periods. Patients with damage thought to be bars indicate standard error. (From Kirwan, Bayley, Galván, &
limited to the hippocampus (H), patients with larger medial tem- Squire, 2008.)
poral lobe lesions (MTL), and controls (Con) were asked to retrieve
686 memory
sponding, unmanipulated regions in the repeated scenes.
Again, these effects occurred only when individuals were
aware of the manipulation. Participants who were unaware
that a scene had been changed looked at it in the same way
that they looked at repeated scenes (figure 46.13). The fourth
finding was that these effects occurred even when the scenes
were presented without any indication that memory was
being tested or that individuals should try to detect which
scenes were new, old, or manipulated.
Thus there was no indication that eye movements reveal
an unaware (unconscious) form of memory. Instead, eye
movements reflected declarative (conscious) memory.
These findings support the principle that hippocampus-
Figure 46.12 Performance on the Childhood Portion of the dependent memory is accessible to awareness. Recent
Autobiographical Memory Interview (maximum score, 9). Each studies of transitive inference (Smith & Squire, 2005)
participant’s score is represented by a circle, and patients are identi- and eyeblink conditioning (Smith, Clark, Manns, & Squire,
fied by initials. MTL, patients with medial temporal lobe lesions; 2005) are consistent with this idea. See Smith, Hopkins,
MTL+, patients with medial temporal lobe lesions and additional
lesions in neocortex; CON, controls. (From Bayley, Gold, Hopkins, and Squire (2006) for discussion of two studies that
& Squire, 2005.) reached different conclusions (Greene, Spellman, Dusek,
Eichenbaum, & Levy, 2001; Ryan, Althoff, Whitlow, &
has been learned, and the availability of learned material Cohen, 2000).
to awareness has been considered one of its key features
(Eichenbaum, 1997; Gabrieli, 1998; Squire, 1992; Tulving
Summary
& Schacter, 1990). In some cases, when behavior is changed
by experience, it is unclear what kind of memory is being This chapter reviewed a number of recent findings pertinent
expressed. Consider the case of eye movements. When indi- to the organization of memory and the function of the medial
viduals view novel scenes, familiar scenes, or familiar scenes temporal lobe. These findings indicated that (1) visual per-
in which a change has been introduced, eye movements ception is independent of the medial temporal lobe; (2)
differ depending on the viewing history of each scene. What working memory can be identified independently of the
kind of memory is indexed by eye movements, and is this performance of amnesic patients, and when this identifica-
kind of memory accessible to awareness? tion is accomplished, working memory is found to be inde-
Two studies addressed this issue by asking what kind of pendent of the medial temporal lobe; (3) humans have a
memory is operating when eye movements change as the robust capacity for habit learning that operates outside of
result of experience (Smith, Hopkins, & Squire, 2006; Smith awareness and is independent of the medial temporal lobe;
& Squire, 2008). Amnesic patients and controls viewed (4) path integration, a form of spatial cognition, is indepen-
scenes that were either novel, repeated, or manipulated. For dent of the hippocampus and entorhinal cortex when infor-
manipulated scenes, an element was either added to or mation can be maintained within working memory, but path
removed from a previously presented scene. The first finding integration depends on these structures when demands are
was that the patients were impaired at remembering whether made on long-term memory; (5) the hippocampus is required
a scene was new or old and also whether a scene had been for recognition memory, regardless of whether decisions are
manipulated or not. The second finding was that control based on recollection or familiarity; (6) the medial temporal
subjects, but not amnesic patients, examined scenes differ- lobe plays a time-limited role in declarative memory such
ently depending on whether the scenes were new or old. that very remote memory, including remote autobiographi-
Specifically, during a 5-second viewing period, controls cal memory, is intact after medial temporal lobe damage;
made fewer fixations and sampled fewer regions when scenes and (7) the kind of declarative memory that is dependent on
were repeated than when they were novel. Importantly, the hippocampus is accessible as conscious, aware knowl-
these effects occurred only when individuals were aware that edge of what has been learned.
a scene was novel or repeated. The third finding was that,
when scenes were manipulated, healthy participants made
acknowledgments This work was supported by the
more fixations in the manipulated region, spent more time Medical Research Service of the Department of Veterans Affairs,
looking at the manipulated region, and made more transi- NIMH, the Metropolitan Life Foundation, and an NSF predoc-
tions into and out of the manipulated region than in corre- toral fellowship (Y.S.).
shrager and squire: medial temporal lobe function and human memory 687
Figure 46.13 Eye movement traces (black lines) and fixations altered and who was unaware of the change. Aware individuals
(diamonds) for four different individuals who viewed this image for spent more time looking at the altered region of the image than
5 seconds. (A) An individual for whom this image was novel. (B ) unaware individuals or individuals who had never seen the version
An individual for whom this image was familiar. (C ) An individual with the man in the truck. In each panel, the critical region is
for whom the image was familiar but altered (a man with a dolly identified by a black square, but the square did not appear during
is no longer in the back of the truck) and who was aware of the testing. (From Smith, Hopkins, & Squire, 2006.)
change. (D) An individual for whom the image was familiar but
REFERENCES
amnesia in patients with hippocampal, medial temporal,
Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A temporal lobe, or frontal pathology. Learn. Memory, 13,
proposed system and its control processes. In K. W. Spence & 545–557.
J. T. Spence (Eds.), The psychology of learning and motivation: Advances Brown, M. W., & Aggleton, J. P. (2001). Recognition memory:
in research and theory (pp. 89–195). New York: Academic Press. What are the roles of the perirhinal cortex and hippocampus?
Baddeley, A., & Hitch, G. J. (1974). Working memory. In G. A. Nat. Rev. Neurosci., 2, 51–61.
Bower (Ed.), Recent advances in learning and motivation (pp. 47–89). Buckley, M. J., Booth, M. C., Rolls, E. T., & Gaffan,
New York: Academic Press. D. (2001). Selective perceptual impairments after perirhinal
Batschelet, E. (1981). Circular statistics in biology. London: cortex ablation. J. Neurosci., 21, 9824–9836.
Academic Press. Buckley, M. J., & Gaffan, D. (1998). Perirhinal cortex
Bayley, P. J., Frascino, J. C., & Squire, L. R. (2005). Robust habit ablation impairs visual object identification. J. Neurosci., 18,
learning in the absence of awareness and independent of the 2268–2275.
medial temporal lobe. Nature, 436, 550–553. Buffalo, E. A., Ramus, S. J., Clark, R. E., Teng, E., Squire,
Bayley, P. J., Gold, J. J., Hopkins, R. O., & Squire, L. R. (2005). L. R., & Zola, S. M. (1999). Dissociation between the effects
The neuroanatomy of remote memory. Neuron, 46, 799–810. of damage to perirhinal cortex and area TE. Learn. Memory, 6,
Bayley, P. J., Hopkins, R. O., & Squire, L. R. (2003). Successful 572–599.
recollection of remote autobiographical memories by amnesic Buffalo, E. A., Stefanacci, L., Squire, L. R., & Zola, S. M.
patients with medial temporal lobe lesions. Neuron, 38, 135–144. (1998). A reexamination of the concurrent discrimination learn-
Bayley, P. J., Hopkins, R. O., & Squire, L. R. (2006). The fate of ing task: The importance of anterior inferotemporal cortex, area
old memories after medial temporal lobe damage. J. Neurosci., TE. Behav. Neurosci., 112, 3–14.
26, 13311–13317. Bussey, T. J., & Saksida, L. M. (2002). The organization of visual
Bright, P., Buckman, J., Fradera, A., Yoshimasu, H., object representations: A connectionist model of effects of lesions
Colchester, A. C., & Kopelman, M. D. (2006). Retrograde in perirhinal cortex. Eur. J. Neurosci., 15, 355–364.
688 memory
Bussey, T. J., Saksida, L. M., & Murray, E. A. (2003). Impair- Hannula, D. E., Tranel, D., & Cohen, N. J. (2006). The long and
ments in visual discrimination after perirhinal cortex lesions: the short of it: Relational memory impairments in amnesia, even
Testing “declarative” vs. “perceptual-mnemonic” views of peri- at short lags. J. Neurosci., 26, 8352–8359.
rhinal cortex function. Eur. J. Neurosci., 17, 649–660. Hartley, T., Bird, C. M., Chan, D., Cipolotti, L., Husain, M.,
Cipolotti, L., Shallice, T., Chan, D., Fox, N., Scahill, R., Vargha-Khadem, F., & Burgess, N. (2007). The hippocampus
Harrison, G., Stevens, J., & Rudge, P. (2001). Long-term ret- is required for short-term topographical memory in humans.
rograde amnesia . . . the crucial role of the hippocampus. Neuro- Hippocampus, 17, 34–48.
psychologia, 39, 151–172. Hirano, M., & Noguchi, K. (1998). Dissociation between specific
Corkin, S. (1984). Lasting consequences of bilateral medial tem- personal episodes and other aspects of remote memory in a patient
poral lobectomy: Clinical course and experimental findings in with hippocampal amnesia. Percept. Mot. Skills, 87, 99–107.
H.M. Sem. Neurol., 4, 249–258. Holdstock, J. S., Gutnikov, S. A., Gaffan, D., & Mayes,
Drachman, D. A., & Arbit, J. (1966). Memory and the hippocam- A. R. (2000). Perceptual and mnemonic matching-to-sample in
pal complex. II. Is memory a multiple process? Arch. Neurol., 15, humans: Contributions of the hippocampus, perirhinal and
52–61. other medial temporal lobe cortices. Cortex, 36, 301–322.
Eichenbaum, H. (1997). Declarative memory: Insights from Hood, K. L., Postle, B. R., & Corkin, S. (1999). An evaluation
cognitive neurobiology. Annu. Rev. Psychol., 48, 547–572. of the concurrent discrimination task as a measure of habit
Eichenbaum, H., & Cohen, N. J. (2001). From conditioning to learning: Performance of amnesic subjects. Neuropsychologia, 37,
conscious recollection: Memory systems of the brain. New York: Oxford 1375–1386.
University Press. Kirwan, C. B., Bayley, P. J., Galván, V. V., & Squire, L. R.
Eichenbaum, H., Dudchenko, P., Wood, E., Shapiro, M., & (2008). Detailed recollection of remote autobiographical memory
Tanila, H. (1999). The hippocampus, memory, and place cells: after damage to the medial temporal lobe. Proc. Natl. Acad. Sci.
Is it spatial memory or a memory space? Neuron, 23, 209–226. USA, 105, 2676–2680.
Eichenbaum, H., Yonelinas, A. P., & Ranganath, C. (2007). Kopelman, M. D., Wilson, B. A., & Baddeley, A. D. (1989). The
The medial temporal lobe and recognition memory. Annu. Rev. Autobiographical Memory Interview: A new assessment of auto-
Neurosci., 30, 123–152. biographical and personal semantic memory in amnesic patients.
Eslinger, P. (1998). Autobiographical memory after temporal lobe J. Clin. Exp. Neuropsychol., 11, 724–744.
lesions. Neurocase, 4, 481–495. Lavenex, P., & Amaral, D. G. (2000). Hippocampal-neocortical
Etienne, A. S., & Jeffery, K. J. (2004). Path integration in interaction: A hierarchy of associativity. Hippocampus, 10,
mammals. Hippocampus, 14, 180–192. 420–430.
Fernandez-Ruiz, J., Wang, J., Aigner, T. G., & Mishkin, M. Lee, A. C., Barense, M. D., & Graham, K. S. (2005). The
(2001). Visual habit formation in monkeys with neurotoxic contribution of the human medial temporal lobe to perception:
lesions of the ventrocaudal neostriatum. Proc. Natl. Acad. Sci. USA, Bridging the gap between animal and human studies. Q. J. Exp.
98, 4196–4201. Psychol. [B], 58, 300–325.
Fortin, N. J., Wright, S. P., & Eichenbaum, H. (2004). Lee, A. C., Buckley, M. J., Pegman, S. J., Spiers, H., Scahill,
Recollection-like memory retrieval in rats is dependent on the V. L., Gaffan, D., Bussey, T. J., Davies, R. R., Kapur, N.,
hippocampus. Nature, 431, 188–191. Hodges, J. R., & Graham, K. S. (2005). Specialization in
Frankland, P. W., & Bontempi, B. (2005). The organization the medial temporal lobe for processing of objects and scenes.
of recent and remote memories. Nat. Rev. Neurosci., 6, 119– Hippocampus, 15, 782–797.
130. Lee, A. C., Bussey, T. J., Murray, E. A., Saksida, L. M., Epstein,
Gabrieli, J. D. (1998). Cognitive neuroscience of human memory. R. A., Kapur, N., Hodges, J. R., & Graham, K. S. (2005).
Annu. Rev. Psychol., 49, 87–115. Perceptual deficits in amnesia: Challenging the medial temporal
Glanzer, M., Kim, K., Hilford, A., & Adams, J. (1999). Slope of lobe “mnemonic” view. Neuropsychologia, 43, 1–11.
the receiver-operating characteristic in recognition memory. J. Levine, B., Svoboda, E., Hay, J. F., Winocur, G., & Moscovitch,
Exp. Psychol. Learn. Mem. Cogn., 25, 500–513. M. (2002). Aging and autobiographical memory: Dissociating
Gold, J. J., & Squire, L. R. (2005). Quantifying medial temporal episodic from semantic retrieval. Psychol. Aging, 17, 677–689.
lobe damage in memory-impaired patients. Hippocampus, 15, Levy, D. A., Shrager, Y., & Squire, L. R. (2005). Intact visual
79–85. discrimination of complex and feature-ambiguous stimuli in the
Graham, K. S., Scahill, V. L., Hornberger, M., Barense, absence of perirhinal cortex. Learn. Memory, 12, 61–66.
M. D., Lee, A. C., Bussey, T. J., & Saksida, L. M. (2006). Macmillan, N., & Creelman, C. (2005). Detection theory: A user’s
Abnormal categorization and perceptual learning in patients guide. Mahwah, NJ: Lawrence Erlbaum Associates.
with hippocampal damage. J. Neurosci., 26, 7547–7554. Malamut, B. L., Saunders, R. C., & Mishkin, M. (1984). Monkeys
Greene, A. J., Spellman, B. A., Dusek, J. A., Eichenbaum, with combined amygdalo-hippocampal lesions succeed in object
H. B., & Levy, W. B. (2001). Relational learning with and discrimination learning despite 24-hour intertrial intervals. Behav.
without awareness: Transitive inference using nonverbal stimuli Neurosci., 98, 759–769.
in humans. Mem. Cogn., 29, 893–902. Mandler, G. (1980). Recognizing: The judgment of previous
Hampton, R. R. (2005). Monkey perirhinal cortex is critical for occurrence. Psychol. Rev., 87, 252–271.
visual memory, but not for visual perception: Reexamination of Manns, J. R., Hopkins, R. O., Reed, J. M., Kitchener, E. G., &
the behavioural evidence from monkeys. Q. J. Exp. Psychol. [B], Squire, L. R. (2003). Recognition memory and the human hip-
58, 283–299. pocampus. Neuron, 37, 171–180.
Hampton, R. R., & Murray, E. A. (2002). Learning of discrimina- Manns, J. R., & Squire, L. R. (2002). The medial temporal lobe
tions is impaired, but generalization to altered views is intact, in and memory for facts and events. In A. D. Baddeley, M. D.
monkeys (Macaca mulatta) with perirhinal cortex removal. Behav. Kopelman, & B. A. Wilson (Eds.), The handbook of memory disorders
Neurosci., 116, 363–377. (pp. 81–99). New York: John Wiley & Sons.
shrager and squire: medial temporal lobe function and human memory 689
McNaughton, B. L., Battaglia, F. P., Jensen, O., Moser, Smith, C. N., & Squire, L. R. (2008). Experience-dependent
E. I., & Moser, M. B. (2006). Path integration and the neural eye movements for old and new scenes reflect hippocampus-
basis of the “cognitive map.” Nat. Rev. Neurosci., 7, 663–678. dependent, declarative memory, even in the absence of memory
Milner, B. (1972). Disorders of learning and memory after instructions. J. Neurosci., 28, 12825–12833.
temporal lobe lesions in man. Clin. Neurosurg., 19, 421–446. Squire, L. R. (1992). Memory and the hippocampus: A synthesis
Milner, B., Corkin, S., & Teuber, H.-L. (1968). Further analysis from findings with rats, monkeys, and humans. Psychol. Rev., 99,
of the hippocampal amnesic syndrome: 14-year follow-up study 195–231.
of H.M. Neuropsychologia, 6, 215–234. Squire, L. R. (2004). The legacy of patient H. M. for neuroscience.
Moscovitch, M., Nadel, L., Winocur, G., Gilboa, A., & Neuron, 61, 6–9.
Rosenbaum, R. S. (2006). The cognitive neuroscience of remote Squire, L. R., & Bayley, P. J. (2007). The neuroscience of remote
episodic, semantic and spatial memory. Curr. Opin. Neurobiol., 16, memory. Curr. Opin. Neurobiol., 17, 185–196.
179–190. Squire, L. R., & Knowlton, B. J. (2000). The medial temporal
Murray, E. A., & Bussey, T. J. (1999). Perceptual-mnemonic func- lobe, the hippocampus, and the memory systems of the brain.
tions of the perirhinal cortex. Trends Cogn. Sci., 3, 142–151. In M. S. Gazzaniga (Ed.), The new cognitive neurosciences (pp. 765–
Nichols, E. A., Kao, Y. C., Verfaellie, M., & Gabrieli, J. D. 779). Cambridge, MA: MIT Press.
(2006). Working memory and long-term memory for faces: Squire, L. R., Stark, C. E., & Clark, R. E. (2004). The medial
Evidence from fMRI and global amnesia for involvement of the temporal lobe. Annu. Rev. Neurosci., 27, 279–306.
medial temporal lobes. Hippocampus, 16, 604–616. Squire, L. R., Wixted, J. T., & Clark, R. E. (2007). Recognition
O’Keefe, J., & Nadel, L. (1978). The hippocampus as a cognitive map. memory and the medial temporal lobe: A new perspective. Nat.
Oxford, UK: Clarendon. Rev. Neurosci., 8, 872–883.
Olson, I. R., Moore, K. S., Stark, M., & Chatterjee, Squire, L. R., & Zola-Morgan, S. (1991). The medial temporal
A. (2006). Visual working memory is impaired when the medial lobe memory system. Science, 253, 1380–1386.
temporal lobe is damaged. J. Cogn. Neurosci., 18, 1087–1097. Squire, L. R., Zola-Morgan, S., & Chen, K. S. (1988). Human
Olson, I. R., Page, K., Moore, K. S., Chatterjee, A., & amnesia and animal models of amnesia: Performance of amnesic
Verfaellie, M. (2006). Working memory for conjunctions relies patients on tests designed for the monkey. Behav. Neurosci., 102,
on the medial temporal lobe. J. Neurosci., 26, 4596–4601. 210–221.
Pashler, H., & Carrier, M. (1996). Memory. In E. Bjork & Stark, C. E., & Squire, L. R. (2000). Intact visual perceptual dis-
R. Bjork (Eds.), Handbook of perception and cognition (pp. 3–29). San crimination in humans in the absence of perirhinal cortex. Learn.
Diego: Academic Press. Memory, 7, 273–278.
Rempel-Clower, N. L., Zola, S. M., Squire, L. R., & Amaral, Steinvorth, S., Levine, B., & Corkin, S. (2005). Medial temporal
D. G. (1996). Three cases of enduring memory impairment lobe structures are needed to re-experience remote autobio-
after bilateral damage limited to the hippocampal formation. graphical memories: Evidence from H.M. and W.R. Neuropsycho-
J. Neurosci., 16, 5233–5255. logia, 43, 479–496.
Ribot, T. (1881). Les maladies de la memoire [Diseases of memory]. Teng, E., Stefanacci, L., Squire, L. R., & Zola, S. M. (2000).
New York: Appleton-Century-Crofts. Contrasting effects on discrimination learning after hippocampal
Ryan, J. D., Althoff, R. R., Whitlow, S., & Cohen, N. J. (2000). lesions and conjoint hippocampal-caudate lesions in monkeys.
Amnesia is a deficit in relational memory. Psychol. Sci., 11, J. Neurosci., 20, 3853–3863.
454–461. Tulving, E., & Schacter, D. L. (1990). Priming and human
Sanders, H. I., & Warrington, E. K. (1971). Memory for remote memory systems. Science, 247, 301–306.
events in amnesic patients. Brain, 94, 661–668. Wais, P. E., Wixted, J. T., Hopkins, R. O., & Squire, L. R. (2006).
Schacter, D. L., & Tulving, E. (1994). Memory systems. Cambridge, The hippocampus supports both the recollection and the
MA: MIT Press. familiarity components of recognition memory. Neuron, 49,
Scoville, W. B., & Milner, B. (1957). Loss of recent memory after 459–466.
bilateral hippocampal lesions. J. Neurol. Neurosurg. Psychiatry, 20, Warrington, E. K. (1996). Studies of retrograde memory:
11–21. A long-term view. Proc. Natl. Acad. Sci. USA, 93, 13523–13526.
Shrager, Y., Gold, J. J., Hopkins, R. O., & Squire, L. R. (2006). Warrington, E. K., & Taylor, A. M. (1973). Immediate memory
Intact visual perception in memory-impaired patients with for faces: Long- or short-term memory? Q. J. Exp. Psychol., 25,
medial temporal lobe lesions. J. Neurosci., 26, 2235–2240. 316–322.
Shrager, Y., Kirwan, C. B., & Squire, L. R. (2008). The neural Whitlock, J. R., Sutherland, R. J., Witter, M. P., Moser,
basis of the cognitive map: Path integration does not require M. B., & Moser, E. I. (2008). Navigating from hippocampus to
hippocampus or entorhinal cortex. Proc. Natl. Acad. Sci. USA, 105, parietal cortex. Proc. Natl. Acad. Sci. USA, 105, 14755–14762.
12034–12038. Wixted, J. T., & Squire, L. R. (2004). Recall and recognition are
Shrager, Y., Levy, D. A., Hopkins, R. O., & Squire, L. R. (2008). equally impaired in patients with selective hippocampal damage.
Working memory and the organization of brain systems. J. Neu- Cogn. Affective Behav. Neurosci., 4, 58–66.
rosci., 28, 4818–4822. Yonelinas, A. P., Kroll, N. E., Dobbins, I., Lazzara, M., &
Smith, C., & Squire, L. R. (2005). Declarative memory, awareness, Knight, R. T. (1998). Recollection and familiarity deficits in
and transitive inference. J. Neurosci., 25, 10138–10146. amnesia: Convergence of remember-know, process dissociation,
Smith, C. N., Clark, R. E., Manns, J. R., & Squire, L. R. (2005). and receiver operating characteristic data. Neuropsychology, 12,
Acquisition of differential delay eyeblink classical conditioning is 323–339.
independent of awareness. Behav. Neurosci., 119, 78–86. Yonelinas, A. P., Kroll, N. E., Quamme, J. R., Lazzara,
Smith, C. N., Hopkins, R. O., & Squire, L. R. (2006). Experi- M. M., Sauve, M. J., Widaman, K. F., & Knight, R. T. (2002).
ence-dependent eye movements, awareness, and hippocampus- Effects of extensive temporal lobe damage or mild hypoxia on
dependent memory. J. Neurosci., 26, 11304–11312. recollection and familiarity. Nat. Neurosci., 5, 1236–1241.
690 memory
47 Reconsolidation: A Possible
Bridge between Cognitive and
Neuroscientific Views of Memory
karim nader
abstract The field of reconsolidation is one of the fastest noticing the modifications. These and similar demonstra-
growing fields in memory research. Students of memory find them- tions established that remembering is not akin to a passive
selves in an extremely exciting period because memory research is
readout of a stored file; rather it is a reconstructive process
beginning to be revealed at the neurobiological level as a funda-
mentally dynamic process. A neurobiological model of memory is in which new information is combined with current and
emerging that can accommodate the dynamic nature of memory previous experiences.
revealed in a long tradition of cognitive-oriented studies of human The physiological tradition, along with what is now
memory (Bartlett, 1932). This chapter will briefly address the referred to as systems neuroscience, somewhat limits the
history of consolidation and reconsolidation. It will describe the scope of the effects of malleability documented in cognitive
basis for which a reconsolidation phenomenon is thought to exist
and address some central issues and unresolved problems of psychology. Memory models developed in physiological psy-
memory reconsolidation. chology suggest that memories may only be substantially
manipulated during a transient period of instability that
follows their initial acquisition. It is assumed that memories
When students are asked to give analogies of how the brain initially exist in an unstable (labile) state and then become
processes memories, they often suggest that memories are stable (consolidated) over time. This assumption explains
like pictures stored in a filing cabinet. Remembering is sug- why, if you are trying to remember a phone number and
gested to consist of pulling the correct files from the brain you are distracted within a few minutes of learning the
for examination and then filing them again, unchanged, phone number, you will likely forget the number in part or
back into the storage banks of the brain. Memory—that is, totally because the memory for the number was still in an
the recall of previously stored memory contents—is thus unstable state when the interference occurred. If the same
simply faithful readout, just like opening a data file on a distraction were to happen 24 hours after the initial learning,
computer. Depending on the tradition in which the student chances are that the memory for the phone number would
was trained, this analogy is half wrong, but different parts of not be affected by the same distracter. Again, distracters are
the analogy might be wrong in different traditions. only effective shortly after new learning as only then is
The cognitive tradition views memory as a reconstructive memory in an unstable state, whereas, after a few hours
process, in which memories and their content are subject to have passed, memory would have been stored, or “wired,”
change. This view was first developed by Bartlett in his into the brain, resistant to distracters. In other words, once
seminal book Remembering (1932). He suggested that each memory is stabilized, it becomes fixed and cannot be changed
(episodic) memory recall represents essentially a re-creation, as easily as it can shortly after learning (McGaugh, 2000).
based on one’s current assumptions and beliefs about the The processes involved in memory “fixation” have been the
world (schemata). A large body of work now supports this main focus in a highly successful research program in the
initial claim, indicating that under several conditions, mem- physiological tradition. How do memories become fixed
ories can be easily corrupted (Loftus, 1997; Schacter, 1999). over time in the brain? What are the cellular and molecular
False memory paradigms can instill false memories within mechanisms mediating this transformation from a labile to
minutes, and subjects who are sensitive to these effects are a consolidated state?
often amazed and sometimes alarmed by how easily their It is obvious that there are only small areas of overlap
memories can be manipulated without their consciously between these two traditions. On the one hand, cognitive
psychology reminds us that our memories are not snapshots
karim nader Psychology Department, McGill University, of the past but elaborate reconstructions. On the other hand,
Montreal, Quebec many physiologists and neuroscientists study the mechanisms
692 memory
A electroconvulsive shock (ECS), which impaired memory
when applied shortly after training, suggesting that the
memory was consolidated about one day after learning.
However, if they presented the animals 24 hours after train-
ing with a reminder just prior to ECS, amnesia would follow
(Misanin et al., 1968). According to consolidation theory, as
the memory had already become insensitive to ECS, and
STM LTM thus consolidated, it should also remain so following memory
1.Lasts sec. to hr 1.Develops over hrs
retrieval. The authors suggested that transformation of a
2.Labile 2.Stable/fixed
3.Does not require New 3.Does Require New
memory from an unstable to a stable state occurred not only
RNA or Protein synthesis RNA or Protein synthesis for new memories but also for consolidated memories after
they had been recalled. This effect, which was originally
called “cue-induced amnesia,” led to a large number of
studies. As with many (new) phenomena, the effect was not
B replicated in some paradigms (Dawson & McGaugh, 1969;
100 Gold & King, 1972; Squire, Slater, & Chace, 1976). Never-
theless, it was found across a range of species and amnesic
treatments (Lewis, 1979; Spear & Mueller, 1984), suggesting
80
Percent Freezing
Percent Freezing
STM) and again at 24 hours after reactivation (postreactivation
LTM, PR-LTM). These were analogues to STM and LTM 60
tests, respectively. We predicted that if reactivation of a
consolidated memory initiated a second time-dependent 40
memory process, then anisomycin infusions into the LBA Control
20
should have no effect on PR-STM but should impair PR- Anisomycin
LTM performance. This was the pattern of results that we 0
found (figure 47.2B; Nader et al., 2000a). Furthermore, in the PR-STM PR-LTM
absence of memory reactivation, the same anisomycin injec-
Figure 47.2 (A) A schematic of Donald Lewis’s theory of memory
tions did not induce amnesia, indicating that the memory was processing. Active memories (AM) are considered to be unstable or
consolidated at that time. Based on these and other findings, labile. They are thought to last on the order of minutes to hours.
we concluded that reactivation of a consolidated memory Currently the evidence suggests that its expression does not require
initiates a second time-dependent memory process that any new RNA or protein synthesis. Inactive memories (IM) are
requires protein synthesis in the amygdala in order to be thought to develop over a few hours. Currently the data suggest
that they require new protein and RNA synthesis in order to
restabilized. The restabilization process is now called
become stable. (B) Data from Nader, Schafe, and LeDoux (2000a)
reconsolidation. The term was used at least as early as 1973 demonstrating that postreactivation infusions of the protein synthe-
(Spear, 1973) and has been used in more recent times by sis inhibitor anisomycin into the basolateral amygdala blocks
Przybyslawski and Sara (1997) and Rodriguez and colleagues reconsolidation of auditory fear conditioning. Note that the data
(1993). Again it is important to point out that making a meet the operational definition of a consolidation blockade, intact
PR-STM and impaired PR-LTM.
memory “labile” is not consistent with it being weakened in
any way, as suggested, for example, by Rudy, Biedenkapp,
Moineau, and Bolding (2006), who argue that lability requires motor sequence learning (Walker, Brakefield, Hobson, &
that memories be eliminated and then put back into the brain. Stickgold, 2003), appetitive conditioning (Wang, Ostlund,
This position is clearly at odds with the data and the proposed Nader, & Balleine, 2005), episodic memories (Forcato et al.,
explanations of the phenomenon (Nader et al.). 2007; Hupbach, Gomez, Hardt, & Nadel, 2007), and memo-
Our findings have again led to a large number of studies, ries of drug reward (Lee, Di Ciano, Thomas, & Everitt, 2005;
and again the evidence is much the same as after the original C. Miller & Marshall, 2005). A variety of different amnesic
description of the phenomenon by Misanin and colleagues treatments have been shown to be effective in blocking recon-
(1968). A time-dependent behavioral impairment has now solidation, such as targeted protein (Nader et al., 2000a) or
been demonstrated across a variety of tasks (figure 47.3), such RNA synthesis inhibition (Duvarci, Nader, & LeDoux, 2008;
as object recognition (Kelly, Laroche, & Davis, 2003), incen- Sangha et al., 2003), pharmacological inhibition of kinase
tive learning (Sangha, Scheibenstock, & Lukowiak, 2003), activity (Kelly et al.), protein knockout mice (Bozon, Davis, &
inhibitory avoidance (Anokhin, Tiunova, & Rose, 2002), Laroche, 2003), inducible knockout mice (Kida et al., 2002),
694 memory
Context fear conditioning- Rats
Auditory fear conditioning- Rats Intra-hippocampus infusions
Intra-amygdala infusions 100
100 90
Percent Freezing
Percent Freezing
80
80
70
60 60
50
40 40
Control 30 Vehicle
20 20
Anisomycin Anisomycin
0 10
PR-STM PR-LTM 0
PR-STM PR-LTM
(Nader et al., 2000) (Debiec, LeDoux, & Nader, 2002)
Percent Freezing
80
80
60
60
40
Control 40
20 Control
Zif286KO 20
0 CREBI
PR-STM PR-LTM 0
PR-STM PR-LTM
(Bozon, Davis, & Larouche, 2003)
(Kida et al., 2001)
Conditioned malaise- Sea Slugs
Motor sequence learning- Human
Changes in Body Length
6
5 60
From Reactivation
4
Percent Change
3
40
2 20
1
0
0
-1 -20 PR-STM PR-LTM
Control
-2 -40
-3 aniso Control
-4 -60
Interference
PR-STM PR-LTM -80
(Child, Epstein, Kuzerian, & Alkon, 2003) (Walker, Brakefield, Hobson, & Stickgold, 2003)
Figure 47.3 Representative data from some of the studies postreactivation LTM (PR-LTM). Labels above each graph refer
reporting a reconsolidation impairment. Notice that they all to the paradigm and the nature of the amnesia treatment. Below
show intact postreactivation STM (PR-STM) and impaired each are the citations for the data.
beta-adrenergic antagonists (Przybyslawski & Sara, 1997), and Constraints on Reconsolidation The data set shows that
simply an interference by new learning (Forcato et al.; Hupbach reconsolidation can be found across a large number of
et al.; Walker et al., 2003). In addition, reconsolidation has paradigms and species, suggesting that reconsolidation is a
been reported across a broad spectrum of species, such as fundamental phenomenon. However, it is apparently not
snails (Sangha et al., 2003), sea slugs (Child, Epstein, Kuzirian, ubiquitous, as some have failed to demonstrate it in the late
& Alkon, 2003), crabs (Pedreira, Perez-Cuesta, & Maldonado, phase of instrumental conditioning or memory of context
2002), chicks (Anokhin et al.), mice (Kida et al.), rats (Nader (Biedenkapp & Rudy, 2004; Hernandez, Sadeghian, & Kelley,
et al., 2000a), and most importantly humans (Brunet et al., 2002). This fact would indicate that perhaps not all memories
2008; Forcato et al.; Hupbach et al.; Walker et al.). While most undergo reconsolidation. Furthermore, there is a large amount
of the aforementioned studies have focused on blockade of of exciting work demonstrating that the ability of a memory
reconsolidation, there is also evidence that reconsolidation to undergo reconsolidation can be influenced by certain
can be enhanced by postreactivation increases in kinase activ- experimental parameters. For example, some older memories
ity (Tronson, Wiseman, Olausson, & Taylor, 2006). may be more resistant to reconsolidation (Eisenberg & Dudai,
696 memory
A)
A B C
B)
Reactivation Test for Freezing to C
when B is presented.
B
B Impaired
A B C No Effect
Figure 47.4 (A) Cartoon of the chain of associations created to ment in responding if B was presented on test. Indirect reactivation
test whether being reactivated in an indirect manner changed a of the same association by presenting A during reactivation causes
memory’s propensity to undergo reconsolidation (Doyere, Debiec, the directly reactivated A → B memory to undergo reconsolidation
Monfils, Schafe, & LeDoux, 2007). (B) Presenting B during the but not the indirectly reactivated association, B → C. Therefore,
reactivation directly elicits the B → C association that undergoes amnesic agents had no effect on responding to B.
reconsolidation. Amnesic treatments would then induce an impair-
from-amnesia paradigm. Furthermore, within this paradigm consolidation of long-term facilitation is accompanied by
the consolidation-impairment view of amnesia only makes an increase in synapse number (Bailey & Kandel, 1993).
negative predictions, which do not prove that a memory is However, in amnesic preparations, in which a memory deficit
not present (Nader & Wang, 2006; Squire, 2006). For this was induced by inhibiting CREB phosphorylation, increase
reason, even at the present time, there are completely opposed in synapse number is not observed (Bailey & Kandel, 1993).
views on the nature of experimental amnesia (for examples of Such data are considered exclusively as evidence supporting
the discordance on the issue see the special section in Learning the view that amnesia represents impairment of memory
and Memory, 13[5], 2006). storage (Squire, 2006). While this is certainly a very positive
Consequently, we cannot address whether amnesia induced step to advancing the issue, there are still theories of memory
by postreactivation treatments is due to impairing reconsoli- processing at the behavioral level (Lewis, 1979; R. Miller &
dation or retrieval processes. However, what we can do is Marlin, 1984; Spear, 1973) that may provide alternative
compare this amnesia to amnesia for new memories and interpretations of this kind of data that remain consistent with
apply the accepted standards in the field of consolidation to the behavioral impairment being a retrieval failure (Nader &
conclude that the former deficit is due to impaired memory Wang, 2006).
storage. At the behavioral level, amnesia for new memories A few scientists have begun to examine what happens to
(Anokhin et al., 2002; Lattal & Abel, 2004; Quartermain & established molecular and cellular correlates of LTM when
McEwen, 1970; Squire & Barondes, 1972) and reactivated reconsolidation is blocked. Importantly, long-term potentia-
memories (Anokhin et al.; Lattal & Abel; Quartermain & tion in its late phase (L-LTP) can undergo a reconsolidation-
McEwen, 1970; Squire & Barondes, 1972) are similar, as they like process (Fonseca, Nagerl, & Bonhoeffer, 2006). In
can both show recovery. Recovery from amnesia, however, addition, learning-induced increases in field potentials in the
can be consistent with both storage and retrieval-impairment amygdala are decreased when reconsolidation is blocked for
views of amnesia (Nader & Wang, 2006; Squire, 2006). that memory (Doyere, Debiec, Monfils, Schafe, & LeDoux,
Because the recovery-from-amnesia paradigm did not 2007). Molecular correlates of LTM have also been shown
resolve the issue, the field has also examined whether the to return toward baseline levels when reconsolidation is
molecular and cellular changes that occur during the post- blocked (C. Miller & Marshall, 2005; Rose & Rankin, 2006;
training stabilization, or consolidation, period are lost in Valjent et al., 2006), as if the brain area targeted by the
amnesic animals (Squire, 2006). For example, in aplysia the amnesic agent reduced the plasticity in those circuits. Within
698 memory
Figure 47.5 Systems reconsolidation in the hippocampus. Data
from a contextual fear-conditioning paradigm demonstrating
systems reconsolidation. Training (CS-US) consisted of 8 shock
presentations in a conditioning chamber. (A) 45 days after condi-
tioning, electrolytic lesions of the dorsal hippocampus (lesion)
immediately after memory reactivation (CS) produced a significant
impairment. Conversely, the same lesions had no effect when the
reactivation session (no CS) was omitted, demonstrating that 45
days after conditioning the memory is independent of the hippo-
campus. Thus reactivation of a hippocampus-independent memory
returns it to being hippocampus-dependent, an example of systems
reconsolidation. (B) Reactivation of the remote memory returned it
to being dependent on the hippocampus for less than 2 days. (C )
Memory model of the hippocampus demonstrating both-systems
reconsolidation. Over time the neocortex (possibly the anterior cin-
gulate) becomes competent to mediate a simple response and might
no longer need the hippocampus, at which point it is a remote
memory (top arrow). Reactivation of the remote memory causes the
cortical trace, which remains in the cortex, to require hippocampus
feedback (bottom arrow) over the next 2 days. (From Debiec,
LeDoux, & Nader, 2002.)
700 memory
Debiec, J., LeDoux, J. E., & Nader, K. (2002). Cellular and Gold, P. E., & King, R. A. (1972). Amnesia: Tests of the effect of
systems reconsolidation in the hippocampus. Neuron, 36(3), delayed footshock-electroconvulsive shock pairings. Physiol.
527–538. Behav., 8(5), 797–800.
Doyere, V., Debiec, J., Monfils, M. H., Schafe, G. E., & Gold, P. E., & King, R. A. (1974). Retrograde amnesia: Storage
LeDoux, J. E. (2007). Synapse-specific reconsolidation of distinct failure versus retrieval failure. Psychol. Rev., 81(5), 465–469.
fear memories in the lateral amygdala. Nat. Neurosci., 10(4), Gordon, W. C. (1977a). Similarities of recently acquired and
414–416. reactivated memories in interference. Am. J. Psychol., 90(2),
Dudai, Y. (2004). The neurobiology of consolidations, or, how 231–242.
stable is the engram? Annu. Rev. Psychol., 55, 51–86. Gordon, W. C. (1977b). Susceptibility of a reactivated memory to
Dudai, Y., & Morris, R. (2000). To consolidate or not to consoli- the effects of strychnine: A time-dependent phenomenon. Physiol.
date: What are the questions? In J. Bolhius (Ed.), Brain, perception, Behav., 18(1), 95–99.
memory: Advances in cognitive sciences (pp. 149–162). Oxford, UK: Hebb, D. O. (1949). The organization of behavior. New York: Wiley.
Oxford University Press. Hernandez, P. J., Sadeghian, K., & Kelley, A. E. (2002).
Duncan, C. P. (1949). The retroactive effect of electroconvulsive Early consolidation of instrumental learning requires protein
shock. J. Comp. Physiol. Psychol., 42, 32–44. synthesis in the nucleus accumbens. Nat. Neurosci., 5(12),
Duque, J., Mazzocchio, R., Stefan, K., Hummel, F., Olivier, 1327–1331.
E., & Cohen, L. G. (2008). Memory formation in the motor Hupbach, A., Gomez, R., Hardt, O., & Nadel, L. (2007). Recon-
cortex ipsilateral to a training hand. Cereb. Cortex, 18(6), solidation of episodic memories: A subtle reminder triggers inte-
1395–1406. gration of new information. Learn. Memory, 14(1–2), 47–53.
Duvarci, S., Mamou, C. B., & Nader, K. (2006). Extinction is not Kandel, E. R. (2001). The molecular biology of memory storage:
a sufficient condition to prevent fear memories from undergoing A dialogue between genes and synapses. Science, 294(5544),
reconsolidation in the basolateral amygdala. Eur. J. Neurosci., 1030–1038.
24(1), 249–260. Kelly, A., Laroche, S., & Davis, S. (2003). Activation of mito-
Duvarci, S., Nader, K., & LeDoux, J. E. (2005). Activation of gen-activated protein kinase/extracellular signal-regulated
extracellular signal-regulated kinase-mitogen-activated protein kinase in hippocampal circuitry is required for consolidation
kinase cascade in the amygdala is required for memory recon- and reconsolidation of recognition memory. J. Neurosci., 23(12),
solidation of auditory fear conditioning. Eur. J. Neurosci., 21(1), 5354–5360.
283–289. Kida, S., Josselyn, S. A., de Ortiz, S. P., Kogan, J. H., Chevere,
Duvarci, S., Nader, K., & LeDoux, J. E. (2008). A comparision I., Masushige, S., et al. (2002). CREB required for the stability
of the sensitivities of consolidation and reconsolidation to RNA of new and reactivated fear memories. Nat. Neurosci., 5(4),
synthesis inhibition. Learn. Memory, 15(10), 747–755. 348–355.
Ebbinghaus, M. (1885). Über das Gedächtnis. Leipzig: K. Buehler. Kim, J. J., & Fanselow, M. S. (1992). Modality-specific retrograde
Eichenbaum, H., Otto, T., & Cohen, N. J. (1994). Two functional amnesia of fear. Science, 256, 675–677.
components of the hippocampal memory system. Behav. Brain Land, C., Bunsey, M., & Riccio, D. C. (2000). Anomalous
Sci., 17, 449–518. properties of hippocampal lesion-induced retrograde amnesia.
Eisenberg, M., & Dudai, Y. (2004). Reconsolidation of fresh, Psychobiology, 28, 476–485.
remote, and extinguished fear memory in Medaka: Old fears Lattal, K. M., & Abel, T. (2004). Behavioral impairments caused
don’t die. Eur. J. Neurosci., 20(12), 3397–3403. by injections of the protein synthesis inhibitor anisomycin after
Eisenberg, M., Kobilo, T., Berman, D. E., & Dudai, Y. (2003). contextual retrieval reverse with time. Proc. Natl. Acad. Sci. USA,
Stability of retrieved memory: Inverse correlation with trace 101, 4667–4672.
dominance. Science, 301(5636), 1102–1104. LeDoux, J. E. (2000). Emotion circuits in the brain. Annu. Rev.
Eisenhardt, D., & Menzel, R. (2007). Extinction learning, recon- Neurosci., 23, 155–184.
solidation and the internal reinforcement hypothesis. Lee, J. L., Di Ciano, P., Thomas, K. L., & Everitt, B. J. (2005).
Neurobiol. Learn. Mem., 87(2), 167–173. Disrupting reconsolidation of drug memories reduces cocaine-
Fischer, A., Sananbenesi, F., Schrick, C., Spiess, J., & seeking behavior. Neuron, 47(6), 795–801.
Radulovic, J. (2004). Distinct roles of hippocampal de novo Lewis, D. J. (1979). Psychobiology of active and inactive memory.
protein synthesis and actin rearrangement in extinction of con- Psychol. Bull., 86(5), 1054–1083.
textual fear. J. Neurosci., 24(8), 1962–1966. Loftus, E. F. (1997). Creating false memories. Sci. Am., 277(3),
Flexner, L. B., Flexner, J. B., & Stellar, E. (1965). Memory and 70–75.
cerebral protein synthesis in mice as affected by graded amounts Loftus, E. F., & Yuille, J. C. (1984). Departures from reality in
of puromycin. Exp. Neurol., 13(3), 264–272. human perception and memory. In H. Weingartner & E. S.
Fonseca, R., Nagerl, U. V., & Bonhoeffer, T. (2006). Parker (Eds.), Memory consolidation: Psychobiology of cognition (pp.
Neuronal activity determines the protein synthesis dependence 163–184). Hillsdale, NJ: Lawrence Erlbaum Associates.
of long-term potentiation. Nat. Neurosci., 9(4), 478–480. Malinow, R., & Malenka, R. C. (2002). AMPA receptor traffick-
Forcato, C., Burgos, V. L., Argibay, P. F., Molina, V. A., ing and synaptic plasticity. Annu. Rev. Neurosci., 25, 103–126.
Pedreira, M. E., & Maldonado, H. (2007). Reconsolidation Maren, S. (2001). Neurobiology of Pavlovian fear conditioning.
of declarative memory in humans. Learn. Memory, 14(4), Annu. Rev. Neurosci., 24, 897–931.
295–303. Martin, S. J., Grimwood, P. D., & Morris, R. G. (2000). Synaptic
Glickman, S. (1961). Perseverative neural processes and consolida- plasticity and memory: An evaluation of the hypothesis. Annu.
tion of the memory trace. Psychol. Bull., 58, 218–233. Rev. Neurosci., 23, 649–711.
Goelet, P., Castellucci, V. F., Schacher, S., & Kandel, McClelland, J. L., McNaughton, B. L., & O’Reilly, R. C.
E. R. (1986). The long and short of long-term memory—A (1995). Why there are complementary learning systems in the
molecular framework. Nature, 322(July), 419–422. hippocampus and neocortex: Insights from the successes and
702 memory
Stollhoff, N., Menzel, R., & Eisenhardt, D. (2005). Spontane- Valjent, E., Aubier, B., Corbille, A. G., Brami-Cherrier, K.,
ous recovery from extinction depends on the reconsolidation of Caboche, J., Topilko, P., et al. (2006). Plasticity-associated gene
the acquisition memory in an appetitive learning paradigm in Krox24/Zif268 is required for long-lasting behavioral effects of
the honeybee (Apis mellifera). J. Neurosci., 25(18), 4485–4492. cocaine. J. Neurosci., 26(18), 4956–4960.
Suzuki, A., Josselyn, S. A., Frankland, P. W., Masushige, S., Walker, M. P., Brakefield, T., Hobson, J. A., & Stickgold, R.
Silva, A. J., & Kida, S. (2004). Memory reconsolidation and (2003). Dissociable stages of human memory consolidation and
extinction have distinct temporal and biochemical signatures. reconsolidation. Nature, 425(6958), 616–620.
J. Neurosci., 24(20), 4787–4795. Wang, S. H., Ostlund, S. B., Nader, K., & Balleine, B. W.
Tronson, N. C., Wiseman, S. L., Olausson, P., & Taylor, (2005). Consolidation and reconsolidation of incentive learning
J. R. (2006). Bidirectional behavioral plasticity of memory in the amygdala. J. Neurosci., 25(4), 830–835.
reconsolidation depends on amygdalar protein kinase A. Nat. Wiltgen, B. J., & Silva, A. J. (2007). Memory for context becomes
Neurosci., 9(2), 167–169. less specific with time. Learn. Memory, 14(4), 313–317.
Tulving, E. (2002). Episodic memory: From mind to brain. Annu. Winocur, G., Moscovitch, M., & Sekeres, M. (2007). Memory
Rev. Psychol., 53, 1–25. consolidation or transformation: Context manipulation and
Tulving, E., & Thomson, D. M. (1973). Encoding specificity and hippocampal representations of memory. Nat. Neurosci., 10(5),
retrieval processes in episodic memory. Psychol. Rev., 80(5), 555–557.
359–380.
abstract Cognitive control refers to the set of processes that theories that characterize the mechanisms, functional orga-
guide thought and action in accordance with current goals. In this nization, and regulation of cognitive control. Finally, we
chapter we consider the manner in which cognitive control mecha- review functional neuroimaging and lesion evidence for the
nisms guide mnemonic processing. First, we consider the architec-
ture of prefrontal cortex (PFC) and review leading theories of how
interaction between cognitive control and memory, with an
PFC operations support distinct forms of control. Next, we consider emphasis on the interplay between mnemonic uncertainty,
two illustrative and well-characterized situations in which PFC interference, and PFC-mediated control functions.
control guides mnemonic processing: (1) when competition between
memories creates interference, and (2) when ineffective retrieval
cues yield uncertainty. Finally, we consider the ways in which prior
PFC anatomy and connectivity
mnemonic experiences may reduce future interference and uncer-
tainty, thereby easing the demands placed on PFC control mecha- This chapter will focus on the function of four main subre-
nisms. Together, these considerations highlight the dynamic gions within PFC that have been implicated in cognitive
interplay between cognitive control and memory. control: ventrolateral, dorsolateral, frontopolar, and medial
PFC (figure 48.1). Ventrolateral PFC (VLPFC) corresponds
to the inferior frontal gyrus, encompassing pars orbitalis
Cognitive control refers to the set of processes that guide (area 47/12 in Petrides & Pandya, 2002), pars triangularis
thought and action in accordance with current goals. Central (∼Brodmann’s area [BA] 45), and pars opercularis (∼BA 44).
to higher cognitive function, cognitive control allows organ- Following Badre and Wagner (2007), we refer to pars orbit-
isms to represent task demands, flexibly work with memory, alis as anterior VLPFC and pars triangularis as mid-VLPFC
and promote context- and goal-relevant information pro- (note that these two regions have been collectively termed
cessing in the face of distraction. Control mechanisms are mid-VLPFC by Petrides & Pandya, 2002) and to pars oper-
particularly important in unfamiliar situations or changing cularis as posterior VLPFC. Dorsolateral PFC (DLPFC)
environments when acquired knowledge provides either refers to regions within the middle frontal gyrus (areas 8,
insufficient or inappropriate information to satisfy current 9/46, and 46; Petrides & Pandya, 1999). In humans, the
demands. The prefrontal cortex (PFC) is a fundamental ventral bound of this region is defined by the inferior frontal
component of the neural circuitry supporting cognitive sulcus and the dorsal bound by the superior frontal sulcus.
control. By orchestrating the influence of past experience on Frontopolar cortex (∼BA 10) corresponds to the most rostral
present behavior, PFC mechanisms configure neural pro- portion of PFC, including portions of middle frontal gyrus.
cessing to optimize behavior. The medial wall of PFC includes portions of BAs 8, 9, and
In this chapter we explore the dynamic interaction 10 and the anterior cingulate cortex (ACC; BAs 24 and 32).
between control mechanisms and memory, with a specific Though anatomically distinct, lateral and medial PFC sub-
focus on prefrontal contributions to cognitive control. We regions have been shown to be interconnected both with
begin with a brief description of the neural circuitry support- each other and with more posterior regions of cortex, includ-
ing cognitive control, focusing on the anatomy and connec- ing medial and lateral temporal cortex and posterior parietal
tivity of subregions within PFC. Next, we discuss current cortex (Petrides & Pandya, 1999, 2002, 2007).
race, kuhl, badre, and wagner: cognitive control and memory 705
Figure 48.2 Model of PFC and anterior cingulate involvement
during performance of the Stroop task. Circles represent processing
units, which correspond to a population of neurons assumed to
code a given piece of information. Lines represent connections
between units, with heavier lines indicating stronger connections.
Looped connections with black circles indicate mutual inhibition
among units within that layer. In the Stroop task, subjects must
name the ink color in which a word is presented, rather than read
the word. The presentation of a conflict stimulus (the word “blue”
displayed in red ink) activates (indicated by gray fill) input layer
units representing “red ink” and the word “blue.” The “colors”
task demand unit is activated in PFC (gray fill), representing the
current goal to name the color of the ink, and passes activation to
the intermediate units in the color-naming pathway (indicated by
arrows), increasing the activation of those units and biasing process-
Figure 48.1 Anatomical subdivisions of the PFC. (A) Lateral view
ing in favor of activity flowing along the color-naming pathway.
of left PFC depicting cytoarchitectonic areas (numbered). Anterior
This bias favors activation of the response unit (“red”) correspond-
VLPFC corresponds to areas 47/12, mid-VLPFC corresponds to
ing to the color input (red ink), even though the connection weights
area 45, and posterior VLFPC corresponds to area 44. DLPFC
in this pathway are weaker than in the word-reading pathway that
corresponds to middle frontal gyrus including areas 8, 9/46, and
would favor a response based on reading the word (“blue”). By
46. FPC corresponds to area 10. (B) Medial view of right PFC.
computing the level of conflict (or the presence of simultaneously
ACC corresponds to areas 24 and 32. FPC corresponds to area 10.
active representations in the response layer), ACC initially detects
VLFPC, ventrolateral prefrontal cortex. DLPFC, dorsolateral pre-
the need for this top-down bias from PFC. ACC, anterior cingulate
frontal cortex. FPC, frontopolar cortex. ACC, anterior cingulate
cortex. (Adapted with permission from M. M. Botvinick, T. S.
cortex. (Reprinted from M. Petrides & D. N. Pandya, 1999. Dor-
Braver, D. M. Barch, C. S. Carter, & J. D. Cohen, 2001. Conflict
solateral prefrontal cortex: Comparative cytoarchitectonic analysis
monitoring and cognitive control. Psychol. Rev., 108, 624–652.
in the human and the macaque brain and corticocortical connec-
Copyright 2001, American Psychological Association.)
tion patterns. Eur. J. Neurosci., 11, 1011–1036. Copyright 1999, with
permission from Blackwell Synergy.)
706 memory
Specifically, the maintenance of task-relevant contextual Neuroimaging and lesion data support the proposal that
representations in PFC has been proposed to bias DLPFC and VLPFC functionally differ. For example, lesions
establishment of appropriate mappings between sensory of mid-DLPFC (areas 9/46 and 46) produce impairments in
inputs, internal states, and motor outputs. In the absence the ability to order information in working memory (Petrides,
of cognitive control, behavior is driven in an automatic, 2000), and functional magnetic resonance imaging (fMRI)
bottom-up fashion by representations that are most strongly studies indicate that DLPFC activity increases during
activated by input cues. However, when weakly established complex working memory tasks, such as when working
(but task-relevant) representations must be selected in the memory loads are high (Rypma, Prabhakaran, Desmond,
face of competition from stronger (but task-irrelevant) Glover, & Gabrieli, 1999), as well as when representations
representations, PFC control signals are thought to bias the held in working memory must be reordered (D’Esposito,
flow of information processing to enhance the strength of Postle, Ballard, & Lease, 1999; Postle, Berger, & D’Esposito,
the relevant representations and overcome the task-irrelevant 1999; Wagner, Maril, Bjork, & Schacter, 2001) or updated
competitors (Cohen, Dunbar, & McLelland, 1990). (Salmon et al., 1996; Garavan, Ross, Li, & Stein, 2000).
Illustrative of this putative bias mechanism, consider the Similarly, within episodic retrieval tasks, DLPFC activation
Stroop paradigm, wherein subjects are presented color has often been associated with monitoring retrieved mne-
words in different ink colors and are asked to name the ink monic information (e.g., Henson, Rugg, Shallice, & Dolan,
color (figure 48.2). Presentation of a word strongly elicits the 2000; Fletcher & Henson, 2001; Dobbins, Foley, Schacter,
prepotent response to read the word, because subjects have & Wagner, 2002; Rugg, Henson, & Robb, 2003; Achim &
more experience reading words than naming the color of Lepage, 2005; Dobbins, Simons, & Schacter, 2004). In con-
word print. Thus, if the ink color is incongruent with the trast, neuroimaging studies indicate that activity within
color word (e.g., “BLUE” in red ink), a prepotent response VLPFC increases during the controlled retrieval and selec-
(“blue”) must be overcome in favor of a weaker response tion of information from long-term memory, as well as in
(“red”). Biased competition theory proposes that lateral PFC the presence of mnemonic interference (Thompson-Schill,
represents the current task goal (e.g., name the ink color) D’Esposito, Aguirre, & Farah, 1997; Jonides, Smith,
and biases processing in color-naming pathways to favor the Marschuetz, Koeppe, & Reuter-Lorenz, 1998; Bunge,
weaker but goal-relevant response (Cohen, Dunbar, & Ochsner, Desmond, Glover, & Gabrieli, 2001; Badre &
McClelland, 1990). Wagner, 2002). We will further discuss VLPFC contribu-
Importantly, top-down bias mechanisms have been argued tions to mnemonic control in the section on the interaction
to support a variety of functions, including working memory, between control and memory.
selective attention, controlled retrieval from long-term
memory, task switching, response inhibition, and response Rostrocaudal Hierarchies In addition to apparent
selection. While the biased competition theory proposes a dorsal/ventral dissociations, accumulating evidence suggests
central mechanism for cognitive control, there may be mul- that hierarchically organized cognitive control processes
tiple types of control that differ in their form or domain. In map to a functional gradient along the rostrocaudal axis of
the next sections, we describe several theories that focus lateral frontal cortex (Christoff & Gabrieli, 2000; Fuster,
on the functional architecture of control and its relationship 2001; Koechlin, Ody, & Kouneiher, 2003; Wood &
to the organization of PFC. Grafman, 2003; Bunge & Zelazo, 2006; Koechlin & Jubault,
2006; Petrides, 2006; Badre & D’Esposito, 2007; Botvinick,
The Dorsal-Ventral Hypothesis A complementary 2007; Koechlin & Summerfield, 2007; Badre, 2008). More
perspective on cognitive control suggests that dorsal and caudal regions of frontal cortex, inclusive of premotor cortex,
ventral regions of lateral PFC mediate dissociable, but are thought to control processing at “lower” levels of
interactive, forms of control (Petrides, 1994; Owen, Evans, representation in the stimulus-action processing hierarchy,
& Petrides, 1996). In this view, control mechanisms supported such as response selection (figure 48.3). Progressively more
by VLPFC and DLPFC operate over different loci or types anterior regions of frontal cortex are proposed to support
of representations (Petrides, 1996). VLPFC mechanisms control mechanisms that operate upon increasingly
have been proposed to support controlled retrieval and “higher” levels of representation (Christoff & Gabrieli, 2000;
selection of long-term knowledge stored in posterior cortices Badre, 2008), including more abstract higher-order plans
and the maintenance of these representations within work- or complex schemas. Functional organization along the
ing memory, while DLPFC mechanisms have been pro- horizontal axis of lateral PFC has also been characterized as
posed to support the monitoring and manipulation of the mediating cross-temporal contingencies between past,
representations retrieved and maintained by VLPFC (e.g., present, and future events, with caudal PFC mechanisms
D’Esposito et al., 1998; Petrides, 2002). guiding behavior based upon the immediate context in
race, kuhl, badre, and wagner: cognitive control and memory 707
708 memory
Figure 48.3 Hierarchical organization of cognitive representa- cortex. Spheres from Badre and D’Esposito (2007) (red) reflect foci
tions in lateral cortex. (A) Schema of two hierarchies of cortical of activation with experimental manipulations at different levels of
memory, executive memory, and perceptual memory, and the dis- representation: A, the response level; C, the feature level; E, the
tribution of these hierarchies in frontal and posterior cortical dimension level; G, the context level. Spheres from Koechlin, Ody,
regions, respectively. In frontal cortex, representations that are and Kouneiher (2003) (blue) reflect foci of activation with manipu-
“higher” in the processing hierarchy are mapped to more rostral lations of different levels of control: B, sensory control; D, contex-
regions, and “lower”-level representations are mapped to more tual control; F, episodic control. (Adapted with permission from D.
caudal regions. (Reprinted from J. M. Fuster, 2001, The prefrontal Badre & M. D’Esposito, 2007, Functional magnetic resonance
cortex—An update: Time is of the essence, Neuron, 30, 319–333. imaging evidence for a hierarchical organization of the prefrontal
Copyright 2001, with permission from Elsevier.) (B) Neuroimaging cortex, J. Cogn. Neurosci., 19, 2082–2099. Copyright 2007, with
data providing evidence for representational hierarchies in frontal permission from the MIT Press.) (See color plate 60.)
which a stimulus occurs and more rostral PFC regions Stroop paradigm) and provide feedback signals to lateral
processing information that is successively more remote in PFC that up-regulate control (figure 48.2). Consistent with
time (Fuster, 2001; Braver, Reynolds, & Donaldson, 2003; this proposal, imaging studies have documented functional
Koechlin et al., 2003; Koechlin & Summerfield, 2007). coactivation of ACC and lateral PFC under situations of
With its location at the most rostral extent of PFC, response and mnemonic conflict (e.g., Bunge, Burrows, &
frontopolar cortex (FPC; ∼BA 10; figure 48.1) may be Wagner, 2004; Badre & Wagner, 2004; Kerns et al., 2004;
positioned at the apex of the putative control hierarchy Kuhl, Dudukovic, Kahn, & Wagner, 2007).
(Koechlin & Summerfield, 2007). While the precise func- The basal ganglia (BG) have also been implicated in regu-
tions of FPC remain to be determined, neuroimaging studies lating PFC-mediated control processes. For example, in situ-
have consistently observed FPC activation during higher- ations of response inhibition it has been argued that the
level cognitive tasks, complex working memory tasks, and subthalamic nucleus (a component of the BG) interacts with
episodic retrieval (Fletcher & Henson, 2001; Ramnani & right VLPFC and preSMA such that initiated motor
Owen, 2004). For example, FPC is recruited when previ- responses can be terminated (Aron & Poldrack, 2006; Aron
ously selected goals or task-relevant information must be et al., 2007). It has been argued, through computational
maintained in a pending state until ongoing subtasks are models, that PFC-BG interactions also support cognitive
executed (Koechlin, Basso, Pietrini, Panzer, & Grafman, operations, such as working memory performance (O’Reilly
1999; Braver & Bongiolatti, 2002; Badre & Wagner, 2004; & Frank, 2006; Hazy, Frank, & O’Reilly, 2007). Specifically,
Koechlin & Hyafil, 2007). Similarly, FPC has been associ- this work has suggested that the BG gate PFC processing
ated with higher-order functions such as integrating across depending on task demands, with BG “learning” which PFC
multiple sources of information (Christoff et al., 2001; Bunge, mechanisms to gate through dopamine-mediated reinforce-
Wendelken, Badre, & Wagner, 2004; Ramnani & Owen, ment learning. This hypothesis has received support from
2004; Green, Fugelsang, Kraemer, Shamosh, & Dunbar, recent evidence that PFC-BG interactions support working
2006; De Pisapia, Slomski, & Braver, 2007) or evaluating memory performance and that BG activation prior to the
the products of internally generated information (Christoff, onset of working memory trials is predictive of the extent to
Ream, Geddes, & Gabrieli, 2003). which task-irrelevant information is successfully gated, or
denied processing (McNab & Klingberg, 2008).
Regulation of Control While control mechanisms
supported by lateral PFC are thought to drive goal-relevant Interactions between control and memory
behavior, equally important are the mechanisms through
which control is regulated. Substantial evidence indicates Having surveyed leading theories of how PFC implements
that regions within medial PFC, including the anterior cognitive control, we now consider the manner in which
cingulate cortex, serve this modulatory role (but see Fellows prefrontal control interacts with mnemonic operations.
& Farah, 2005). Specifically, ACC computations have been However, because there are numerous examples of such
alternately proposed to detect the presence of conflict interactions across multiple forms and stages of memory
(Botvinick, Cohen, & Carter, 2004; Kerns et al., 2004; and involving multiple PFC subregions (for reviews see
MacDonald, Cohen, Stenger, & Carter, 2000), error Fletcher & Henson, 2001; Wagner, 2002; Buckner, 2003;
likelihood (Brown & Braver, 2005), or uncertainty (Walton, Simons & Spiers, 2003), we restrict our focus to two exam-
Devlin, & Rushworth, 2004), and to signal lateral PFC ples of PFC involvement in mnemonic processing. Specifi-
mechanisms to increase top-down biasing of task-appropriate cally, we consider how VLPFC mechanisms contribute to
representations. For example, ACC may detect the presence performance (1) when memory representations interfere
of simultaneously active, competing representations (such as with each other and (2) when ineffective retrieval cues yield
conflicting responses elicited by incongruent trials in the uncertainty.
race, kuhl, badre, and wagner: cognitive control and memory 709
Figure 48.4 Damage to mid- and posterior VLPFC in humans
impairs the ability to select relevant semantic information under
high-selection demands. (A) Location of PFC lesions in patients
with selection deficits on a verb-generation task. Scale represents
amount of lesion overlap across patient group. (B) Mean number
of errors in a task requiring subjects to generate semantically appro-
priate verbs for concrete nouns under high-selection demands
(filled bars) versus low-selection demands (unfilled bars). Nouns in
the high-selection group had a lower response-strength ratio (ratio
of the relative response frequency of the most common completion
to the relative response frequency of the second most common
completion) than did nouns in the low-selection condition. Subject
groups were composed of patients with lesions restricted to left
inferior frontal gyrus (left IFG group), patients with frontal lesions
outside of left IFG (frontal controls), and healthy older adults
(elderly controls). (Adapted with permission from S. L. Thompson-
Schill, D. Swick, M. J. Farah, M. D’Esposito, I. P. Kan, & R. T.
Knight, 1998, Verb generation in patients with focal frontal lesions:
A neuropsychological test of neuroimaging findings, Proc. Natl. Acad.
Sci. USA, 95, 15855–15860. Copyright 1999, with permission from
National Academy of Sciences, U.S.A.)
Figure 48.5 Evidence for left mid-VLPFC involvement in resolv- 15, 2003–2012. Copyright 2005, with permission from Oxford
ing interference during the Sternberg working memory University Press.) (C) Damage to left middle and inferior frontal
task. (A) The interference variant of the Sternberg working memory gyri in patient R.C. impairs the ability to successfully reject nega-
paradigm in which subjects maintain a set of four letters in tive recent probes. Patient R.C., a 51-year-old male with a signifi-
working memory until a probe letter appears, at which point they cant lesion in left middle and inferior frontal gyri, showed a
indicate whether the probe is a member of the currently main- pronounced interference effect in both response times (left panel)
tained set (positive probe) or is not a member of the currently rele- and accuracy (right panel) compared to four control groups: control
vant set (negative probe). Interference occurs when a negative subjects that were matched in age and education to R.C. (Controls:
probe is a member of the immediately preceding set (negative CN ); frontal patients with damage outside of left mid-VLPFC
recent) relative to a negative probe that is not a member of the (Frontal Patients: FR); older adults matched in age and education
immediately preceding set (negative nonrecent). (B) Greater activa- to the frontal patient group (Elderly: EA); and a group of young
tion in left mid-VLPFC (circled), as measured by fMRI, occurs adults (Young: YA). (Adapted with permission from S. L. Thomp-
during negative recent (hatched bar) compared to negative son-Schill, J. Jonides, C. Marshuetz, E. E. Smith, M. D’Esposito,
nonrecent trials (unfilled bar), reflecting greater recruitment of left I. P. Kan, R. T. Knight, & D. Swick, 2002, Effects of frontal lobe
mid-VLPFC in the presence of interference in working memory. damage on interference effects in working memory, Cogn. Affective
(A and B adapted from D. Badre & A. D. Wagner, 2005, Frontal Behav. Neurosci., 2, 109–120. Copyright 2002, with permission from
lobe mechanisms that resolve proactive interference, Cereb. Cortex, Psychonomic Society, Inc.)
710 memory
race, kuhl, badre, and wagner: cognitive control and memory 711
the extent that semantic decisions require selecting goal- Specifically, subjects first studied a list of word pairs (e.g.,
relevant information in the face of competition (Thompson- “dog-boxer”); next, a second list of word pairs was studied,
Schill et al., 1997). For example, in one task subjects were containing either repeated pairs, completely novel pairs, or
shown nouns and required to generate semantically related pairs that partially overlapped with previously studied pairs
verbs; critically, some of the nouns were associated with a (e.g., “sportsman-boxer”—the proactive interference
dominant verb (e.g., “scissors” strongly elicits “cut”; a low- condition). Dolan and Fletcher observed that left lateral
selection situation), whereas other nouns were associated PFC, inclusive of left mid-VLPFC, was highly sensitive to
with multiple verbs (e.g., “wheel” may elicit “turn,” “steer,” the presence of interference, as this region was differentially
and “drive”; a high-selection situation). Functional MRI engaged when subjects were encoding word pairs that over-
revealed greater left mid- and posterior VLPFC activation lapped with previously studied pairs. Additional findings
during generation under high- relative to low-selection relating left mid-VLPFC to the resolution of proactive inter-
demands (for related findings, see Thompson-Schill, ference have been reported in more recent fMRI studies
D’Esposito, & Kan, 1999; Badre, Poldrack, Paré-Blagoev, of episodic encoding (Fletcher, Shallice, & Dolan, 2000;
Insler, & Wagner, 2005). Subsequent work demonstrated Henson, Shallice, Josephs, & Dolan, 2002), complementing
that damage to left mid- and posterior VLPFC in humans neuropsychological observations that damage to lateral PFC
impairs the ability to select relevant semantic representa- results in an increased susceptibility to proactive interference
tions—specifically when competition is present—establish- (e.g., Shimamura, Jurica, Mangels, Gershberg, & Knight,
ing the necessity of this region for resolving semantic 1995; Smith, Leonard, Crane, & Milner, 1995). It has been
interference (figure 48.4; Thompson-Schill et al., 1998). argued that, during episodic encoding, left mid-VLPFC-
Additional evidence for the role of left mid-VLPFC in mediated selection may allow for relevant semantic associa-
resolving interference comes from studies using the interfer- tions between word pairs to be favored in the face of
ence variant of the Sternberg working memory paradigm interference from previously learned, irrelevant associations
(figure 48.5). In this paradigm, each trial requires the encod- (Henson et al., 2002).
ing and maintenance of a set of stimuli in working memory Left mid-VLPFC engagement has also been observed
and determination of whether a subsequently presented test during episodic retrieval situations that are well character-
probe is or is not a member of the currently maintained set ized as requiring selection. For example, with an increase
(trial N). Interference occurs when the test probe is not a in the number of competing associates that interfere with
member of the currently maintained set but was a member retrieval of a target associate, left lateral PFC, inclusive of
of the previously maintained set (trial N − 1)—“negative left mid-VLPFC, displays a corresponding increase in
recent” probes. The now classic finding is that “negative retrieval-related activation (Sohn, Goode, Stenger, Carter,
recent” probes elicit greater activation in left mid-VLPFC & Anderson, 2003; Sohn et al., 2005; Danker, Gunn, &
than do “negative nonrecent” probes—trials requiring the Anderson, 2008). Likewise, when a retrieval task involves
same decision but without interference (figure 48.5; e.g., recollecting a specific detail of an encoding event over
Jonides et al., 1998; D’Esposito, Postle, Jonides, & Smith, other possible event details (e.g., as in source memory tasks),
1999; Bunge et al., 2001; Badre & Wagner, 2005; Nee, left mid-VLPFC, among other regions, is engaged (e.g.,
Jonides, & Berman, 2007). Moreover, the ability to success- Nolde, Johnson, & D’Esposito, 1998; Dobbins et al., 2002;
fully reject negative recent probes is compromised by Cabeza, Locantore, & Anderson, 2003; Dobbins &
left mid-VLPFC damage (Thompson-Schill et al., 2002; Wagner, 2005; Lundstrom, Ingvar, & Petersson, 2005).
figure 48.5) or disruption by means of transcranial magnetic Importantly, left mid-VLPFC is distinguished from other
stimulation (Feredoes, Tononi, & Postle, 2006). Mechanisti- lateral PFC regions engaged during source retrieval in
cally, it has been argued that rejecting negative recent that it supports source recollection in a domain-general
probes engages left mid-VLPFC because accurate task per- manner (Dobbins & Wagner, 2005). These data comple-
formance requires identifying (selecting) the relevant context ment neuropsychological observations that patients with
for the familiar negative probe (i.e., that it appeared in the lateral PFC damage are particularly impaired at attributing
last trial) so that it can be appropriately rejected (Badre & retrieved information to its relevant source (Janowsky,
Wagner, 2005; for alternative interpretations, see Jonides & Shimamura, & Squire, 1989).
Nee, 2006). In summary, extant data provide strong support for the
Within the domain of episodic memory, selection to over- hypothesis that left mid-VLPFC mediates the resolution of
come interference likely plays a role during both encoding interference by selecting goal-relevant representations in the
and retrieval. In a classic PET study, Dolan and Fletcher face of competition from irrelevant representations. While
(1997) measured neural responses during the encoding of we have focused on the role of selection in working-memory,
word pairs, manipulating the extent to which prior learning semantic-retrieval, and episodic-memory paradigms, it is
interfered with current encoding (i.e., proactive interference). worth noting that mid-VLPFC selection has also been asso-
712 memory
ciated with overcoming proactive interference during task Wagner, Paré-Blagoev, et al., 2001). The coactivation of
switching (Badre & Wagner, 2006). As such, this selection left anterior VLPFC and middle temporal cortex suggests a
mechanism does not appear to support retrieval, per se frontal-temporal interaction in which left anterior VLPFC
(Thompson-Schill et al., 1997); rather, selection likely oper- provides a top-down bias that activates semantic representa-
ates postretrieval such that goal-relevant representations can tions stored in temporal cortex.
be favored over goal-irrelevant representations (Badre & Functional dissociations between left mid-VLPFC and left
Wagner, 2007). anterior VLPFC have also been observed in the context of
short-term semantic priming (Gold et al., 2006) and episodic
Uncertainty While left mid-VLPFC (∼BA 45) is thought retrieval (Danker et al., 2008). For example, Gold and col-
to support selection between activated representations, a leagues (2006) used a lexical decision priming task to identify
central question is whether there are additional PFC regions in which neural processing demands were (1)
mechanisms that support the top-down activation of decreased with the presentation of semantically related
representations under other situations of uncertainty. Here primes and (2) increased with the presentation of semanti-
we define uncertainty as the situation in which goal-relevant cally unrelated (interfering) primes. These two situations
representations are not automatically activated because of provide a compelling parallel between the controlled retrieval
ineffective triggering cues. Under such situations, strategic and selection distinction explored by Badre and colleagues
activation, or controlled retrieval, of goal-relevant representations (2005). For example, when a “related” semantic prime is
is required to recover relevant knowledge (Wagner, Paré- presented (e.g., “spoon” as a prime for the target “fork”),
Blagoev, Clark, & Poldrack, 2001; Badre & Wagner, 2002; the prime should elicit bottom-up semantic activation that
Badre et al., 2005; Badre & Wagner, 2007). Extant data reduces the demand for controlled retrieval once the target
indicate that anterior VLPFC (area 47/12) mediates appears (i.e., the prime has already activated the relevant
controlled retrieval, with the left homologue differentially semantic information). On the other hand, “unrelated”
supporting such retrieval from semantic memory and the semantic primes (e.g., “spoon” as a prime for “coat”) elicit
right homologue from visual associative memory. activation of irrelevant semantic information that may inter-
Evidence for the distinction between selection and con- fere with access to target-related information, thus requiring
trolled retrieval comes from an fMRI study that varied subsequent selection of relevant target-related information in
demands on each of these putative control processes (Badre the face of irrelevant information. Strikingly, the presenta-
et al., 2005). In that study, controlled retrieval demands were tion of “related” primes resulted in reduced engagement of
manipulated by varying the strength of the semantic associa- left anterior VLPFC and middle temporal cortex, presum-
tion between a cue and target in a task in which subjects ably because of reduced controlled retrieval demands, rela-
were required to identify semantic associates (targets) of par- tive to a neutral prime control condition. In contrast, the
ticular cues. For example, identifying the semantic relation- increased selection demands associated with “unrelated”
ship between strongly associated nouns such as “candle” and primes resulted in increased engagement of left mid-VLPFC,
“flame” places low demands on controlled retrieval, relative relative to the neutral prime condition. Paralleling these
to weakly associated nouns such as “candle” and “halo.” findings, Danker, Gunn, and Anderson (2008) observed that
The difference in controlled retrieval demands is due to the left mid-VLPFC and anterior VLPFC functionally dissociate
fact that “candle” is more likely to generate bottom-up acti- during episodic retrieval, with the former being sensitive to
vation of the associated concept “flame,” thereby facilitating mnemonic competition (fan size) and associative memory
identification of a semantic relationship; “candle,” however, strength and the latter being selectively sensitive to associa-
is less likely to elicit bottom-up activation of weakly associ- tive memory strength.
ated concepts such as “halo,” meaning that identification of Together, these studies of semantic retrieval (Badre et al.,
a semantic relationship between these stimuli requires top- 2005; Gold et al., 2006) and episodic retrieval (Danker,
down semantic search. Within this same decision task, selec- Gunn, & Anderson, 2008; see also Dobbins & Wagner,
tion demands were independently manipulated by varying 2005) provide compelling evidence for a dissociation between
the extent to which irrelevant semantic information was a selection mechanism supported by left mid-VLPFC and
likely to interfere (e.g., by including distracters that were a controlled retrieval mechanism supported by left
either strongly or weakly interfering). Consistent with the anterior VLPFC that interacts with middle temporal cortex.
selection literature, Badre and colleagues (2005) reported The argument that left anterior VLPFC, in particular,
increases in left mid-VLPFC activity as selection demands supports controlled semantic retrieval is also supported by
increased (figure 48.6). In contrast, increases in controlled evidence that neural disruption (by means of transcranial
retrieval demands were associated with increased engage- magnetic stimulation) of left anterior VLPFC, but not
ment of left anterior VLPFC and middle temporal cortex— left posterior VLPFC, interferes with semantic—but not
regions that were not modulated by selection (see also phonological—processing (Gough, Nobre, & Devlin, 2005).
race, kuhl, badre, and wagner: cognitive control and memory 713
Figure 48.6 Left ventrolateral PFC is differentially engaged between a cue and the correct target. Greater controlled retrieval
during controlled retrieval and selection from semantic memory. is necessary under conditions of weak cue-target associative strength
During a semantic decision task, participants were presented with because of diminished bottom-up activation of relevant knowledge.
target words beneath a cue word. On each trial, participants deter- The top panel shows the location of anterior VLPFC and mid-
mined which of the target words was semantically related to the VLPFC regions of interest. The fMRI data from these regions of
cue. Selection demands were manipulated by varying the task interest (bottom panel) reveal a crossover interaction wherein ante-
requirements for each trial (either a global relatedness judgment or rior VLPFC displays greater activity with high control demands
a more specific feature similarity judgment that entailed higher than with high selection demands, and mid-VLPFC displays greater
selection demands) and by varying the extent to which irrelevant activity with high selection demands than with high control
semantic information was likely to interfere with the decision (i. e., demands. (Adapted from D. Badre & A. D. Wagner, 2007, Left
the distracter could be a preexperimental associate of the cue, but ventrolateral prefrontal cortex and the cognitive control of memory,
not along the relevant dimension). Controlled retrieval demands Neuropsychologia, 45, 2883–2901. Copyright 2007, with permission
were manipulated by varying the strength of the association from Elsevier.)
It should be noted, however, that controlled retrieval does Decreased PFC demands through mnemonic
not render selection unnecessary. That is, the combination suppression and prediction
of automatic and controlled semantic retrieval may result in
the activation of multiple representations, from which a Thus far, we have described how the recruitment of PFC
subset must be selected. Indeed, Badre and colleagues (2005) control processes facilitates achievement of current mne-
describe conditions in which both selection and controlled monic goals. In this final section, we consider how past
retrieval demands were high, and these situations engaged experience can favor goal-appropriate representations and
both left anterior VLPFC and left mid-VLPFC (see also reduce future demands on cognitive control. We describe
Danker et al.). Thus, while distinct VLPFC subregions evidence for modulation of control by (1) prior acts of selec-
appear to support dissociable forms of cognitive control, tion that strengthen relevant memories and weaken interfer-
these functionally separable regions may act in concert when ing memories, and (2) experience-dependent plasticity that
automatic retrieval is insufficient to arrive at mnemonic strengthens memory-based predictions to reduce uncertainty
goals (Kostopoulos & Petrides, 2003, 2008). Moreover, given at multiple levels of processing between stimulus input and
the dorsal-ventral hypothesis of prefrontal contributions to response output.
cognitive control, it is worth noting that PFC correlates of
controlled retrieval and selection have been concentrated in Reduced Interference Although the presence of
VLPFC, rather than DLPFC, subregions. competition during retrieval may require PFC mechanisms
714 memory
that implement interference resolution (e.g., Thompson- and right anterior VLPFC processes during their subsequent
Schill et al., 1997; Sohn et al., 2003; Dobbins & Wagner, retrieval (Kuhl, Kahn, Dudukovic, & Wagner, 2008). This
2005; Sohn et al., 2005), demands on PFC control dynamic interplay between cognitive control and memory
mechanisms often change with experience. For example, highlights how experience-dependent changes in memory
memories that are repeatedly selected during retrieval accrue strength and mnemonic competition yield cognitive control
a competitive advantage over other memories that are benefits and costs, as evidenced by decreasing and increasing
selected against. This advantage stems from both the demands on PFC control mechanisms during future acts of
strengthening of selected memories (e.g., Roediger & remembering.
Karpicke, 2006) and the weakening of interfering, selected-
against memories (M. Anderson, 2003). These adaptive Reduced Uncertainty Experience-dependent learning
changes in memory strength are thought to “benefit” future also reduces demands on PFC-mediated control by
processing by favoring memories that are likely to be relevant decreasing uncertainty associated with previously encountered
in the future ( J. Anderson, 2007) and reducing interference stimuli. Illustrative of this point is the phenomenon of repetition
from memories that are likely to remain irrelevant. Indeed, priming, a form of nondeclarative (or implicit) memory that
general support for the processing benefits associated with is expressed behaviorally as faster reaction times, increased
prior acts of selection comes from fMRI observations of response accuracy, or otherwise biased responding when
reduced lateral PFC engagement across repeated acts of stimuli are repeatedly processed (Tulving & Schacter, 1990;
episodic retrieval relative to initial acts (e.g., Henson et al., Roediger & McDermott, 1993). For example, stimulus
2002; Law et al., 2005). Moreover, electrophysiological classification decisions—for example, “Is a horse animate?”—
evidence indicates that the engagement of PFC during initial are speeded with repetition, reflecting the behavioral benefits
selective retrieval is predictive of later forgetting (weakening) of previous stimulus processing. At the neural level, cortical
of interfering memories, suggesting that reductions in regions that are active during initial stimulus processing
interference occur as a result of prior PFC-mediated frequently show reduced responses during subsequent
mnemonic selection ( Johansson, Aslan, Bäuml, Gabel, & stimulus processing (e.g., Raichle et al., 1994; Gabrieli et al.,
Mecklinger, 2007). 1996; Schacter & Buckner, 1998; Wiggs & Martin, 1998;
Building on these observations, a recent fMRI study Henson, 2003)—a phenomenon that has been referred to
examined whether the PFC control mechanisms that support as repetition suppression, neural priming, or fMRI adaptation. For
initial mnemonic selection also “benefit”—in terms of example, stimulus repetition in the visual domain is associated
reduced subsequent processing demands—from the weaken- with reduced activation in visual cortical areas, as expressed
ing of interfering memories (Kuhl et al., 2007). At a behav- in reduced neural firing rates (Desimone, 1996) and reduced
ioral level, Kuhl and colleagues (2007) observed that repeated PET/fMRI activation (Wiggs & Martin, 1998; Wagner &
selective retrieval of target memories elicits forgetting of Koutstaal, 2002). These neural activation reductions are
interfering memories, replicating prior observations of generally thought to reflect computational savings or more
retrieval-induced forgetting (M. Anderson, Bjork, & Bjork, efficient processing in neural networks supporting stimulus
1994; Levy & Anderson, 2002). Critically, when this behav- perception.
ioral effect was related to functional activation during the While perceptual priming facilitates processing in sensory
repeated acts of selective retrieval, the data revealed that cortical regions, other forms of priming are associated
the extent to which interfering memories were forgotten with repetition suppression in lateral PFC. In particular,
was tightly correlated with PFC processing benefits that conceptual priming—implicit memory at the level of semantic
occurred across the repeated acts of selective retrieval. Spe- or conceptual information—is typically associated with
cifically, ACC and right anterior VLPFC displayed robust activation reductions in left-lateralized frontotemporal
decreases in engagement during future target memory regions (figure 48.8A), including left VLPFC and regions
remembering to the extent that interfering memories were within inferior and lateral temporal cortex (Demb et al.,
forgotten (figure 48.7). 1995; Wagner, Desmond, Demb, Glover, & Gabrieli, 1997;
While Kuhl and colleagues’ (2007) data reveal the neural Buckner et al., 1998; Gabrieli et al., 1996). Conceptual
processing benefits of mnemonic filtering (for related find- priming is dissociable from perceptual priming in that con-
ings, see M. Anderson et al., 2004; Depue, Curran, & Banich, ceptual priming is invariant to changes in perceptual input
2007), it is important to emphasize that these benefits are across repetitions (e.g., priming will occur across stimulus
obtained only when one’s memory goals remain constant. modality changes such as auditory to visual), whereas
By contrast, when previously interfering and selected-against perceptual priming occurs to the extent that there is per-
memories later become goal-relevant, the weakening that ceptual overlap across repetitions (e.g., words appearing in
these memories suffered results in increased demands on ACC the same font or same modality) (Roediger & McDermott,
race, kuhl, badre, and wagner: cognitive control and memory 715
Figure 48.7 Demands on PFC control mechanisms that support cessing demands on these regions were reduced to the extent that
initial mnemonic selection are reduced with the weakening of irrelevant memories were forgotten. (Adapted from B. A. Kuhl, N.
interfering memories. During repeated, selective retrieval of goal- M. Dudukovic, I. Kahn, & A. D. Wagner, 2007, Decreased
relevant memories, fMRI activation reductions in (A) ACC and (B) demands on cognitive control reveal the neural processing benefits
right anterior VLPFC were correlated with the behavioral evidence of forgetting. Nat. Neurosci., 10, 908–914. Copyright 2007, reprinted
that interfering memories were later forgotten, suggesting that pro- by permission from Macmillan Publishers, Ltd.)
1993; Badgaiyan, Schacter, & Alpert, 2001; Carlesimo et al., tions in conceptual priming tasks may reflect reduced control
2003). demands owing to increased availability of item-related
Although the repetition suppression in left VLPFC knowledge.
and lateral temporal cortex that accompanies conceptual By contrast, an alternative—though not mutually exclu-
priming is consistent with the hypothesis that these regions sive—account of repetition suppression in VLPFC is that
interact during controlled retrieval of semantic information prior processing of a stimulus results in “stimulus-response”
(Badre et al., 2005; Gold et al., 2006), at present there learning that facilitates subsequent mappings between
is debate regarding the processes underlying repetition the stimulus and a decision or response (Dobbins, Schnyer,
suppression in these cortical areas. The dominant, or Verfaellie, & Schacter, 2004; Schacter, Dobbins, & Schnyer,
traditional, view is that representations in cortical regions 2004). For example, when repeatedly asked, “Is a horse
that store conceptual information are “tuned” with experi- animate?” subsequent performance can be facilitated by
ence, such that previously accessed information is more direct retrieval of a learned association between the “stimu-
effectively activated during future processing (Wiggs & lus” with the relevant “response” (“yes”). Thus, while the
Martin, 1998; Grill-Spector, Henson, & Martin, 2006; retrieval of response information previously associated with
figure 48.8). Several mechanisms have been proposed to a stimulus does not reflect facilitated conceptual processing
support such cortical “tuning” within a population of (rather, it may enable the bypassing of controlled semantic
neurons, including reductions in overall activation (fatigue retrieval), “stimulus-response” learning may nonetheless
model), a reduction in the number of responsive neurons reduce demands on PFC control mechanisms that support
(sharpening model), and faster processing or settling time decision or response selection (Schacter et al., 2004; Schacter,
(facilitation model). Viewed in this light, left VLPFC reduc- Wig, & Stevens, 2007).
716 memory
Figure 48.8 Repetition priming paradigm, neural priming logia, 39, 184–199. Copyright 2001, with permission from Elsevier.)
effects, and hypothesized mechanisms of “cortical tuning.” (A) In (C) Proposed experience-dependent changes in a neural network
semantic priming tasks, subjects initially study a set of stimuli (e.g., representing visual object features. First presentation of a stimulus
pictures or words), making a semantic decision (e.g., size judgment) activates a network of neurons (circles) coding for relevant and
about those stimuli. Subsequently, during the critical test phase, irrelevant features of the stimulus. Repeated presentation “tunes”
semantic decisions are made about previously studied (primed) and the stimulus representation, reducing the overall firing rate across
novel (unprimed) stimuli. Typically, improved behavioral perfor- this network as well as the associated fMRI signal. Possible mecha-
mance measures (e.g., reaction times and accuracy) are observed nisms supporting cortical “tuning” in a population of neurons with
for primed compared to unprimed stimuli. (B) Functional MRI repeated stimulus presentation include less overall activation
scanning during the test phase of a semantic classification priming (fatigue model), a reduction in the number of responsive neurons
task revealed activation reductions in fusiform (circled) and left (sharpening model), and faster processing or settling time (facilita-
ventrolateral PFC (arrow) for primed compared to unprimed tion model). (Adapted with permission from K. Grill-Spector,
stimuli. (Data from W. Koutstaal, A. D. Wagner, M. Rotte, A. R. Henson, & A. Martin, 2006, Repetition and the brain: Neural
Maril, R. L. Buckner, & D. L. Schacter, 2001, Perceptual specific- models of stimulus-specific effects, Trends Cogn. Sci., 10, 14–23.
ity in visual object priming: Functional magnetic resonance imaging Copyright 2006, with permission from Elsevier.)
evidence for a laterality difference in fusiform cortex, Neuropsycho-
race, kuhl, badre, and wagner: cognitive control and memory 717
Figure 48.9 Contributions of “response learning” to neural posterior VLPFC and fusiform cortex showed significant neural
priming during a semantic classification task. Subjects semantically priming during the priming phase when the classification rule
classified visually presented objects (“Bigger than a shoe box?”) that was held constant. Inversion of the classification rule in the cue
were presented once (unprimed) or three times (primed) and inversion phase reduced neural priming in posterior VLPFC and
responded with a yes/no response. During a subsequent cue rever- eliminated priming in fusiform cortex. The disruption of neural
sal phase the task cue was inverted (“Smaller than a shoe box?”), priming in the cue reversal phase suggests that subjects could no
and half of the items from the previous priming phase were re- longer use learned “responses” as a route to action and that neural
presented along with a new set of unprimed items. (A) Functional priming in these regions during the priming phase reflects stimulus-
MRI scanning revealed regions displaying reductions in the neural response learning rather than priming of conceptual information.
priming signal (difference in activation between primed and (Adapted with permission from I. G. Dobbins, D. M. Schnyer,
unprimed trials) in the cue reversal relative to the priming phase M. Verfaellie, & D. L. Schacter, 2004, Cortical activity reductions
(left panel arrow points to left posterior VLPFC [BA 9/44]; right during repetition priming can result from rapid response learning,
panel arrow points to left fusiform [BA 37]). (B) Hemodynamic Nature, 428, 316–319. Copyright 2004, reprinted by permission
time courses from the two regions of interest indicated in A. Both from Macmillan Publishers, Ltd.)
718 memory
The role of response learning in conceptual priming tasks can reduce future uncertainty, and thus demands on PFC-
has received support from a study by Dobbins, Schnyer, and mediated control.
colleagues (2004). In this study (figure 48.9), stimuli (e.g.,
“Bulldozer”) were repeatedly semantically classified (e.g., Conclusion
“Larger than a shoebox?”), with the specific classification
decision and the corresponding response either being held In this chapter we reviewed influential theories of cognitive
constant across repetitions or changed across repetitions control and considered the specific manner in which VLPFC
(e.g., “Smaller than a shoebox?”). While repetition of a stim- control mechanisms serve to resolve interference and reduce
ulus with the identical decision cue was associated with uncertainty during mnemonic processing. While our focus
robust repetition suppression in left VLPFC, repetition of a on VLPFC operations reflects the considerable progress that
stimulus with the inverted decision cue was associated with has been made in understanding VLPFC function (for
diminished repetition suppression in this region. Because the reviews, see Petrides, 2005; Badre & Wagner, 2007), it
same conceptual information is accessed across the decision should be emphasized that other PFC mechanisms work in
cues, the disruption of priming with cue inversion suggests conjunction with VLPFC to achieve mnemonic goals (for
that the left VLPFC repetition suppression effects typically reviews, see Fletcher & Henson, 2001; Wagner, 2002;
observed in conceptual priming tasks are at least partially Buckner, 2003; Simons & Spiers, 2003). For example, it has
attributable to stimulus-response learning rather than been argued that while VLPFC supports “active retrieval”
priming of conceptual information. While these data provide of mnemonic representations, DLPFC subserves the com-
an important challenge to accounts of left VLPFC priming plementary role of monitoring mnemonic representations
that focus only on the reduction in cognitive control demands once activated (Petrides, 1996, 2005). To the extent that
following cortical tuning of semantic representations, one DLPFC supports the monitoring of mnemonic information
caveat is that the design used by Dobbins and colleagues (Henson et al., 2000; Fletcher & Henson, 2001; Dobbins
covaried repetition at the “decision” and “response” levels. et al., 2002; Rugg et al., 2003; Achim & Lepage, 2005;
That is, switching the decision from “Larger than a shoebox?” Dobbins, Simons, et al., 2004), this argument would suggest
to “Smaller than a shoebox?” requires both a decision switch a hierarchical, but interactive, relationship between VLPFC
and a response switch (Schacter et al., 2004; Schnyer, and DLPFC retrieval operations. Along similar lines, it has
Dobbins, Nicholls, Schacter, & Verfaellie, 2006). Indeed, been suggested that VLPFC and DLPFC are hierarchically
behavioral evidence suggests that priming at the decision organized during episodic encoding, with VLPFC serving a
level can be dissociated from response repetition (Schnyer general role in encoding (e.g., Wagner et al., 1998; Brewer,
et al., 2007). Zhao, Desmond, Glover, & Gabrieli, 1998), but DLPFC
Together, extant evidence suggests that prior conceptual selectively recruited when encoding involves processing
processing can reduce demands on PFC control mechanisms the relationship between multiple stimuli (Blumenfeld &
during future conceptual processing. However, additional Ranganath, 2006; Murray & Ranganath, 2007). Further
work is needed to establish the extent to which these PFC delineation of the contributions of DLPFC to mnemonic
activation reductions reflect priming at different levels of processing, as well as the nature of DLPFC-VLPFC interac-
processing (i.e., conceptual, decision, or response). An tions, remains an important avenue for future research.
intriguing hypothesis is that these distinct levels of learning Finally, frontopolar cortex has frequently been implicated
might give rise to dissociable forms of neural priming. For in higher-order forms of mnemonic processing (for reviews,
example, priming at the conceptual level may reduce see Rugg & Wilding, 2000; Fletcher & Henson, 2001;
demands on processing in left anterior VLPFC—a region Buckner, 2003; Ramnani & Owen, 2004), though ambiguity
that has repeatedly been implicated in controlled semantic remains concerning the specific nature of frontopolar inter-
retrieval—whereas learning at the response level may reduce actions with “lower” forms of mnemonic control. Further
demands on processing in regions more directly related to advances in our understanding of the interplay between
response selection (e.g., premotor areas) (Race, Shanker, & PFC control and mnemonic processing will require con-
Wagner, 2008). Of additional interest is whether these dis- sideration of both the computations supported by specific
tinct forms of priming—from higher-level conceptual PFC subregions and the manner in which coordinated pro-
priming to lower-level response learning—correspond to a cessing across these subregions allows for the achievement
representational hierarchy within PFC (Fuster, 2001; Badre of mnemonic goals.
& D’Esposito, 2007; Koechlin & Summerfield, 2007),
acknowledgments This work was supported by grants from
perhaps organized along an anterior (higher-level) to poste-
the National Institute of Mental Health (5R01-MH076932-02;
rior (lower-level) gradient (Race et al., 2008). Insight into 5R01-MH080309-02), the Alfred P. Sloan Foundation, and
these questions will provide a more complete understanding the National Alliance for Research on Schizophrenia and
of the multiple ways in which learning from past experiences Depression.
race, kuhl, badre, and wagner: cognitive control and memory 719
REFERENCES Braver, T. S., & Bongiolatti, S. R. (2002). The role of frontopo-
lar cortex in subgoal processing during working memory.
Achim, A. M., & Lepage, M. (2005). Dorsolateral prefrontal cortex NeuroImage, 15, 523–536.
involvement in memory post-retrieval monitoring revealed in Braver, T. S., Reynolds, J. R., & Donaldson, D. I. (2003). Neural
both item and associative recognition tests. NeuroImage, 24, mechanisms of transient and sustained cognitive control during
1113–1121. task switching. Neuron, 39, 713–726.
Anderson, J. R. (2007). How can the human mind occur in the physical Brewer, J. B., Zhao, Z., Desmond, J. E., Glover, G. H., &
universe? Oxford, UK: Oxford University Press. Gabrieli, J. D. (1998). Making memories: Brain activity that
Anderson, M. C. (2003). Rethinking interference theory: predicts how well visual experience will be remembered. Science,
Executive control and the mechanisms of forgetting. J. Mem. 281, 1185–1187.
Lang., 49, 415–445. Brown, J. W., & Braver, T. S. (2005). Learned predictions of
Anderson, M. C., Bjork, R. A., & Bjork, E. L. (1994). Remember- error likelihood in the anterior cingulate cortex. Science, 307,
ing can cause forgetting: Retrieval dynamics in long-term 1118–1121.
memory. J. Exp. Psychol. Learn. Mem. Cogn., 20, 1063–1087. Buckner, R. L. (2003). Functional-anatomic correlates of control
Anderson, M. C., Ochsner, K. N., Kuhl, B., Cooper, J., processes in memory. J. Neurosci., 23, 3999–4004.
Robertson, E., Gabrieli, S. W., Glover, G. H., & Gabrieli, Buckner, R. L., Goodman, J., Burock, M., Rotte, M.,
J. D. (2004). Neural systems underlying the suppression of Koutstaal, W., Schacter, D., Rosen, B., & Dale, A. M.
unwanted memories. Science, 303, 232–235. (1998). Functional-anatomic correlates of object priming in
Aron, A. A., Durston, S., Eagle, D. M., Logan, G. D., Stinear, humans revealed by rapid presentation event-related fMRI.
C. M., & Stuphorn, V. (2007). Converging evidence for a Neuron, 20, 285–296.
frontal-basal-ganglia network for inhibitory control of action and Bunge, S. A., Burrows, B., & Wagner, A. D. (2004). Prefrontal
cognition. J. Neurosci., 27, 11860–11864. and hippocampal contributions to visual associative recognition:
Aron, A. R., & Poldrack, R. A. (2006). Cortical and subcortical Interactions between cognitive control and episodic retrieval.
contributions to Stop signal response inhibition: Role of the Brain Cogn., 56, 141–152.
subthalamic nucleus. J. Neurosci., 26, 2424–2433. Bunge, S. A., Ochsner, K. N., Desmond, J. E., Glover, G. H., &
Badgaiyan, R. D., Schacter, D. L., & Alpert, N. M. (2001). Gabrieli, J. D. (2001). Prefrontal regions involved in keeping
Priming within and across modalities: Exploring the nature of information in and out of mind. Brain, 124, 2074–2086.
rCBF increases and decreases. NeuroImage, 13, 272–282. Bunge, S. A., Wendelken, C., Badre, D., & Wagner, A. D. (2004).
Badre, D. (2008). Cognitive control, hierarchy, and the rostro- Analogical reasoning and prefrontal cortex: Evidence for sepa-
caudal organization of the frontal lobes. Trends Cogn. Sci., 12, rable retrieval and integration mechanisms. Cereb. Cortex, 15,
193–200. 239–249.
Badre, D., & D’Esposito, M. (2007). Functional magnetic Bunge, S. A., & Zelazo, P. D. (2006). A brain-based account of
resonance imaging evidence for a hierarchical organization of the development of rule use in childhood. Curr. Dir. Psychol. Sci.,
the prefrontal cortex. J. Cogn. Neurosci., 19, 2082–2099. 15, 118–121.
Badre, D., Poldrack, R. A., Paré-Blagoev, E. J., Insler, Cabeza, R., Locantore, J. K., & Anderson, N. D. (2003). Later-
R. Z., & Wagner, A. D. (2005). Dissociable controlled retrieval alization of prefrontal activity during episodic memory retrieval:
and generalized selection mechanisms in ventrolateral prefrontal Evidence for the production-monitoring hypothesis. J. Cogn.
cortex. Neuron, 47, 907–918. Neurosci., 15, 249–259.
Badre, D., & Wagner, A. D. (2002). Semantic retrieval, Carlesimo, G. A., Turriziani, P., Paulesu, E., Gorini, A.,
mnemonic control, and prefrontal cortex. Behav. Cogn. Neurosci. Caltagirone, C., Fazio, F., & Perani, D. (2003). Brain activity
Rev., 1, 206–218. during intra- and cross-modal priming: New empirical data and
Badre, D., & Wagner, A. D. (2004). Selection, integration, review of the literature. Neuropsychologia, 42, 14–24.
and conflict monitoring: Assessing the nature and generality Christoff, K., & Gabrieli, J. D. E. (2000). The frontopolar cortex
of prefrontal cognitive control mechanisms. Neuron, 41, 473– and human cognition: Evidence for a rostrocaudal hierarchical
487. organization within the human prefrontal cortex. Psychobiology,
Badre, D., & Wagner, A. D. (2005). Frontal lobe mechanisms that 28, 168–186.
resolve proactive interference. Cereb. Cortex, 15, 2003–2012. Christoff, K., Prabhakaran, V., Dorfman, J., Zhao, Z., Kroger,
Badre, D., & Wagner, A. D. (2006). Computational and neurobio- J. K., Holyoak, K. J., & Gabrieli, J. D. (2001). Rostrolateral
logical mechanisms underlying cognitive flexibility. Proc. Natl. prefrontal cortex involvement in relational integration during
Acad. Sci. USA, 103, 7186–7191. reasoning. NeuroImage, 14, 1136–1149.
Badre, D., & Wagner, A. D. (2007). Left ventrolateral prefrontal Christoff, K., Ream, J. M., Geddes, L. P., & Gabrieli, J. D.
cortex and the cognitive control of memory. Neuropsychologia, 45, (2003). Evaluating self-generated information: Anterior prefron-
2883–2901. tal contributions to human cognition. Behav. Neurosci., 117,
Blumenfeld, R. S., & Ranganath, C. (2006). Dorsolateral pre- 1161–1168.
frontal cortex promotes long-term memory formation through Cohen, J. D., Dunbar, K., & McClelland, J. L. (1990). On the
its role in working memory organization. J. Neurosci., 26, control of automatic processes: A parallel distributed processing
916–925. account of the Stroop effect. Psychol. Rev., 97, 332–361.
Botvinick, M. M. (2007). Multilevel structure in behaviour and in Cohen, J. D., & Servan-Schreiber, D. (1992). Context, cortex,
the brain: A model of Fuster’s hierarchy. Philos. Trans. R. Soc. and dopamine: A connectionist approach to behavior and
Lond. B Biol. Sci., 362, 1615–1626. biology in schizophrenia. Psychol. Rev., 99, 45–77.
Botvinick, M. M., Cohen, J. D., & Carter, C. S. (2004). Danker, J. F., Gunn, P., & Anderson, J. R. (2008). A rational
Conflict monitoring and anterior cingulate cortex: An update. account of memory predicts left prefrontal activation during
Trends Cogn. Sci., 8, 539–546. controlled retrieval. Cereb. Cortex, 18, 2674–2685.
720 memory
Demb, J. B., Desmond, J. E., Wagner, A. D., Vaidya, C. J., Garavan, H., Ross, T. J., Li, S. J., & Stein, E. A. (2000).
Glover, G. H., & Gabrieli, J. D. (1995). Semantic encoding A parametric manipulation of central executive functioning.
and retrieval in the left inferior prefrontal cortex: A functional Cereb. Cortex, 10, 585–592.
MRI study of task difficulty and process specificity. J. Neurosci., Gold, B. T., Balota, D. A., Jones, S. J., Powell, D. K., Smith,
15, 5870–5878. C. D., & Andersen, A. H. (2006). Dissociation of automatic and
De Pisapia, N., Slomski, J. A., & Braver, T. S. (2007). Functional strategic lexical-semantics: Functional magnetic resonance
specializations in lateral prefrontal cortex associated with the imaging evidence for differing roles of multiple frontotemporal
integration and segregation of information in working memory. regions. J. Neurosci., 26, 6523–6532.
Cereb. Cortex, 17, 993–1006. Gough, P. M., Nobre, A. C., & Devlin, J. T. (2005). Dissociating
Depue, B. E., Curran, T., & Banich, M. T. (2007). Prefrontal linguistic processes in the left inferior frontal cortex with tran-
regions orchestrate suppression of emotional memories via a scranial magnetic stimulation. J. Neurosci., 25, 8010–8016.
two-phase process. Science, 317, 215–219. Green, A. E., Fugelsang, J. A., Kraemer, D. J., Shamosh,
Desimone, R. (1996). Neural mechanisms for visual memory N. A., & Dunbar, K. N. (2006). Frontopolar cortex mediates
and their role in attention. Proc. Natl. Acad. Sci. USA, 93, abstract integration in analogy. Brain Res., 1096, 125–137.
13494–13499. Grill-Spector, K., Henson, R., & Martin, A. (2006). Repetition
Desimone, R., & Duncan, J. (1995). Neural mechanisms of and the brain: Neural models of stimulus-specific effects. Trends
selective visual attention. Annu. Rev. Neurosci., 18, 193–222. Cogn. Sci., 10, 14–23.
D’Esposito, M., Aguirre, G. K., Zarahn, E., Ballard, D., Shin, Hazy, T. E., Frank, M. J., & O’Reilly, R. C. (2007). Towards an
R. K., & Lease, J. (1998). Functional MRI studies of spatial executive without a homunculus: Computational models of the
and nonspatial working memory. Brain Res. Cogn. Brain Res., 7, prefrontal cortex/basal ganglia system. Philos. Trans. R. Soc. Lond.
1–13. B Biol. Sci., 362, 1601–1613.
D’Esposito, M., Postle, B. R., Ballard, D., & Lease, J. (1999). Henson, R. N. (2003). Neuroimaging studies of priming. Prog. Neu-
Maintenance versus manipulation of information held in working robiol., 70, 53–81.
memory: An event-related fMRI study. Brain Cogn., 41, 66–86. Henson, R. N. A., Rugg, M. D., Shallice, T., & Dolan, R. J.
D’Esposito, M., Postle, B. R., Jonides, J., & Smith, E. E. (1999). (2000). Confidence in recognition memory for words: Dissociat-
The neural substrate and temporal dynamics of interference ing right prefrontal roles in episodic retrieval. J. Cogn. Neurosci.,
effects in working memory as revealed by event-related func- 12, 913–923.
tional MRI. Proc. Natl. Acad. Sci. USA, 96, 7514–7519. Henson, R. N. A., Shallice, T., Josephs, O., & Dolan, R. J.
Dobbins, I. G., Foley, H., Schacter, D. L., & Wagner, A. D. (2002). Functional magnetic resonance imaging of proactive
(2002). Executive control during episodic retrieval: Multiple interference during spoken cued recall. NeuroImage, 17,
prefrontal processes subserve source memory. Neuron, 35, 543–558.
989–996. Janowsky, J. S., Shimamura, A. P., & Squire, L. R. (1989). Source
Dobbins, I. G., Schnyer, D. M., Verfaellie, M., & Schacter, memory impairment in patients with frontal lobe lesions. Neuro-
D. L. (2004). Cortical activity reductions during repetition psychologia, 27, 1043–1056.
priming can result from rapid response learning. Nature, 428, Johansson, M., Aslan, A., BÄuml, K. H., Gabel, A., &
316–319. Mecklinger, A. (2007). When remembering causes forgetting:
Dobbins, I. G., Simons, J. S., & Schacter, D. L. (2004). fMRI Electrophysiological correlates of retrieval-induced forgetting.
evidence for separable and lateralized prefrontal memory moni- Cereb. Cortex, 17, 1335–1341.
toring processes. J. Cogn. Neurosci., 16, 908–920. Jonides, J., & Nee, D. E. (2006). Brain mechanisms of proactive
Dobbins, I. G., & Wagner, A. D. (2005). Domain-general and interference in working memory. Neuroscience, 139, 181–193.
domain-sensitive prefrontal mechanisms for recollecting events Jonides, J., Smith, E. E., Marshuetz, C., Koeppe, R. A., &
and detecting novelty. Cereb. Cortex, 15, 1768–1778. Reuter-Lorenz, P. A. (1998). Inhibition in verbal working
Dolan, R. J., & Fletcher, P. C. (1997). Dissociating prefrontal memory revealed by brain activation. Proc. Natl. Acad. Sci. USA,
and hippocampal function in episodic memory encoding. Nature, 95, 8410–8413.
388, 582–585. Kerns, J. G., Cohen, J. D., MacDonald, A. W., 3rd, Cho,
Fellows, L. K., & Farah, M. J. (2005). Is anterior cingulate cortex R. Y., Stenger, V. A., & Carter, C. S. (2004). Anterior cingu-
necessary for cognitive control? Brain, 128, 788–796. late conflict monitoring and adjustments in control. Science, 303,
Feredoes, E., Tononi, G., & Postle, B. R. (2006). Direct 1023–1026.
evidence for a prefrontal contribution to the control of proactive Koechlin, E., Basso, G., Pietrini, P., Panzer, S., & Grafman,
interference in verbal working memory. Proc. Natl. Acad. Sci. USA, J. (1999). The role of the anterior prefrontal cortex in human
103, 19530–19534. cognition. Nature, 399, 148–151.
Fletcher, P. C., & Henson, R. N. (2001). Frontal lobes and human Koechlin, E., & Hyafil, A. (2007). Anterior prefrontal function
memory: Insights from functional neuroimaging. Brain, 124, and the limits of human decision-making. Science, 318,
849–881. 594–598.
Fletcher, P. C., Shallice, T., & Dolan, R. J. (2000). “Sculpting Koechlin, E., & Jubault, T. (2006). Broca’s area and the
the response space”—An account of left prefrontal activation at hierarchical organization of human behavior. Neuron, 50, 963–
encoding. NeuroImage, 12, 404–417. 974.
Fuster, J. M. (2001). The prefrontal cortex—An update: Time is Koechlin, E., Ody, C., & Kouneiher, F. (2003). The architecture
of the essence. Neuron, 30, 319–333. of cognitive control in the human prefrontal cortex. Science, 302,
Gabrieli, J. D. E., Desmond, J. E., Demb, J. B., Wagner, 1181–1185.
A. D., Stone, M. V., Vaidya, C. J., & Glover, G. H. (1996). Koechlin, E., & Summerfield, C. (2007). An information theoreti-
Functional magnetic resonance imaging of semantic memory cal approach to prefrontal executive function. Trends Cogn. Sci.,
processes in the frontal lobes. Psychol. Sci., 7, 278–283. 11, 229–235.
race, kuhl, badre, and wagner: cognitive control and memory 721
Kostopoulos, P., & Petrides, M. (2003). The mid-ventrolateral Petrides, M. (2006). The rostro-caudal axis of cognitive control
prefrontal cortex: Insights into its role in memory retrieval. Eur. processing within lateral frontal cortex. In S. Dehaene,
J. Neurosci., 17, 1489–1497. J.-R. Duhamel, M. D. Hauser, & G. Rizzolatti (Eds.), From monkey
Kostopoulos, P., & Petrides, M. (2008). Left mid-ventrolateral brain to human brain: A Fyssen Foundation Symposium (pp. 293–314).
prefrontal cortex: Underlying principles of function. Eur. Cambridge, MA: MIT Press.
J. Neurosci., 27, 1037–1049. Petrides, M., & Pandya, D. N. (1999). Dorsolateral prefrontal
Kuhl, B. A., Dudukovic, N. M., Kahn, I., & Wagner, A. D. cortex: Comparative cytoarchitectonic analysis in the human
(2007). Decreased demands on cognitive control reveal and the macaque brain and corticocortical connection patterns.
the neural processing benefits of forgetting. Nat. Neurosci., 10, Eur. J. Neurosci., 11, 1011–1036.
908–914. Petrides, M., & Pandya, D. N. (2002). Comparative cytoarchitec-
Kuhl, B. A., Kahn, I., Dudukovic, N. M., & Wagner, A. D. tonic analysis of the human and the macaque ventrolateral pre-
(2008). Overcoming suppression in order to remember: Contri- frontal cortex and corticocortical connection patterns in the
butions from anterior cingulate and ventrolateral prefrontal monkey. Eur. J. Neurosci., 16, 291–310.
cortex. Cogn. Affective Behav. Neurosci., 8, 211–221. Petrides, M., & Pandya, D. N. (2007). Efferent association path-
Law, J. R., Flanery, M. A., Wirth, S., Yanike, M., Smith, ways from the rostral prefrontal cortex in the macaque monkey.
A. C., Frank, L. M., Suzuki, W. A., Brown, E. N., & Stark, J. Neurosci., 27, 11573–11586.
C. E. L. (2005). Functional magnetic resonance imaging activity Postle, B. R., Berger, J. S., & D’Esposito, M. (1999). Functional
during the gradual acquisition and expression of paired-associate neuroanatomical double dissociation of mnemonic and execu-
memory. J. Neurosci., 25, 5720–5729. tive control processes contributing to working memory perfor-
Levy, B. J., & Anderson, M. C. (2002). Inhibitory processes mance. Proc. Natl. Acad. Sci. USA, 96, 12959–12964.
and the control of memory retrieval. Trends Cogn. Sci., 6, 299– Race, E., Shanker, S., & Wagner, A. D. (2008). Neural priming
305. in human frontal cortex: Multiple forms of learning reduce
Lundstrom, B. N., Ingvar, M., & Petersson, K. M. (2005). The demands on the prefontal executive system. J. Cogn. Neurosci.,
role of precuneus and left inferior frontal cortex during source 1–16.
memory episodic retrieval. NeuroImage, 27, 824–834. Raichle, M. A., Feiz, J. A., Videen, T. O., MacLeod, A. M. K.,
MacDonald, A. W., 3rd, Cohen, J. D., Stenger, V. A., & Carter, Pardo, J. V., Fox, P. T., & Petersen, S. E. (1994).
C. S. (2000). Dissociating the role of the dorsolateral prefrontal Practice-related changes in human functional anatomy during
and anterior cingulate cortex in cognitive control. Science, 288, non-motor learning. Cereb. Cortex, 4, 8–26.
1835–1838. Ramnani, N., & Owen, A. M. (2004). Anterior prefrontal cortex:
McNab, F., & Klingberg, T. (2008). Prefrontal cortex and basal Insights into function from anatomy and neuroimaging. Nat. Rev.
ganglia control access to working memory. Nat. Neurosci., 11, Neurosci., 5, 184–194.
103–107. Roediger, H. L., & Karpicke, J. D. (2006). Test-enhanced learn-
Miller, E. K., & Cohen, J. D. (2001). An integrative theory of ing: Taking memory tests improves long-term retention. Psychol.
prefrontal cortex function. Annu. Rev. Neurosci., 24, 167–202. Sci., 17, 249–255.
Murray, L. J., & Ranganath, C. (2007). The dorsolateral Roediger, H. L., III, & McDermott, K. B. (1993). Implicit
prefrontal cortex contributes to successful relational memory memory in normal human subjects. In H. Spinnler & F. Boller
encoding. J. Neurosci., 27, 5515–5522. (Series Eds.) & F. Boller & J. Grafman (Vol. Eds.), Handbook of
Nee, D. E., Jonides, J., & Berman, M. G. (2007). Neural neuropsychology (pp. 63–131). Amsterdam: Elsevier.
mechanisms of proactive interference-resolution. NeuroImage, 38, Rugg, M. D., Henson, R. N., & Robb, W. G. (2003). Neural cor-
740–751. relates of retrieval processing in the prefrontal cortex during
Nolde, S. F., Johnson, M. K., & D’Esposito, M. (1998). Left recognition and exclusion tasks. Neuropsychologia, 41, 40–52.
prefrontal activation during episodic remembering: An event- Rugg, M. D., & Wilding, E. L. (2000). Retrieval processing and
related fMRI study. NeuroReport, 9, 3509–3514. episodic memory. Trends Cogn. Sci., 4, 108–115.
O’Reilly, R. C., & Frank, M. J. (2006). Making working memory Rypma, B., Prabhakaran, V., Desmond, J. E., Glover, G. H., &
work: A computational model of learning in the prefrontal cortex Gabrieli, J. D. (1999). Load-dependent roles of frontal brain
and basal ganglia. Neural Comput., 18, 283–328. regions in the maintenance of working memory. NeuroImage, 9,
Owen, A. M., Evans, A. C., & Petrides, M. (1996). Evidence for 216–226.
a two-stage model of spatial working memory processing within Salmon, E., Van der Linden, M., Collette, F., Delfiore, G.,
the lateral frontal cortex: A positron emission tomography study. Maquet, P., Degueldre, C., Luxen, A., & Franck, G. (1996).
Cereb. Cortex, 6, 31–38. Regional brain activity during working memory tasks. Brain, 119,
Petrides, M. (1994). Frontal lobes and behaviour. Curr. Opin. 1617–1625.
Neurobiol., 4, 207–211. Schacter, D. L., & Buckner, R. L. (1998). Priming and the brain.
Petrides, M. (1996). Specialized systems for the processing of mne- Neuron, 20, 185–195.
monic information within the primate frontal cortex. Philos. Schacter, D. L., Dobbins, I. G., & Schnyer, D. M. (2004). Speci-
Trans. R. Soc. Lond. B Biol. Sci., 351, 1455–1461. ficity of priming: A cognitive neuroscience perspective. Nat. Rev.
Petrides, M. (2000). The role of the mid-dorsolateral prefrontal Neurosci., 5, 853–862.
cortex in working memory. Exp. Brain Res., 133, 44–54. Schacter, D. L., Wig, G. S., & Stevens, W. D. (2007). Reductions
Petrides, M. (2002). The mid-ventrolateral prefrontal cortex in cortical activity during priming. Curr. Opin. Neurobiol., 17,
and active mnemonic retrieval. Neurobiol. Learn. Mem., 78, 171–176.
528–538. Schnyer, D. M., Dobbins, I. G., Nicholls, L., Davis, S.,
Petrides, M. (2005). Lateral prefrontal cortex: Architectonic and Verfaellie, M., & Schacter, D. L. (2007). Item to
functional organization. Philos. Trans. R. Soc. Lond. B Biol. Sci., decision mapping in rapid response learning. Mem. Cogn., 35,
360, 781–795. 1472–1482.
722 memory
Schnyer, D. M., Dobbins, I. G., Nicholls, L., Schacter, in patients with focal frontal lesions: A neuropsychological
D. L., & Verfaellie, M. (2006). Rapid response learning in test of neuroimaging findings. Proc. Natl. Acad. Sci. USA, 95,
amnesia: Delineating associative learning components in repeti- 15855–15860.
tion priming. Neuropsychologia, 44, 140–149. Tulving, E., & Schacter, D. L. (1990). Priming and human
Shimamura, A. P., Jurica, P. J., Mangels, J. A., Gershberg, F. B., memory systems. Science, 247, 301–306.
& Knight, R. T. (1995). Susceptibility to memory interference Wagner, A. D. (2002). Cognitive control and episodic memory:
effects following frontal lobe damage: Findings from tests of Contributions from prefrontal cortex. In L. R. Squire & D. L.
paired-associate learning. J. Cogn. Neurosci., 7, 144–152. Schacter (Eds.), Neuropsychology of memory (3rd ed., pp. 174–192).
Simons, J. S., & Spiers, H. J. (2003). Prefrontal and medial tem- New York: Guilford Press.
poral lobe interactions in long-term memory. Nat. Rev. Neurosci., Wagner, A. D., Desmond, J. E., Demb, J. B., Glover, G. H., &
4, 637–648. Gabrieli, J. D. E. (1997). Semantic repetition priming for verbal
Smith, M. L., Leonard, G., Crane, J., & Milner, B. (1995). The and pictorial knowledge: A functional MRI study of left inferior
effects of frontal- or temporal-lobe lesions on susceptibility to prefrontal cortex. J. Cogn. Neurosci., 9, 714–726.
interference in spatial memory. Neuropsychologia, 33, 275–285. Wagner, A. D., & Koutstaal, W. (2002). Priming. In V. S.
Sohn, M. H., Goode, A., Stenger, V. A., Carter, C. S., Ramachandran (Ed.), Encyclopedia of the human brain (vol. 4, pp.
& Anderson, J. R. (2003). Competition and representation 27–46). San Diego: Academic Press.
during memory retrieval: Roles of the prefrontal cortex and Wagner, A. D., Maril, A., Bjork, R. A., & Schacter, D. L.
the posterior parietal cortex. Proc. Natl. Acad. Sci. USA, 100, (2001). Prefrontal contributions to executive control: fMRI
7412–7417. evidence for functional distinctions within lateral prefrontal
Sohn, M. H., Goode, A., Stenger, V. A., Jung, K. J., Carter, cortex. NeuroImage, 14, 1337–1347.
C. S., & Anderson, J. R. (2005). An information-processing Wagner, A. D., ParÉ-Blagoev, E. J., Clark, J., & Poldrack,
model of three cortical regions: Evidence in episodic memory R. A. (2001). Recovering meaning: Left prefrontal cortex guides
retrieval. NeuroImage, 25, 21–33. controlled semantic retrieval. Neuron, 31, 329–338.
Thompson-Schill, S. L., D’Esposito, M., Aguirre, G. K., & Wagner, A. D., Schacter, D. L., Rotte, M., Koutstaal, W.,
Farah, M. J. (1997). Role of left inferior prefrontal cortex in Maril, A., Dale, A. M., Rosen, B. R., & Buckner, R. L.
retrieval of semantic knowledge: A reevaluation. Proc. Natl. Acad. (1998). Building memories: Remembering and forgetting of
Sci. USA., 94, 14792–14797. verbal experiences as predicted by brain activity. Science, 281,
Thompson-Schill, S. L., D’Esposito, M., & Kan, I. P. (1999). 1188–1191.
Effects of repetition and competition on activity in left prefrontal Walton, M. E., Devlin, J. T., & Rushworth, M. F. (2004). Inter-
cortex during word generation. Neuron, 23, 513–522. actions between decision making and performance monitoring
Thompson-Schill, S. L., Jonides, J., Marshuetz, C., Smith, E. E., within prefrontal cortex. Nat. Neurosci., 7, 1259–1265.
D’Esposito, M., Kan, I. P., Knight, R. T., & Swick, D. (2002). Wiggs, C. L., & Martin, A. (1998). Properties and mechanisms of
Effects of frontal lobe damage on interference effects in working perceptual priming. Curr. Opin. Neurobiol., 8, 227–233.
memory. Cogn. Affective Behav. Neurosci., 2, 109–120. Wood, J. N., & Grafman, J. (2003). Human prefrontal cortex:
Thompson-Schill, S. L., Swick, D., Farah, M. J., D’Esposito, M., Processing and representational perspectives. Nat. Rev. Neurosci.,
Kan, I. P., & Knight, R. T. (1998). Verb generation 4, 139–147.
race, kuhl, badre, and wagner: cognitive control and memory 723
49 Phases of Influence: How Emotion
Modulates the Formation and
Retrieval of Declarative Memories
elizabeth a. kensinger
abstract We tend to remember emotional experiences long by Buchanan & Adolphs, 2004). This mnemonic benefit
after we have forgotten more mundane ones. The beneficial effects conveyed by emotion has long been acknowledged (see
of emotion on memory appear to arise through influences between Colgrove, 1899, for a study examining memory for the assas-
emotion-specific processes and domain-general sensory and mne-
monic processes. These interactions arise at every phase of memory,
sination of President Abraham Lincoln), but it is only within
including encoding, consolidation, and retrieval. As this chapter recent decades that research has begun to elucidate the
describes, emotion heightens perception and attention during processes that give rise to it. Though animal research has
encoding and enhances the likelihood that information is elabo- clarified many of the mechanisms that support emotion’s
rated and organized. Emotion also modulates postencoding con- influence on memory (reviewed by McGaugh, 2004; Phelps
solidation processes, increasing the likelihood that an emotional
& LeDoux, 2005), this chapter focuses exclusively on the
event is maintained in a durable memory trace. Emotion continues
to wield its influence at retrieval, increasing the likelihood that effects of emotion on declarative memory in humans,
information is retrieved and also augmenting the subjective vivid- describing how behavioral, neuropsychological, neurophar-
ness associated with the retrieved memory. This chapter discusses macological, and neuroimaging studies have elucidated how
the neural processes that underlie these effects of emotion on emotion influences each phase of memory (see figure 49.1).
memory. Particular emphasis is placed on understanding the role
Because the study of emotional memory in humans is still a
of the amygdala in emotional memory and how the amygdala
exerts its effects by means of interactions with other sensory and relatively young topic of investigation, this chapter concludes
mnemonic regions. with a discussion of directions for future research, including
the need to consider individual differences when assessing
the effects of emotion on memory.
Events often elicit short-lived cognitive, physiological, and
somatic reactions, otherwise known as emotions (see Barrett, The influence of emotion during encoding
2006; Izard, 2007; Frijda & Sundararajan, 2007; Panksepp,
2007; Scherer, 2000, for discussion of the best way to think It is well known that the way in which information is
about the term). Emotional reactions accompany many of processed initially has downstream consequences on the
life’s experiences, particularly those we care most about likelihood that the information is remembered later (Craik
remembering. It is, therefore, critical to understand how & Lockhart, 1972), with information that is detected,
emotion influences memory processes, as without this knowl- attended, and elaborated upon being the most likely to
edge, it would be nearly impossible to discern how memory be remembered (Craik, Govoni, Naveh-Benjamin, &
operates in everyday life. This realization has sparked inter- Anderson, 1996). Many of emotion’s effects on memory
est in the study of “emotional memory,” or the examination appear to arise through broader influences on the way in
of how memories for experiences that triggered an emo- which emotional information is detected and attended at the
tional response are formed and retrieved. outset. Emotional stimuli are noticed more quickly and more
Extensive research on emotional memory demonstrates often than nonemotional ones (Anderson, 2005; Fox, Russo,
that emotional experiences tend to be remembered better Bowles, & Dutton, 2001; Leclerc & Kensinger, 2008;
than experiences that lack emotional importance, an effect Ohman, Flykt, & Esteves, 2001; Phelps, Ling, & Carrasco,
referred to as “emotional memory enhancement” (reviewed 2006; Williams, Mathews, & MacLeod, 1996), and the pro-
cessing of emotional information is prioritized so that it can
elizabeth a. kensinger Department of Psychology, Boston occur even when attentional resources are taxed (reviewed
College, Chestnut Hill; Athinoula A. Martinos Center for by Dolan & Vuilleumier, 2003; Pessoa, 2005; Vuilleumier &
Biomedical Imaging, Charlestown, Massachusetts Driver, 2007). Once an emotional item is detected, attention
also is more likely to be focused and sustained on it (e.g., the fusiform gyrus (e.g., Noesselt, Driver, Heinze, & Dolan,
Armony & Dolan, 2002; Mogg, Bradley, de Bono, & Painter, 2005; Vuilleumier, Richardson, Armony, Driver, &
1997), and individuals are more likely to elaborate on the Dolan, 2004) and occipital lobe (Tabert et al., 2001; figure
emotional information, connecting it with existing semantic 49.2A) during the processing of emotional information.
or autobiographical information (e.g., Buchanan, Etzel, Although these correlations cannot establish the directionality
Adolphs, & Tranel, 2006; Talmi & Moscovitch, 2004; Talmi, of the modulation, they are consistent with the proposal
Schimmack, Paterson, & Moscovitch, 2007). Each of these that the amygdala can modulate sensory functioning.
factors can increase the likelihood that emotional informa- Stronger evidence for an amygdala-mediated influence
tion is encoded into a stable memory trace. In fact, Talmi, on sensory activity came from a study in which Vuilleumier
Luk, McGarry, & Moscovitch (2007) have proposed that and colleagues (2004) asked individuals with varying amounts
direct modulation of memory may not be required for short- of amygdala damage to view fearful and neutral faces
term enhancements in the retention of emotional informa- while in an fMRI scanner. Only patients with a functioning
tion. Rather, emotion’s modulation of domain-general amygdala showed fusiform modulation in response to the
processes—enhanced attention, distinctive encoding, and facial expression, with greater fusiform activity to fearful
information elaboration and organization—may be suffi- than to neutral faces. In fact, there was a strong correlation
cient to mediate emotion’s benefit on retention of informa- between the amount of intact amygdala and the amount
tion over relatively short delays. As will be described of fusiform modulation in response to the fearful faces,
subsequently, neuroimaging may provide one means to consistent with the proposal that the amygdala has a
clarify the extent to which emotion’s influence on memory modulatory effect on visual processing regions, increasing
is mediated through influences on information processing the likelihood that emotional information is detected and
rather than dependent on direct modulation of memory processed.
binding and consolidation processes (see also Talmi, Interactions between the amygdala and sensory regions
Anderson, Riggs, Caplan, & Moscovitch, 2008). also seem to enhance memory for the visual details of
emotional stimuli (Mickley & Kensinger, 2008; Kensinger,
The Effect of Emotion on Information Detection Garoff-Eaton, & Schacter, 2007b). Participants are more
and Attention Allocation Many of emotion’s influ- likely to remember the precise visual attributes of a negative
ences on detection and attention appear to arise through item as compared to a neutral one; for example, they recog-
interactions between the amygdala and other sensory nize exactly which grenade they have seen more often than
regions. It is proposed that once the amygdala is activated they recognize which blender they have seen (Kensinger,
by emotional stimuli, it can modulate the functioning of Garoff-Eaton, & Schacter, 2006). This effect appears to arise
sensory cortices to assure that emotional information is from interactions between the amygdala and the fusiform
attended (LeDoux, 1995). This hypothesis is anatomically gyrus during encoding. As compared to the processing of
plausible, because the amygdala has strong reciprocal neutral items, during the processing of negative items that
connections with most sensory regions (Amaral, Price, will later be remembered with precise visual detail, there is
Pitkanen, & Carmichael, 1992; Amaral, 2003). The increased activity in the amygdala and the right fusiform
hypothesis also is supported by neuroimaging studies that gyrus. There also is a strong correlation between the amount
reveal strong correlations between the amount of activity of activity in these two regions during the processing of nega-
in the amygdala and in visual processing regions including tive items, whereas no such correlation exists during the
726 memory
Relative Pixel Intensity in
2000
2.1
Signal Change in
Occipital Cortex
1.9
R Fusiform
1.7
1.5
1000 1.3
1.1
0.9
0.7
0 0.5
400 1200 2000 0 0.1 0.2 0.3 0.4 0.5 0.6
Figure 49.2 During the processing of negatively emotional paper). These correlations are particularly strong during the encod-
information, there often are robust correlations between amygdala ing of negative items that will later be remembered with specific
activity and activity in sensory processing regions (panel A, adapted visual details (panel B, data from Kensinger, Garoff-Eaton, &
from Tabert et al., 2001; images depict coordinates reported in that Schacter, 2007b).
processing of neutral items (Kensinger et al., 2007b; figure items may be remembered in a detailed fashion because
49.2B). The right fusiform gyrus is a region that is associated attention is focused on the intrinsic details of those items;
with the processing of visually specific details (e.g., Koutstaal however, by focusing on those emotional elements, other
et al., 2001) and with subsequent memory for the visual event details may be missed or easily forgotten.
details of neutral items (Garoff, Slotnick, & Schacter, 2005). It is interesting to note that this attentional focusing
Therefore, it makes sense that enhanced activity within this on emotional items does not arise through engagement of
region could increase the likelihood that the visual details of the same frontoparietal attention circuits that guide atten-
a negatively emotional item are remembered. tion toward nonemotional, task-relevant information
Modulation of sensory processes does not appear to be the (reviewed by Corbetta & Shulman, 2002). Rather, when
only avenue by which emotion enhances memory for visual attention is focused on emotional information, it appears to
detail. The ability to remember the visual details of emo- be through engagement of emotion-specific processes that
tional items also may be tied to the way in which attention are brought online when a task requires engagement of
is allocated during encoding. Emotion does not seem to motivational processes and of attention to affective stimuli
uniformly enhance memory for all aspects of an experience. (e.g., Robbins & Everitt, 1996; Schultz, 2000). This dissocia-
Rather, some event details are remembered well and others tion suggests that emotional information may be attended
are readily forgotten (reviewed by Buchanan & Adolphs, as a result of the engagement of emotion-specific processes
2002; Kensinger, 2007; Mather, 2007; Reisberg & Heuer, rather than the domain-general ones that guide attention
2004). For example, when presented with complex visual toward any task-relevant piece of information (and see
scenes, it often is that case that the visual details of the Vuilleumier & Driver, 2007, for further discussion). Thus,
emotional aspects are remembered well but the visual when an individual is affectively focused on an item, atten-
details of the nonemotional aspects are remembered poorly tion appears to be drawn to the intrinsic attributes of
(e.g., Kensinger, Garoff-Eaton, & Schacter, 2007a; Payne, that emotional item. This selective attention seems to have
Stickgold, Swanberg, & Kensinger, 2008; figure 49.3). An downstream mnemonic consequences, leading those intrin-
fMRI study revealed that activity in an affective-attentional sic item details to be remembered better than elements only
network, including the right orbitofrontal cortex, the ante- extrinsically linked to the emotional item (see Kensinger,
rior cingulate gyrus, and the caudate nucleus, corresponds 2007; Mather, 2007, for further discussion). These findings
with the ability to remember the visual details of an emo- highlight that even when emotion’s effects on memory seem
tional item but also with the inability to remember other to be mediated by influences on domain-general processes
aspects associated with the item’s presentation, such as what (such as attention allocation), this mediation may actually
decision a person made about an item (Kensinger et al., reflect emotion-specific modulation of sensory and atten-
2007b; figure 49.4). This finding suggests that emotional tional processes.
Figure 49.3 After experiencing an emotional event—such as a details, such as what the street looked like (B). Modulation of atten-
car accident (A)—participants may retain good memory for the tional focusing at encoding, as well as of consolidation processes,
details of the accident itself, but poor memory for the contextual appears to contribute to this effect.
Effects on Elaboration and Organization of Input attention is divided), the mnemonic enhancement for
In addition to these effects of emotion on information nonarousing emotional items disappears (Bush & Geer, 2001;
detection and attention allocation, emotion also appears to Kensinger & Corkin, 2004; Kern, Libkuman, Otoni, &
influence the likelihood that information is elaborated and Holmes, 2005).
organized. It is well known that events that elicit negative
emotions are elaborated and rehearsed more often than Conclusions About Encoding Nearly all studies
events that elicit no emotion (reviewed by Ochsner & examining memory for emotional information have revealed
Schacter, 2003). Emotional items also benefit from a strong correlation between how active the amygdala is
organizational clustering to a greater degree than do during the encoding of emotional information and how
nonemotional items (Buchanan, Etzel, et al., 2006; Talmi & well that emotional information is remembered (reviewed
Moscovitch, 2004; Talmi, Schimmack, et al., 2007). Because by Hamann, 2001; LaBar & Cabeza, 2006). These correlations
information that is elaborated and well organized is more exist both across participants (e.g., Cahill et al., 1996) and
likely to be remembered (Craik & Lockhart, 1972), it within a single participant (e.g., Canli, Zhao, Brewer,
makes sense that if emotion provides an organizing structure Gabrieli, & Cahill, 2000). They arise in tasks using verbal
or a basis for elaboration, these features would convey stimuli (e.g., Erk et al., 2003; Kensinger & Corkin, 2004),
benefits to memory. Indeed, a number of behavioral studies slide shows (e.g., Cahill et al., 1996), facial expressions (e.g.,
have demonstrated an important role for elaborative and Sergerie, Lepage, & Armony, 2006), and colored photographs
organizational processes in enhancing emotional memory (e.g., Dolcos et al., 2004; Sharot, Delgado, & Phelps, 2004),
(Phelps et al., 1998; Talmi & Moscovitch, 2004), and and they hold across a range of encoding tasks. Though
neuroimaging studies have confirmed that regions implicated many of these studies have focused on the amygdala’s
in elaborative processing—including the lateral prefrontal modulation of hippocampal binding and consolidation
cortex—often are disproportionately recruited during the processes (an issue we will return to in the next section),
successful encoding of emotional information (e.g., Dolcos, the studies reviewed have revealed that emotion can exert
LaBar, & Cabeza, 2004; Kensinger & Corkin, 2004; Maratos, many of its influences on memory by means of alterations in
Allan, & Rugg, 2000). These elaborative processes appear earlier stages of information processing. In particular,
to be particularly essential for boosting the encoding of engagement of emotion-specific processes, implemented by
emotional information that is not highly arousing (e.g., the amygdala and orbitofrontal cortex, can influence memory
Buchanan, Etzel, et al., 2006; Bush & Geer, 2001; Kensinger by modulating sensory and conceptual processes. These
& Corkin, 2003; Talmi, Schimmack, et al., 2007), perhaps interactions can ensure that emotional information is
because these items do not benefit from the same amyg- detected, attended, and elaborated (see also Duncan &
dala-mediated enhancements in detection and attention Barrett, 2007; Talmi et al., 2008). Thus, regardless of
(reviewed by Kensinger, 2004). Thus, when elaborative the particular stimuli, encoding instructions, or task design,
processes cannot be engaged easily (for example, when the engagement of emotion-specific processes during
728 memory
information is more likely to be detected, attended, and
elaborated, then it would make sense that the information
would be remembered well after both short and long
delays. Interestingly, however, the effects of emotion often
become exaggerated after a long delay (e.g., Kleinsmith
& Kaplan, 1963; Walker & Tarte,1963; Sharot, Verfaellie,
& Yonelinas, 2007; Sharot & Yonelinas, 2008), and damage
to the amygdala tends to disproportionately influence the
retention of emotional information over long delays
while having a lesser influence on the ability to retain
emotional information for only a short duration of time
(Phelps, LaBar, & Spencer, 1997; Phelps et al., 1998; LaBar
& Phelps, 1998). These findings cannot easily be explained
by effects of emotion on encoding processes. Rather, these
results point to the ability of emotion to influence the
likelihood that memories are solidified into stable long-
term traces. If emotion—through the actions of the amyg-
dala—serves to increase the probability that a memory
is consolidated, then it should follow that the benefit for
emotional compared to nonemotional memories increases
as the retention interval lengthens and that amygdala damage
disrupts this effect. Indeed, as will be described later, there
is abundant evidence to suggest that the amygdala modu-
lates consolidation processes, enhancing the likelihood
that an emotional memory can be remembered after a long
delay.
730 memory
has been reviewed thoroughly (by Buchanan, 2007), and so
here I will focus on the specific question of the amygdala’s
role during retrieval.
It is well established that the amygdala is active during the
retrieval of emotional memories. Neuroimaging studies have
revealed that amygdala engagement occurs both when the
retrieval cue itself is emotional (e.g., Dolan, Lane, Chua, &
Fletcher, 2000; Kensinger & Schacter, 2005b) and when the
cue is neutral but the associated study context is emotional
(e.g., Maratos, Dolan, Morris, Henson, & Rugg, 2001;
Smith, Henson, Dolan, & Rugg, 2004; A. Smith, Henson,
Rugg, & Dolan, 2005; Somerville, Wig, Whalen, & Kelley,
2006; Sterpenich et al., 2006). In one study, participants
were asked to view objects that were presented against either
neutral or emotional backgrounds. During recognition, they
were shown the objects in isolation, and they had to indicate
whether each object had been studied previously. The criti-
cal finding was that amygdala activity was greater during
retrieval of items that had been studied with an emotional
context than during retrieval of items that had been studied
Figure 49.5 When participants study visual scenes and are tested with a nonemotional context (A. Smith et al., 2004). The
on their memory for those scenes after either a 12-hour delay fact that amygdala activity was influenced by the study
including a night of sleep or a 12-hour period of time spent awake,
memory for the negative objects within scenes is selectively context, even when the retrieval cue itself was neutral, sug-
enhanced across a sleeping as compared to a waking delay. Memory gests that amygdala engagement during retrieval may not
for the backgrounds of those same scenes is unaffected by whether merely represent an emotional response to a retrieval cue.
the delay included time spent awake or time spent asleep. (Data Rather, amygdala activity may be directly tied to the recov-
from Payne, Stickgold, Swanberg, & Kensinger, 2008.) ery of emotionally relevant information present during the
encoding episode.
appreciation for the fact that not all aspects of an emotional Though neuroimaging studies indicate that the amygdala
experience are equally likely to be remembered (e.g., is involved in the retrieval of emotional memories, they
Reisberg & Heuer, 2004; Kensinger, 2007; Mather, 2007). cannot speak to the necessity of the region. Indeed, there
Although many of these selective effects likely arise from have been extensive discussions about whether the amygdala
attentional focusing at encoding (as discussed in the previous is essential for the retrieval of emotional memories (see
section), some of the effects may also arise through focal Nader, 2003; LeDoux, 2000, for discussion). At least with
influences on consolidation. As noted earlier, sleep does not regard to the retrieval of emotional autobiographical
appear to benefit consolidation of all event attributes to an memories, recent patient studies have provided evidence
equal degree; rather, the benefits seem to be particularly that this region does play an essential role. Patients with
pronounced for those details that are intrinsic to the damage to the amygdala have difficulty retrieving emo-
emotional items. Further research is needed to reveal the tional memories, even of events that were experienced
extent to which encoding processes versus postencoding prior to the onset of their amygdala damage (Buchanan,
consolidation mechanisms lead to the focal enhancements Tranel, & Adolphs, 2005, 2006). Even when they do
in emotional memory, leading only some attributes of an recall emotional experiences, patients with amygdala
emotional event to be remembered well. lesions rate them as being less emotional, as well as less
vivid, than do control participants, suggesting that without
The influence of emotion during retrieval the amygdala, emotional memories cannot be remembered
as often or with the same qualitative richness as with an
In comparison to the extensive number of studies that have intact amygdala.
examined the effects of emotion on encoding and consolida- These studies cannot clarify the specific role played by the
tion processes, relatively few studies have investigated the amygdala during the retrieval of emotional memories. In
influence of emotion during memory retrieval. However, the particular, it is unclear whether the amygdala’s retrieval-
extant data indicate that emotion can modulate retrieval related activity leads to or is caused by successful retrieval.
processes. The role of emotion in memory retrieval recently It is widely accepted that memory retrieval consists of at
732 memory
Conclusions About Retrieval Though it is clear that when it comes to emotion-memory interactions, there appear
amygdala engagement enhances encoding processes and to be important individual differences. The sex of an indi-
facilitates consolidation, it is more widely debated whether vidual can influence the neural processes that correspond
the amygdala confers a benefit upon emotional memory with emotional memory enhancement, with men often
retrieval. There is some evidence to suggest that limbic showing more right-lateralized amygdala activity and women
engagement primarily inflates a person’s confidence in a showing more left-lateralized amygdala activity (reviewed
memory (e.g., Sharot et al., 2004); but there is other evidence by Cahill, 2003; Hamann, 2005). Sex also can influence the
that limbic engagement at retrieval may be tied to magnitude of memory enhancement or memory trade-off
remembering event details (e.g., Kensinger & Schacter, elicited by emotion (discussed in Hamann, 2005). Personal-
2005b, 2008; A. Smith et al., 2006). It seems likely that, just ity characteristics, such as how neurotic someone is,
as with its modulation of encoding and consolidation also seem to influence the amount of amygdala activity
processes, the amygdala’s influence during retrieval may elicited by stimuli (Hamann & Canli, 2004) and the likeli-
critically depend on the types of details that a person is trying hood that emotional information is detected (discussed by
to recover. Perhaps amygdala engagement during retrieval Duncan & Barrett, 2007), perhaps having downstream
facilitates the recovery of details intrinsically linked to an effects on the magnitude of emotional memory enhance-
experience (e.g., the details of the emotional aspect of the ment demonstrated. A person’s level of anxiety or cognitive
event) but does not help with the recovery of details more abilities also can influence emotional memory enhancement
peripheral to the elicited emotion (e.g., the nonemotional and the extent of mnemonic trade-off elicited when an
context in which the event occurred). Indeed, a study by emotional item is embedded in a nonemotional context
Sharot, Martorella, Delgado, and Phelps (2007) revealed (Waring, Payne, Schacter, & Kensinger, in press). A person’s
that enhanced amygdala activity during retrieval was age also has fundamental influences on how emotional infor-
associated with a reduction of activity in regions associated mation is processed and remembered (reviewed by Kens-
with retrieval of broader spatiotemporal context. It may be inger & Leclerc, in press; Mather, 2006). These studies
that when the amygdala is engaged, details intrinsic to the emphasize that research must examine not only how emotion
emotional aspects of the event are remembered, whereas impacts memory across all individuals, but also how indi-
retrieval of more peripheral, contextual details is impeded. vidual differences influence the nature of emotion-memory
Future research will do well to examine the validity of this interactions.
hypothesis. Third, as the resolution of MRI scans increases, it will be
important for future research to move beyond thinking
Concluding remarks and future directions about the amygdala and the hippocampal memory system
as single entities and to more thoroughly investigate how
Emotion appears to influence the processes engaged during reciprocal influences are likely to depend on the particular
every phase of memory, but there are still many unanswered subdivisions of each of these regions. Animal research has
questions regarding how emotion exerts its influence. First, suggested that not all regions of the amygdala play the same
as alluded to in the preceding sections, we do not yet have modulatory role and that amygdalar interactions may not
a firm understanding of when emotion enhances, hinders, be equivalently strong with all medial temporal lobe struc-
or exerts no influence on the likelihood of remembering tures (Davachi, 2006; McDonald, 2003). A finer apprecia-
information. It is well known that emotion does not lead tion of these anatomical distinctions within the human brain
to a picture-perfect memory (reviewed by Mather, 2007; may go a long way toward revealing how emotion exerts its
Reisberg & Heuer, 2004). Nevertheless, emotional infor- complex influences on memory formation, consolidation,
mation—and particularly negative information—can be and retrieval.
remembered with greater accuracy than nonemotional
information (reviewed by Kensinger, 2007). Additional acknowledgments I thank Keely Muscatell, Jessica Payne, and
research is needed to understand which types of details Daniel Schacter for helpful discussion and for assistance in the
are remembered well for emotional experiences and at preparation of this chapter. I gratefully acknowledge funding from
the National Science Foundation (grant BCS-0542694) and the
which memory phases emotion conveys its mnemonic
National Institute of Mental Health (grant MH080833).
advantage. Future research will do well to investigate these
issues not only through presentation of controlled stimuli
within a laboratory setting, but also through assessment REFERENCES
of participants’ memories for emotional, autobiographical
Amaral, D. G. (2003). The amygdala, social behavior, and danger
experiences. detection. Ann. NY Acad. Sci., 1000, 337–347.
Second, though researchers often assume that memory Amaral, D., Price, J., Pitkanen, A., & Carmichael, S. (1992).
processes are consistent from one individual to the next, The amygdala: Neurobiological aspects of emotion, memory,
734 memory
Hamann, S., & Canli, T. (2004). Individual differences in emotion LaBar, K. S., & Cabeza, R. (2006). Cognitive neuroscience of
processing. Curr. Opin. Neurobiol., 14, 233–238. emotional memory. Nat. Neurosci. Rev., 7, 54–56.
Hu, P., Stylos-Allan, M., & Walker, M. P. (2006). Sleep LaBar, K. S., & Phelps, E. A. (1998). Arousal-mediated memory
facilitates consolidation of emotionally arousing declarative consolidation: Role of the medial temporal lobe in humans.
memory. Psychol. Sci., 10, 891–898. Psychol. Sci., 9, 490–493.
Izard, C. E. (2007). Basic emotions, natural kinds, emotion schemas, Lang, P. J., Bradley, M. M., & Cuthbert, B. N. (1999).
and a new paradigm. Perspect. Psychol. Sci., 2, 260–280. International Affective Picture System (IAPS): Technical manual and
Kahn, I., Davachi, L., & Wagner, A. D. (2004). Functional-neu- affective ratings. Gainesville, FL: Center for Research in
roanatomic correlates of recollection: Implications for models of Psychophysiology.
recognition memory. J. Neurosci., 24, 4172–4180. Leclerc, C. M., & Kensinger, E. A. (2008). Age-related differ-
Kensinger, E. A. (2004). Remembering emotional experiences: ences in medial prefrontal activation in response to emotional
The contribution of valence and arousal. Rev. Neurosci., 15, images. Cogn. Affective Behav. Neurosci., 8, 153–164.
241–251. LeDoux, J. E. (1995). Emotion: Clues from the brain. Annu. Rev.
Kensinger, E. A. (2007). How negative emotion affects memory Psychol., 46, 209–235.
accuracy: Behavioral and neuroimaging evidence. Curr. Dir. LeDoux, J. E. (2000). Emotion circuits in the brain. Annu. Rev.
Psychol. Sci., 16, 213–218. Neurosci., 23, 155–184.
Kensinger, E. A., & Corkin, S. (2003). Memory enhancement for Maratos, E. J., Allan, K., & Rugg, M. D. (2000). Recognition
emotional words: Are emotional words more vividly remem- memory for emotionally negative and neutral words: An ERP
bered than neutral words? Mem. Cogn., 31, 1169–1180. study. Neuropsychologia, 38, 1452–1465.
Kensinger, E. A., & Corkin, S. (2004). Two routes to emotional Maratos, E. J., Dolan, R. J., Morris, J. S., Henson, R. N.,
memory: Distinct neural processes for valence and arousal. Proc. & Rugg, M. D. (2001). Neural activity associated with
Natl. Acad. Sci. USA, 101, 3310–3315. episodic memory for emotional context. Neuropsychologia, 39,
Kensinger, E. A., Garoff-Eaton, R. J., & Schacter, D. L. (2006). 910–920.
Memory for specific visual details can be enhanced by negative Marshall, L., & Born, J. (2007). The contribution of sleep to
arousing content. J. Mem. Lang., 54, 99–112. hippocampus-dependent memory consolidation. Trends Cogn.
Kensinger, E. A., Garoff-Eaton, R. J., & Schacter, D. L. Sci., 11, 442–450.
(2007a). Effects of emotion on memory specificity: Memory Mather, M. (2006). Why memories may become more positive
trade-offs elicited by negative visually arousing stimuli. J. Mem. with age. In B. Uttl, N. Ohta, & A. L. Siegenthaler (Eds.), Memory
Lang., 56, 575–591. and emotion: Interdisciplinary perspectives (pp. 135–159). Malden, MA:
Kensinger, E. A., Garoff-Eaton, R. J., & Schacter, D. L. Blackwell.
(2007b). How negative emotion enhances the visual specificity of Mather, M. (2007). Emotional arousal and memory binding: An
a memory. J. Cogn. Neurosci., 19, 1872–1887. object-based framework. Perspect. Psychol. Sci., 2, 33–52.
Kensinger, E. A., & Leclerc, C. M. (in press). Age-related changes McDonald, A. J. (2003). Is there an amygdala and how far does
in the neural mechanisms supporting emotion processing and it extend? Ann. NY Acad. Sci., 985, 1–21.
emotional memory. Eur. J. Cogn. Psychol. McGaugh, J. L. (2004). The amygdala modulates the consolidation
Kensinger, E. A., & Schacter, D. L. (2005a). Emotional content of memories of emotionally arousing experiences. Annu. Rev. Neu-
and reality-monitoring ability: FMRI evidence for the influence rosci., 27, 1–28.
of encoding processes. Neuropsychologia, 43, 1429–1443. Mickley, K. R., & Kensinger, E. A. (2008). Emotional valence
Kensinger, E. A., & Schacter, D. L. (2005b). Retrieving influences the neural correlates associated with remembering
accurate and distorted memories: Neuroimaging evidence for and knowing. Cogn. Affective Behav. Neurosci., 8, 143–152.
effects of emotion. NeuroImage, 27, 167–177. Mogg, K., Bradley, B. P., de Bono, J., & Painter, M. (1997).
Kensinger, E. A., & Schacter, D. L. (2007). Remembering Time course of attentional bias for threat information in non-
the specific visual details of presented objects: Neuro- clinical anxiety. Behav. Res. Ther., 35, 297–303.
imaging evidence for effects of emotion. Neuropsychologia, 45, Nader, K. (2003). Memory traces unbound. Trends Neurosci., 26,
2951–2962. 65–72.
Kensinger, E. A., & Schacter, D. L. (2008). Neural processes Noesselt, T., Driver, J., Heinze, H. J., & Dolan, R. (2005).
supporting young and older adults’ emotional memories. J. Cogn. Asymmetrical activation in the human brain during processing
Neurosci., 7, 1–13. of fearful faces. Curr. Biol., 15, 424–429.
Kern, R. P., Libkuman, T. M., Otoni, H., & Holmes, K. (2005). Nyberg, L., Habib, R., McIntosh, A. R., & Tulving, E. (2000).
Emotional stimuli, divided attention, and memory. Emotion, 5, Reactivation of encoding-related brain activity during memory
408–417. retrieval. Proc. Natl. Acad. Sci. USA, 97, 11120–11124.
Kilpatrick, L., & Cahill, L. (2003). Amygdala modulation Ochsner, K. N., & Schacter, D. L. (2003). Remembering emo-
of parahippocampal and frontal regions during emotionally tional events: A social cognitive neuroscience approach. In R. J.
influenced memory storage. NeuroImage, 20, 2091–2099. Davidson, H. Goldsmith, and K. R. Scherer (Eds.), Handbook of
Kleinsmith, L. J., & Kaplan, S. (1963). Paired-associate learning the affective sciences. (pp. 643–660). New York: Oxford University
as a function of arousal and interpolated interval. J. Exp. Psychol., Press.
65, 190–193. Ohman, A., Flykt, A., & Esteves, F. (2001). Emotion drives atten-
Koutstaal, W., Wagner, A. D., Rotte, M., Maril, A., Buckner, tion: Detecting the snake in the grass. J. Exp. Psychol. Gen., 130,
R. L., & Schacter, D. L. (2001). Perceptual specificity in visual 466–478.
object priming: Functional magnetic resonance imaging evi- Panksepp, J. (2007). Neurologizing the psychology of affects: How
dence for a laterality difference in fusiform cortex. Neuropsycholo- appraisal-based constructivism and basic emotion theory can
gia, 39, 184–199. coexist. Perspect. Psychol. Sci., 2, 281–296.
736 memory
Walker, E. L., & Tarte, R. D. (1963). Memory storage as a func- Wheeler, M. E., Petersen, S. E., & Buckner, R. L. (2000). Mem-
tion of arousal and time with homogeneous and heterogeneous ory’s echo: Vivid remembering reactivates sensory-specific
lists. J. Verb. Learn. Verb. Behav., 2, 113–119. cortex. Proc. Natl. Acad. Sci. USA, 97, 11125–11129.
Waring, J., Payne, J. D., Schacter, D. L., & Kensinger, Williams, J. M. G., Mathews, A., & MacLeod, C. (1996). The
E. A. (in press). Impact of individual differences upon emotion- emotional Stroop task and psychopathology. Psychol. Bull., 120,
induced memory trade-offs. Cogn. Emotion. 3–24.
abstract There is a wealth of information at the individual level while variability is often treated as a nuisance, controlled by
regarding the neural basis of episodic memory that may be lost by averaging across individuals, relying on a group map can
relying on group averages. The topography of brain activity under- also be a lost opportunity to realize the full extent of the
lying an episodic memory task is enormously variable from indi-
vidual to individual. Despite this variability, individual patterns of
brain’s involvement in particular tasks, particularly a task as
brain activity are relatively stable over time. This stability suggests dynamic, complex, and strategic as episodic memory.
that there are systematic factors, either cognitive or physiological, Group maps of whole-brain activity during an episodic
between individuals that can account for the variability. We have retrieval task, no matter how sophisticated and rigorous the
found that individual differences in memory strategy, as well as statistical analyses, have been shown to be poor representa-
other factors, can account for a significant portion of the variance
tions of the pattern of activations and deactivations that
between individuals in their patterns of brain activity. These find-
ings demonstrate that, while performance of a typical episodic occur at the individual level (Miller et al., 2002; Miller et al.,
memory task engages widespread specialized brain regions through- in press). The individual differences in the patterns of
out most of the cortex, different strategies may differentially engage activity were observed to be so extensive that one subject,
these various brain regions. for example, had significant activity in the dorsolateral
regions of the prefrontal and parietal cortex while another
subject had significant activity in the ventrolateral regions
It can be just as important to study the things that make us only. Yet, when we brought the subjects back for another
different from each other as it is to study the things that we session months later, the individual patterns of activity were
have in common. This fact has been appreciated since relatively stable, indicating that a significant portion of the
at least 1911, when the eminent learning theorist E. L. variance between individuals was not due to random fluc-
Thorndike wrote, “If we could thus adequately describe tuations of noise. We replicated this finding in a recent fMRI
each of a million human beings, . . . the million men would study that compared individual patterns of brain activity
be found to differ widely. . . . We may study the features of across repeated sessions (Miller et al., in press). As shown in
intellect and character which are common to all men; or we figure 50.1, the patterns of activations and deactivations
may study the differences in intellect and character which were quite unique from individual to individual. Yet the
distinguish individual men.” One of the important things individual patterns of activity persisted over an extended
that make us different is the unique ways in which we period of time (in this case, between 2 and 4 months). This
remember past events. Further, these uniquely individual stability indicated that the variations observed between indi-
approaches to memory likely result in extensive variability viduals were not due to random fluctuations, but repre-
in the engagement of specialized, universal brain regions. sented some systematic differences between individuals that
This is particularly evident in the variable pattern of brain were greatly affecting the pattern of brain activity across the
activity observed in fMRI studies across individuals perform- whole brain. It has been our observation across several
ing an episodic memory task. Patterns of individual brain studies of episodic memory that the group maps are not
activity are as unique and persistent as fingerprints. Yet, representative of the patterns of activations occurring at
unlike fingerprints, these unique and persistent patterns of the individual level.
brain activity may also be quite informative about individu-
als and how they go about remembering past events. And, The dangers of averaging across subjects
michael b. miller Department of Psychology, University of Within the field of psychology, there have been numerous
California, Santa Barbara, California examples over the years of erroneous conclusions based
on averaged data. In one compelling example, Gallistel, within individual birds was actually abrupt and steplike. An
Fairhurst, and Balsam (2004) demonstrated in a basic animal abrupt and steplike learning process is a fundamentally dif-
learning paradigm that the negatively accelerated, gradually ferent psychological process from a gradual learning process
increasing learning curve (a basic assumption of most learn- over time. William Estes recently noted that, whereas a
ing theorists) is an artifact of averaging across individuals. model built on group data can illustrate real trends, it can
Using conditioned responses in pigeons (pecking a key), be a poor fit to individual data, and it can be a major source
Gallistel and colleagues effectively showed that learning of distortion. In discussing the efforts of a number of inves-
740 memory
tigators in the 1950s to raise awareness about the dangers of vidual level but lost in a group average (Caramazza, 1986;
group data, Estes wrote, “It is not easy, however, to change Sokol et al., 1991).
the habits of people who are comfortable with traditional
ways of doing things, and developers of cognitive models Individual differences and neuroimaging
have continued to rely for support mainly on the fitting of
functions such as curves of learning, retention, and general- Neuroimaging faced a similar issue in its early years. Marc
ization to averaged data” (Estes, 2002, p. 6). Raichle has commented that many of the early researchers
The issue of relying on commonalities across individuals worried that the creation of group maps by averaging neu-
has also been debated repeatedly in the fields of neurology roimaging data across subjects would greatly diminish the
and neuropsychology for over a century (Caramazza, 1986; signal due to inherently high individual variability (Raichle,
Sokol, McCloskey, Cohen, & Aliminosa, 1991; Robertson, 1997). Yet those early studies reliably demonstrated retino-
Knight, Rafal, & Shimamura, 1993). In the mid-1800s, Paul topic mapping of the primary visual cortex (Fox et al., 1986),
Broca argued that speech production could be localized to as well as mapping of higher-order association areas
the third convolution of the left inferior frontal gyrus by (Petersen, Fox, Posner, Mintun, & Raichle, 1988), using
examining the common area of damage across a group of group maps. Since that time, group maps have become
patients exhibiting similar speech production deficits. At the much more sophisticated and population-based, and more
same time, however, another neurologist, John Hughlings emphasis has been placed on them given the struggle to
Jackson, argued against a centralized region for speech overcome the inherently low overall signal-to-noise ratio of
based on his observations of wide variations in the extent neuroimaging data. It is interesting to note, however, that
and location of damage in patients exhibiting similar aphasic many vision researchers have reverted back to relying on
symptoms and wide variations in symptoms in patients with individual data by retinotopically mapping individuals and
similar damage. Contrary to Broca, Jackson proposed that testing hypotheses on an individual basis with many trials
speech was a widely distributed function within the brain (Warnking et al., 2002). In addition, many researchers are
(Critchley & Critchley, 1998). Of course, Broca’s view held relying on functional localizers in several tasks because of
sway for the next century and eventually led to models of individual differences in the specific location of specialized
language, such as the Wernicke-Geschwind model, that regions within the brain (Saxe, Brett, & Kanwisher, 2006).
relied on these distinct and localized modules of language We argue that a general reliance on group maps to represent
function. However, more recent studies have suggested that, the pattern of activity across the whole brain for a particular
although this classic brain-language model is useful as a task needs to be reevaluated.
heuristic, it is empirically wrong because it cannot account Few studies have attempted to systematically examine the
for the range of aphasic symptoms, it is underspecified lin- individual variability of brain activity associated with a cog-
guistically and anatomically, and it does not take into account nitive task. In general, most neuroimaging studies involving
the extensive individual variability in symptoms of brain- individual differences can be divided into four categories:
damaged patients and in the location and extent of their (1) studies that correlate a particular behavioral performance
damage (Benson, 1985; Poeppel & Hickok, 2004). For with modulated activity in a specific brain region; (2) studies
example, Nina Dronkers (1996) reported on 22 patients with that divide subjects into smaller groups based on a behav-
lesions in Broca’s area with only 10 having Broca’s aphasia. ioral measure and then look for differences in activations
Ojemann, Ojemann, Lettich, and Berger (2008) reported on between the groups; (3) studies that look at the overlap of
cortical stimulation during neurosurgery of 117 patients and individual brain activations and variations of activity around
found that language disruption occurred in individualized a circumscribed region; and (4) studies that look at the degree
mosaics of cortex with substantial variability in the location to which group activations are reproducible. Each of these
of these mosaics, some of which correlated with sex and techniques has been reviewed previously (Miller & Van
intelligence. It is clear that, on one hand, localizing brain Horn, 2007), and each can be a useful analytical tool. For
functions by isolating common areas of brain damage across example, a convincing way to demonstrate the function of a
patients with similar deficits has been useful and, to a certain given brain region is to show that the activity in that region
extent, necessary given that brain damage is rarely confined is modulated by individual differences in behavior, as has
to one specific functional region, and that reliance on case been demonstrated in numerous neuroimaging studies from
studies can “sacrifice generalizability, predictability, and correlations of individual differences in procedural learning
the possibility of refutation” due to subject variability and the modulation of activity in the motor cortex (Grafton,
(Robertson et al., 1993, p. 716). On the other hand, case- Woods, & Tyszka, 1994) to individual differences in memory
study-by-case-study approaches can accomplish a similar performance and modulation of the medial temporal lobe
winnowing and modeling of function/brain relationships (Nyberg, McIntosh, Houle, Nilsson, & Tulving, 1996;
while preserving much of the information that is at the indi- Tulving, Habib, Nyberg, Lepage, & McIntosh, 1999). Some
742 memory
in press). The relative stability of the individual patterns of (Squire, Stark, & Clark, 2004; Eichenbaum, Yonelinas, &
activity over long periods of time suggested that unique Ranganath, 2007). Many memory researchers have sug-
individual activations were not necessarily noise but instead gested that prefrontal and parietal areas support episodic
were likely to reflect cognitive processing that was unique to memory with cognitive processes peripheral to the actual
the individual and was related to how the individual per- retrieval process, with evidence derived from brain-damaged
formed the task and/or other unique physiological proper- patient studies (Incisa della Rocchetta & Milner, 1993;
ties related to that individual. The persistence of these Janowsky, Shimamura, Kritchevsky, & Squire, 1989;
uniquely individual patterns of activity should be viewed as Petrides, 1996; Ranganath, Johnson, & D’Esposito, 2003;
an opportunity and not as a nuisance. The opportunity it Knight, 1991) and neuroimaging studies (Nyberg et al.,
affords us is the ability to explore the fundamentally different 1995; Buckner, Koustaal, Schacter, Wagner, & Rosen, 1998;
ways we remember past events and the unique brain regions Rugg et al., 1998; Fletcher, Shallice, Frith, Frackowiak, &
we recruit to accomplish that task. Dolan, 1998; Cabeza et al., 2003; Nolde, Johnson, &
D’Esposito, 1998; Henson, Shallice, & Dolan, 1999; Dobbins,
The inherently variable nature of episodic memory Rice, Wagner, & Schacter, 2003). One potential implication
and the brain regions underlying it of this architecture is that one and the same behavioral
outcome—such as an “old” response on a recognition test—
Episodic memory “stores and makes possible subsequent could be based on a distinct set of information and a distinct
recovery of information about personal experiences from combination of neural circuits in two different individuals.
the past. It enables people to travel back in time, as it were, Therefore the emerging picture of the neural basis of
into their personal past, and to become consciously aware episodic retrieval is that it comprises several distinct brain
of having witnessed or participated in events and happenings regions and that these distinct brain regions may be engaged
at earlier times.” (Tulving, 1989, p. 362). However, the differentially depending on unique individual strategies and
methods used to probe episodic memory experimentally, demands. There is substantial evidence that people will
such as a standard recognition test, utilize not only bits of employ a multitude of strategies during the encoding and
information from episodic memory but bits of information retrieval phases of a standard memory task (Stoff & Eagle,
from other systems as well, such as semantic memory. As 1971; Battig, 1975; Weinstein, Underwood, Wicker, &
Endel Tulving once wrote, “It is probably as difficult to find Cubberly, 1979; Paivio, 1983; Reder, 1987; Graf & Birt,
‘pure’ episodic-memory tasks and ‘pure’ semantic tasks as it 1996). There is also substantial evidence from neuroimaging
is to find sodium and chlorine as free elements in nature, studies that individual differences in memory strategy can
although their compound, NaCl, is found in abundance” alter which brain regions become activated (Savage et al.,
(Tulving, 1983, p. 55). A “remembered” response on a rec- 2001; Casasanto et al., 2002; Speer, Jacoby, & Braver, 2003;
ognition test can be influenced by a variety of nonepisodic Kondo et al., 2005; Tsukiura, Mochizuki-Kawai, & Fujii,
processes like semantic associations (Underwood, 1965), 2005). One notable study (Kirchoff & Buckner, 2006) identi-
schematic reconstructions (Brewer & Treyens, 1981; Miller fied the various strategies people adopt during an uncon-
& Gazzaniga, 1998), perceptual fluency (Jacoby & Dallas, strained encoding of unrelated pairs of pictures. They found
1981), shifting criterion (Miller & Wolford, 1999), and so on that two strategies in particular, verbal elaboration and
(for reviews see Roediger, 1996; Schacter, 1999). visual inspection, correlated with memory performance and
Further, episodic memory is widely distributed through- with brain activity in distinct regions: verbal elaboration
out the cortex, with different regions storing different aspects correlated with activity in prefrontal regions associated with
of the complete memory trace (Squire, 1987). It relies on an controlled verbal processing, whereas visual inspection cor-
extensive hippocampal-cortical network for the consolida- related with activity in the extrastriate cortex.
tion, storage, and utilization of information, and the hippo- The variable, unconstrained, and widely distributed
campus is not involved in the permanent storage of nature of brain activity during an episodic retrieval task is
information per se, but rather serves to facilitate consolida- particularly evident in the reported sites of activations across
tion of a distributed cortical memory trace (Squire et al., studies when compared to the reported sites of activations
1992; Wittenberg & Tsien, 2002; but see also Nadel & from other tasks, such as semantic retrieval. Cabeza and
Moscovitch, 1997). A principal characteristic of this distrib- Nyberg (2000) categorized hundreds of neuroimaging studies
uted network is that it affords the rapid and flexible forma- by cognitive domain and then plotted the reported sites of
tion of multimodal memories. In addition to widely activations for each study within each cognitive domain as
distributed information, there is also a broad network of a point on a glass brain. A cursory review of their findings
specialized brain regions that are influential in, but not reveals a general consistency in the localization of activity
necessary for, the completion of the task. After all, only across studies for most of the cognitive domains, but not
damage to the medial temporal lobe causes severe amnesia for episodic retrieval. Even in more constrained versions of
744 memory
Figure 50.2 A comparison of random-effects group maps and display the standard deviations across the 14 individuals at each
variance maps across three memory tasks. The random-effects voxel above a threshold of 2 standard deviations. As the variance
maps are a statistically thresholded ( p < .001 uncorrected for mul- maps indicate, individuals variably engaged much wider regions of
tiple comparisons) representation of the common areas of brain the cortex during episodic retrieval than during semantic retrieval
activity across 14 individuals. The variance maps to the right or working memory. (See color plate 62.)
instructed to learn the words for a later recognition memory ity: anatomical similarity, connectivity similarity (measured
test and hence were free to choose whatever strategy by computing fractional anisotropy maps from DTI images),
came most naturally. During the episodic retrieval task, default mode network similarity, encoding strategy, visual-
subjects simply made an “old/new” recognition judgment izer/verbalizer trait factor scores, and performance mea-
of the words, half of which were previously studied. In a sures. As predicted, we found that the more similar two
hierarchical regression analysis, we included several factors individuals’ tendency to visualize, the more similar their
to assess their relative contribution to the observed variabil- patterns of brain activity.
Table 50.1
Factors that are related to the degree of similarity between any two brain volumes of activity
during an episodic memory task
Factors Related to Variability in Brain Activity DR2
Situational Factors
Experimental design: blocked or event-related 40%
Stimulus type (faces or words) 28%
Different sessions n.s.
Different tasks 8%
Task difficulty 5%
Individual Differences in Physiology and Anatomy
Structural anatomy n.s.
Default mode network (coherence maps) 4%
White matter connectivity (fractional anisotropy) 7%
Individual Differences in Cognition and Information Processing
Retrieval strategy (criteria) 8%
Memory performance (d prime) n.s.
Reaction time n.s.
Tendency to visualize 5%
Tendency to verbalize n.s.
Individual Deviations Unaccounted For 16–44%
Data from Miller et al., in press; Donovan & Miller, 2008; Guerin & Miller, 2009. ΔR2 values are from
hierarchical regression analyses conducted in each study with the variables entered in the order noted
on the table. Not all variables were represented in each study. The values varied considerably from study
to study, with representative values listed here. Factors with “n.s.” were not significant in any of the
studies.
746 memory
In terms of individual differences in physiology and acknowledgments The author would like to acknowledge that
anatomy, we have found that similarity in white matter con- the research discussed in this chapter was supportorted by the
Institute for Collaborative Biotechnologies through contract no.
nectivity and in the default mode network are both factors,
W911NF-07-1-0072 from the U.S. Army Research Office.
but not individual differences in structural anatomy.
Although all brain volumes are spatially normalized before
being analyzed, there still exists a considerable difference
between individuals in the orientation and precise location REFERENCES
of cortical landmarks. Yet those anatomical differences are Battig, W. (1975). Within-individual differences in “cognitive”
not predictive of functional activation differences. As for processes. In R. L. Solso (Ed.), Information processing and cognition
individual differences in cognition and information process- (pp. 195–228). Hillsdale, NJ: Erlbaum.
ing, it is becoming clear in our studies using episodic memory Benson, D. F. (1985). Aphasia and related disorders: A clinical
approach. In M.-M. Mesulam (Ed.), Principles of behavioral neurology
tasks that individual differences in strategy have a significant
(pp. 193–238). Philadelphia: Davis.
effect, but not individual differences in memory performance Brewer, W. F., & Treyens, J. C. (1981). Role of schemata in
and accuracy. Much work still needs to be done to determine memory for places. Cogn. Psych., 13, 207–230.
the full range of factors that contribute to the individual Buckner, R. L., Koustaal, W., Schacter, D. L., Wagner, A. D.,
variability. This need is evident in the last factor that is listed & Rosen, B. R. (1998). Functional–anatomic study of episodic
retrieval using fMRI. I. Retrieval effort vs. retrieval success.
in table 50.1. In all our hierarchical regression analyses we
NeuroImage, 7, 151–162.
include dummy variables for each individual. These dummy Cabeza, R., Dolcos, F., Prince, S. E., Rice, H. J., Weissman,
variables are always entered last after accounting for all the D. H., & Nyberg, L. (2003). Attention-related activity during
other individual difference factors. Yet these individual vari- episodic memory retrieval: A crossfunction fMRI study. Neuro-
ables still account for around 40% of the variance, suggest- psychologia, 41, 390–399.
ing that some individuals are more deviant from the group Cabeza, R., & Nyberg, L. (2000). Imaging cognition. II. An empir-
ical review of 275 PET and fMRI studies. J. Cogn. Neurosci., 12,
than others and that we have yet to find the factors that 1–47.
account for that fact. Caramazza, A. (1986). On drawing inferences about the structure
It should also be noted that many of the factors that dis- of normal cognitive systems from the analysis of patterns of
tinguish individuals on an episodic memory task may also impaired performance: The case for single-patient studies. Brain
distinguish those individuals on other tasks as well. We found Cogn., 5, 41–66.
Casasanto, D. J., Killgore, W. D. S., Maldjian, J. A., Glosser,
in the study comparing activity across three different memory G., Alsop, D. C., Cooke, A. M., Grossman, M., & Detre, J. A.
tasks (an episodic retrieval task, a working memory task, and (2002). Neural correlates of successful and unsuccessful verbal
a semantic retrieval task) that the brain activity of an indi- memory encoding. Brain Lang., 80, 287–295.
vidual performing an episodic retrieval task is more similar Critchley, M., & Critchley, E. A. (1998). John Hughlings
on average to that same individual performing an entirely Jackson: Father of English neurology. New York: Oxford University
Press.
different task than it is to a different individual performing
Dobbins, I. G., Rice, H. J., Wagner, A. D., & Schacter, D. L.
the same episodic retrieval task (Miller et al., in press). (2003). Memory orientation and success: Separable neurocogni-
tive components underlying episodic recognition. Neuropsycholo-
Conclusion gia, 41, 318–333.
Donovan, C. L., & Miller, M. B. (2008). Individual variability in
Individuals vary enormously in their patterns of brain brain activity during episodic encoding and retrieval: How it
relates to anatomy, strategy, visual/verbal traits and personality.
activity. This variability is particularly widespread during an
Soc. Neurosci. Abstracts, Washington, DC.
episodic memory task, and it extends across most of the Dronkers, N. F. (1996). A new brain region for speech: The insula
cortex. These fluctuations in activity between individuals are and articulatory planning. Nature, 384, 159–161.
not random because they are relatively stable over time. Eichenbaum, H., Yonelinas, A. P., & Ranganath, C. (2007). The
Furthermore, we have been able to account for significant medial temporal lobe and recognition memory. Annu. Rev. Neu-
portions of this variability, including differences in white rosci., 30, 123–152.
Estes, W. K. (2002). Traps in the route to models of memory and
matter connectivity and differences in strategy. A clear and decision. Psychon. Bull. & Rev., 9(1), 3–25.
full understanding of all the sources of variability between Fletcher, P. C., Shallice, T., Frith, C. D., Frackowiak,
individuals will be necessary in order for us to determine R. S. J., & Dolan, R. J. (1998). The functional roles of prefrontal
whether a pattern of brain activity observed during a memory cortex in episodic memory. II. Retrieval. Brain, 121, 1249–
task is truly reflective of that individual’s thoughts and traits. 1256.
Fox, P. T., Mintun, M. A., Raichle, M. E., Miezin, F. M.,
It will be critical for future neuroimaging studies of episodic Allman, J. M., & Van Essen, D. C. (1986). Mapping human
memory to explore what makes us unique as well as to visual cortex with positron emission tomography. Nature, 323,
explore what we have in common. 806–809.
748 memory
Memory and learning: The Ebbinghaus Centennial Conference (pp. 203– Squire, L. R., Stark, C. E. L., & Clark, R. E. (2004). The medial
220). Hillsdale, NJ: Lawrence Erlbaum. temporal lobe. Annu. Rev. Neurosci., 27, 279–306.
Robertson, L. C., Knight, R. T., Rafal, R., & Shimamura, A. P. Stoff, D. M., & Eagle, M. N. (1971). The relationship among
(1993). Cognitive neuropsychology is more than single-case reported strategies, presentation rate, and verbal ability and their
studies. J. Exp. Psychol. Learn. Mem. Cogn., 19, 710–717. effects on free recall learning. J. Exp. Psychol., 87, 423–428.
Roediger, H. L., III (1996). Memory illusions. J. Mem. Lang., 35, Thorndike, E. L. (1911). Individuality. Boston: Houghton Mifflin.
76–100. Tsukiura, T., Mochizuki-Kawai, H., & Fujii, T. (2005). The effect
Rugg, M. D., Fletcher, P. C., Allan, K., Frith, C. D., of encoding strategies on medial temporal lobe activations during
Frackowiak, R. S. J., & Dolan, R. J. (1998). Neural correlates the recognition of words: An event-related fMRI study. Neuro-
of memory retrieval during recognition memory and cued recall. Image, 25, 452–461.
NeuroReport, 8, 262–273. Tulving, E. (1983). Elements of episodic memory. New York: Oxford
Savage, C. R., Deckersbach, T., Heckers, S., Wagner, University Press.
A. D., Schacter, D. L., Alpert, N. M., Fischman, A. J., & Tulving, E. (1989). Remembering and knowing the past. Amer. Sci.,
Rauch, S. L. (2001). Prefrontal regions supporting spontaneous 77, 361–367.
and directed application of verbal learning strategies: Evidence Tulving, E., Habib, R., Nyberg, L., Lepage, M., & McIntosh,
from PET. Brain, 124, 219–231. A. R. (1999). Positron emission tomography correlations in and
Saxe, R., Brett, M., & Kanwisher, N. (2006). Divide and conquer: beyond medial temporal lobes. Hippocampus, 9, 71–82.
A defense of functional localizers. NeuroImage, 30(4), 1088– Underwood, B. J. (1965). False recognition produced by implicit
1096. verbal responses. J. Exp. Psychol., 70, 122–129.
Schacter, D. L. (1999). The seven sins of memory: Insights Vincent, J. L., Snyder, A. Z., Fox, M. D., Shannon, B. J.,
from psychology and cognitive neuroscience. Am. Psychol., 54, Andrews, J. R., Raichle, M. E., & Buckner, R. L. (2006).
182–203. Coherent spontaneous activity identifies a hippocampal-parietal
Schacter, D. L., Addis, D. R., & Buckner, R. L. (2008). Episodic memory network. J. Neurophysiol., 96, 3517–3531.
simulation of future events: Concepts, data, and applications. Wagner, A. D., Pare-Blagoev, E. J., Clark, J., & Poldrack, R.
The Year in Cognitive Neuroscience 2008. Ann. NY Acad. Sci., 1124, A. (2001). Recovering meaning: Left prefrontal cortex guides
39–60. controlled semantic retrieval. Neuron, 31, 329–338.
Smith, S. M., Beckmann, C. F., Ramnani, N., Woolrich, M. W., Wagner, A. D., Schacter, D. L., Rotte, M., Koutstaal, W.,
Bannister, P. R., Jenkinson, M., Matthews, P. M., & Maril, A., Dale, A. M., Rosen, B. R., & Buckner, R. L. (1998).
McGonigle, D. J. (2005). Variability in fMRI: A re-examination Building memories: Remembering and forgetting of verbal expe-
of inter-session differences. Hum. Brain Mapping, 24, 248–257. riences as predicted by brain activity. Science, 281, 1188–1191.
Sokol, S. M., McCloskey, M., Cohen, N. J., & Aliminosa, D. Warnking, J., Dojat, M., GuÉrin-DuguÉ, A., Delon-Martin, C.,
(1991). Cognitive representations and processes in arithmetic: Olympieff, S., Richard, N., ChÉhikian, A., & Segebarth, C.
Inferences from the performance of braindamaged subjects. (2002). fMRI retinotopic mapping—Step by step. NeuroImage, 17,
J. Exp. Psychol. Learn. Mem. Cogn., 17, 355–376. 1665–1683.
Speer, N., Jacoby, L., & Braver, T. (2003). Strategy-dependent Weinstein, C. E., Underwood, V. L., Wicker, F. W., &
changes in memory: Effects on behavior and brain activity. Cogn. Cubberly, W. E. (1979). Cognitive learning strategies: Verbal
Affective Behav. Neurosci., 3, 155–167. and imaginal elaboration. In H. F. O’Neil & C. D. Spielberger
Squire, L. R. (1987). Memory and brain. New York: Oxford (Eds.), Cognitive and affective learning strategies (pp. 45–75). New
University Press. York: Academic Press.
Squire, L. R., Ojemann, J. G., Miezin, F. M., Petersen, S. E., Wittenberg, G. M., & Tsien, J. Z. (2002). An emerging
Videen, T. O., & Raichle, M. E. (1992). Activations of the molecular and cellular framework for memory processing by the
hippocampus in normal humans: A functional anatomical study hippocampus. Trends Neurosci., 25(10), 501–505.
of memory. Proc. Natl. Acad. Sci. USA, 89, 1837–1841.
abstract Memory is widely conceived as a fundamentally con- the publication of Neisser’s (1967) seminal analysis of con-
structive rather than purely reproductive process. One well-known structive processes in perception and memory, began to
source of evidence for constructive remembering is provided by receive support from cognitive studies during the 1970s (for
various kinds of memory errors and illusions. A second line of evi-
dence, which has recently emerged into the forefront of cognitive
historical reviews, see Roediger, 1996; Schacter, 1995). Since
neuroscience, concerns the processes involved in imagining or that time, overwhelming cognitive evidence has accumulated
simulating future events and novel scenes. In this chapter we discuss in favor of Bartlett’s claim that memory is “an affair of
recent studies using various patient populations and neuroimaging construction rather than reproduction” (for overviews, see
techniques to examine future-event simulation and its relation to Brainerd & Reyna, 2005; Loftus, 2003; Schacter, 1996, 2001).
episodic memory, and we also link this research with earlier studies
When we turn to the cognitive neuroscience of memory,
of constructive memory. Converging evidence supports the idea
that imagining possible future events depends on much of the same the situation looks a bit different. While cognitive neurosci-
neural machinery as does remembering past events, which we refer entists have not opposed the idea that memory involves
to as the core network. We consider conceptual and theoretical constructive processes, sustained interest in constructive
issues raised by this work, and also discuss adaptive functions of aspects of memory has developed only recently. Of course,
future-event simulation and related processes in the context of a
neurologists and neuropsychologists have long been inter-
constructive approach to memory.
ested in the phenomenon of confabulation, where patients
with damage to various regions within prefrontal cortex and
The first notion to get rid of is that memory is primarily or liter- related regions produce vivid but highly inaccurate “recol-
ally reduplicative, or reproductive. In a world of constantly chang- lections” of events that never happened. Clinicians have
ing environment, literal recall is extraordinarily unimportant . . . produced striking clinical reports of confabulation (e.g.,
memory appears to be an affair of construction rather than Talland, 1961), and more recently a number of investigators
reproduction.
have approached the phenomenon experimentally (for
—Bartlett, 1932, pp. 204–205
review, see Schnider, 2008). During the past decade, inves-
When Sir Frederic Bartlett drew on experimental observa- tigations of memory distortions in other patient populations,
tions of errors and distortions in recall of complex stories to as well as neuroimaging studies of accurate versus inaccurate
argue that memory is a fundamentally constructive process, remembering in healthy individuals, have greatly increased
his claims had little influence on his contemporaries. Psycho- our understanding of the cognitive neuroscience of construc-
logical research on memory at the time was dominated by tive memory (Schacter, Norman, & Koutstaal, 1998;
studies of rote learning in simple paired-associate paradigms; Schacter & Slotnick, 2004).
Bartlett’s methods and theories made little sense in the context Even more recently—during just the past few years—
of the prevailing behaviorist zeitgeist. Several decades passed there has been a dramatic increase in research on a related
before Bartlett’s ideas about constructive memory, revived by topic that also illuminates the constructive nature of memory:
the role of memory in imagining or simulating possible
future events (cf., Buckner & Carroll, 2007; Buckner,
daniel l. schacter Department of Psychology, Harvard
Andrews, & Schacter, 2008; Schacter, Addis, & Buckner,
University, Cambridge, Massachusetts; Athinoula A. Martinos
Center for Biomedical Imaging, Massachusetts General Hospital, 2007, 2008; Suddendorf & Corballis, 2007). Evidence has
Charlestown, Massachusetts rapidly accumulated to support the idea that memory—
donna rose addis Department of Psychology, University of especially episodic memory, the system that allows individu-
Auckland, Auckland, New Zealand als to recollect past events—is also critically involved in our
randy l. buckner Department of Psychology, Center for ability to imagine future happenings and carry out related
Brain Sciences, Harvard University, Cambridge, Massachusetts;
Athinoula A. Martinos Center for Biomedical Imaging, Massachu- kinds of mental simulations. Furthermore, brain regions tra-
setts General Hospital, Charlestown, Massachusetts; Howard ditionally identified with memory, including the hippocam-
Hughes Medical Institute, Cambridge, Massachusetts pus, are similarly engaged when people carry out various
schacter, addis, and buckner: constructive memory and simulation of future events 751
mental simulations. These investigations have provided new such observations indicate the occurrence of mental time
evidence concerning constructive processes in memory and travel in rats, they do suggest that the hippocampus may
may even provide clues concerning the functions of such provide prospective signals that could be used as a basis for
constructive processes (Hassabis, Kumaran, & Maguire, making decisions.
2007; Schacter & Addis, 2007a, 2007b). Perhaps overlooked in the intensive discussion over
One important impetus for this new wave of research whether animals can engage in mental time travel is that
emerged from claims made by Tulving (1983, 2002) that relatively little is known about how humans use memory to
episodic memory supports “mental time travel” in both the imagine or simulate future events. Although cognitive neu-
past and the future. Tulving, influenced by prior related roscience has made much progress in delineating the nature
ideas from the Swedish neuroscientist David Ingvar (1979, of remembering, it has barely scratched the surface in study-
1985; for further discussion, see Buckner & Carroll, 2007; ing how memory is used to imagine future events and to
Schacter et al., 2008), suggested that mental time travel sup- engage in related forms of mental simulation. The upsurge
ports “autonoetic,” or self-knowing, consciousness, which in relevant research during the past few years has begun to
allows individuals to view themselves as temporally extended rectify the situation. In this chapter, we will focus on recent
entities whose present awareness is influenced by the recol- cognitive neuroscience research that has examined relations
lected past and imagined future. among memory, imagination, and future-event simulation.
During the 1990s, these ideas about mental time travel
became associated with discussions concerning whether this Imagining future events: Findings and ideas
capacity is unique to human beings or whether nonhuman
animals also possess some form of autonoetic consciousness Insights into the nature of future-event simulation have been
that allows them to revisit the past and anticipate the future. gained by cognitive studies of healthy young adults and
Suddendorf and Corballis (1997, 2007) and Tulving (2002, memory-impaired populations, and more recently by neu-
2005) have both argued forcefully that mental time travel is roimaging studies. In the present chapter, we focus on
restricted to human beings. While both Tulving (2002, 2005) memory-impaired populations and neuroimaging studies
and Suddendorf and Corballis (1997, 2007) allow that non- (for more general reviews, see Buckner et al., 2008; Schacter
human animals can use semantic or procedural memory et al., 2008).
systems to gain access to stored information, they assert that
such processes need not involve either recollecting a past Studies of Future-Event Simulation in Memory-Impaired
event or using mental simulation to “preexperience” a future Populations We consider here three memory-impaired
event—the essence of mental time travel. populations in which future-event simulation has been
This strong claim has spurred considerable debate (see examined: amnesic patients, older adults, and psycho-
Clayton, Bussey, & Dickinson, 2003, and commentaries on pathological populations.
Suddendorf & Corballis, 2007) and will likely be difficult to
resolve definitively owing to limitations on our ability to Amnesic patients It is well-established that the amnesic syn-
assess inner experience in nonhumans, which is central drome resulting from damage to the medial temporal lobes
to the concept of mental time travel. Some compelling and related structures is associated with a severe impairment
experimental demonstrations, at the very least, cast doubt in the ability to remember past experiences (see chapter 46
on the strong claim for human uniqueness. For example, by Shrager and Squire, this volume). Early clinical observa-
Clayton and Dickinson (1998) showed that food-caching tions (Talland, 1965) suggested that amnesic patients might
scrub jays are able to retrieve detailed information about also have problems envisioning their personal futures and
what food they cached as well as when and where they planning for upcoming events. Tulving (1985) reported that
cached it, and Raby, Alexis, Dickinson, and Clayton (2007) the densely amnesic patient KC, who cannot remember any
have shown that conditions exist in which jays cache food specific episodes from his past (for a review of KC, see
in a way that appears to indicate some type of planning for Rosenbaum et al., 2005), exhibits similar problems envision-
the future. In a related line of research with rodents, several ing any specific episodes in his future (Rosenbaum, Gilboa,
investigators have provided evidence indicating some of type Levine, Winocur, & Moscovitch, in press; Tulving, 1985;
of prospective coding, including evidence that hippocampal Tulving, Schacter, McLachlan, & Moscovitch, 1988). Note,
neurons encode not only a rat’s current location and recent however, that KC is characterized by fairly extensive brain
memory, but also encode prospective information con- damage, including damage to medial temporal, prefrontal,
cerning where the rat needs to go in the immediate future and other regions (see Rosenbaum et al.), thereby limiting
(Diba & Buzsáki, 2007; Ferbinteanu & Shapiro, 2003; Foster the specificity with which his problems remembering the
& Wilson, 2006; A. Johnson & Redish, 2007; Pastalkova, past or imagining the future can be associated with particu-
Itskov, Anavasingham, & Buzsáki, 2008). Whether or not lar brain regions. A similar issue applies to a later and more
752 memory
systematic study by Klein, Loftus, and Kihlstrom (2002) were segmented into distinct details that were classified as
concerning patient DB, who became amnesic as a result of either internal (episodic) or external (semantic). The key
cardiac arrest and consequent anoxia. DB showed marked finding was that older adults generated fewer internal details
deficits on a 10-item questionnaire probing past and future than younger adults; importantly, this effect was observed to
events that were matched for temporal distance from the the same extent for future events as for past events. By con-
present (e.g., “What did you do yesterday? What are you trast, older adults showed small but significant increases
going to do tomorrow?”). The patient’s deficit in simulating relative to young adults in the production of external details
future events appeared to involve only his personal future, for both past and future events. Furthermore, there were
since DB showed little difficulty imagining possible future strong positive correlations across past and future events for
happenings in the public domain, such as political events. both internal and external detail scores, whereas internal
More recently, Hassabis, Kumaran, Vann, and Maguire and external detail scores were not correlated with one
(2007) examined the ability of five patients with documented another. Finally, the internal (but not external) detail score
bilateral hippocampal amnesia to imagine novel experiences, correlated significantly with a measure of relational memory
such as “Imagine you’re lying on a white sandy beach in a (paired-associate learning), known to be dependent on the
beautiful tropical bay.” The experimenters scored the con- hippocampus, a point to which we will return later when
structions of patients and controls based on the content, discussing theoretical accounts of future-event simulation.
spatial coherence, and subjective qualities of the imagined Overall, the results reveal a strong link between remem-
scenarios. Four of the five hippocampal patients produced bering the past and imagining the future in older adults.
constructions that were significantly reduced in richness and These findings dovetail nicely with observations from Spreng
content compared with those of controls, especially for the and Levine (2006), who reported similar temporal distribu-
measure of spatial coherence. The single patient who per- tions for past and future events in aging: when remembering
formed normally on the imaginary scene task was character- past events or imagining future events that are likely to
ized by some residual hippocampal tissue. Because the lesions happen, both older and younger adults generated the highest
in the other cases appear to specifically include the hippo- number of events near the present, with the frequency
campal formation, this study strengthens the link between declining as a function of time in a manner well described
event simulation and hippocampal function. Note, however, by a power function.
that Hassabis and colleagues did not specifically require
participants to imagine future events, indicating that the Psychopathological populations A growing number of studies
amnesic patients suffer from an impairment in event simula- have examined future-event simulation in patients with
tion that is not restricted to a particular time interval. various forms of psychopathology. We have reviewed this
literature in detail elsewhere (Schacter et al., 2008) and sum-
Older adults It is well known that healthy older adults exhibit marize several key findings here. Williams and colleagues
a variety of episodic memory deficits (e.g., Craik & (1996) reported a seminal study in which they found that
Salthouse, 2000), but little is known about future-event simu- suicidally depressed patients have difficulty recalling specific
lation in aging. Addis, Wong, and Schacter (2008) recently memories of past events and also in generating specific simu-
investigated the issue. They noted earlier work showing that lations of future events. Compared to nondepressed controls,
aging is associated with reduced specificity during the recall the past and future events generated by depressed patients
of past autobiographical episodes. Levine, Svoboda, Hay, in response to cue words lacked specific detail and thus were
Winocur, and Moscovitch (2002) reported such age-related characterized as “overgeneral”; these reductions in specific-
changes in the episodic quality of past events using the Auto- ity of past and future events were significantly correlated.
biographical Interview (AI), a measure that distinguishes Similar findings have been reported in milder forms of
episodic information from other “external” details (e.g., depression (e.g., Dickson & Bates, 2005; MacLeod, Rose,
semantic information, other external events, repetitions) that & Williams, 1993) and also in anxious individuals (Stöber
comprise a participant’s description of a past event. Levine & Borkovec, 2002).
and colleagues observed that older adults recalled signifi- Williams and colleagues (1996) found that past and future
cantly fewer internal/episodic details and tended to produce events generated by suicidally depressed patients were over-
more external/semantic information. general for both positive and negative events. However,
Addis, Wong, and Schacter (2008) used an adapted others have reported effects of valence. For instance,
version of the Autobiographical Interview that required MacLeod and colleagues (1993) found that suicidally
young and older participants to generate memories of past depressed patients were less able to envision positive future
events and simulations of future events in response to indi- episodes (see also Dickson & Bates, 2006). Indeed, reduced
vidual word cues. They allowed participants three minutes access to positive future events correlates with the severity
to describe each episode, and transcriptions of the events of hopelessness (MacLeod & Cropley, 1995), suggesting that
schacter, addis, and buckner: constructive memory and simulation of future events 753
simulation deficits may help to maintain the sense of hope- the future recruit a similar network of brain regions. Such
lessness that typically characterizes depression (for related findings were reported initially in an early positron emission
neuroimaging research concerning neural correlates of tomography (PET) study from Okuda and colleagues (2003;
optimism, see Sharot, Riccardi, Raio, & Phelps, 2007, and see also Partiot, Grafman, Sadato, Wachs, & Hallett, 1995,
commentary by Schacter & Addis, 2007c). Similarly, for related early findings). During scanning, participants
increased access to simulations of negative future events is talked freely about either the near past or future (i.e., the last
characteristic of anxiety disorders (e.g., MacLeod, Tata, or next few days) or the distant past or future (i.e., the last
Kentish, Carroll, & Hunter, 1997; Ruane, MacLeod, & or next few years). Similar levels of activation were observed
Holmes, 2005). during past and future conditions in several prefrontal
Such observations in patients with depression and anxiety regions, as well as in the medial temporal lobe (right
disorders have led to the proposal that the reduced specific- hippocampus and bilateral parahippocampal gyrus). Note,
ity of autobiographical memories and future-event simula- however, that because Okuda and colleagues used a rela-
tions reflects problems with affect regulation: patients tively unconstrained paradigm that did not probe partici-
produce overgeneral events because they truncate search or pants about particular events, it is unclear whether these
construction to protect themselves from experiencing poten- reports consisted of episodic memories and simulations
tially destabilizing memories or simulations (Williams, 1996, (unique events specific in time and place) or general semantic
2006). Recently, D’Argembeau, Raffard, and Van der information about an individual’s past or future. More
Linden (2008) reported that schizophrenics generated sig- recent fMRI studies have used event-related designs to yield
nificantly fewer specific past and future events than did information regarding the neural bases of specific past and
healthy controls. Such findings are less likely to be attribut- future events.
able to the kinds of affect-regulation problems that occur in Szpunar, Watson, and McDermott (2007) instructed par-
depression and anxiety. Moreover, the findings from depres- ticipants to remember specific events that occurred in their
sion, anxiety, and schizophrenia are quite similar to those personal past, imagine specific future events that might
considered earlier from amnesic patients and older adults. occur in their personal future, or imagine specific events
Taken together, these observations encourage further involving a familiar individual (Bill Clinton) in response to
consideration of the possible role of neuropsychological defi- event cues (e.g., past birthday, retirement party). Consistent
cits that may contribute to the reduced specificity of events with previous observations, there was considerable overlap
evident in both psychiatric and nonpsychiatric populations. in activity associated with past and future events in the bilat-
For example, we have suggested (Schacter et al., 2008) that eral frontopolar and medial temporal lobe regions, as well
the aforementioned data from amnesic patients (and data to as in posterior cingulate cortex. Note also that these regions
be considered shortly from neuroimaging studies) implicat- were not activated to the same degree when participants
ing the hippocampus in event simulation raise the possibility imagined events involving Bill Clinton, seeming to demon-
that hippocampal dysfunction might contribute to overgen- strate a neural signature that is unique to the construction
eral simulations of past and future events. Hippocampal of events in one’s personal past or future.
atrophy is evident in a number of psychiatric conditions in One general issue that applies to the foregoing studies,
which simulation deficits have been documented, including and potentially to any neuroimaging study that compares
depression (Bremner et al., 2000; Campbell & Macqueen, the neural correlates of remembering past events and imagin-
2004) and schizophrenia (Velakoulis et al., 2006), and it also ing future events, is that remembering is usually associated
has been documented in older adults (e.g., Driscoll et al., with greater levels of episodic detail than is imagining (e.g.,
2003; Golomb et al., 1993). It is therefore possible that M. Johnson, Foley, Suengas, & Raye, 1988). To the extent
hippocampal dysfunction contributes to simulation deficits that this outcome occurs, comparisons between past and
observed across these varied populations. future events will be partly or entirely confounded by differ-
ences in level of detail. Using event-related fMRI, Addis,
Neuroimaging of Future-Event Simulation During Wong, and Schacter (2007) attempted to equate experimen-
the past couple of years, several studies have used tally the level of detail and related phenomenological fea-
neuroimaging techniques to compare the neural correlates tures of past and future events. Also, taking advantage of the
of imagining future events with those that characterize temporal resolution of fMRI, the past and future tasks were
remembering past events. We first review key experimental divided into two phases: (1) an initial construction phase
findings and related observations before turning to some during which participants generated a past or future event
emerging conceptual issues. in response to an event cue (e.g., “dress”) and pressed a
button when they had an event in mind, and (2) an elabora-
Basic findings: The core network A consistent finding across tion phase during which participants generated as much
studies has been that remembering the past and imagining detail as they could about the event.
754 memory
Figure 51.1 Sagittal slice (x = −4) illustrating the striking com- pattern of common activity was not present during the construction
monalities in the medial left prefrontal and parietal regions engaged of past and future events; it only emerged during the elaboration
when remembering the past (left panel) and imagining the future of these events (shown here, relative to the elaboration phase of a
(right panel). These marked similarities of activation were also semantic and an imagery control task; significant at p < .001,
evident in areas of the medial temporal lobe (left hippocampus, uncorrected; shown at p < .005, uncorrected.) (Originally published
bilateral parahippocampal gyrus) and lateral cortex (left temporal in Addis, Wong, & Schacter, 2007.) (See color plate 63.)
pole and left bilateral inferior parietal cortex). This extensive
The construction phase was associated with some common networks important to memory retrieval (Cabeza & St
past-future activity in posterior visual regions and left hip- Jacques, 2007; Gilboa, 2004; Maguire, 2001; Spreng, Mar,
pocampus. During the elaboration phase, when participants & Kim, in press; Svoboda, McKinnon, & Levine, 2006;
focused on generating details about the remembered or Wagner, Shannon, Kahn, & Buckner, 2005). Analyses of
imagined event, there was even more extensive overlap the interactions among the brain regions within this core
between the past and future tasks (see figure 51.1). Both network demonstrate that all of the component regions
event types were associated with activity in a network of are selectively correlated with one another within a large-
regions including medial temporal (hippocampus and para- scale brain system that includes the hippocampal formation
hippocampal gyrus) and prefrontal cortex, as well as medial (Greicius, Srivastava, Reiss, & Menon, 2004; Kahn,
parietal and retrosplenial cortex. Andrews-Hanna, Vincent, Snyder, & Buckner, 2008; Vincent
Botzung, Denkova, and Manning (2008) have recently et al., 2006), and that the network likely consists of distinct
reported data from an fMRI study that are mainly consistent interacting subsystems (Buckner et al., 2008).
with those from the preceding studies. The day before Although it seems clear that remembering the past and
scanning, subjects initially reported on 20 past events imagining the future are both associated to some extent with
from the last week and 20 future events planned for the next a common core network, neuroimaging studies have also
week. The experimenters constructed cue words for these yielded a number of findings that point to possible differ-
events that were presented to subjects the next day during ences between the two. First, direct comparisons have con-
scanning, when they were instructed to think of past or sistently shown greater activity in several brain regions when
future events to each cue. Past and future events produced individuals imagine the future than when they remember the
activation in a network similar to that reported by Addis and past. For example, Okuda and colleagues (2003) reported
colleagues (2007). greater activity in frontopolar and medial temporal regions
Collectively, the results from the preceding imaging studies when people talked about the future than the past; Szpunar
consistently implicate a core network of structures in both and colleagues (2007) reported that bilateral premotor cortex
remembering the past and imagining the future (Buckner & and left precuneus were more active for future relative to
Carroll, 2007; Buckner et al., 2008; Schacter et al., 2007, past events, but not vice versa; and Addis and colleagues
2008). This network consists of prefrontal and medial tem- (2007) found that during the early construction phase of
poral lobe regions, as well as posterior regions including future simulation, several regions showed greater activity for
lateral parietal, posterior cingulate, and retrosplenial cortices future versus past events (but not the reverse), including right
that have previously been observed as components of brain hippocampus and frontopolar cortex.
schacter, addis, and buckner: constructive memory and simulation of future events 755
In a more recent study, Addis, Cheng, and Schacter levels of hippocampal activity. By contrast, because future
(2008) contrasted activity when individuals were cued to events are thought to require more intensive recombining of
remember or imagine specific events, as in previous studies, disparate details into a coherent event, the hippocampal
versus when they were cued to remember general, routine response to increasing amounts of future-event detail should
events (e.g., having brunch after attending church) or to be larger than that for past-event detail. In addition, since
imagine generic events that might occur sometime in their the frontal pole is thought to play a role in prospective think-
personal futures (e.g., reading the newspaper each morning). ing (e.g., Okuda et al., 2003), this region should also exhibit
Addis and colleagues replicated the foregoing findings of a future > past detail response if it is specifically involved in
greater activity for future than past events during the early the generation of future details.
phase of event construction. Furthermore, they found that Consistent with predictions, the analysis showed that the
the left frontal pole showed this future > past effect for both left posterior hippocampus was responsive to the amount of
specific and generic events, suggesting a general role in detail comprising both past and future events. In contrast, a
prospection irrespective of the specificity of the event. separate region in the left anterior hippocampus responded
However, the right hippocampus showed the future > past differentially to the amount of detail comprising future
effect only for specific events; in fact, there was no evidence events, possibly reflecting the recombination of details into
for right hippocampal activity during construction of generic a novel future event. Moreover, the right frontal pole
future events. responded significantly more to the generation of future-
These observations are open to multiple interpretations relative to past-event details, again suggesting that this region
(note also that Botzung et al., 2008, reported evidence for might be involved specifically in prospective thinking.
increased activity for past versus future events; but see The parametric modulation analysis of temporal distance
Schacter et al., 2008, for discussion of methodological issues revealed that the increasing recency of past events was asso-
that complicate intepretation of this finding). For example, ciated with activity in the right parahippocampus gyrus (BA
Szpunar and colleagues (2007) suggested that a more active 35/36), while activity in the bilateral hippocampus was asso-
type of imagery processing might be required by future than ciated with the increasing remoteness of future events. Addis
past events. Addis and colleagues (2007) hypothesized that and Schacter (2008) proposed that the hippocampal response
more intensive constructive processes are required by imag- to the distance of future events reflects the increasing dispa-
ining future events than by retrieving past events. While both rateness of details likely included in remote future events and
past- and future-event tasks require the retrieval of informa- the intensive relational processing required for integrating
tion from memory, thus engaging common memory net- such details into a coherent episodic simulation of the future.
works, only the future task requires that event details gleaned More generally, these results suggest that the core network
from various past events be flexibly recombined into an supporting past- and future-event simulation can be recruited
imaginary event, perhaps resulting in increased activity in different ways depending on whether the generated event
during future-event tasks. A related possibility is that imag- is in the past or future.
ined future events are more novel than remembered past
events; increased activity during future-event tasks might Conceptual issues: Past versus future or remembering versus
reflect some form of novelty encoding. This latter idea is imagining? The preceding observations raise a general point
potentially applicable to findings of increased hippocampal concerning the growing number of studies that have com-
activation during future-event tasks, since it is well known pared remembering the past with imagining the future.
that encoding novel events can be associated with increased When differences between these two conditions are observed,
hippocampal activity (e.g., Ranganath & Rainer, 2003). they are typically attributed to differences in the way that
Note, however, that Addis, Cheng, and Schacter’s (2008) the brain handles past and future events. However, in the
finding that increased right hippocampal activity for future reviewed studies past events are remembered whereas future
events was observed for specific but not generic events would events are imagined; accordingly, the differences could
appear to be inconsistent with a simple novelty-encoding equally well be attributed to differences between remember-
account, because both the specific and generic future events ing and imagining, rather than differences between past and
were novel. future per se. Of course, the future cannot be remembered
Addis and Schacter (2008) report additional findings con- because it has not yet happened. However, both the past
cerning differential neural responses to past and future and the future can be imagined. Furthermore, events can be
events. They conducted parametric modulation analyses, imagined without any specific reference to a particular time
with temporal distance and detail as covariates, focusing on point. Therefore, it would be of interest to determine whether
the hippocampal and the frontopolar regions. They hypoth- any of the foregoing findings are indeed specifically related
esized that reintegrating increasing amounts of detail for either to imagining future events, or whether such findings are
a past or future event would be associated with increasing observed when people imagine events (1) that lack a specific
756 memory
temporal reference or (2) that might have occurred in their the network can be used for event simulation regardless of
personal pasts. Recent studies provide evidence concerning the temporal location of the event.
both points.
Hassabis, Kumaran, and Maguire (2007) adapted the Theoretical implications: Future event simulation and
experimental paradigm that they had used previously with constructive memory
amnesic patients for an fMRI study with healthy volunteers
in which participants were asked to imagine novel, fictitious Neuroimaging and neuropsychological observations have
scenes, without explicit reference to whether those scenes led to a number of new theoretical proposals, involving both
should be placed in the past, present, or future. Subjects attempts to describe the critical cognitive processes associ-
were then scanned in a subsequent session in which they ated with core network activation and attempts to consider
were cued to remember the previously constructed fictitious functional aspects of future-event simulation. We have
scenes, construct additional novel fictitious scenes, or recall reviewed these proposals in detail elsewhere (Buckner &
real episodic memories from their personal pasts. Hassabis Carroll, 2007; Buckner et al., 2008; Schacter et al., 2007,
and colleagues found that all three conditions were associ- 2008). Here, we briefly summarize the main ideas. We first
ated with activations in some of the regions within the core consider two related attempts to delineate key processes
network that were associated with future-event simulation associated with the core network, and then describe a
in previously reviewed studies, including hippocampus, related idea that attempts to link the core network and
parahippocampal gyrus, and retrosplenial cortex. The results future-event simulation with memory errors and related
thus indicate that activity in these regions is not restricted constructive aspects of remembering.
to conditions that explicitly require imagining future
events. However, Hassabis and colleagues also reported that Core Network Activation: Critical Cognitive
remembering “real” episodic memories yielded increased Processes The experimental work that we have reviewed
activity in several core network regions—notably anterior has sparked a number of attempts to characterize the key
medial prefrontal cortex and posterior cingulate cortex—in processes subserved by the core network that is consistently
comparison with constructing fictitious events. We will activated in recent studies of imagining and remembering
return shortly to the theoretical implications of these latter (Buckner & Carroll, 2007; Hassabis, Kumaran, & Maguire,
findings. 2007; Hassabis, Kumaran, Vann, et al., 2007; Hassabis &
In a related study, Addis, Pan, Vu, Laiser, and Schacter Maguire, 2007; Schacter & Addis, 2007a, 2007b; Spreng
(in press) attempted to disambiguate whether future- et al., in press). As we have noted in our previous reviews,
event-related activity is specifically associated with pro- these perspectives share much in common and differ mainly
spective thinking or with the more general demands of in points of emphasis and focus.
imagining an episodic event in either temporal direction Buckner and Carroll (2007) and Buckner and colleagues
by instructing subjects to imagine events that might occur (2008) argued that the core network serves a common set of
in their personal future or events that might have occurred processes by which past experiences are used adaptively to
in their personal pasts. Prior to scanning, participants imagine perspectives and events beyond those that emerge
provided episodic memories of actual experiences that from the immediate environment. By this view, the functions
included details about a person, object, and place involved in of the core network are not restricted to tasks requiring
that event. During scanning, the subjects were cued to mental time travel. In addition to the network’s role in
recall some of the events that had actually occurred, and remembering the past and envisioning the future, the core
for the conditions in which they imagined events, the experi- network is hypothesized to contribute to more general func-
menters randomly recombined details concerning person, tions, extending to diverse tasks that require mental simula-
object, and place from separate episodes. Participants were tion of alternative perspectives. They observed that some,
thus presented with cues for a person, object, and place but not all, regions within the core network are engaged
taken from multiple episodes, and were instructed to imagine during theory-of-mind tasks that require thinking about the
them together in a single, novel episode that included the perspectives of others (e.g., Saxe & Kanwisher, 2003), and
specified details. they also noted that such regions may be engaged during
Addis and colleagues (in press) reported that all regions certain kinds of spatial navigation tasks (e.g., Byrne, Becker,
within the core network (including medial prefrontal and & Burgess, 2007). Buckner and Carroll suggested that the
frontopolar cortex, hippocampus, parahippocampal gyrus, core brain network is commonly engaged when individuals
lateral temporal and temporopolar cortex, medial parietal are simulating alternative perspectives, including alterna-
cortex including posterior cingulate and restrosplenial cortex, tives in the present and possibilities in the future—a process
and lateral parietal cortex) were similarly engaged when they provisionally termed self-projection. This view predicts
participants imagined future and past events, suggesting that that activation of the core network should correspond to the
schacter, addis, and buckner: constructive memory and simulation of future events 757
extent that a task encourages simulation of an alternative remembering and envisioning the future but that are also
perspective not beyond the immediate environment. Spreng used to varying degrees across a diverse set of tasks that
and colleagues (in press) performed a meta-analysis of studies extend well beyond forms of “mental time travel” (for further
that generally supported this broad hypothesis. discussion of the neural correlates of mental time travel, see
In a detailed analysis of the anatomy and functional con- Arzy, Molnar-Szarkacs, & Blanke, 2008).
nections among the regions within the network, Buckner and
colleagues (2008) recently expanded on this perspective and Episodic Simulation, the Core Network, and
showed that the core network comprises at least two interact- Constructive Memory Schacter and Addis (2007a,
ing subsystems: the medial temporal lobe subsystem func- 2007b; for related ideas, see Dudai & Carruthers, 2005;
tions to provide information from memory; the dorsal medial Suddendorf & Corballis, 1997; Suddendorf & Busby, 2005)
prefrontal cortex subsystem participates to derive self- have linked findings concerning event simulation and core
relevant mental simulations. The two subsystems interact network activity to the observation that memory involves a
through hubs, including the posterior cingulate, but can be constructive process of piecing together bits and pieces of
dissociated using the observation that dorsal medial prefron- information. According to the constructive episodic simulation
tal cortex and the medial temporal lobe are not intrinsically hypothesis, imagining future events requires a system that
correlated with one another. One possibility is that mental can flexibly recombine details from past events. From this
simulations such as remembering and envisioning the future perspective, past and future events draw on similar
draw heavily on contributions from both subsystems, whereas information stored in episodic memory and rely on similar
other forms of task rely preferentially on one subsystem. For underlying processes; episodic memory supports the
example, theory-of-mind tasks that do not draw on memory construction of future events by extracting and recombining
rely primarily on the medial prefrontal subsystem, as evi- stored information into a simulation of a novel event. The
denced by strong activation of that system and not the adaptive value of such a system is that it enables past
medial temporal lobe. Consistent with this idea, patients information to be used flexibly in simulating alternative
with medial temporal lesions exhibit intact performance on future scenarios without engaging in actual behavior. A
theory-of-mind tasks that do not draw on past memories potential downside of such a system, however, is that it is
(Rosenbaum, Stuss, Levine, & Tulving, 2007). However, the vulnerable to memory errors, such as misattribution and
neuropsychological data also indicate that patients with false recognition (for examples, see Schacter & Addis, 2007a,
damage to medial prefrontal regions (Bird, Castelli, Malik, 2007b). This observation suggests, intriguingly, that certain
Frith, & Husain, 2004) show intact performance on several kinds of memory errors may be the by-product of a system
theory-of-mind tasks, which is perplexing in light of the whose adaptive function is to make available information
common activation of this region in imaging studies. from the past in a flexible form that supports simulations of
Hassabis and Maguire (2007) have argued that a process future events.
they refer to as scene construction links together various tasks The constructive episodic simulation hypothesis receives
that depend on many regions within the core network, in general support from the previously reviewed findings of
particular those associated with the medial temporal subsys- neural and cognitive overlap between past and future events;
tem. Scene construction focuses on visuospatial aspects of and, because it emphasizes the importance of flexibly relat-
mental simulations and was motivated initially by the previ- ing and recombining information from past episodes, the
ously discussed finding that amnesic patients with medial hypothesis is more specifically supported by the mounting
temporal damage show deficits when asked to imagine novel evidence from amnesia (Hassabis, Kumaran, Vann, et al.,
scenes, with a disproportionate impairment in the spatial 2007), neuroimaging (Addis et al., 2007; Addis & Schacter,
coherence of the imagined scenes (Hassabis, Kumaran, 2008; Botzung et al., 2008; Hassabis, Kumaran, & Maguire,
Vann, et al., 2007). Neuroimaging findings from the same 2007; Okuda et al., 2003), and aging (Addis, Wong, &
task likewise show core network activity (Hassabis, Kumaran, Schacter, 2008), linking hippocampal function and relational
& Maguire, 2007). Because the novel-scenes task does not processing with episodic simulation. The hippocampal
explicitly require mental time travel, Hassabis and Maguire region is thought to support relational memory processes
contended that projecting oneself into the past or the future (e.g., Eichenbaum & Cohen, 2001); and, according to the
is not the critical process for activating the medial temporal constructive episodic simulation hypothesis, these processes
subsystem. are critical for recombining stored information into future-
Taken in the context of the anatomic analysis of Buckner event simulations.
and colleagues (2008), the collective results begin to con- Because the constructive episodic simulation hypothesis
verge on the idea that the core network comprises at least places great emphasis on the process of recombining event
two subsystems that interact to accomplish autobiographical details, it is critical to determine whether such recombination
758 memory
processes are critical for future-event simulation, or whether typically activate as part of the core network when individu-
such simulations are based on retrieval of entire past epi- als imagine themselves in personal future events (Addis
sodes, or fragments of such episodes, which are simply recast et al., 2007; Szpunar et al., 2007). Interestingly, in a recent
as possible future events. Relevant data are provided by the study (Abraham, von Cramon, & Schubotz, 2008) where
aforementioned study by Addis and colleagues (in press) participants were asked to imagine scenarios that involved
using experimental recombination of details from distinct meeting real people (e.g., George Bush) versus fictional char-
episodes: core network activation, including the hippocam- acters (e.g., Cinderella), anterior prefrontal and posterior
pus, was observed under conditions that effectively ruled out cingulate were more active during the former than the latter
recasting of a single past episode as a future event. condition, possibly indicating greater ease of self-projection
Although further research is required to delineate the when imagining oneself meeting an actual person (medial
exact role of the hippocampus in mental simulation, it may temporal regions were similarly active in the two conditions).
be worth noting that research on other aspects of construc- Taken together, the foregoing studies suggest that additional
tive memory has also highlighted the involvement of the areas and processes (beyond anterior medial prefrontal and
hippocampus and related medial temporal lobe regions. For posterior cingulate) must be recruited to allow one to distin-
instance, some neuroimaging studies of false recognition, guish an episodic memory from a realistic future simulation
where individuals claim to have previously encountered a that engages the self.
novel item that is conceptually or perceptually related to a Here, it seems likely that there is a role for the long-
previously studied item, have documented hippocampal/ standing idea from research on reality monitoring that
medial temporal lobe activation during false recognition of remembering events that one has actually experienced is
semantically associated words (e.g., Cabeza, Rao, Wagner, associated with greater numbers of sensory and perceptual
Mayer, & Schacter, 2001) or abstract shapes (Slotnick & details than remembering previously imagined events (e.g.,
Schacter, 2004). Similarly, several studies have shown that Johnson & Raye, 1981). This idea has received support
amnesic patients with medial temporal lobe damage show from behavioral studies (e.g., Johnson et al., 1988) as well
reduced levels of false recognition for various kinds of mate- as neuromaging research (Kensinger & Schacter, 2006;
rials, suggesting that the hippocampal region is involved with see also Slotnick & Schacter, 2004). Most directly related
encoding and/or retrieving the information that drives false to the present concerns, Addis and colleagues (in press)
recognition effects (e.g., Schacter, Verfaellie, & Pradere, report preliminary evidence that remembering actual
1996; Verfaellie, Page, Orlando, & Schacter, 2005). Taken autobiographical events is more strongly associated with
together with the evidence for hippocampal involvement activity in posterior visual cortex (and some medial temporal
during simulation of future or novel events, it seems increas- regions) than is imagining future or past events using the
ingly clear that the hippocampus is related importantly to previously described procedure of cuing imagined events by
constructive aspects of memory. recombining details from different actual events. In this
Studies of future-event simulation also bring into sharp study, remembered events were rated as significantly more
focus fundamental issues concerning processes of reality moni- detailed than imagined events, so it would make sense from
toring, which allow us to distinguish between remembered the perspective of the reality-monitoring framework that
and imagined events (Johnson & Raye, 1981). If remember- regions associated with processing of sensory and contextual
ing past events and imagining future or novel events recruit details would show greater activity for real events than for
largely overlapping brain networks, how can individuals imagined ones.
distinguish fantasy from reality? Although still in its infancy, it seems clear that research
Hassabis, Kumaran, and Maguire (2007) addressed this on future-event simulation and related forms of internally
issue in the context of their neuroimaging study, where they directed cognition has much to offer memory research.
found, as noted earlier, that anterior medial prefrontal cortex At the very least, the striking similarities observed during
and posterior cingulate cortex showed greater activity when remembering the past and imagining the future are con-
individuals recollected real episodic memories as compared sistent with Bartlett’s (1932) claim that “memory appears
to when they imagined novel scenes. Because their novel- to be an affair of construction rather than reproduction.”
scenes task does not require mental time travel or projection We are optimistic that further study of such processes
of the self, Hassabis, Kumaran, and Maguire suggested that as future-event simulation, scene construction, and self-
anterior medial prefrontal cortex and posterior cingulate projection will teach us much about the constructive nature
cortex “support episodic memory over and above scene con- of memory.
struction” (2007, p. 14372), perhaps contributing to effective acknowledgments Preparation of this chapter was supported by
reality monitoring. While this conclusion may be accurate grants from the NIA, NIMH, and HHMI. We thank Adrian
in the context of the scene-construction task, these regions Gilmore for help with preparation of the manuscript.
schacter, addis, and buckner: constructive memory and simulation of future events 759
REFERENCES Clayton, N. S., & Dickinson, A. (1998). Episodic-like memory
during cache recovery by scrub jays. Nature, 395, 272–274.
Abraham, A., von Cramon, D. Y., & Schubotz, R. I. (2008). Craik, F. I. M., & Salthouse, T. A. (Eds.). (2000). Handbook of aging
Meeting George Bush versus meeting Cinderella: The neural and cognition (2nd ed.). Hillsdale, NJ: Erlbaum.
response when telling apart what is real from what is fictional in D’Argembeau, A., Raffard, S., & Van der Linden, M. (2008).
the context of our reality. J. Cogn. Neurosci., 20, 965–976. Remembering the past and imagining the future in schizophre-
Addis, D. R., Cheng, T., & Schacter, D. L. (2008). Episodic s nia. J. Abnorm. Psychol., 117, 247–251.
imulation of specific and general future events. Poster presented at the Diba, K., & Buzsáki, G. (2007) Forward and reverse hippocampal
Annual Meeting of the Organization for Human Brain Mapping, place-cell sequences during replay. Nat. Neurosci., 10,
Melbourne, Australia. 1241–1242.
Addis, D. R., Pan, L., Vu, M. A., Laiser, N., & Schacter, Dickson, J. M., & Bates, G. W. (2005). Influence of repression on
D. L. (in press). Constructive episodic simulation of the future autobiographical memories and expectations of the future. Aust.
and the past: Distinct subsystems of a core brain network mediate J. Psychol., 57, 20–27.
imagining and remembering. Neuropsychologia. Dickson, J. M., & Bates, G. W. (2006). Autobiographical
Addis, D. R., & Schacter, D. L. (2008). Constructive episodic memories and views of the future: In relation to dysphoria. Int.
simulation: Temporal distance and detail of past and future J. Psychol., 41, 107–116.
events modulate hippocampal engagement. Hippocampus, 18, Driscoll, I., Hamilton, D. A., Petropoulos, H., Yeo, R. A.,
227–237. Brooks, W. M., Baumgarter, R. N., et al. (2003). The aging
Addis, D. R., Wong, A. T., & Schacter, D. L. (2007). Remember- hippocampus: Cognitive, biochemical, and structural findings.
ing the past and imagining the future: Common and distinct Cereb. Cortex, 13, 1344–1351.
neural substrates during event construction and elaboration. Dudai, Y., & Carruthers, M. (2005). The Janus face of
Neuropsychologia, 45, 1363–1377. Mnemosyne. Nature, 434, 823–824.
Addis, D. R., Wong, A. T., & Schacter, D. L. (2008). Age-related Eichenbaum, H., & Cohen, N. J. (2001). From conditioning to
changes in the episodic simulation of future events. Psychol. Sci., conscious recollection: Memory systems of the brain. New York: Oxford
19, 33–41. University Press.
Arzy, S., Molnar-Szarkacs, I. M., & Blanke, O. (2008). Self in Ferbinteanu, J., & Shapiro, M. L. (2003). Prospective and retro-
time: Imagined self-location influences neural activity related to spective memory coding in the hippocampus. Neuron, 40,
mental time travel. J. Neurosci., 28, 6502–6507. 1227–1239.
Bartlett, F. C. (1932). Remembering. Cambridge, UK: Cambridge Foster, D. J., & Wilson, M. A. (2006). Reverse replay of behav-
University Press. ioral sequences in hippocampal place cells during the awake
Bird, C. M., Castelli, F., Malik, O., Frith, U., & Husain, M. state. Nature, 440, 680–683.
(2004). The impact of extensive medial frontal lobe damage on Gilboa, A. (2004). Autobiographical and episodic memory—One
theory of mind and cognition. Brain, 127, 914–928. and the same? Evidence from prefrontal activation in neuroim-
Botzung, A., Denkova, E., & Manning, L. (2008). Experiencing aging studies. Neuropsychologia, 42, 1336–1349.
past and future personal events: Functional neuroimaging evi- Golomb, J., de Leon, M. J., Kluger, A., George, A. E., Tarshish,
dence on the neural bases of mental time travel. Brain Cogn., 66, C., & Ferris, S. H. (1993). Hippocampal atrophy in normal
202–212. aging: An association with recent memory impairment. Arch.
Brainerd, C. J., & Reyna, V. F. (2005). The science of false memory. Neurol., 50, 967–973.
New York: Oxford University Press. Greicius, M. D., Srivastava, G., Reiss, A. L., & Menon,
Bremner, J. D., Narayan, M., Anderson, E. R., Staib, L. H., V. (2004). Default-mode network activity distinguishes Alzheim-
Miller, H. L., & Charney, D. S. (2000). Hippocampal volume er’s disease from healthy aging: Evidence from functional MRI.
reduction in major depression. Am. J. Psychiatry, 157, 115–117. Proc. Natl. Acad. Sci. USA, 101, 4637–4642.
Buckner, R. L., Andrews, J. R., & Schacter, D. L. (2008). The Hassabis, D., Kumaran, D., & Maguire, E. A. (2007). Using
brain’s default system: Anatomy, function, and relevance to imagination to understand the neural basis of episodic memory.
disease. The Year in Cognitive Neuroscience, Ann. NY Acad. Sci., 1124, J. Neurosci., 27, 14365–14374.
1–38. Hassabis, D., Kumaran, D., Vann, S. D., & Maguire, E. A. (2007).
Buckner, R. L., & Carroll, D. C. (2007). Self-projection and the Patients with hippocampal amnesia cannot imagine new experi-
brain. Trends Cogn. Sci., 11, 49–57. ences. Proc. Natl. Acad. Sci. USA, 104, 1726–1731.
Byrne, P., Becker, S., & Burgess, N. (2007). Remembering the Hassabis, D., & Maguire, E. A. (2007). Deconstructing episodic
past and imagining the future: A neural model of spatial memory memory with construction. Trends Cogn. Sci., 11, 299–306.
and imagery. Psychol. Rev., 114, 340–375. Ingvar, D. H. (1979). Hyperfrontal distribution of the cerebral
Cabeza, R., Rao, S., Wagner, A. D., Mayer, A., & Schacter, D. grey matter flow in resting wakefulness: On the functional
L. (2001). Can medial temporal lobe regions distinguish true anatomy of the conscious state. Acta Neurol. Scand., 60, 12–25.
from false? An event-related fMRI study of veridical and illusory Ingvar, D. H. (1985). “Memory of the future”: An essay on the
recognition memory. Proc. Natl. Acad. Sci. USA, 98, 4805–4810. temporal organization of conscious awareness. Hum. Neurobiol.,
Cabeza, R., & St Jacques, P. (2007). Functional neuroimaging of 4, 127–136.
autobiographical memory. Trends Cogn. Sci., 11, 219–227. Johnson, A., & Redish, A. D. (2007). Neural ensembles in CA3
Campbell, S., & Macqueen, G. (2004). The role of the hippocam- transiently encode paths forward of the animal at a decision
pus in the pathophysiology of major depression. J. Psychiatry point. J. Neurosci., 27, 12176–12189.
Neurosci., 29, 417–426. Johnson, M. K., Foley, M. A., Suengas, A. G., & Raye, C. L.
Clayton, N. S., Bussey, T. J., & Dickinson, A. (2003). Can animals (1988). Phenomenal characteristics of memories for perceived
recall the past and plan for the future? Nat. Rev. Neurosci., and imagined autobiographical events. J. Exp. Psychol. Gen., 117,
4, 685–691. 371–376.
760 memory
Johnson, M. K., & Raye, C. L. (1981). Reality monitoring. Psychol. Ruane, D., MacLeod, A. K., & Holmes, E. A. (2005). The simula-
Rev., 88, 67–85. tion heuristic and visual imagery in pessimisim for negative
Kahn, I., Andrews-Hanna, J. R., Vincent, J. L., Snyder, events in anxiety. Clin. Psychol. Psychother., 12, 313–325.
A. Z., & Buckner, R. L. (2008). Distinct cortical anatomy linked Saxe, R., & Kanwisher, N. (2003). People thinking about thinking
to subregions of the medial temporal lobe revealed by intrinsic people: The role of the temporo-parietal junction in “theory of
functional connectivity. J. Neurophysiol, 100, 129–139. mind.” NeuroImage, 19, 1835–1842.
Kensinger, E. A., & Schacter, D. L. (2006). Neural processes Schacter, D. L. (1995). Memory distortion: History and current
underlying memory attribution on a reality-monitoring task. status. In D. L. Schacter (Ed.), Memory distortion: How minds, brains
Cereb. Cortex, 16, 1126–1133. and societies reconstruct the past (pp. 1–43). Cambridge, MA: Harvard
Klein, S. B., Loftus, J., & Kihlstrom, J. F. (2002). Memory and University Press.
temporal experience: The effects of episodic memory loss in an Schacter, D. L. (1996). Searching for memory: The brain, the mind, and
amnesic patient’s ability to remember the past and imagine the the past. New York: Basic Books.
future. Soc. Cogn., 20, 353–379. Schacter, D. L. (2001). The seven sins of memory: How the mind forgets
Levine, B., Svoboda, E., Hay, J. F., Winocur, G., & and remembers. Boston: Houghton Mifflin.
Moscovitch, M. (2002). Aging and autobiographical memory: Schacter, D. L., & Addis, D. R. (2007a). The cognitive neurosci-
Dissociating episodic from semantic retrieval. Psychol. Aging, 17, ence of constructive memory: Remembering the past and
677–689. imagining the future. Philos. Trans. R. Soc. Lond. B Biol. Sci., 362,
Loftus, E. F. (2003). Make-believe memories. Am. Psychol., 58, 773–786.
867–873. Schacter, D. L., & Addis, D. R. (2007b). The ghosts of past and
MacLeod, A. K., & Cropley, M. L. (1995). Depressive future- future. Nature, 445, 27.
thinking: The role of valence and specificity. Cogn. Ther. Res., 19, Schacter, D. L., & Addis, D. R. (2007c). The optimistic brain.
35–50. Nat. Neurosci., 10, 1345–1347.
MacLeod, A. K., Rose, G., & Williams, J. M. (1993). Components Schacter, D. L., Addis, D. R., & Buckner, R. L. (2007). Remem-
of hopelessness about the future in parasuicide. Cogn. Ther. Res., bering the past to imagine the future: The prospective brain. Nat.
17, 441–455. Rev. Neurosci., 8, 657–661.
MacLeod, A. K., Tata, P., Kentish, J., Carroll, F., & Hunter, Schacter, D. L., Addis, D. R., & Buckner, R. L. (2008). Episodic
E. (1997). Anxiety, depression, and explanation-based pessimism simulation of future events: Concepts, data, and applications.
for future positive and negative events. Clin. Psychol. Psychother., The Year in Cognitive Neuroscience, Ann. NY Acad. Sci., 1124, 39–60.
4, 15–24. Schacter, D. L., Norman, K. A., & Koutstaal, W. (1998). The
Maguire, E. A. (2001). Neuroimaging studies of autobiographical cognitive neuroscience of constructive memory. Annu. Rev.
event memory. Philos. Trans. R. Soc. Lond. B Biol. Sci., 356, Psychol., 49, 289–318.
1441–1451. Schacter, D. L., & Slotnick, S. D. (2004). The cognitive
Neisser, U. (1967). Cognitive psychology. New York: Appleton- neuroscience of memory distortion. Neuron, 44, 149–160.
Century-Crofts. Schacter, D. L., Verfaellie, M., & Pradere, D. (1996).
Okuda, J., Fujii, T., Ohtake, H., Tsukiura, T., Tanji, K., Suzuki, The neuropsychology of memory illusions: False recall and
K., et al. (2003). Thinking of the future and the past: The roles recognition in amnesic patients. J. Mem. Lang., 35, 319–334.
of the frontal pole and the medial temporal lobes. NeuroImage, 19, Schnider, A. (2008). The confabulating mind. Oxford, UK: Oxford
1369–1380. University Press.
Partiot, A., Grafman, J., Sadato, N., Wachs, J., & Hallett, M. Sharot, T., Riccardi, A. M., Raio, C. M., & Phelps, E. A. (2007).
(1995). Brain activation during the generation of emotional and Neural mechanisms mediating optimism bias. Nature, 450,
non-emotional plans. NeuroReport, 6, 1269–1272. 102–105.
Pastalkova, E., Itskov, V., Amarasingham, A., & Buzsáki, G. Slotnick, S. D., & Schacter, D. L. (2004). A sensory signature
(2008). Internally generated cell assembly sequences in the rat that distinguishes true from false memories. Nat. Neurosci., 7,
hippocampus. Science, 321, 1322–1327. 664–672.
Raby, C. R., Alexis, D. M., Dickinson, A., & Clayton, N. S. Spreng, R. N., & Levine, B. (2006). The temporal distribution of
(2007). Planning for the future by western scrub-jays. Nature, 445, past and future autobiographical events across the lifespan. Mem.
919–921. Cogn., 34, 1644–1651.
Ranganath, C., & Rainer, G. (2003). Neural mechanisms for Spreng, R. N., Mar, R. A., & Kim, A. S. N. (in press). The
detecting and remembering novel events. Nat. Rev. Neurosci., 4, common neural basis of autobiographical memory, prospection,
193–202. navigation, theory of mind, and the default mode: A quantitative
Roediger, H. L., III. (1996). Memory illusions. J. Mem. Lang., 35, meta-analysis. J. Cogn. Neurosci.
76–100. StÖber, J., & Borkovec, T. D. (2002). Reduced concreteness of
Rosenbaum, R. S., Gilboa, A., Levine, B., Winocur, G., & worry in generalized anxiety disorder: Findings from a therapy
Moscovitch, M. (in press). Amnesia as an impairment of detail study. Cogn. Ther. Res., 26, 89–96.
generation and binding: Evidence from personal, fictional, and Suddendorf, T., & Busby, J. (2005). Making decisions with the
semantic narratives in K.C. Neuropsychologia. future in mind: Developmental and comparative identification
Rosenbaum, R. S., Kohler, S., Schacter, D. L., Moscovitch, of mental time travel. Learn. Motivation, 36, 110–125.
M., Westmacott, R., Black, S. E., et al. (2005). The case of Suddendorf, T., & Corballis, M. C. (1997). Mental time travel
K.C.: Contributions of a memory-impaired person to memory and the evolution of the human mind. Genet. Soc. Gen. Psychol.
theory. Neuropsychologia, 43, 989–1021. Monogr., 123, 133–167.
Rosenbaum, R. S., Stuss, D. T., Levine, B., & Tulving, E. (2007). Suddendorf, T., & Corballis, M. C. (2007). The evolution of
Theory of mind is independent of episodic memory. Science, 318, foresight: What is mental time travel and is it unique to humans?
1257. Behav. Brain Sci., 30, 299–313.
schacter, addis, and buckner: constructive memory and simulation of future events 761
Svoboda, E., Mckinnon, M. C., & Levine, B. (2006). The func- volumes according to psychosis stage and diagnosis: A magnetic
tional neuroanatomy of autobiographical memory: A meta- resonance imaging study of chronic schizophrenia, first-episode
analysis. Neuropsychologia, 44, 2189–2208. psychosis, and ultra-high-risk individuals. Arch. Gen. Psychiatry, 63,
Szpunar, K. K., Watson, J. M., & McDermott, K. B. (2007). 139–149.
Neural substrates of envisioning the future. Proc. Natl. Acad. Sci. Verfaellie, M., Page, K., Orlando, F., & Schacter, D. L.
USA, 104, 642–647. (2005). Impaired implicit memory for gist information in amnesia.
Talland, G. A. (1961). Confabulation in the Wernicke-Korsakoff Neuropsychology, 19, 760–769.
syndrome. J. Nerv. Ment. Dis., 132, 361. Vincent, J. L., Snyder, A. Z., Fox, M. D., Shannon, B. J.,
Talland, G. A. (1965). Deranged memory: A psychonomic study of the Andrews, J. R., Raichle, M. E., et al. (2006). Coherent spon-
amnesic syndrome. New York: Academic Press. taneous activity identifies a hippocampal-parietal memory
Tulving, E. (1983). Elements of episodic memory. Oxford, UK: network. J. Neurophysiol., 96, 3517–3531.
Clarendon Press. Wagner, A. D., Shannon, B. J., Kahn, I., & Buckner, R. L. (2005).
Tulving, E. (1985). How many memory systems are there? Am. Parietal lobe contributions to episodic memory retrieval. Trends
Psychol., 40, 385–398. Cogn. Sci., 9, 445–453.
Tulving, E. (2002). Episodic memory: From mind to brain. Annu. Williams, J. M. (1996). Depression and the specificity of autobio-
Rev. Psychol., 53, 1–25. graphical memory. In D. C. Rubin (Ed.), Remembering our past:
Tulving, E. (2005). Episodic memory and autonoesis. In Studies in autobiographical memory (pp. 244–267). Cambridge, UK:
H. Terrance & J. Metcalfe (Eds.), The missing link in cognition: Cambridge University Press.
Origins of self-reflective consciousness (pp. 3–56). New York: Oxford Williams, J. M. (2006). Capture and rumination, functional
University Press. avoidance, and executive control (CaRFAX): Three processes
Tulving, E., Schacter, D. L., McLachlan, D. R., & that underlie overgeneral memory. Cogn. Emotion, 20, 548–
Moscovitch, M. (1988). Priming of semantic autobiographical 568.
knowledge: A case study of retrograde amnesia. Brain Cogn., 8, Williams, J. M., Ellis, N. C., Tylers, C., Healy, H., Rose, G.,
3–20. & MacLeod, A. K. (1996). The specificity of autobiographical
Velakoulis, D., Wood, S. J., Wong, M. T., McGorry, P. D., memory and imageability of the future. Mem. Cogn., 24,
Yung, A., Phillips, L., et al. (2006). Hippocampal and amygdala 116–125.
762 memory
VII
LANGUAGE
Chapter 52 hickok 767
55 caplan 805
56 hagoort, baggio,
and willems 819
57 kuhl 837
59 fitch 873
Introduction
alfonso caramazza
766 language
52 The Cortical Organization of
Phonological Processing
gregory hickok
abstract Phonological processing refers to mechanisms involved motor-articulatory mechanisms, whereas the later involves
in representing, accessing, or manipulating information related mapping phonological information onto lexical-semantic
to the sound structure of language. The goal of this chapter is to representations. It is an open question whether the phono-
review what is known about the neural basis of phonological
processes in three broad domains: speech recognition, speech pro-
logical representations involved in input-related processes,
duction, and verbal short-term memory. In particular, we will output-related processes, or other processes are shared or
outline evidence showing that phonological processing is task distinct—for example, whether there are distinct phonologi-
dependent, that phonological-level aspects of speech recognition cal lexicons (Hickok, 2001; Shelton & Caramazza, 1999)—
are bilaterally organized (but computationally asymmetric), and but it is clear that there are, minimally, distinct and
that posterior phonological information interacts with frontal
task-dependent interfaces that phonological representations
motor systems by means of a sensory-motor integration network
that supports aspects of speech production and verbal working enter into (figure 52.1).
memory. These findings are organized theoretically into a dual- A relevant observation regarding task differences in pho-
stream model, which is closely related to dual-stream models pro- nological processing comes from a set of studies that were
posed in the visual domain. conducted in the late 1970s and early 1980s that showed a
double dissociation in two tasks involving phonological
processing (Basso, Casati, & Vignolo, 1977; Blumstein,
Phonological processing refers to mechanisms involved in Cooper, Zurif, & Caramazza, 1977; Miceli, Gainotti,
representing, accessing, or manipulating information related Caltagirone, & Masullo, 1980). These studies examined the
to the sound structure of language.1 As such, phonological ability of aphasic patients to perform a syllable discrimi-
processes are involved in a range of language abilities. The nation task (e.g., decide whether pairs of syllables such as
goal of this chapter is to review what is known about the /da/–/ta/ are the same or different). A prominent theory
neural basis of phonological processes in three broad at the time was that auditory comprehension deficits in
domains: speech recognition, speech production, and verbal aphasia resulted from a deficit in the ability to perceive
short-term memory. We will also examine the relation phonological information in speech (Luria, 1970). Such an
between these various domains and explore possible paral- account predicted that deficits in syllable discrimination
lels and connections between phonological processing net- would be strongly associated with auditory comprehension
works and cortical systems outside the domain of speech deficits in aphasia. This prediction turned out to be incor-
and language. rect: a consistent finding was that syllable discrimination and
word-level comprehension doubly dissociate, even when the
Phonological processing is task dependent comprehension task involved phonological foils (Baker,
Blumstein, & Goodglass, 1981; Miceli et al.). Further, the
Given that phonological information is involved in a broad patient group that tended to perform the worst on syllable
range of linguistic abilities, it is perhaps not surprising that discrimination consisted of nonfluent patients with good audi-
we should find evidence for task dependence in the neural tory comprehension (Basso et al.). Thus data from aphasia
systems recruited to perform this range of tasks. For example, show that it is quite possible to use phonological information
the set of neural circuits involved in, say, verbatim repetition to access lexical-semantic information in a comprehension
of a heard phonological word form must be at least partially task, yet fail to discriminate syllables, and that is it also pos-
different from the set of neural circuits involved in compre- sible to be able to use phonological information to discrimi-
hending the meaning of a heard phonological word form, as nate syllables, yet fail to comprehend words. See Hickok and
the former involves mapping phonological information onto Poeppel (2004) for further discussion of these data.
This double dissociation does not imply that there are
gregory hickok Center for Cognitive Neuroscience, Univer- distinct networks (or lexicons) of phonological representations,
sity of California, Irvine, California one involved in syllable discrimination and the other involved
768 language
In sum, disruption of the left superior temporal lobe does involved sentence-level stimuli, raising the possibility that
not lead to severe impairments in phonological processing anterior STS regions may be responding to some other
during spoken word recognition. This observation has led aspect of the stimuli such as its syntactic or prosodic organi-
to the claim that phonological processes are bilaterally zation (Friederici, Meyer, & von Cramon, 2000; Humphries,
organized in the superior temporal lobe (Hickok & Poeppel, Binder, Medler, & Liebenthal, 2006; Humphries, Love,
2000, 2004, 2007). This claim predicts that damage to the Swinney, & Hickok, 2005; Humphries, Willard, Buchsbaum,
STG bilaterally should produce profound impairment in & Hickok, 2001; Vandenberghe, Nobre, & Price, 2002). The
spoken word recognition, which in fact it does in the form weight of the available evidence, therefore, suggests that
of word deafness (Buchman, Garron, Trost-Cardamone, the critical portion of the STS that is involved in phono-
Wichter, & Schwartz, 1986). logical-level processes is bounded anteriorly by the antero-
lateralmost aspect of Heschl’s gyrus and posteriorly by
The superior temporal sulcus is a critical site for the posteriormost extent of the Sylvian fissure (Hickok &
phonological processing Poeppel, 2007).
Beyond the earliest stages of speech recognition there is Phonological processing systems in speech recognition
accumulating evidence that portions of the STS are impor- are bilateral but asymmetric
tant for representing and/or processing phonological infor-
mation (Binder et al., 2000; Hickok & Poeppel, 2004, 2007; The claim that phonological processing is bilaterally orga-
Indefrey & Levelt, 2004; Liebenthal, Binder, Spitzer, Possing, nized for speech recognition tasks does not imply that the
& Medler, 2005; Price et al., 1996). The STS is activated systems in both hemispheres are computationally identical.
by a range of tasks that require access to phonological To the contrary, there is abundant evidence for differences
information, including speech perception and production in the way acoustic/speech information is processed in the
(Indefrey & Levelt, 2004), and the active short-term main- two hemispheres (Abrams, Nicol, Zecker, & Kraus, 2008;
tenance of phonemic information (Buchsbaum, Hickok, Boemio, Fromm, Braun, & Poeppel, 2005; Giraud et al.,
& Humphries, 2001; Hickok, Buchsbaum, Humphries, & 2007; Hickok & Poeppel, 2007; Zatorre, Belin, & Penhune,
Muftuler, 2003). Functional imaging studies that attempt 2002). What is less clear is the computational nature of these
to isolate phonological processes in perception by con- differences. One view is that the difference turns on biases
trasting speech stimuli with complex nonspeech signals toward temporal (left-hemisphere) versus spectral (right-
have found activation along the STS (Liebenthal et al., 2005; hemisphere) resolution (Zatorre et al., 2002). Another view
Narain et al., 2003; Obleser, Zimmermann, Van Meter, & holds that the two hemispheres differ in terms of their sam-
Rauschecker, 2006; Scott, Blank, Rosen, & Wise, 2000; pling rate, with the left hemisphere operating at a higher
Spitsyna, Warren, Scott, Turkheimer, & Wise, 2006; rate (25–50 Hz) and the right hemisphere at a lower rate
Vouloumanos, Kiehl, Werker, & Liddle, 2001), as have (3–5 Hz) (Poeppel, 2003).2 Yet another proposal, more spe-
studies that manipulate psycholinguistic variables that tap cific to phonological processing, is that the left hemisphere
phonological networks (Okada & Hickok, 2006), such as processes phonemic information in a categorical fashion,
phonological neighborhood density (the number of words whereas the right hemisphere may treat such information
that sound similar to a target word). Although many authors in a more continuous fashion (Liebenthal et al., 2005). We
consider this system to be strongly left dominant, both lesion will not resolve these questions here. For our purposes, it
evidence and imaging evidence suggest a bilateral organiza- is important to note that computational differences exist
tion (Hickok & Poeppel, 2007). between the two hemispheres in the way that speech signals
One currently unresolved question is the relative contri- are processed during speech recognition, but that both are
bution of anterior versus posterior STS regions in phono- involved in the process, and both are largely capable of
logical processing. Lesion evidence indicates that damage to processing phonological information sufficiently well to
posterior temporal lobe areas is most predictive of auditory access lexical-semantic information (Hickok & Poeppel,
comprehension deficits (Bates et al., 2003); however, as 2004). This analysis indicates that spoken word recognition
noted earlier, comprehension deficits in aphasia result pre- involves parallel pathways (multiple routes) in the mapping
dominantly from postphonemic processing levels. A major- from sound to meaning (Hickok & Poeppel, 2007). Although
ity of functional imaging studies targeting phonological this conclusion differs from standard models of speech rec-
processing in perception have highlighted regions in the ognition (Luce & Pisoni, 1998; Marslen-Wilson, 1987;
posterior half of the STS (Hickok & Poeppel, 2007). Other McClelland & Elman, 1986), it agrees nicely with the fact
studies, however, have reported anterior STS activation in that speech contains redundant cues to phonemic informa-
perceptual speech tasks (Mazoyer et al., 1993; Narain et al., tion, as well as with behavioral evidence suggesting that the
2003; Scott et al., 2000; Spitsyna et al., 2006). These studies speech system can take advantage of these different cues
770 language
lamina IV. . . . the intimate relationship and similar “phonological store,” a dedicated buffer, and active mainte-
evolutionary status of Areas 44 and Tpt allows for a certain nance is achieved by the “articulatory rehearsal” mechanism
functional overlap” (Galaburda, 1982, p. 442). (Baddeley, 1992). The concept of a sensorimotor integration
network, as outlined previously, provides an independently
Spt Activity Is Modulated by Motor Effector motivated neural circuit that may be the basis for verbal
Manipulations In monkey parietal cortex, sensorimotor short-term memory (Buchsbaum, Olsen, Koch, & Berman,
integration areas are organized around motor effector 2005; Hickok et al., 2003; Hickok & Poeppel, 2000; see
systems (e.g., ocular versus manual actions in LIP and AIP; also Aboitiz & García V., 1997; Jacquemot & Scott, 2006).
Andersen, 1997; Colby & Goldberg, 1999). Recent evidence Specifically, on the assumption that the proposed sensorimo-
suggests that Spt may be organized around the vocal tract tor integration circuit is bidirectional (Hickok & Poeppel,
effector system: Spt was less active when skilled pianists 2000, 2004, 2007), one can equate the storage component
listened to and then imagined playing novel melodies than of verbal short-term memory with sensory representations
when they listened to and covertly hummed the same in the superior temporal lobe (the same STS regions that are
melodies (Pa & Hickok, 2008). involved in sensory/recognition processes), and one can
equate the active maintenance component with frontal
Spt Is Sensitive to Speech-Related Visual Stimuli articulatory systems: the sensorimotor integration network
Many neurons in sensorimotor integration areas of the (Spt) allows articulatory mechanisms to maintain verbal
monkey parietal cortex are sensitive to inputs from more information in an active state (Hickok et al., 2003). In this
than one sensory modality (Andersen, 1997). The planum sense, the basic architecture is similar to Baddeley’s, except
temporale, while often thought to be an auditory area, also that there is a proposed computational mechanism (senso-
activates in response to sensory input from other modalities. rimotor transformations in Spt) mediating the relation
For example, silent lipreading has been shown to activate between the storage and active maintenance components.
auditory cortex in the vicinity of the planum temporale This view differs from Baddeley’s, however, in that it assumes
(Calvert et al., 1997; Calvert & Campbell, 2003). Although that the storage component is not a dedicated buffer but an
these studies typically report the location as “auditory cortex” active state of networks that are involved in perceptual
including primary regions, group-based localizations in this recognition (Fuster, 1995; Ruchkin, Grafman, Cameron,
region can be unreliable. Indeed, a recent fMRI study & Berndt, 2003). Because our evidence suggests that the
using individual subject analyses has found that activation proposed sensorimotor integration network is not specific to
to visual speech and activation using the standard Spt- phonological information (Hickok et al.), we also suggest that
defining auditory-motor task (listen then covertly produce) the verbal short-term memory circuit is not specific to pho-
are found in the same regions of the left posterior planum nological information, a position that is in line with recent
temporale (Okada & Hickok, 2009). Thus Spt appears to be behavioral work ( Jones, Hughes, & Macken, 2007; Jones &
sensitive also to visual input that is relevant to vocal tract Macken, 1996; Jones, Macken, & Nicholls, 2004). For a
actions. thorough discussion of these issues, see Buchsbaum and
In summary, Spt exhibits all the features of sensorimotor D’Esposito (2008).
integration areas as identified in the parietal cortex of the
monkey. This finding suggests that Spt is a sensorimotor A theoretical framework: The dual-stream model
integration area for vocal tract actions (Pa & Hickok, 2008),
placing it in the context of a network of sensorimotor inte- The processing of phonological information in speech
gration areas in the posterior parietal and temporal/parietal recognition, speech production, and short-term memory
cortex, which receive multisensory input and are organized involves partially overlapping, but also partially distinct,
around motor-effector systems (Andersen, 1997). Although neural circuits. Speech recognition relies primarily on neural
area Spt is not language specific, it counts sensorimotor circuits in the superior temporal lobes bilaterally, whereas
integration for phonological information as a prominent speech production and verbal short-term memory rely on a
function. frontoparietal/temporal circuit that is left-hemisphere domi-
nant. As noted earlier, this divergence of processing streams
Verbal short-term memory relies on auditory-motor is consistent with the fact that phonological information
integration networks plays a role in (1) accessing lexical-semantic representations
on the one hand and (2) driving motor-speech articulation
Verbal short-term memory is often held to comprise at least on the other. As lexical-semantic and motor-speech systems
two components: a storage component of some form and a involve very different types of representations and processing
mechanism for active maintenance of this information. In mechanisms, it stands to reason that divergent pathways
Baddeley’s model, for example, the storage mechanism is the underlie the interface with phonological networks.
Figure 52.2 The dual-stream model of speech processing. STG that are hypothesized to be involved in spectrotemporal
(A) Schematic diagram of the dual-stream model. The earliest analysis. Regions shaded yellow in the posterior half of the STS
stage of cortical speech processing involves some form of are implicated in phonological-level processes. Regions shaded red
spectrotemporal analysis, which is carried out in auditory cortices represent the ventral stream, which is bilaterally organized with
bilaterally in the supratemporal plane. These spectrotemporal a weak left-hemisphere bias. The more posterior regions of the
computations appear to differ between the two hemispheres. ventral stream, the posterior middle and inferior portions of the
Phonological-level processing and representation involves the temporal lobes, correspond to the lexical interface, which links pho-
middle to posterior portions of superior temporal sulcus (STS) nological and semantic information, whereas the more anterior
bilaterally, although there may be a weak left-hemisphere bias at locations correspond to the hypothesized combinatorial network.
this level of processing. Subsequently, the system diverges into Regions shaded blue represent the dorsal stream, which is strongly
two broad streams, a dorsal pathway (blue) that maps sensory left-dominant. The posterior region of the dorsal stream corre-
or phonological representations onto articulatory motor represen- sponds to an area in the Sylvian fissure at the parietal-temporal
tations, and a ventral pathway (red) that maps sensory or phono- boundary (area Spt), which is hypothesized to be a sensorimotor
logical representations onto lexical-conceptual representations. interface, whereas the more anterior locations in the frontal lobe,
(B) Approximate anatomical locations of the dual-stream model likely involving Broca’s region and a more dorsal premotor site,
components, specified as precisely as available evidence allows. correspond to portions of the articulatory network. (Figure repro-
Regions shaded green depict areas on the dorsal surface of the duced from Hickok & Poeppel, 2007.) (See color plate 64.)
772 language
that syllable discrimination relies to a greater extent on Despite the current popularity of motor theories of speech
dorsal stream circuitry (Burton, Small, & Blumstein, 2000) perception, there is strong evidence that the theory is incor-
(explaining the association with frontal lesions), whereas rect. Motor theories of speech perception make a clear pre-
speech recognition tasks rely to a greater extent on ventral diction: disruption of the motor systems involved in speech
stream circuitry. The involvement of dorsal stream circuitry production should produce a substantial disruption of speech
in syllable discrimination tasks makes sense given that recognition. This prediction is falsified by the common
discrimination of serially presented speech information occurrence of patients with large left frontal lesions who
requires some degree of verbal short-term memory. In have profound impairments in the ability to produce speech,
addition, in contrast to the typical view that speech process- yet have well-preserved ability to comprehend speech at the
ing is mainly left-hemisphere dependent, the model suggests lexical level (i.e., severe Broca’s aphasics) (Goodglass, 1993;
that the ventral stream is bilaterally organized (although Goodglass et al., 2001). This finding demonstrates clearly
with important computational differences between the that speech recognition can be achieved without the motor-
two hemispheres); thus the ventral stream itself comprises speech system’s involvement. However, damage to sensory-
parallel processing streams. This approach would explain related speech areas regularly produces deficits in speech
the failure to find substantial speech recognition deficits production, such as the paraphasic errors found in the fluent
following unilateral temporal lobe damage. The dorsal speech of Wernicke and conduction aphasics (Goodglass,
stream, however, is strongly left-dominant, explaining 1993; Goodglass et al., 2001). Thus the evidence confirms
why production deficits are prominent sequelae of dorsal Wernicke’s conceptualization of the relation between sensory
temporal and frontal lesions, as well as explaining why and motor speech systems, namely, that sensory systems are
left-hemisphere injury can substantially impair performance necessary for speech production, but motor-speech systems
on syllable discrimination tasks (Hickok & Poeppel, 2000, are not necessary for speech recognition (Hickok & Poeppel,
2004, 2007). 2000, 2004, 2007). Put differently, the relation between
sensory and motor speech systems is better characterized
On mirror neurons and motor theories of perception by a perceptual theory of speech production than a motor theory of
speech perception.
Evidence we have reviewed suggests a tight connection A strong version of a motor/mirror-neuron theory of
between systems involved in speech perception and speech speech perception is clearly untenable. At the same time, it
production, and the dual-stream model captures this associa- is quite clear that motor knowledge can influence perception
tion in the form of the dorsal processing stream that medi- (Galantucci, Fowler, & Turvey, 2006), as the McGurk effect
ates this relation. The idea that perception and production clearly demonstrates (McGurk & MacDonald, 1976). These
systems in speech are functionally interrelated is not new, as effects do not imply, however, that speech perception
it was an integral component of Wernicke’s language model requires the involvement of motor systems, only that motor
of 1874 (Wernicke, 1874/1969). The motor theory of speech knowledge can influence or constrain the acoustic analysis
perception also highlighted important links between percep- of speech, for example, by means of top-down, or predictive,
tion and production, but with quite a different spin. Whereas coding mechanisms (van Wassenhove, Grant, & Poeppel,
Wernicke emphasized the role of perceptual systems in 2005). The proposed sensorimotor integration network pro-
guiding speech production, the motor theory proposed the vides a neural basis for this influence of motor knowledge on
reverse, that motor speech systems were the foundation for speech perception.
speech perception (Liberman & Mattingly, 1985). Although
the motor theory had lost favor among most speech/ Summary
language scientists, the discovery of “mirror neurons”
(di Pellegrino, Fadiga, Fogassi, Gallese, & Rizzolatti, 1992; Phonological processing is a heterogeneous, task-dependent
Gallese, Fadiga, Fogassi, & Rizzolatti, 1996) has triggered a construct, and the neural systems that support phonological
resurgence of interest in motor theories of perception gener- processing are similarly heterogeneous and task dependent.
ally (Iacoboni et al., 2005; Rizzolatti, Fadiga, Gallese, & There is a fundamental distinction between the processes
Fogassi, 1996), and the motor theory of speech perception and neural circuits involved in tasks that involve motor-
in particular (Rizzolatti & Arbib, 1998). Mirror neurons are related systems compared with tasks that primarily involve
cells found in monkey frontal cortex that respond both lexical-semantic systems leading to task-related double dis-
during the execution of motor acts and during the percep- sociations within the context of “phonological processing.”
tion of others performing similar motor acts. It has been There is also a substantial amount of interaction between
suggested that mirror neurons are the basis for action under- sensory- and motor-related aspects of phonological process-
standing, including the perception/understanding of speech ing, as well as evidence for shared resources, such as phono-
(Rizzolatti & Arbib, 1998). logical systems in the STS. The neuroanatomical framework
774 language
the processing of syntactic and lexical information. Brain Lang., during auditory sentence comprehension. J. Cogn. Neurosci., 18(4),
74, 289–300. 665–679.
Fuster, J. M. (1995). Memory in the cerebral cortex. Cambridge, MA: Humphries, C., Love, T., Swinney, D., & Hickok, G. (2005).
MIT Press. Response of anterior temporal cortex to syntactic and prosodic
Gainotti, G., Micelli, G., Silveri, M. C., & Villa, G. (1982). manipulations during sentence processing. Hum. Brain Mapp., 26,
Some anatomo-clinical aspects of phonemic and semantic com- 128–138.
prehension disorders in aphasia. Acta Neurol. Scand., 66, 652–665. Humphries, C., Willard, K., Buchsbaum, B., & Hickok,
Galaburda, A. M. (1982). Histology, architectonics, and asymme- G. (2001). Role of anterior temporal cortex in auditory sentence
try of language areas. In M. A. Arbib, D. Caplan, & J. C. comprehension: An fMRI study. NeuroReport, 12, 1749–1752.
Marshall (Eds.), Neural models of language processes (pp. 435–445). Iacoboni, M., Molnar-Szakacs, I., Gallese, V., Buccino, G.,
San Diego: Academic Press. Mazziotta, J. C., & Rizzolatti, G. (2005). Grasping the inten-
Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The tions of others with one’s own mirror neuron system. PLoS Biol.,
motor theory of speech perception reviewed. Psychon. Bull. Rev., 3(3), e79.
13(3), 361–377. Indefrey, P., & Levelt, W. J. M. (2000). The neural correlates of
Gallese, V., Fadiga, L., Fogassi, L., & Rizzolatti, G. (1996). language production. In M. S. Gazzaniga (Ed.), The new cognitive
Action recognition in the premotor cortex. Brain, 119(Pt. 2), neurosciences (pp. 845–865). Cambridge, MA: MIT Press.
593–609. Indefrey, P., & Levelt, W. J. (2004). The spatial and temporal
Geschwind, N. (1965). Disconnexion syndromes in animals and signatures of word production components. Cognition, 92(1–2),
man. Brain, 88, 237–294, 585–644. 101–144.
Geschwind, N. (1971). Aphasia. N. Engl. J. Med., 284, 654–656. Jacquemot, C., & Scott, S. K. (2006). What is the relationship
Giraud, A. L., Kleinschmidt, A., Poeppel, D., Lund, T. E., between phonological short-term memory and speech process-
Frackowiak, R. S., & Laufs, H. (2007). Endogenous cortical ing? Trends Cogn. Sci., 10, 480–486.
rhythms determine cerebral specialization for speech perception Jones, D. M., Hughes, R. W., & Macken, W. J. (2007). The
and production. Neuron, 56(6), 1127–1134. phonological store abandoned. Q. J. Exp. Psychol., 60(4),
Goodglass, H. (1992). Diagnosis of conduction aphasia. In S. E. 505–511.
Kohn (Ed.), Conduction aphasia (pp. 39–49). Hillsdale, NJ: Jones, D. M., & Macken, W. J. (1996). Irrelevant tones produce
Lawrence Erlbaum. an irrelevant speech effect: Implications for phonological coding
Goodglass, H. (1993). Understanding aphasia. San Diego: Academic in working memory. J. Exp. Psychol. Learn. Mem. Cogn., 19,
Press. 369–381.
Goodglass, H., Kaplan, E., & Barresi, B. (2001). The assessment Jones, D. M., Macken, W. J., & Nicholls, A. P. (2004). The pho-
of aphasia and related disorders (3rd ed.). Philadelphia: Lippincott nological store of working memory: Is it phonological and is it a
Williams & Wilkins. store? J. Exp. Psychol. Learn. Mem. Cogn., 30(3), 656–674.
Graves, W. W., Grabowski, T. J., Mahta, S., & Gordon, J. K. Levelt, W. J. M., Praamstra, P., Meyer, A. S., Helenius, P., &
(2007). A neural signature of phonological access: Distinguishing Salmelin, R. (1998). An MEG study of picture naming. J. Cogn.
the effects of word frequency from familiarity and length in overt Neurosci., 10, 553–567.
picture naming. J. Cogn. Neurosci., 19, 617–631. Liberman, A. M., & Mattingly, I. G. (1985). The motor theory
Hickok, G. (2001). Functional anatomy of speech perception and of speech perception revised. Cognition, 21, 1–36.
speech production: Psycholinguistic implications. J. Psycholinguist. Lichtheim, L. (1885). On aphasia. Brain, 7, 433–484.
Res., 30, 225–234. Liebenthal, E., Binder, J. R., Spitzer, S. M., Possing, E. T., &
Hickok, G., Buchsbaum, B., Humphries, C., & Muftuler, Medler, D. A. (2005). Neural substrates of phonemic perception.
T. (2003). Auditory-motor interaction revealed by fMRI: Speech, Cereb. Cortex, 15(10), 1621–1631.
music, and working memory in area Spt. J. Cogn. Neurosci., 15, Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words:
673–682. The neighborhood activation model. Ear Hear., 19, 1–36.
Hickok, G., Erhard, P., Kassubek, J., Helms-Tillery, A. K., Luria, A. R. (1970). Traumatic aphasia. The Hague: Mouton.
Naeve-Velguth, S., Strupp, J. P., Strick, P. L., & Ugurbil, Marslen-Wilson, W. D. (1987). Functional parallelism in spoken
K. (2000). A functional magnetic resonance imaging study of word-recognition. Cognition, 25, 71–102.
the role of left posterior superior temporal gyrus in speech pro- Mazoyer, B. M., Tzourio, N., Frak, V., Syrota, A., Murayama,
duction: Implications for the explanation of conduction aphasia. N., Levrier, O., Salamon, G., Dehaene, S., Cohen, L., &
Neurosci. Lett., 287, 156–160. Mehler, J. (1993). The cortical representation of speech. J. Cogn.
Hickok, G., Okada, K., Barr, W., Pa, J., Rogalsky, C., Neurosci., 5, 467–479.
Donnelly, K., Barde, L., & Grant, A. (2008). Bilateral capacity McClelland, J. L., & Elman, J. L. (1986). The TRACE model of
for speech sound processing in auditory comprehension: speech perception. Cogn. Psychol., 18, 1–86.
Evidence from Wada procedures. Brain Lang., 107(3), 179–184. McGlone, J. (1984). Speech comprehension after unilateral injec-
Hickok, G., & Poeppel, D. (2000). Towards a functional tion of sodium amytal. Brain Lang., 22, 150–157.
neuroanatomy of speech perception. Trends Cogn. Sci., 4, 131– McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing
138. voices. Nature, 264, 746–748.
Hickok, G., & Poeppel, D. (2004). Dorsal and ventral streams: A Miceli, G., Gainotti, G., Caltagirone, C., & Masullo,
framework for understanding aspects of the functional anatomy C. (1980). Some aspects of phonological impairment in aphasia.
of language. Cognition, 92, 67–99. Brain Lang., 11, 159–169.
Hickok, G., & Poeppel, D. (2007). The cortical organization of Milner, A. D., & Goodale, M. A. (1995). The visual brain in action.
speech processing. Nat. Rev. Neurosci., 8(5), 393–402. Oxford, UK: Oxford University Press.
Humphries, C., Binder, J. R., Medler, D. A., & Liebenthal, Narain, C., Scott, S. K., Wise, R. J., Rosen, S., Leff, A., Iversen,
E. (2006). Syntactic and semantic modulation of neural activity S. D., & Matthews, P. M. (2003). Defining a left-lateralized
776 language
53 Morphological Processes in
Language Production
kevin a. shapiro and alfonso caramazza
abstract Morphology refers to the set of linguistic processes that separately from other morphemes (for example, preposi-
govern the composition of words from stored units called mor- tions), or bound, meaning that they cannot be produced in
phemes, which encode information about meaning and grammati- isolation (like the markers of plural number and past tense).
cal properties. Neuropsychological studies suggest that morphological
operations can be spared or impaired in the setting of acquired
Bound functional morphemes are called inflections. In
brain damage. Moreover, specific patterns of breakdown in mor- Mandarin Chinese and other so-called isolating languages,
phology have revealed major principles underlying the neural functional morphemes are, as a rule, unbound; for example,
architecture of language. Here we make the case that the language the perfective aspect marker le in the sentence wŏ măi le
production system has at least three components with discrete sānbĕ n shū (“I bought three books”) indicates that the action
neural substrates: one component that represents lexical concepts
expressed by the verb măi (buying) has been completed.
and is organized according to meaning; a second component that
processes morphological information linked to grammatical func- This difference in phonological expression should not be
tion; and a third component that converts lexical and morphologi- taken to imply that English is morphologically “richer” than
cal representations into specific output forms. Chinese or “poorer” than a language like classical Hebrew,
which marks verbs for both aspect and agreement with the
subject (in qaniti šloša s ∂farim, the verb qaniti is a first-person
singular perfective form). Rather, such variation provides
The basic unit of meaning in language is the morpheme, a rich fodder for the study of morphological processing, insofar
type of cognitive representation that corresponds either to as speakers of different languages make different kinds of
a lexical concept (a root, like think), to an abstract modifier errors with morphology under demanding experimental
that can be used to generate new lexical concepts with conditions (Dick, Bates, & Ferstl, 2003) and present with
distinct meanings (a derivational morpheme, like re- or -able), different morphological impairments in the setting of brain
or to a property relevant to the grammatical rules of a lan- damage (Bates, Friederici, & Wulfeck, 1987; Menn & Obler,
guage (a functional morpheme, like the preposition of or 1990; Wulfeck, Bates, & Capasso, 1991). For example,
the past tense marker -ed). Morphology—the system of rules Mandarin-speaking aphasic patients tend to omit functional
that governs the construction of words from individual morphemes (Packard, 1990), while Hebrew-speaking patients
morphemes—is the engine that drives much of language’s tend to make substitution errors with bound morphemes
combinatorial productivity, bridging the gap between the and omission errors with unbound functional morphemes
conceptual, grammatical, and phonological levels of repre- (Friedmann & Grodzinksy, 1997).
sentation. Generative morphological rules are also extremely Moreover, some theories posit the existence, in all lan-
versatile, allowing speakers to express practically unlimited guages, of morphemes that have no phonological content at
nuances of meaning (unthinkable, redirected, antidisestablishmen- all. In the phrase two sheep, for example, the plural marker is
tarians, etc.) using a fixed set of stored representational thought to be phonologically null. This is an exception for
elements. English, which generally marks plurals with the inflectional
Languages differ widely in the way that morphological suffix-s, but is perhaps the rule for Mandarin, which has no
structure is realized in the phonological message. In English marker for plurals per se (shū can mean either “book” or
and many other languages, morphemes may be either pho- “books,” depending on the context). In any given language,
nologically unbound, in the sense that they can be produced there may be very many grammatical features that are
encoded by such zero morphemes (Pesetsky, 1995). On this
kevin a. shapiro Department of Psychology, Harvard University, view, nearly every word produced by a speaker or compre-
Cambridge; Department of Medicine, Children’s Hospital, Boston, hended by a listener is, in fact, an agglomeration of lexical
Massachusetts
alfonso caramazza Department of Psychology, Harvard and functional morphemes, which convey various kinds of
University, Cambridge, Massachusetts; Center for Mind/Brain information crucial for encoding and decoding the meaning
Sciences, University of Trento, Rovereto, Italy of that word in the context of an utterance. In other words,
778 language
circuits distinguishable from those that underly other aspects 2003a; Bird, Lambon-Ralph, Seidenberg, McClelland, &
of language, like phonology and syntax. This suggestion Patterson, 2003; Joanisse & Seidenberg, 1999; Patterson,
coincides both with dominant linguistic theories and with Lambon-Ralph, Hodges, & McClelland, 2001).
psycholinguistic models of sentence processing, which pos- An intermediate possibility is that morphology is neither
tulate the independence of morphology from phonology and an autonomous function nor a wholly owned subsidiary of
syntax on the basis of evidence like the differential involve- phonology, but rather a confederation of processes that
ment of lexical and functional morphemes in slips of the operate at the interstices of language, ensuring that abstract
tongue (e.g., it waits to pay) (Garrett, 1980; Levelt, 1989; syntactic representations with particular grammatical and
Schwartz, 1987). structural features can be matched with specific, contextu-
On the other hand, it is not obvious that all morphological ally appropriate lexical representations, and that these lexical
errors in aphasia are actually attributable to deficits in mor- representations in turn can be converted into phonological
phological knowledge as such. It has been argued that the strings. Morphological deficits in aphasia may arise when
morphological errors made in reading by acquired dyslexic one of these interfaces is compromised by damage at a par-
patients are not morphological in origin at all, but rather ticular level of language processing. For instance, patients
are actually semantic errors or visual errors (Badecker who have difficulty with lexical access may be prone to
& Caramazza, 1987; Castles, Coltheart, Savage, Bates, & making paragrammatic substitution errors in either
Reid, 1996; Funnell, 1987; Plaut & Shallice, 1993). This functional or derivational morphology (Caplan et al., 1972;
proposal has been notoriously difficult to refute, although Kohn & Melvold, 1999; Laine et al., 1995; Miceli &
there is evidence to suggest that such patients do not system- Caramazza, 1988; Semenza et al., 1990). Such patients
atically produce morphological forms that are more frequent may make relatively few phonological errors, especially if
or more imageable (Rastle, Tyler, & Marslen-Wilson, 2006). their errors in other language production tasks are not pri-
Similar doubts exist about the nature of morphological marily phonological—as was true for patient HH described
errors in naming, repetition, and spontaneous speech. In by Laine and colleagues, who produced paralexias involving
a study of repetition errors in 26 aphasic patients, Miceli both functional morphemes and root (or stem) morphemes
and colleagues (2004) demonstrated that aphasic patients (e.g., pesä+lla “on the base” was read as maila+sta “from
who make morphological errors invariably also make pho- the bat”). Interestingly, this patient’s lesion largely spared
nological errors, implying either that the neural circuits the left inferior prefrontal cortex, but may have involved
important for morphology are distinct but grossly insepara- subcortical connections between the left frontal lobe and
ble from regions involved in phonological processing, or that posterior perisylvian areas that were also damaged.
what appear to be morphological errors are in fact errors of Some patients appear simply to ignore morphemes that
phonology. are not lexical roots, even when access to phonological
In this case, there is some evidence to support both posi- information appears to be intact (Tyler, Behrens, Cobb, &
tions. Neuroimaging studies have found that the left inferior Marslen-Wilson, 1990); in these cases, the deficit may also
prefrontal cortex is recruited in a wide variety of linguistic occur at the level of lexical retrieval. However, patients with
and nonlinguistic tasks, including the processing of gram- postlexical processing deficits may have particular difficulty
matical gender (Miceli et al., 2002), phonological processing with functional morphemes, which are often unstressed and
(Heim & Friederici, 2003; Heim, Opitz, Muller, & can require the resyllabification of words and phrases (Kean,
Friederici, 2003; Indefrey & Levelt, 2000), and phonological 1978). Likewise in comprehension, patients of this type may
working memory (Hickok & Poeppel, 2007; Paulesu, Frith, have difficulty parsing functional affixes (Tyler & Cobb,
& Frackowiak, 1993). These diverse results suggest that the 1987). Others still may present with morphological
left anterior perisylvian region may contain populations of impairments that are linked to the ability to use particular
neurons that are heterogeneous in function. It may be that kinds of syntactic information (Goodglass & Berko, 1960),
any brain lesion that is large enough or severe enough to such as information about tense (Friedmann & Grodzinksy,
disrupt morphological processes will also disrupt phonologi- 1997; Miceli, Silveri, Romani, & Caramazza, 1989) or
cal processes—and perhaps other cognitive functions as well. knowledge about a specific grammatical category (Laiacona
Alternatively, perhaps it is the case that functional mor- & Caramazza, 2004; Shapiro & Caramazza, 2003a; Tsap-
phemes are especially vulnerable to impairment because of kini, Jarema, & Kehayia, 2002).
the extra demands they place on the phonemic processor. The question of how morphology interacts with other
Kean observed that agrammatic patients tend to omit func- subcomponents of language, as well as with domain-general
tional morphemes that are phonologically less salient (Kean, mechanisms in cognitive processing, has proven to be a
1978, 1979). Both children and some aphasic patients fare fruitful field of research in this otherwise relatively unculti-
more poorly with inflections that are phonologically more vated domain of cognitive neuroscience. We will discuss
complex (Berko Gleason, 1958; Shapiro & Caramazza, two examples in the sections that follow. First, there is the
780 language
A
Figure 53.1 A neuropsychological dissociation in processing (darker bars), whereas the anomic patient had more trouble inflect-
regular and irregular verb forms. (A) The approximate lesion sites ing irregular verbs—and overapplied the regular suffix to many of
of patient FCL (red area, left anterior perisylvian regions), who had the irregulars (light green bar on top of dark green bar). The per-
symptoms of agrammatism, and patient JLU (green area, left tem- formance of age- and education-matched control subjects is shown
poroparietal region), who had symptoms of anomia. (B) Results of in the gray bars. (Reprinted from Pinker & Ullman, 2002.) (See
verb inflection tests showed that the agrammatic patient had more color plate 65.)
trouble inflecting regular verbs (lighter bars) than irregular verbs
prefrontal cortex is functionally heterogeneous. Not all forms (Teichmann, Dupoux, Kouider, & Bachoud-Lévi,
patients with lesions in this area may be expected to have 2006). They may make more errors than control subjects in
the same pattern of linguistic performance, and activation the production of regularly inflected words (Longworth,
of this region in neuroimaging paradigms may be particu- Keenan, Barker, Marslen-Wilson, & Tyler, 2005), though
larly sensitive to the demands of the task that is employed. the latter finding appears to be subtle and task dependent.
A second anatomical claim of the procedural/declarative Some neuroimaging experiments corroborate the idea that
hypothesis is that the basal ganglia, and specifically the the caudate nuclei are particularly active in the detection of
striate nuclei (the caudate and putamen), are crucial for the syntactic anomalies, including anomalies signaled by mor-
processing of grammatical rules. Studies of patients with phological structure (Forkstam, Hagoort, Fernandez, Ingvar,
early Huntington’s disease, which first affects the caudate, & Petersson, 2006; Lieberman, Chang, Chiao, Bookheimer,
have shown that these patients are indeed impaired in pro- & Knowlton, 2004; Moro et al., 2001). The evidence there-
ducing morphologically complex word forms (Gordon & fore seems, on balance, to support a role for the caudate in
Illes, 1987) and making rule-based judgments about such the application of linguistic rules. By contrast, patients with
782 language
A Left – Sham Right – Sham B
1. aMFG
C
2. IFG
D
3. pMFG
Figure 53.2 Results of the rTMS experiment reported by and posterior middle frontal gyrus (pMFG). (B) The sites of stimula-
Cappelletti et al. (2008), showing a selective disruption in verb tion to the IFG and pMFG. The remaining panels demonstrate the
processing following stimulation to the left anterior frontal gyrus. stereotactic application of TMS to the left pMFG (C) and left IFG
(A) The mean difference in reaction times to nouns and verbs with (D). (Modified from Cappelletti, Fregni, Shapiro, Pascual-Leone, &
repetitive TMS compared to sham stimulation in three areas: the Caramazza, 2008.) (See color plate 66.)
anterior middle frontal gyrus (aMFG), inferior frontal gyrus (IFG),
We propose that the left inferior frontal gyrus represents accounting for certain striking phenomena that have hith-
a common pathway for the production of words bearing erto been somewhat difficult to reconcile with other theories
functional morphemes that specify grammatical information about the organization of language in the brain: namely, the
relevant to one category or another. In other words, this area finding that some aphasic patients exhibit grammatical
(perhaps along with the striate nuclei of the basal ganglia) category deficits that are restricted to either spoken or written
may be important for the conversion of morphological ele- output (Caramazza & Hillis, 1991; Hillis & Caramazza,
ments into phonological segments. The process of selecting 1995; Hillis, Tuffiash, & Caramazza, 2002; Hillis, Wityk,
syntactically appropriate functional morphemes may be Barker, & Caramazza, 2003; Rapp & Caramazza, 2003).
handled by different upstream regions, like the left anterior Perhaps the clearest example of this kind of modality specific
middle frontal gyrus for verbs. These morphosyntactic deficit is the case of patient KSR, who produced verbs better
regions, in turn, must normally receive information from the than nouns in speech, but nouns better than verbs in writing
lexicon, with the constraint that only words meeting certain (Rapp & Caramazza, 2002). That such patients are able to
requirements should be processed as nouns or verbs— produce the same stimuli in one modality but not in another
allowing us to say, for example, that he rose to smell or he smelled strongly implies that the patients’ problems do not arise at
a rose, but not he has been rosing up the place all afternoon. the semantic level of representation. Instead, it has been
The hypothesis of a neuroanatomical dissociation proposed that the cortical regions responsible for storing and
between grammatically based morphological processes and accessing lexical representations are segregated along lines
form-based morphological processes also has the virtue of of both modality and grammatical category, so that brain
C D
Figure 53.3 The area found by Tyler and colleagues (2004) to normalized T1 images of the 12 subjects in the fMRI experiment
be more active for inflected verbs than inflected nouns in an fMRI overlaid with the lesion overlap of the three patients in A–C. Lesion
semantic judgment paradigm, compared to the lesion sites of three overlap is shown in blue, the significant activation found in the
aphasic patients with deficits in processing regularly inflected verb verbs-nouns contrast is in yellow, and the overlap between common
forms in a priming task. (A–C) T1-weighted MR images of three lesion volume of the three patients and the activation is in green.
patients with an outline of the activation found in the verbs-nouns (Reprinted from Tyler, Bright, Fletcher, & Stamatakis, 2004.) (See
contrast superimposed on them. (D) A mean of the spatially color plate 67.)
damage might selectively affect access to orthographic verb disconnections between morphosyntactic processors,
representations, for example. segregated by grammatical category, and morphosegmental
While this proposal is not logically impossible, it is some- processors, which may be segregated by modality.
what difficult to reconcile with the fact that these patients’ Precisely what brain areas are important for category-
lesions tend to be relatively large, and the areas implicated— specific morphosyntactic processes and for the representa-
like the left posterior inferior frontal and precentral gyri tion of phonological and orthographic segments is, of course,
in two patients unable to write verbs (Hillis et al., 2003)— a matter that requires much further investigation. With
are unlikely candidates for modality-specific lexical stores. respect to morphosyntactic processing, the rTMS studies
However, if we suppose that morphosegmental processes reviewed here suggest that the anterior portion of the left
(in phonology and orthography) are dissociable from lexical middle frontal gyrus may be crucial for verbs (Cappelletti
retrieval and morphosyntactic feature selection, an alterna- et al., 2008; Shapiro et al., 2001). The data for nouns are
tive explanation becomes available. It may be that modality- even more severely limited: the lesion data implicate either
specific grammatical-class deficits are manifestations of the left inferior frontal lobe or the inferior parietal lobe
784 language
(Shapiro et al., 2000), although none of the frontal areas Caramazza, A., Miceli, G., Silveri, M. C., & Laudanna,
tested with rTMS was found to be crucial for nouns. What A. (1985). Reading mechanisms and the organization of the
lexicon: Evidence from acquired dyslexia. Cogn. Neuropsychol., 2,
is clear, however, is that some components of the neural
81–114.
circuitry for language production are sensitive to informa- Castles, A., Coltheart, M., Savage, G., Bates, A., & Reid,
tion about grammatical category, while others are dedicated L. (1996). Morphological processing and visual word recogni-
to the processing of particular kinds of output. We believe tion: Evidence from acquired dyslexia. Cogn. Neuropsychol., 13,
that this hypothesized division of labor, with the ultimate 1041–1057.
goal of constructing morphemes into producible and com- Cutler, A. (1981). Degrees of transparency in word formation.
Can. J. Ling., 26, 73–77.
prehensible words, may provide a productive framework Damasio, A. R., & Tranel, D. (1993). Nouns and verbs are
for investigating the neurobiological mechanisms by which retrieved with differently distributed neural systems. Proc. Natl.
language operates. Acad. Sci. USA, 90(11), 4957–4960.
De Diego Balaguer, R., Costa, A., Gallés, N. S., Juncadella,
M., & Caramazza, A. (2004). Regular and irregular morphology
NOTE and its relation with agrammatism: Evidence from Spanish and
Catalan. Cortex, 40(1), 157–158.
1. In this chapter we are concerned primarily with language pro-
De Diego Balaguer, R., Rodríguez-Fornells, A., Rotte, M.,
duction: in other words, how do speakers produce morphologi-
Bahlmann, J., Heinze, H.-J., & Münte, T. F. (2006). Neural
cally complex words? Of course, an analogous problem exists
circuits subserving the retrieval of stems and grammatical
in the domain of comprehension: how do listeners access
features in regular and irregular verbs. Hum. Brain Mapp., 27,
the meaning of morphologically complex words? We make
874–888.
the assumption here that the lexicon is unitary—that is, that
Desai, R., Conant, L. L., Waldron, E., & Binder, J. R. (2006).
the same kinds of lexical representations are accessed in produc-
fMRI of past tense processing: The effects of phonological com-
tion and comprehension. It follows that theories about morpho-
plexity and task difficulty. J. Cogn. Neurosci., 18(2), 278–297.
logical composition in the lexicon, even those based empirically
Dick, F., Bates, E., & Ferstl, E. C. (2003). Spectral and temporal
on evidence from comprehension tasks, should also apply to
degradation of speech as a simulation of morphosyntactic deficits
language production.
in English and German. Brain Lang., 85(3), 535–542.
Druks, J. (2006). Morpho-syntactic and morpho-phonological
deficits in the production of regularly and irregularly inflected
REFERENCES
verbs. Aphasiology, 20(9), 993–1017.
Badecker, W., & Caramazza, A. (1987). The analysis of morpho- Forkstam, C., Hagoort, P., Fernandez, G., Ingvar, M., &
logical errors in a case of acquired dyslexia. Brain Lang., 32, Petersson, K. M. (2006). Neural correlates of artificial syntactic
278–305. structure classification. NeuroImage, 32(2), 956–967.
Bates, E., Friederici, A., & Wulfeck, B. (1987). Grammatical Forster, K. I. (1976). Accessing the mental lexicon. In R. J. Wales
morphology in aphasia: Evidence from three languages. Cortex, & E. Walker (Eds.), New approaches to language mechanisms (pp.
23(4), 545–574. 257–287). Amsterdam: North Holland.
Beretta, A., Campbell, C., Carr, T. H., Huang, J., Schmitt, L. Friedmann, N. A., & Grodzinksy, Y. (1997). Tense and agreement
M., Christianson, K., et al. (2003). An ER-fMRI investigation in agrammatic production: Pruning the syntactic tree. Brain
of morphological inflection in German reveals that the brain Lang., 56(3), 397–425.
makes a distinction between regular and irregular forms. Brain Funnell, E. (1987). Morphological errors in acquired dyslexia: A
Lang., 85, 67–92. case of mistaken identity. Q. J. Exp. Psychol. [A], 39, 497–539.
Berko Gleason, J. (1958). The child’s learning of English Garrett, M. F. (1980). Levels of processing in sentence produc-
morphology. Word, 14, 150–177. tion. In B. Butterworth (Ed.), Language production (Vol. 1, pp.
Bird, H., Lambon-Ralph, M. A., Seidenberg, M. S., 177–220). New York: Academic Press.
McClelland, J. L., & Patterson, K. (2003). Deficits in Goodglass, H., & Berko, J. (1960). Agrammatism and inflectional
phonology and past-tense morphology: What’s the connection? morphology in English. J. Speech Hear. Res., 3, 257–267.
J. Mem. Lang., 48, 502–526. Goodglass, H., Klein, B., Carey, P., & Jones, K. (1966). Specific
Braber, N., Patterson, K., Ellis, K., & Ralph, M. A. L. (2005). semantic word categories in aphasia. Cortex, 2(1), 74–89.
The relationship between phonological and morphological defi- Gordon, W. P., & Illes, J. (1987). Neurolinguistic characteristics
cits in Broca’s aphasia: Further evidence from errors in verb of language production in Huntington’s disease: A preliminary
inflection. Brain Lang., 92(3), 278–287. report. Brain Lang., 31(1), 1–10.
Buckingham, H. W., & Kertesz, A. (1974). A linguistic analysis of Heim, S., & Friederici, A. D. (2003). Phonological processing in
fluent aphasia. Brain Lang., 1(1), 43–61. language production: Time course of brain activity. NeuroReport,
Butterworth, B. (1983). Lexical representation. In B. Butterworth 14(16), 2031–2033.
(Ed.), Language production (Vol. 2). London: Academic Press. Heim, S., Opitz, B., Muller, K., & Friederici, A. (2003). Phono-
Caplan, D., Kellar, L., & Locke, S. (1972). Inflection of logical processing during language production: fMRI evidence
neologisms in aphasia. Brain, 95(1), 169–172. for a shared production-comprehension network. Brain Res. Cogn.
Cappelletti, M., Fregni, F., Shapiro, K., Pascual-Leone, A., & Brain Res., 16, 285–296.
Caramazza, A. (2008). Processing nouns and verbs in the left Hickok, G., & Poeppel, D. (2007). The cortical organization of
frontal cortex: A TMS study. J. Cogn. Neurosci., 20(4), 707–720. speech processing. Nat. Rev. Neurosci., 8(5), 393–402.
Caramazza, A., & Hillis, A. E. (1991). Lexical organization of Hillis, A. E., & Caramazza, A. (1995). Representation of gram-
nouns and verbs in the brain. Nature, 349, 788–790. matical knowledge in the brain. J. Cogn. Neurosci., 7, 396–407.
786 language
Reber, P. J., & Squire, L. R. (1999). Intact learning of artificial Tyler, L. K., Bright, P., Fletcher, P., & Stamatakis,
grammars and intact category learning by patients with Parkin- E. A. (2004). Neural processing of nouns and verbs: The
son’s disease. Behav. Neurosci., 113(2), 235–242. role of inflectional morphology. Neuropsychologia, 42(4), 512–
Sahin, N. T., Pinker, S., & Halgren, E. (2006). Abstract gram- 523.
matical processing of nouns and verbs in Broca’s area: Evidence Tyler, L. K., & Cobb, H. (1987). Processing bound grammatical
from fMRI. Cortex, 42, 540–562. morphemes in context: The case of an aphasic patient. Lang.
Schwartz, M. F. (1987). Patterns of speech production deficit Cogn. Process., 2(3), 245–262.
within and across aphasia syndromes: Application of a psycho- Tyler, L. K., deMornay-Davies, P., Anokhina, R., Longworth,
linguistic model. In M. Coltheart, G. Sartori, & R. Job (Eds.), C., Randall, B., & Marslen-Wilson, W. D. (2002). Dissocia-
The cognitive neuropsychology of language (pp. 163–199). Hove, Sussex, tions in processing past tense morphology: Neuropathology and
UK: LEA. behavioral studies. J. Cogn. Neurosci., 14(1), 79–94.
Semenza, C., Butterworth, B., Panzeri, M., & Ferreri, T. Tyler, L. K., Randall, B., & Marslen-Wilson, W. D. (2002).
(1990). Word formation: New evidence from aphasia. Neuropsy- Phonology and neuropsychology of the English past tense.
chologia, 28(5), 499–502. Neuropsychologia, 40(8), 1154–1166.
Shapiro, K., & Caramazza, A. (2003a). Grammatical processing Tyler, L. K., Stamatakis, E. A., Post, B., Randall, B., &
of nouns and verbs in left frontal cortex? Neuropsychologia, 41(9), Marslen-Wilson, W. (2005). Temporal and frontal systems in
1189–1198. speech comprehension: An fMRI study of past tense processing.
Shapiro, K., & Caramazza, A. (2003b). Looming a loom: Evi- Neuropsychologia, 43(13), 1963–1974.
dence for independent access to grammatical and phonological Ullman, M. T. (2001). The declarative/procedural model of
properties in verb retrieval. J. Neurolinguistics, 16(2–3), 85–111. lexicon and grammar. J. Psycholinguist. Res., 30, 37–69.
Shapiro, K. A., Pascual-Leone, A., Mottaghy, F. M., Ullman, M. T. (2004). Contributions of memory circuits to
Gangitano, M., & Caramazza, A. (2001). Grammatical language: The declarative/procedural model. Cognition, 92,
distinctions in the left frontal cortex. J. Cogn. Neurosci., 13(6), 231–270.
713–720. Ullman, M., Corkin, S., Coppola, M., Hickok, G., Growdon,
Shapiro, K., Shelton, J., & Caramazza, A. (2000). Grammatical J., Koroshetz, W., & Pinker, S. (1997). A neural dissociation
class in lexical production and morphological processing: within language: Evidence that the mental dictionary is part of
Evidence from a case of fluent aphasia. Cogn. Neuropsychol., 17, declarative memory, and that grammatical rules are part of the
665–682. procedural system. J. Cogn. Neurosci., 9, 266–276.
Small, J. A., Lyons, K., & Kemper, S. (1997). Grammatical abili- Ullman, M. T., Pancheva, R., Love, T., Yee, E., Swinney, D., &
ties in Parkinson’s disease: Evidence from written sentences. Hickok, G. (2005). Neural correlates of lexicon and grammar:
Neuropsychologia, 35(12), 1571–1576. Evidence from the production, reading, and judgment of inflec-
Taft, M. (1979). Recognition of affixed words and the word fre- tion in aphasia. Brain Lang., 93(2), 185–238.
quency effect. Mem. Cogn., 7, 263–272. Vannest, J., Bertram, R., Järvikivi, J., & Niemi, J. (2002). Coun-
Taft, M., & Forster, K. I. (1975). Lexical storage and retrieval of terintuitive cross-linguistic differences: More morphological
prefixed words. J. Verb. Learn. Verb. Beh., 14, 638–647. computation in English than in Finnish. J. Psycholinguist. Res.,
Teichmann, M., Dupoux, E., Kouider, S., & Bachoud-Lévi, 31(2), 83–106.
A. C. (2006). The role of the striatum in processing language Vannest, J., & Boland, J. (1999). Lexical morphology and lexical
rules: Evidence from word perception in Huntington’s disease. access. Brain Lang., 68(1–2), 324–332.
J. Cogn. Neurosci., 18(9), 1555–1569. Witt, K., Nühsman, A., & Deuschl, G. (2002). Intact artificial
Tsapkini, K., Jarema, G., & Kehayia, E. (2002). A morphological grammar learning in patients with cerebellar degeneration
processing deficit in verbs but not in nouns: A case study in a and advanced Parkinson’s disease. Neuropsychologia, 40(9),
highly inflected language. J. Neurolinguistics, 15(3), 265–288. 1534–1540.
Tyler, L. K., Behrens, S., Cobb, H., & Marslen-Wilson, W. Wulfeck, B., Bates, E., & Capasso, R. (1991). A crosslinguistic
(1990). Processing distinctions between stems and affixes: Evi- study of grammaticality judgments in Broca’s aphasia. Brain
dence from a non-fluent aphasic patient. Cognition, 36, 129–153. Lang., 41(2), 311–336.
abstract The core component of expert reading is the fast and occipitoparietal “where” stream (Ungerleider & Mishkin,
accurate perception of single words by the visual system, an ability 1982). In this chapter we propose an integrated view of the
that results from years of intensive learning. We propose an inte- contributions of the ventral and dorsal streams to single-
grated view of the contributions of the ventral and dorsal streams
to this process, associating brain imaging in normal subjects and
word reading. We systematically associate information from
studies of brain-damaged patients. Together, these two sources of brain imaging in normal subjects and contributions from
data indicate that fluent reading results from a tight collaboration studies of brain-damaged patients with varieties of acquired
of both pathways. In the left occipitotemporal cortex, the Visual “peripheral” dyslexias—that is, reading deficits resulting
Word Form system allows for the fast, invariant, and parallel from impaired visual processing, as opposed to language-
encoding of well-formed letter strings. The occipitoparietal pathway
related “central” dyslexias. Together, these two sources of
makes an important contribution to reading through attention
orienting, word selection, and within-word serial decoding under data indicate that fluent reading results from a tight collabo-
nonoptimal reading conditions. ration of the ventral and dorsal visual pathways, with the
occipitotemporal route dominating for expert reading of
known words and the occipitoparietal pathway making an
essential contribution to reading under dysfluent, unfamiliar,
The acquisition of reading by children rests on a delicate or degraded conditions.
tuning of the visual system and of the verbal system, and
on the elaboration of novel interactions between these two
preexisting domains. As a result of this long and effortful
Word processing in the ventral visual pathway
process, adult readers are able to scan pages of text in a fast Word Perception as Object Perception Over the last
and orderly manner, identifying a flow of words that are decades, studies in monkeys and, more recently, functional
each fixated only for a fraction of a second, immediately imaging in humans have shown that object recognition is
accessing their sound and meaning, and building up at the achieved through neuronal hierarchies located in the ventral
same time an integrated interpretation of the text. The core occipitotemporal pathway (figure 54.1). Moving from area
component of this remarkable process is the fast and accu- V1 to inferotemporal (IT) cortex, converging neurons show
rate perception of single words by the visual system. A pre- an increasing invariance to position and scale, an increasing
requisite for access to a word’s sound and meaning is the size of the receptive fields, and an increasing complexity
identification of its component letters and of their order, of the neurons’ optimal stimuli (M. Booth & Rolls, 1998;
an abstract representation that has been called the Visual Riesenhuber & Poggio, 1999; Rolls, 2000; Serre, Oliva, &
Word Form (Besner, 1989; Paap, Newsome, & Noel, 1984; Poggio, 2007; Ullman, 2007). Connections include bottom-
Warrington & Shallice, 1980). up and top-down projections within the ventral stream
In past years, research has concentrated on the contribu- (Felleman & Van Essen, 1991), as well as projections to and
tion of the left ventral visual system to word-identification from more remote frontal and parietal regions subserving
processes. However, like any complex visual task, reading attentional control (Kastner & Ungerleider, 2000).
is most likely achieved through a collaboration of the We proposed that the ability to read words stems from
two components of the cerebral visual system—that is, the this general ability of the ventral stream to identify complex
ventral occipitotemporal “what” stream and the dorsal multipart objects. According to the local combination
detector, or LCD, model (Dehaene, Cohen, Sigman, &
laurent cohen AP-HP, Hôpital de la Salpêtrière, Department Vinckier, 2005), words are encoded through a posterior-
of Neurology, Paris; Université Paris VI, Faculté de Médecine to-anterior hierarchy of neurons tuned to increasingly
Pitié-Salpêtrière, Paris; INSERM UMRS 975, Centre de Recherche
de l’ICM, Paris, France larger and more complex word fragments, such as visual
stanislas dehaene INSERM, Cognitive Neuro-Imaging Unit, features, single letters, bigrams, quadrigrams, and possibly
Gif sur Yvette; Collège de France, Paris, France whole words.
cohen and dehaene: ventral and dorsal contributions to word reading 789
Lexico-semantic reading route Phonological reading route
Visuo-spatial attention
Local bigrams (y = -56) OTS
IPS
Low-level visual processing -33 -60 48
Local contours
(letter fragments) V2 V2
Oriented bars V1 V1
Figure 54.1 Synthetic schema of the reading system, merging phonological reading routes (green). The proposed normalized
propositions from Dehaene, Cohen, Sigman, and Vinckier (2005) coordinates for the lexicosemantic and phonological reading routes
and Cohen and colleagues (2003). Low-level processing is achieved are from a meta-analysis of 35 PET and fMRI studies (Jobard,
in each hemisphere for the contralateral half of the visual field Crivello, & Tzourio-Mazoyer, 2003), and the coordinates of the
(yellow). Information converges on the left-hemispheric Visual visuospatial attention system are from Gitelman et al. (1999). IFG:
Word Form system, where an invariant representation of inferior frontal gyrus; MTG: middle temporal gyrus; SMG: supra-
letter strings is computed (red). The dorsal visual stream exerts a marginal gyrus: OTS: occipitotemporal sulcus; IPS: intraparitetal
top-down attentional control on the hierarchy of ventral areas sulcus. (See color plate 68.)
(blue). The ventral visual system then feeds the lexicosemantic and
790 language
This system reaches its optimal level of expertise only after induces a length effect (Lavidor & Ellis, 2002; Lavidor, Ellis,
years of practice. Through perceptual learning mechanisms, Shillcock, & Bland, 2001).
neurons within the ventral pathway become progressively The RVF advantage is a complex phenomenon, for which
attuned to the regularities of the writing system at all hier- several compatible mechanisms have been put forward:
archical levels. This hierarchy must also take into account degradation of information resulting from right-to-left inter-
the need to interact with downstream codes for phonologi- hemispheric transfer of LVF letters; better perceptual
cal, morphological, and lexical knowledge of words (Goswami learning in the most stimulated sector of the visual field
& Ziegler, 2006). Eventually, the adult pattern of perfor- (Nazir, 2000; Nazir, Ben-Boutayab, Decoppet, Deutsch, &
mance—that is, fast and invariant word recognition with Frost, 2004); and rightward attentional bias. As to the ulti-
little influence of the number of letters—is thought to reflect mate causes of such perceptual or attentional asymmetries,
the parallel encoding of letter strings through a fast bottom- they may involve left-hemispheric lateralization of language
up hierarchy of converging detectors. (M. Kinsbourne, 1972), left-to-right reading habits (Deutsch
& Rayner, 1999; Lavidor & Whitney, 2005; Mishkin &
Early Visual Processing of Printed Words Forgays, 1952), and the fact that the beginning of words is
more informative than their end and should therefore be
Retinotopic processing Letters are first processed in the hemi- kept close to fixation, as acuity drops steeply away from the
sphere contralateral to their location in the visual field, pro- fovea (e.g., O’Regan et al., 1984).
bably in increasingly invariant format, through areas V1 to Nazir and colleagues (Nazir, 2000; Nazir et al., 2004)
V4. Those areas, located approximately between Talairach emphasized the role of perceptual learning in the genesis of
coordinates (TC) y = −90 and y = −70, are modulated by the RVF advantage, as a result of the most frequent percep-
physical parameters such as word length (Whiting et al., tion of words in this sector of the visual field. Along those
2003) and visual contrast (Mechelli, Humphreys, Mayall, lines, it is plausible that expert word perception, like other
Olson, & Price, 2000), stimulus degradation (Helenius, instances of overpracticed perceptual abilities, is restricted
Tarkiainen, Cornelissen, Hansen, & Salmelin, 1999; to the trained region of the visual field and results from
Jernigan et al., 1998), and stimulus rate and duration (Price increased activation in retinotopic cortex, with increasing
& Friston, 1997; Price, Moore, & Frackowiak, 1996). Accord- reliance on its more posterior sectors (Sigman et al., 2005).
ingly, the P150 wave evoked by word reading is only sensitive Congruent with this view, Cohen and colleagues (2002)
to the physical repetition of stimuli in a masked priming found a left extrastriate region (TC −24 −78 −12) only
paradigm (Petit, Midgley, Holcomb, & Grainger, 2006). responsive to RVF stimuli, which showed stronger activation
by alphabetic strings than by checkerboards, while no such
Perceptual asymmetry It has long been recognized that words difference was observed in corresponding right extrastriate
are read more easily when they are displayed in the right areas. Moreover, transcranial magnetic stimulation (TMS)
visual field (RVF) than in the left visual field (LVF) (for inhibition of the left (but not of the right) occipital cortex
reviews see Ducrot & Grainger, 2007; Ellis, 2004). By con- induces a length effect for words displayed in the RVF
tinuously varying fixation point inside and outside words, (Skarratt & Lavidor, 2006). This effect occurs when TMS is
Brysbaert, Vitu, and Schroyens (1996) showed that the RVF applied 80 ms after word presentation, supporting the local-
advantage is closely related to another behavioral asymme- ization of the interference to the posterior visual cortex.
try, namely, that in the optimal reading position, gaze posi- Moreover, priming tasks with split-field stimuli suggest
tion falls left of word center (Nazir, 2000; O’Regan, that alphabetic strings are encoded in a format less depen-
Levy-Schoen, Pynte, & Brugaillere, 1984), so that most of dent on physical shape and case when they are viewed in
the word falls in the RVF. Thus the visual reading span of the RVF than in the LVF (Burgund & Marsolek, 1997;
about 10 letters (Rayner & Bertera, 1979) is not distributed Marsolek, Kosslyn, & Squire, 1992; Marsolek, Schacter, &
equally across both hemifields, as letter-identification per- Nicholas, 1996), possibly reflecting general processing asym-
formance decreases more slowly with eccentricity in the metries in the visual system (Burgund & Marsolek, 2000;
RVF than in the LVF (Nazir, Jacobs, & O’Regan, 1998). Marsolek, 1995; Sawamura, Georgieva, Vogels, Vanduffel,
In addition to higher accuracy and shorter latencies, the & Orban, 2005). Accordingly, using a masked priming para-
RVF advantage is characterized by parallel letter identifica- digm, Dehaene and colleagues (2001) have evidenced case-
tion, as indexed by constant reading latencies irrespective of specific physical repetition priming in the right extrastriate
word length. The absence of a word-length effect is restricted cortex (though similar regions were also present in left
to words displayed in the optimal viewing position, or fully extrastriate at a lower threshold) (for similar effects with
within the sector of the RVF closest to the fovea. Outside of object perception see Koutstaal et al., 2001).
those conditions, a length effect emerges. Accordingly, when Overall, such data support the idea that the poster-
words extend across central fixation, only their left part ior sector of the left ventral pathway develops superior
cohen and dehaene: ventral and dorsal contributions to word reading 791
perceptual abilities for contralateral strings of letters (as controversies (Price & Devlin, 2003; Wright et al., 2007),
indexed by measures of accuracy, speed, parallelism, and which we tried to clarify by applying to the VWFA the
invariance), explaining at least the perceptual component of distinctive notions of reproducible localization, partial
the RVF advantage. regional selectivity, and functional specialization (for review
and discussion see Cohen & Dehaene, 2004).
Pathology: Reading with hemianopia or with apperceptive agnosia
The asymmetric role of posterior visual cortex in reading is Specialization within the ventral stream
supported by the pattern of reading impairments resulting 1. Reproducible localization. Reading-related activations
from left versus right hemianopia. Reading is highly depen- are reproducibly located within the occipitotemporal sulcus
dent on the integrity of the central visual field. As unilateral lateral to the left fusiform gyrus (VWFA), with only a
lesions affecting the retrochiasmatic visual tract up to primary few millimeters of intersubject variability (Cohen et al.,
visual cortex result in scotomas sparing at least half of 2002; Jobard, Crivello, & Tzourio-Mazoyer, 2003). The
the fovea, the ensuing reading impairments are relatively VWFA is activated by visual words irrespective of their
mild. Only right hemianopia without sparing of foveal position in the visual field (Cohen et al., 2000). An associated
vision induces noticeable reading difficulty (Zihl, 1995). electrical or magnetic signature is detected about 170–200 ms
First, the visual span of such patients is reduced, and they after stimulation (e.g., Cohen et al., 2000; Marinkovic
may require several fixations in order to perceive long et al., 2003; Tarkiainen, Helenius, Hansen, Cornelissen, &
words. Second, patients lose the reading advantage specific Salmelin, 1999).
to the normal RVF. Accordingly, they show an influence The remarkable topographical reproducibility of the
of word length on reading latencies, as normal subjects do VWFA may result from its optimal positioning within
with words displayed in their LVF (Cohen et al., 2003). gradients biasing the a priori organization of the visual
Third, perception in the right parafoveal field, in an area cortex, such as a posterior-to-anterior increase in perceptual
spanning about 15 letters (Rayner & McConkie, 1976), is invariance (Grill-Spector et al., 1998; Lerner, Hendler,
important for preparing the accurate landing of the gaze on Ben-Bashat, Harel, & Malach, 2001) and a mesial-to-lateral
subsequent words (Sereno & Rayner, 2003). Therefore increase in preference for foveal versus peripheral stimuli
hemianopic patients make abnormally short and numerous (Hasson, Levy, Behrmann, Hendler, & Malach, 2002). A
saccades when reading word sequences (Leff et al., 2000; further reason for the localization of the VWFA, particularly
Zihl, 1995). for its usual left lateralization, may be the availability of
Finally, patients with so-called apperceptive agnosia more direct connections to other language-related sites
(Humphreys & Riddoch, 1993; Lissauer, 1890) following involved in phonological or lexical processing (Cai, Lavidor,
(generally bilateral) lesions of intermediate visual areas such Brysbaert, Paulignan, & Nazir, 2008; Cohen, Jobert, Le
as V2 and V4 are impaired at word reading just as they are Bihan, & Dehaene, 2004; Epelbaum et al., 2008; Mahon &
at identifying other types of shapes and objects (Heider, Caramazza, 2009).
2000; Michel, Henaff, & Bruckert, 1991; Rizzo, Nawrot, 2. Partial regional selectivity. The VWFA is activated by
Blake, & Damasio, 1992). alphabetic strings relative to fixation but often also relative
to complex nonalphabetic stimuli such as faces or geometri-
Invariant Representation of Letters and the Visual cal patterns (e.g., Cohen et al., 2002; Puce, Allison, Asgari,
Word Form Area After percolating through retinotopic Gore, & McCarthy, 1996). However, the difference in acti-
cortex, visual word information converges on the sector of vation between words relative to visual objects is variable
ventral cortex anterior to V4, ranging approximately from across studies, and may even be inverted, depending on a
TC y = 60 to y = −40, a region with larger receptive fields number of experimental parameters (e.g., Wright et al.,
and greater capacity of invariance (figure 54.2). This region 2007). This lack of absolute regional selectivity may be taken
receives afferences from both visual hemifields (Tootell, as a sensible argument against the use of the VWFA label,
Mendola, Hadjikhani, Liu, & Dale, 1998) and shows as this region may well be involved in processing nonalpha-
repetition suppression by object images across changes in betic visual objects. However, selectivity may be detectable
size, position, and orientation (Grill-Spector et al., 1999), only at a higher spatial resolution. Thus intracranial record-
and across a change of exemplar within a category (Koutstaal ings occasionally showed P150 or N200 waves elicited
et al., 2001). Accordingly, we proposed that, during reading, exclusively by letter strings, as compared to a variety of
part of this region (which we labeled as the Visual Word control stimuli such as phase-scrambled strings, flowers,
Form Area, or VWFA) is responsible for the computation of faces, or geometrical shapes (Allison, McCarthy, Nobre,
an invariant representation of letter identities (Cohen et al., Puce, & Belger, 1994; Allison, Puce, Spencer, & McCarthy,
2000). Both this proposed labeling and the functional 1999). Moreover, some left inferotemporal lesions (see the
properties of this region have given rise to enduring subsection “Pathology: Pure alexia”) yield massive alexia
792 language
L R
MOUTON
MOUTON Reading latency
BOLD response
1,6 4000
AVONIL
AVONI L 3000
1,4
1,2 QUMBSS
QUMBSS 2000
1 1000
QOADTQ
QOADTQ
0,8 0
FF IL FL BG QG W KZWYWK
KZWYWK 2 3 4 5 6 7 8 9
Number
Number ofof letters
letters
Figure 54.2 Word processing in the ventral pathway. Top panel: cortex responsible for pure alexia (top). Whereas before surgery
Activations induced by printed words relative to a fixation baseline word reading was fast and constant irrespective of word length,
in the left hemisphere (left) and in the bilateral ventral visual after surgery the patient showed slow letter-by-letter reading
pathway (right). Left panel: The VWF system shows a linear increase (middle). In the same patient, the 3D image shows the relative
of activation (top) by letter strings forming closer statistical approxi- position of the VWFA (blue), of other category-dependent fMRI
mations to orthographically legal strings (middle). This functional activation clusters before surgery, of the brain lesion (green), and
specialization increases progressively in more anterior regions of intracerebral electrodes (magenta). (Right panel adapted from
within the VWF system (bottom). (Left panel adapted from Gaillard et al., 2006.) (See color plate 69.)
Vinckier et al., 2007.) Right panel: Surgical lesion in the left ventral
cohen and dehaene: ventral and dorsal contributions to word reading 793
affecting even single letters, contrasting with the spared rec- Pure alexia is an acquired and selective reading deficit occur-
ognition of complex multipart objects, faces, or digit strings, ring in previously literate patients. Patients typically have
demonstrating that the VWFA, even if activated by a wide entirely preserved production and comprehension of oral
range of stimuli, may evolve to be necessary only to word language, and they can write normally either spontaneously
recognition. or to dictation. However, they show various degrees of
3. Functional specialization. The issue of selectivity is inde- impairment of word reading. The critical cortical lesions
pendent of the hypothesis of a functional specialization of generating pure alexia overlap with the VWFA as defined
the VWFA (figure 54.2). On top of their preexisting object with functional imaging (Cohen et al., 2003; Gaillard et al.,
coding properties, neurons in the VWFA develop elaborate 2006). Pure alexia may also follow deafferentation of an
functional specialization as they get attuned to arbitrary intact VWFA following left-hemispheric white matter lesions
features of the subject’s script. As the clearest instance of (Cohen, Henry, et al., 2004; Epelbaum et al., 2008). Poste-
functional specialization, activation of the VWFA is stronger rior callosal lesions cause a selective deafferentation of the
when the script is familiar than when it is unfamiliar (e.g., VWFA from the right occipital cortex, yielding alexia
Hebrew versus alphabetic strings; Baker et al., 2007) or restricted to the LVF (Cohen et al., 2000, 2003; Molko et
created de novo (Price, Wise, & Frackowiak, 1996). More- al., 2002; Suzuki et al., 1998).
over, using masked repetition priming, it was shown that In the most severe cases, known as global alexia, patients
the VWFA represents words in a format invariant for cannot identify single letters, let alone whole words (Dalmas
the upper- versus lowercase distinction (e.g., radio versus & Dansilio, 2000; Dejerine, 1892). Such patients may or
RADIO), another arbitrary culture-dependent feature of may not have access to abstract letter identities, as tested
writing systems (Dehaene et al., 2004, 2001). Finally, within for instance in a cross-case letter-matching task (Miozzo
the subjects’ familiar script, the VWFA is activated more & Caramazza, 1998; Mycroft, Hanley, & Kay, 2002).
strongly by letter strings forming closer statistical approxi- More often, patients show relatively preserved letter identi-
mations to orthographically legal strings (including real fication abilities and develop letter-by-letter reading strate-
words), showing that the VWFA incorporates constraints on gies, as if only the most finely tuned mechanisms of word
letter combinations, which are specific to the familiar lan- perception were affected, those allowing for rapid and
guage (Binder, Medler, Westbury, Liebenthal, & Buchanan, parallel identification of letter strings. As an indication of
2006; Cohen et al., 2002; Vinckier et al., 2007). this effortful reading strategy, patients show a large increase
4. Internal structure of the Visual Word Form system. According in the number and the duration of fixations per word relative
to the LCD model, the anteroposterior extension of the to normals and even to patients with hemianopic dyslexia
VWFA (about 20 mm) should reflect its heterogeneous and (Behrmann, Shomstein, Black, & Barton, 2001). There is
hierarchically organized structure. Dehaene and colleagues some evidence that in letter-by-letter readers, residual
(2004), using a subliminal priming design, showed that the letter identification can be subtended by right-hemispheric
type of prime-target similarity that causes fMRI priming regions symmetrical to the VWFA or by spared patches of
varies according to the anterior-posterior location in left left-hemispheric ventral cortex (Bartolomeo, Bachoud-Levi,
occipitotemporal cortex, with an increasing invariance for Degos, & Boller, 1998; Cohen, Henry, et al., 2004; Gaillard
position and case change, and probably greater reliance on et al., 2006).
larger-size units such as bigrams or quadrigrams. More Finally, some patients show better-than-chance perfor-
recently, Vinckier and colleagues (2007) tested whether a mance in purely implicit reading tasks such as lexical or
hierarchy of detectors of increasingly larger word fragments semantic decision, contrasting with the apparent inability to
is present in the left occipitotemporal cortex. The frequency identify printed words (Coslett & Saffran, 1989; Coslett,
of letters, bigrams, and quadrigrams was manipulated, yield- Saffran, Greenbaum, & Schwartz, 1993). Implicit reading
ing a range of stimuli with an increasing structural similarity has been most clearly evidenced with Arabic numerals,
to real words. The more anterior an area was within the which can be compared accurately even when explicit
Visual Word Form region, the more sensitive it was to the reading is grossly impaired (Cohen & Dehaene, 1995, 2000),
frequency of complex components, revealing a gradient-like probably revealing effective right-hemispheric identification
spatial organization within the VWFA (see Grainger and processes.
Holcomb, in press, for a review of ERP data relevant to the
fragmentation of orthographic processing). Contribution of the dorsal pathway
Pathology: Pure alexia Impairments affecting the Visual Word The operation of the ventral stream during word reading is
Form system correspond to the syndrome of pure alexia, as modulated by attentional influences, originating from pari-
described in the 19th century (Binder & Mohr, 1992; etal regions, that may impinge on all processing levels from
Damasio & Damasio, 1983; Dejerine, 1892; see figure 54.2). striate cortex (Chawla, Rees, & Friston, 1999; Somers, Dale,
794 language
Seiffert, & Tootell, 1999) to ventral occipitotemporal areas letters cannot be effectively processed in parallel over the
(Kastner, De Weerd, Desimone, & Ungerleider, 1998; see whole string.
figures 54.1 and 54.3). In order to make sense of the variety
of reading impairments that may follow parietal lesions, we Orientation of Attention Spatial attention modulates
will distinguish somewhat artificially three contributions of the efficiency of the visual processing of alphabetic stimuli.
attentional control to single-word reading: orienting to the Thus words are better recognized when they appear in a
region of space where the target word is displayed, filtering region of the visual field to which attention has been directed
out irrelevant words present in the vicinity of the target, and by a previous cue (McCann, Folk, & Johnston, 1992), and
serially attending to letters or word fragments whenever subliminal letters have a priming effect on subsequent
L R
40%
20%
0%
Figure 54.3 Contribution of the dorsal pathway to word reading. (Left panel adapted from Cohen, Dehaene, Vinckier, Jobert, &
Top panel: Activations induced by printed words relative to a Montavont, 2008.) Right panel: In a patient with bilateral parietal
fixation baseline in the left hemisphere (left) and in the bilateral atrophy and spared ventral cortex (top), there was a severe reading
dorsal visual pathway (right). Left panel: The bilateral intraparietal impairment above a similar threshold of rotation angle, demon-
cortex shows a nonlinear increase of activation with word degrada- strating the role of parietal cortex whenever display degradation
tion, correlated with reaction times (top). For instance, activations exceeds the range of invariance in the ventral cortex. (Right panel
increased steeply for words rotated by more than 45° (bottom). adapted from Vinckier et al., 2006.) (See color plate 70.)
cohen and dehaene: ventral and dorsal contributions to word reading 795
targets only when they are displayed at an attended location right neglect in situations of competition between objects,
(Marzouki, Grainger, & Theeuwes, 2007). As mentioned while his right lesion yielded left neglect in situations of
before, the RVF advantage may partly result from a rightward competitions between the parts of an object.
bias of attention. Ducrot and Grainger (2007) showed that A clarifying framework was proposed by Hillis and
exogenous spatial cuing has no impact on the (asymmetrical) Caramazza (1995), who suggested that the varieties of
reading performance for words displayed only slightly off neglect dyslexia may be attributed to spatial attentional
fixation, suggesting that in the central field, the RVF biases acting on one or more of progressively more abstract
advantage is mostly perceptual. In contrast, cuing was very word representations derived from Marr’s theory of object
effective for more peripheral words and tended to reduce the perception (Marr, 1982): a peripheral retinocentric feature
RVF advantage. In a study of lateralized word reading, representation, a stimulus-centered letter-shape level, and a
Cohen and colleagues (2002) found larger activations for word-centered graphemic representation akin to the Visual
RVF than for LVF words in the left precuneus and thalamus, Word Form (for a review of supportive data see Haywood
with no activations for the opposite contrast, likely reflecting & Coltheart, 2000). Thus, in a deficit at the retinocentric
the attentional component of the RVF advantage. level, error rate for a given letter should depend on its posi-
tion in the visual field relative to central fixation and not on
Pathology: Neglect dyslexia The defining feature of neglect dys- its rank within the target word. In contrast, in a deficit at
lexia is the existence of a left-right spatial gradient in the rate the stimulus-centered level, error rate should depend on the
of reading errors far exceeding the normal RVF advantage distance from the center of the word irrespective of the posi-
(for an overview and references see Riddoch, 1990). Follow- tion of the word in the visual field. Naturally, both parame-
ing the general pattern of hemispatial neglect, it is much ters may be relevant in some if not in the majority of patients.
more common to observe left than right neglect dyslexia, More remote from neglect in its usual sense, neglect at the
although a number of right-sided cases have been reported. graphemic level yields errors affecting one end of words
Neglect dyslexia is generally associated with signs of neglect irrespective of their spatial position or orientation. Thus
outside the domain of reading, although patients with seem- patient NG made errors with the last letters (e.g., hound →
ingly isolated neglect dyslexia have been reported. Neglect house) when reading standard words, but also vertical words
is thought to result from associated impairments of both and mirror-reversed words, as well as when naming orally
nonlateralized and lateralized components of attentional/ spelled words and when performing other lexical tasks such
spatial processing (Husain & Rorden, 2003). The latter may as spelling (Caramazza & Hillis, 1990). Note, however, that
depend on saliency maps of the opposite hemispace sub- there are alternative accounts of word-centered neglect
tended by each posterior parietal lobe (Medendorp, Goltz, dyslexia, in frameworks that refute the existence of object-
Vilis, & Crawford, 2003; M. Sereno, 2001). Assuming that centered neural representations (Deneve & Pouget, 2003;
those lateralized maps contribute to the top-down modula- Mozer, 2002).
tion of the ventral visual stream, one may expect that distinct Finally, letter strings that are neglected in explicit reading
varieties of neglect dyslexia may arise, depending on the side tasks may nevertheless be processed to higher representation
of the lesion, the affected parietal structure, the ventral levels. This possibility is suggested by preserved performance
regions that are deprived of attentional modulation, and so in lexical decision (Arduino, Burani, & Vallar, 2003), by the
on. Indeed, there are numerous clinical observations to illus- fact that erroneous responses often tend to have the same
trate this fractionation of neglect dyslexia (Riddoch, 1990). length as the actual targets (K. Kinsbourne & Warrington,
Neglect errors typically affect the leftmost letters when 1962), or by higher error rates observed with nonwords than
patients read single words, and the leftmost side of the page with real words (Sieroff, Pollatsek, & Posner, 1988). The
when they read connected text. However, those two types interpretation of such findings is still debated (Riddoch,
of errors can be to some extent doubly dissociated, suggest- 1990), but it is plausible that neglected words can be partially
ing that neglect dyslexia is not a homogeneous syndrome processed in the ventral visual pathway in the absence
(Costello & Warrington, 1987; Kartsounis & Warrington, of conscious awareness, as has also been shown in normal
1989). This fractionation is best illustrated by the case of subjects (Dehaene et al., 2001; Devlin et al., 2003) and with
patient JR, who suffered from bilateral occipitoparietal other types of visual stimuli such as faces or houses in neglect
lesions (Humphreys, 1998). When presented with words patients (Rees et al., 2000).
scattered on a page, he omitted the rightmost words, but his
reading errors affected the leftmost letters of the words that Selection of One Single Word For optimal reading, not
he picked out. Likewise, he showed left neglect when he was only should the attention window encompass the target
asked to read single words, while he showed right neglect word, but it should also be narrow enough to exclude
when trying to name the component letters of the same other neighboring words. In normal subjects it is possible to
stimuli. This pattern suggests that JR’s left lesion yielded force a spread of attention over two words, by briefly
796 language
presenting two words side by side, and specifying only (Saffran & Coslett, 1996), suggesting that low-level visual
afterward which of the two should be reported (Davis & features may help to focus the attention on the target word
Bowers, 2004; Treisman & Souther, 1986). This procedure and to discard distracters.
degrades performance and induces reading errors that are In brief, attentional dyslexia may be due to insufficient
analogous to those observed in the pathological condition attentional focusing on one among several concurrent letters
known as attentional dyslexia (for qualifications to this or letter strings represented in the Visual Word Form system.
analogy see Davis & Bowers, 2004). Note that the few cases of attentional dyslexia with sufficient
lesion data consistently point to a left parietal involvement
Pathology: Attentional dyslexia The hallmark of attentional (Friedmann & Gvion, 2001; Mayall & Humphreys, 2002;
dyslexia is the contrast between preserved reading of Shallice & Warrington, 1977; E. K. Warrington et al., 1993).
isolated words and high error rates when the target is Such asymmetry may relate to a left-hemispheric bias for
surrounded by other words (for a review see Davis & object-oriented attention (Egly, Driver, & Rafal, 1994), or
Coltheart, 2002). It is generally attributed to an impaired more generally to the left dominance for language.
attentional selection of one among several concurrent
stimuli (Shallice, 1988). This induces (1) an inaccurate pro- Attending to Parts of Words and Serial Decoding As
cessing of the target (substitutions, additions, or deletions of an outcome of perceptual learning, in expert readers the
letters) as a result of the competition by surrounding words ventral visual pathway gets attuned to the perception of
and (2) intrusion of distracters into later stages of processing normal print: horizontally aligned words presented in the
(letter migrations from the flanking words into the response foveal region in a usual font are identified in a fast and parallel
to the target). manner. There are, however, a number of circumstances in
Such ideas are in good agreement with imaging data in which this optimal encoding is either unavailable or
normals, showing that when multiple objects are presented inappropriate to the task at hand, as revealed by slower
simultaneously, they exert mutual inhibition, resulting in reading speed and by the emergence of a linear increase of
decreased ventral visual activations (Kastner et al., 1998). reading latencies with word length. We suggest that this
Directing attention toward one of the stimuli compensates length effect reflects a failure of parallel letter processing in
this reduction of activity. Moreover, the activation induced the ventral pathway and indicates the deployment of serial
by distracters in areas T4 and TEO is reduced in proportion attention to letters or groups of letters (for an alternative
to the attention that is paid to the target, and it is inversely account see Whitney, 2001; Whitney & Lavidor, 2004). Serial
correlated with frontoparietal activations (Pinsk, Doniger, & reading would involve parietal structures driving spatial-
Kastner, 2004). It is thus plausible that in attentional dyslex- attentional processes (Gitelman et al., 1999; Husain &
ics, impaired selection abilities, which are unmasked in the Rorden, 2003; Kanwisher & Wojciulik, 2000; Mesulam,
presence of flanker words, cause both visual errors due to a 1999) and a modulation by this top-down attention of ventral
weakened representation of the target and letter migrations occipitotemporal structures coding for word fragments
due to an excessive activation of distracters. (Chawla et al., 1999; Kastner et al., 1998; Somers et al., 1999).
The phenomenon of flanker interference also prevails Departure from parallel reading as indexed by the
when patients are asked to read single letters surrounded by emergence of a length effect occurs in many conditions:
other letters. This finding leads to the paradoxical observa- (1) in children whose reading expertise is still incompletely
tion that patients may be good at reading isolated words but developed, with an effect of word length persisting until
not at naming their component letters. More generally, about the age of 10 (Aghababian & Nazir, 2000); (2) in pure
interference seems to occur only between items of the same alexic patients who develop letter-by-letter reading following
category. In their seminal article Shallice and Warrington left ventral lesions, a strategy that is associated with parietal
(1977) showed that flanking letters but not flanking digits activations (Gaillard et al., 2006); (3) in normal subjects
interfered with letter identification. Similarly, there is no attempting to read words degraded by means of contrast
mutual interference between letters and whole words (E. K. reduction (Legge, Ahn, Klitz, & Luebker, 1997), of mIxEd
Warrington, Cipolotti, & McNeil, 1993). One may note that case printing (Lavidor, 2002; Mayall, Humphreys, Mechelli,
in some patients the interference between letters is the same Olson, & Price, 2001), of vertical display (Bub & Lewine,
whether the target and flankers are printed in the same case 1988), and of lateral display in the LVF (Lavidor & Ellis,
or not (Shallice & Warrington, 1977; E. K. Warrington 2002); and (4) in normal subjects reading aloud pseudo-
et al.), suggesting that the impairment impinges on visual words, which probably requires the serial left-to-right
areas that already show high-level invariance, such as the conversion of graphemes into phonemes (Weekes, 1997).
VWFA. Still, the irrelevance of case changes for attentional Interestingly, patients with semantic dementia who suffer
selection is not absolute. Indeed, letter migrations between from a progressive dissolution of lexical knowledge show
words may be reduced by using different typographic cases a length effect even when reading real words (Cumming,
cohen and dehaene: ventral and dorsal contributions to word reading 797
Patterson, Verfaellie, & Graham, 2006). This abnormal dowords, for which grapheme-to-phoneme conversion
length effect is due to reduced top-down lexical support for requires the sequential inspection of graphemes. For instance,
word identification, compelling patients to process real a patient could read accurately 29 out of 30 briefly presented
words as pseudowords. words, while she identified only 4 out of 30 pseudowords
We recently studied the mechanisms involved in reading (Coslett & Saffran, 1991).
degraded words (Cohen, Dehaene, Vinckier, Jobert, & We recently studied a simultanagnosic patient with
Montavont, 2008; see figure 54.3). We presented adult bilateral parietal atrophy (Vinckier et al., 2006; see figure
readers with words that were progressively degraded in three 54.3). She was excellent at reading normally printed foveal
different ways (word rotation, letter spacing, and displace- words, but she was severely impaired at reading words that
ment to the visual periphery). Behaviorally, we identified were mirror reversed, or rotated by angles larger than 50°,
degradation thresholds above which reading difficulty or whose letters were separated by at least two blank spaces,
increased nonlinearly, with the concomitant emergence of a or words displayed in her left hemifield. According to the
length effect. Functional MRI activations were correlated present hypothesis, above those critical thresholds—that is,
with reading difficulty in bilateral occipitotemporal and pari- when stimulus degradation exceeds the perceptual tolerance
etal regions, reflecting the strategies required to identify of the ventral system—reading normally requires the inter-
degraded words. A core region of the intraparietal cortex vention of the parietal lobes to pilot the attention-driven
was engaged in all modes of degradation. Supporting the exploration of stimuli (for a congruent observation see Hall,
current interpretation, the same region is also activated, and Humphreys, & Cooper, 2001). Parietal lesions did not allow
its interactions with other parts of the reading network the patient to resort to such strategy. This study was con-
increase, when subjects are required to pay attention to gruent with an imaging study reviewed before (Cohen et al.,
letters within nondegraded words (Bitan et al., 2005; 2008): overlapping parietal regions were activated in normal
J. Booth et al., 2002). Furthermore, in the ventral pathway, subjects and lesioned in the patient, and the same degree of
word degradation led to an amplification of activation in the word degradation boosted parietal activations in normals
posterior Visual Word Form Area at a level thought to and caused a drop in the patient’s performance.
encode single letters. We also found an effect of word Because of her parietal lesions, this patient also presented
length restricted to highly degraded words in bilateral occipi- with orientation agnosia (e.g., Priftis, Rusconi, Umilta, &
toparietal regions. Zorzi, 2003). She was thus unable to discriminate normally
oriented words or pictures of objects from the same rotated
Pathology: Spatial dyslexia and Balint’s syndrome Balint’s syn- stimuli. However, while she was unable to discriminate pic-
drome, a consequence of bilateral dorsal parietal lesions, tures of objects from their mirror-reversed images, she could
includes simultanagnosia, which prevents the binding of do so easily with reversible pseudowords. For instance,
objects with a stable localization in space and the computa- “boup” and “quod” appeared to her as distinct items,
tion of their relative positions, and ocular apraxia, which although they are mirror images of each other. The ventral
precludes an accurate control of saccades toward peripheral pathway builds up a mirror-invariant representation of
targets (Rizzo & Vecera, 2002). The most salient impact of common objects (Logothetis & Pauls, 1995; Rollenhagen &
this disorder on reading is an inability to read connected text Olson, 2000), which requires the intervention of explicit
as a result of chaotic scanning of the display. The patients’ orientation analysis dependent on parietal cortex in order
gaze wanders randomly from word to word, and the relative to discriminate mirror images. In contrast, the default invari-
position of words cannot be appreciated. However, patients ance for mirror symmetry is “unlearned” by the ventral
can read accurately each of the disconnected words on pathway in the particular case of reading, since reading
which they land. requires the accurate discrimination of mirror-symmetric
While the identification of optimally printed words is not shapes (e.g., “p” versus “q”).
substantially affected, patients may have major difficulties
reading words presented in unusual formats, such as verti- Interfacing with the verbal system
cally arrayed or widely spaced letters. These difficulties
disrupt the automatic binding of letters into single visual As the result of a collaboration between ventral and dorsal
objects, and therefore require a scanning of component routes, detailed visual information about letter strings is
letters, which Balint patients cannot do. Due to impaired ultimately conveyed to downstream language areas. In this
scanning, patients may also be unable to report one letter section, we briefly point out some open issues pertaining to
out of a string, even with optimally displayed real words the relationships of the visual system with the language-
(Baylis, Driver, Baylis, & Rafal, 1994). A similar account related components of word processing, including phonol-
explains why Balint patients are impaired at reading pseu- ogy and the lexicon.
798 language
Multiple Outlets from the Ventral Stream Assuming and a finer-grained code used to access phonology from
that word fragments of various sizes are identified in the orthography (Grainger & Holcomb, in press).
ventral stream, one may expect that rich direct and indirect
projections should exist toward areas involved in lexical, Phonological Impact on Visual Representations One
semantic, motor, or phonological processes. However, the potential shortcoming of the LCD model is that it focuses
pathways leading from the VWFA to all components of the primarily on the acquisition of visual expertise in reading—
reading network are not precisely defined. The macaque that is, how the ventral visual system eventually incorporates
equivalent of the VWFA putatively falls within the IT orthographic regularities (see figure 54.1). However, it is
complex, which projects to the inferior parietal lobule likely that word phonology also influences orthographic
and the anterior temporal lobe, in addition to occipital representations in the visual system. Early letter-to-sound
and interhemispheric connections (Schmahmann & Pandya, mapping is thought to be crucial for reading acquisition,
2006). Moreover, there may be a specifically human devel- which may constrain the eventual structure of the
opment of projections from the inferior temporal cortex to orthographic code in adults (Goswami & Ziegler, 2006;
language-related superior temporal, parietal, and frontal Ziegler & Goswami, 2005).
regions, through the arcuate fasciculus (Catani, Jones, & The impact of phonology on visual processing emerges
ffytche, 2005; Epelbaum et al., 2008) and the inferior fronto- from the comparison between scripts that differ in terms
occipital fasciculus (Catani, Howard, Pajevic, & Jones, 2002), of orthographic transparency—that is, the regularity of
respectively. grapheme-phoneme conversion rules. According to the
Following the observation of alexia with agraphia, Dejer- LCD model, transparency should be reflected in the size of
ine (1892) suggested that the next step following visual word the units encoded by occipitotemporal neurons. In “trans-
processing should be the angular gyrus, which he postulated parent” writing systems such as Italian or the Japanese
to be the “visual center of letters.” Indeed, the angular gyrus kana script, the letter and bigram levels should suffice for
is among the regions that are modulated during reading grapheme-phoneme conversion. In an “opaque” script,
tasks, even if it often remains below the baseline level of acti- however, such as English or kanji, a larger-size visual unit,
vation (Binder et al., 2003; Binder, Medler, Desai, Conant, & more anterior along the visual hiearchy, should be used.
Liebenthal, 2005), and there is functional connectivity Compatible with this idea, stronger and more anterior acti-
between the angular gyrus and the left fusiform gyrus at coor- vation is observed in the left occipitotemporal region in
dinates matching the VWFA (Horwitz, Rumsey, & Donohue, English than in Italian readers (Paulesu et al., 2000), and, at
1998). There is also correlated activity in the VWFA and in a slightly more mesial location, during kanji than during
left inferior frontal areas (Bokde, Tagamets, Friedman, & kana reading in Japanese readers (Ha Duy Thuy et al., 2004;
Horwitz, 2001). A further potential output pathway is to tem- Nakamura, Dehaene, Jobert, Le Bihan, & Kouider, 2005).
poral regions anterior to the VWFA. These regions, which However, evidence of an influence of phonology on visual
have been difficult to image with functional MRI because of processing within a given writing system is less clear. There
magnetic susceptibility artifacts, are probably involved in are numerous behavioral demonstrations of an impact
supramodal semantic processing (for a review see Giraud & of phonology on the processing of printed words, as well as
Price, 2001; Kreiman, Koch, & Fried, 2000; Lambon Ralph, of cross-modal word activations in parietal and superior
McClelland, Patterson, Galton, & Hodges, 2001). or lateral temporal regions (e.g., J. Booth et al., 2002;
Finally, it is possible that different segments of the Visual Cohen, Jobert, Le Bihan, & Dehaene, 2005; van Atteveldt,
Word Form system feed distinct language-related processes Formisano, Goebel, & Blomert, 2004). Still there is little
by projecting to distinct areas. Thus Mechelli and colleagues evidence that some of those effects reflect the operation of
(2005) found that during reading the posterior fusiform the visual system per se, rather than of later speech-related
cortex, which codes for single letters according to the LCD processes. For instance, Grainger, Kiyonaga, & Holcomb
model, was coupled with the superior premotor cortex, pos- (2006) showed that by 225 ms after the presentation of a
sibly in relation to letter-to-articulation transcoding, while target word preceded by a masked prime, ERPs distin-
the anterior fusiform cortex, presumably coding for large guished homophone pseudoword primes, as compared to
word fragments, was coupled with Broca’s pars triangularis, nonhomophone controls (e.g., bakon-BACON versus bafon-
possibly in relation to lexicosemantic access. Accordingly, BACON). Although this time window is roughly compatible
the former coupling increased during pseudoword reading, with processing in the Visual Word Form system, the ante-
whereas the latter increased during exception word reading. rior topography of this effect does not support an occipito-
In a similar vein, Grainger proposed on the basis of behav- temporal source. The contribution of phonological structure
ioral data that two types of orthographic code are computed: to word encoding in the visual system is thus largely open to
a coarse code used to rapidly access semantic information empirical research.
cohen and dehaene: ventral and dorsal contributions to word reading 799
Conclusion correlates of lexical access during visual word recognition.
J. Cogn. Neurosci., 15(3), 372–393.
The present review emphasizes that fluent reading results Binder, J. R., Medler, D. A., Desai, R., Conant, L. L., &
Liebenthal, E. (2005). Some neurophysiological constraints on
from an intimate collaboration of multiple areas forming a models of word naming. NeuroImage, 27(3), 677–693.
distributed network. Although the VWFA clearly plays an Binder, J. R., Medler, D. A., Westbury, C. F., Liebenthal, E.,
essential role in expert reading, the recent literature has & Buchanan, L. (2006). Tuning of the human left fusiform gyrus
tended to forget that the dorsal spatial-attentional system to sublexical orthographic structure. NeuroImage, 33(2),
also makes a major contribution through attention orienting, 739–748.
Binder, J. R., & Mohr, J. P. (1992). The topography of callosal
word selection, and within-word serial decoding. Adult reading pathways. A case-control analysis. Brain, 115,
readers probably rely on serial attentive reading under rela- 1807–1826.
tively rare conditions; but we speculate that young readers, Bitan, T., Booth, J. R., Choy, J., Burman, D. D., Gitelman,
in whom the word length effect is particularly large, rely D. R., & Mesulam, M. M. (2005). Shifts of effective connectivity
heavily on the dorsal route early during the laying down of within a language network during rhyming and spelling.
J. Neurosci., 25(22), 5397–5403.
the grapheme-phoneme decoding stage. Although phono-
Bokde, A. L., Tagamets, M. A., Friedman, R. B., & Horwitz, B.
logical sources of developmental reading impairments have (2001). Functional interactions of the inferior frontal cortex
received vast attention, our analysis suggests that occipito- during the processing of words and word-like stimuli. Neuron,
parietal impairments are also very likely to have an impact 30(2), 609–617.
on developmental dyslexia, as indeed suggested by recent Booth, J. R., Burman, D. D., Meyer, J. R., Gitelman,
D. R., Parrish, T. B., & Mesulam, M. M. (2002). Functional
research (Bosse, Tainturier, & Valdois, 2007; Lassus-
anatomy of intra- and cross-modal lexical tasks. NeuroImage, 16(1),
Sangosse, N’Guyen-Morel, & Valdois, 2008; Valdois, Bosse, 7–22.
& Tainturier, 2004). In the future, developmental neuroim- Booth, M. C., & Rolls, E. T. (1998). View-invariant representa-
aging paradigms should be developed to directly image the tions of familiar objects by neurons in the inferior temporal visual
ventral and dorsal routes as children learn to read. cortex. Cereb. Cortex, 8(6), 510–523.
Bosse, M. L., Tainturier, M. J., & Valdois, S. (2007). Develop-
mental dyslexia: The visual attention span deficit hypothesis.
Cognition, 104(2), 198–230.
REFERENCES
Brysbaert, M., Vitu, F., & Schroyens, W. (1996). The right visual
Aghababian, V., & Nazir, T. A. (2000). Developing normal field advantage and the optimal viewing position effect: On the
reading skills: Aspects of the visual processes underlying word relation between foveal and parafoveal word recognition. Neuro-
recognition. J. Exp. Child Psychol., 76(2), 123–150. psychology, 10, 385–395.
Allison, T., McCarthy, G., Nobre, A., Puce, A., & Belger, A. Bub, D. N., & Lewine, J. (1988). Different modes of word reco-
(1994). Human extrastriate visual cortex and the perception of gnition in the left and right visual fields. Brain Lang., 33(1),
faces, words, numbers, and colors. Cereb. Cortex, 4(5), 544–554. 161–188.
Allison, T., Puce, A., Spencer, D. D., & McCarthy, G. (1999). Burgund, E. D., & Marsolek, C. J. (1997). Letter-
Electrophysiological studies of human face perception. case-specific priming in the right cerebral hemisphere with a
I: Potentials generated in occipitotemporal cortex by face and form-specific perceptual identification task. Brain Cogn., 35(2),
non-face stimuli. Cereb. Cortex, 9(5), 415–430. 239–258.
Arduino, L. S., Burani, C., & Vallar, G. (2003). Reading aloud Burgund, E. D., & Marsolek, C. J. (2000). Viewpoint-invariant
and lexical decision in neglect dyslexia patients: A dissociation. and viewpoint-dependent object recognition in dissociable neural
Neuropsychologia, 41(8), 877–885. subsystems. Psychon. Bull. Rev., 7(3), 480–489.
Baker, C. I., Liu, J., Wald, L. L., Kwong, K. K., Benner, T., & Cai, Q., Lavidor, M., Brysbaert, M., Paulignan, Y., & Nazir,
Kanwisher, N. (2007). Visual word processing and experiential T. A. (2008). Cerebral lateralization of frontal lobe language
origins of functional selectivity in human extrastriate cortex. Proc. processes and lateralization of the posterior visual word process-
Natl. Acad. Sci. USA, 104(21), 9087–9092. ing system. J. Cogn. Neurosci., 20(4), 672–681.
Bartolomeo, P., Bachoud-Levi, A. C., Degos, J. D., & Boller, Caramazza, A., & Hillis, A. E. (1990). Spatial representation of
F. (1998). Disruption of residual reading capacity in a pure alexic words in the brain implied by studies of a unilateral neglect
patient after a mirror-image right-hemispheric lesion. Neurology, patient. Nature, 346, 267–269.
50(1), 286–288. Catani, M., Howard, R. J., Pajevic, S., & Jones, D. K. (2002).
Baylis, G. C., Driver, J., Baylis, L. L., & Rafal, R. D. (1994). Virtual in vivo interactive dissection of white matter fasciculi in
Reading of letters and words in a patient with Balint’s syndrome. the human brain. NeuroImage, 17(1), 77–94.
Neuropsychologia, 32(10), 1273–1286. Catani, M., Jones, D. K., & ffytche, D. H. (2005). Perisylvian
Behrmann, M., Shomstein, S. S., Black, S. E., & Barton, language networks of the human brain. Ann. Neurol., 57(1),
J. J. (2001). The eye movements of pure alexic patients during 8–16.
reading and nonreading tasks. Neuropsychologia, 39(9), 983–1002. Chawla, D., Rees, G., & Friston, K. J. (1999). The physiological
Besner, D. (1989). On the role of outline shape and word-specific basis of attentional modulation in extrastriate visual areas. Nat.
visual pattern in the identification of function words—none. Neurosci., 2(7), 671–676.
Q. J. Exp. Psychol. [A], 41, 91–105. Cohen, L., & Dehaene, S. (1995). Number processing in pure
Binder, J. R., McKiernan, K. A., Parsons, M. E., Westbury, alexia: The effect of hemispheric asymmetries and task demands.
C. F., Possing, E. T., Kaufman, J. N., et al. (2003). Neural Neurocase, 1, 121–137.
800 language
Cohen, L., & Dehaene, S. (2000). Calculating without reading: masking and unconscious repetition priming. Nat. Neurosci., 4(7),
Unsuspected residual abilities in pure alexia. Cogn. Neuropsychol., 752–758.
17, 563–583. Dejerine, J. (1892). Contribution à l’étude anatomo-pathologique
Cohen, L., & Dehaene, S. (2004). Specialization within the ventral et clinique des différentes variétés de cécité verbale. Mémoires de
stream: The case for the Visual Word Form Area. NeuroImage, la Société de Biologie, 4, 61–90.
22, 466–476. Deneve, S., & Pouget, A. (2003). Basis functions for object-
Cohen, L., Dehaene, S., Naccache, L., Lehéricy, S., Dehaene- centered representations. Neuron, 37, 347–359.
Lambertz, G., Hénaff, M. A., et al. (2000). The Visual Word Deutsch, A., & Rayner, K. (1999). Initial fixation location effects
Form Area: Spatial and temporal characterization of an initial in reading Hebrew words. Lang. Cogn. Process., 14, 393–421.
stage of reading in normal subjects and posterior split-brain Devlin, A. M., Cross, J. H., Harkness, W., Chong, W. K.,
patients. Brain, 123, 291–307. Harding, B., Vargha-Khadem, F., et al. (2003). Clinical
Cohen, L., Dehaene, S., Vinckier, F., Jobert, A., & Montavont, outcomes of hemispherectomy for epilepsy in childhood and
A. (2008). Reading normal and degraded words: Contribution adolescence. Brain, 126(Pt. 3), 556–566.
of the dorsal and ventral visual pathways. NeuroImage, 40(1), Ducrot, S., & Grainger, J. (2007). Deployment of spatial atten-
353–366. tion to words in central and peripheral vision. Percept. Psychophys.,
Cohen, L., Henry, C., Dehaene, S., Molko, N., Lehéricy, S., 69(4), 578–590.
Martinaud, O., et al. (2004). The pathophysiology of letter- Egly, R., Driver, J., & Rafal, R. D. (1994). Shifting visual atten-
by-letter reading. Neuropsychologia, 42, 1768–1780. tion between objects and locations: Evidence from normal and
Cohen, L., Jobert, A., Le Bihan, D., & Dehaene, S. (2004). Dis- parietal lesion subjects. J. Exp. Psychol. Gen., 123(2), 161–177.
tinct unimodal and multimodal regions for word processing in Ellis, A. W. (2004). Length, formats, neighbours, hemispheres,
the left temporal cortex. NeuroImage, 23(4), 1256–1270. and the processing of words presented laterally or at fixation.
Cohen, L., Jobert, A., Le Bihan, D., & Dehaene, S. (2005). Dis- Brain Lang., 88(3), 355–366.
tinct unimodal and crossmodal regions for word processing in Epelbaum, S., Pinel, P., Gaillard, R., Delmaire, C., Perrin, M.,
the left temporal cortex. NeuroImage, 23, 1256–1270. Dupont, S., Dehaene, S., & Cohen, L. (2008). Pure alexia as a
Cohen, L., Lehericy, S., Chochon, F., Lemer, C., Rivaud, S., & disconnection syndrome: New diffusion imaging evidence for an
Dehaene, S. (2002). Language-specific tuning of visual cortex? old concept. Cortex, 44, 962–974.
Functional properties of the Visual Word Form Area. Brain, Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchi-
125(Pt. 5), 1054–1069. cal processing in the primate cerebral cortex. Cereb. Cortex, 1(1),
Cohen, L., Martinaud, O., Lemer, C., Lehericy, S., Samson, Y., 1–47.
Obadia, M., et al. (2003). Visual word recognition in the left and Friedmann, N., & Gvion, A. (2001). Letter position dyslexia. Cogn.
right hemispheres: Anatomical and functional correlates of Neuropsychol., 18, 673–696.
peripheral alexias. Cereb. Cortex, 13(12), 1313–1333. Gaillard, R., Naccache, L., Pinel, P., Clemenceau, S., Volle,
Coslett, H. B., & Saffran, E. (1991). Simultanagnosia: To see E., Hasboun, D., et al. (2006). Direct intracranial, fMRI and
but not two see. Brain, 114, 1523–1545. lesion evidence for the causal role of left inferotemporal cortex
Coslett, H. B., & Saffran, E. M. (1989). Evidence for preserved in reading. Neuron, 50, 191–204.
reading in “pure alexia.” Brain, 112, 327–359. Giraud, A. L., & Price, C. J. (2001). The constraints functional
Coslett, H. B., Saffran, E. M., Greenbaum, S., & Schwartz, neuroanatomy places on classical models of auditory word pro-
H. (1993). Reading in pure alexia: The effect of strategy. Brain, cessing. J. Cogn. Neurosci., 13, 754–765.
116, 21–37. Gitelman, D. R., Nobre, A. C., Parrish, T. B., LaBar, K. S.,
Costello, A., & Warrington, E. K. (1987). The dissociation Kim, Y. H., Meyer, J. R., et al. (1999). A large-scale distributed
of visuospatial neglect and neglect dyslexia. J. Neurol. Neurosurg. network for covert spatial attention: Further anatomical delinea-
Psychiatry, 50, 1110–1116. tion based on stringent behavioural and cognitive controls. Brain,
Cumming, T. B., Patterson, K., Verfaellie, M., & Graham, K. 122(Pt. 6), 1093–1106.
S. (2006). One bird with two stones: Abnormal word length Goswami, U., & Ziegler, J. C. (2006). A developmental perspec-
effect in pure alexia and semantic dementia. Cogn. Neuropsychol., tive on the neural code for written words. Trends Cogn. Sci., 10(4),
23, 1130–1161. 142–143.
Dalmas, J. F., & Dansilio, S. (2000). Visuographemic alexia: A new Grainger, J., & Holcomb, P. J. (in press). Neural constraints
form of a peripheral acquired dyslexia. Brain Lang., 75(1), 1–16. on a functional architecture for word recognition. In P. L.
Damasio, A. R., & Damasio, H. (1983). The anatomic basis of pure Cornelissen, P. C. Hansen, M. L. Kringelbach, & K. Pugh (Eds.),
alexia. Neurology, 33, 1573–1583. The neural basis of reading. Oxford, UK: Oxford University Press.
Davis, C. J., & Bowers, J. S. (2004). What do letter migration Grainger, J., Kiyonaga, K., & Holcomb, P. J. (2006). The time
errors reveal about letter position coding in visual word recogni- course of orthographic and phonological code activation. Psychol.
tion? J. Exp. Psychol. Hum. Percept. Perform., 30(5), 923–941. Sci., 17(12), 1021–1026.
Davis, C., & Coltheart, M. (2002). Paying attention to reading Grill-Spector, K., Kushnir, T., Edelman, S., Avidan, G.,
errors in acquired dyslexia. Trends Cogn. Sci., 6(9), 359. Itzchak, Y., & Malach, R. (1999). Differential processing of
Dehaene, S., Cohen, L., Sigman, M., & Vinckier, F. (2005). The objects under various viewing conditions in the human lateral
neural code for written words: A proposal. Trends Cogn. Sci., 9, occipital complex. Neuron, 24(1), 187–203.
335–341. Grill-Spector, K., Kushnir, T., Hendler, T., Edelman, S.,
Dehaene, S., Jobert, A., Naccache, L., Ciuciu, P., Poline, J. B., Itzchak, Y., & Malach, R. (1998). A sequence of object-
Le Bihan, D., et al. (2004). Letter binding and invariant recogni- processing stages revealed by fMRI in the human occipital lobe.
tion of masked words. Psychol. Sci., 15(5), 307–313. Hum. Brain Mapp., 6(4), 316–328.
Dehaene, S., Naccache, L., Cohen, L., Bihan, D. L., Mangin, Ha Duy Thuy, D., Matsuo, K., Nakamura, K., Toma, K., Oga,
J. F., Poline, J. B., et al. (2001). Cerebral mechanisms of word T., Nakai, T., et al. (2004). Implicit and explicit processing of
cohen and dehaene: ventral and dorsal contributions to word reading 801
kanji and kana words and non-words studied with fMRI. Lambon Ralph, M. A., McClelland, J. L., Patterson, K.,
NeuroImage, 23(3), 878–889. Galton, C. J., & Hodges, J. R. (2001). No right to speak? The
Hall, D. A., Humphreys, G. W., & Cooper, A. G. C. (2001). relationship between object naming and semantic impairment:
Neuropsychological evidence for case-specific reading: Neuropsychological evidence and a computational model.
Multi-letter units in visual word recognition. Q. J. Exp. Psychol. J. Cogn. Neurosci., 13(3), 341–356.
[A], 54, 439–467. Lassus-Sangosse, D., N’Guyen-Morel, M. A., & Valdois,
Hasson, U., Levy, I., Behrmann, M., Hendler, T., & Malach, S. (2008). Sequential or simultaneous visual processing deficit in
R. (2002). Eccentricity bias as an organizing principle for human developmental dyslexia? Vis. Res., 48, 979–988.
high-order object areas. Neuron, 34(3), 479–490. Lavidor, M. (2002). An examination of the lateralized abstrac-
Haywood, M., & Coltheart, M. (2000). Neglect dyslexia and the tive/form specific model using MiXeD-CaSe primes. Brain Cogn.,
early stages of visual word recognition. Neurocase, 6, 33–44. 48(2–3), 413–417.
Heider, B. (2000). Visual form agnosia: Neural mechanisms and Lavidor, M., & Ellis, A. W. (2002). Word length and ortho-
anatomical foundations. Neurocase, 6, 1–12. graphic neighborhood size effects in the left and right cerebral
Helenius, P., Tarkiainen, A., Cornelissen, P., Hansen, P. C., & hemispheres. Brain Lang., 80(1), 45–62.
Salmelin, R. (1999). Dissociation of normal feature analysis and Lavidor, M., Ellis, A. W., Shillcock, R., & Bland, T. (2001).
deficient processing of letter-strings in dyslexic adults. Cereb. Evaluating a split processing model of visual word recognition:
Cortex, 9(5), 476–483. Effects of word length. Brain Res. Cogn. Brain Res., 12(2), 265–272.
Hillis, A. E., & Caramazza, A. (1995). A framework for inter- Lavidor, M., & Whitney, C. (2005). Word length effects in
preting distinct patterns of hemispatial neglect. Neurocase, 1, Hebrew. Brain Res. Cogn. Brain Res., 24(1), 127–132.
189–207. Leff, A. P., Scott, S. K., Crewes, H., Hodgson, T. L., Cowey,
Horwitz, B., Rumsey, J. M., & Donohue, B. C. (1998). Functional A., Howard, D., et al. (2000). Impaired reading in patients with
connectivity of the angular gyrus in normal reading and dyslexia. right hemianopia. Ann. Neurol., 47(2), 171–178.
Proc. Natl. Acad. Sci. USA, 95(15), 8939–8944. Legge, G. E., Ahn, S. J., Klitz, T. S., & Luebker, A. (1997). Psy-
Humphreys, G. W. (1998). Neural representation of objects in chophysics of reading—XVI: The visual span in normal and low
space: A dual coding account. Philos. Trans. R. Soc. Lond. B Biol. vision. Vis. Res., 37(14), 1999–2010.
Sci., 353(1373), 1341–1351. Lerner, Y., Hendler, T., Ben-Bashat, D., Harel, M., & Malach,
Humphreys, G. W., & Riddoch, M. J. (1993). Object agnosias. R. (2001). A hierarchical axis of object process-ing stages in the
In C. Kennard (Ed.), Visual perceptual defects (pp. 339–359). human visual cortex. Cereb. Cortex, 11(4), 287–297.
London: Baillière Tindall. Lissauer, H. (1890). Ein Fall von Seelenblindheit nebst einen
Husain, M., & Rorden, C. (2003). Non-spatially lateralized Beitrage zur Theorie derselben. Arch. Psychiatr. Nervenkr., 21,
mechanisms in hemispatial neglect. Nat. Rev. Neurosci., 4, 222–270.
26–36. Logothetis, N. K., & Pauls, J. (1995). Psychophysical and physi-
Jernigan, T. L., Ostergaard, A. L., Law, I., Svarer, C., Gerlach, ological evidence for viewer-centered object representations in
C., & Paulson, O. B. (1998). Brain activation during word the primate. Cereb. Cortex, 5(3), 270–288.
identification and word recognition. NeuroImage, 8(1), 93–105. Mahon, B. Z., & Caramazza, A. (2009). Concepts and
Jobard, G., Crivello, F., & Tzourio-Mazoyer, N. (2003). Evalu- categories: A cognitive neuropsychological perspective. Annu.
ation of the dual route theory of reading: A metanalysis of 35 Rev. Psychol, 60, 27–51.
neuroimaging studies. NeuroImage, 20, 693–712. Marinkovic, K., Dhond, R. P., Dale, A. M., Glessner, M., Carr,
Kanwisher, N., & Wojciulik, E. (2000). Visual attention: Insights V., & Halgren, E. (2003). Spatiotemporal dynamics of modal-
from brain imaging. Nat. Rev. Neurosci., 1(2), 91–100. ity-specific and supramodal word processing. Neuron, 38(3),
Kartsounis, L. D., & Warrington, E. K. (1989). Unilateral neglect 487–497.
overcome by cues implicit in stimulus displays. J. Neurol. Neuro- Marr, D. (1982). Vision: A computational investigation into the human
surg. Psychiatry, 52, 1253–1259. representation and processing of visual information. New York: W. H.
Kastner, S., De Weerd, P., Desimone, R., & Ungerleider, L. G. Freeman.
(1998). Mechanisms of directed attention in the human extrastri- Marsolek, C. J. (1995). Abstract visual-form representations in the
ate cortex as revealed by functional MRI. Science, 282(5386), left cerebral hemisphere. J. Exp. Psychol. Hum. Percept. Perform.,
108–111. 21(2), 375–386.
Kastner, S., & Ungerleider, L. G. (2000). Mechanisms of Marsolek, C. J., Kosslyn, S. M., & Squire, L. R. (1992). Form-
visual attention in the human cortex. Annu. Rev. Neurosci., 23, specific visual priming in the right cerebral hemisphere. J. Exp.
315–341. Psychol. Learn. Mem. Cogn., 18(3), 492–508.
Kinsbourne, K., & Warrington, E. K. (1962). A variety Marsolek, C. J., Schacter, D. L., & Nicholas, C. D. (1996).
of reading dysability associated with right hemisphere lesions. Form-specific visual priming for new associations in the right
J. Neurol., 25, 339–344. cerebral hemisphere. Mem. Cogn., 24(5), 539–556.
Kinsbourne, M. (1972). Eye and head turning indicates cerebral Marzouki, Y., Grainger, J., & Theeuwes, J. (2007). Exogenous
lateralization. Science, 176(34), 539–541. spatial cueing modulates subliminal masked priming. Acta Psychol.
Koutstaal, W., Wagner, A. D., Rotte, M., Maril, A., Buckner, (Amst.), 126(1), 34–45.
R. L., & Schacter, D. L. (2001). Perceptual specificity in visual Mayall, K., & Humphreys, G. W. (2002). Presentation and task
object priming: Functional magnetic resonance imaging evi- effects on migration errors in attentional dyslexia. Neuropsycholo-
dence for a laterality difference in fusiform cortex. Neuropsycholo- gia, 40(8), 1506–1515.
gia, 39(2), 184–199. Mayall, K., Humphreys, G. W., Mechelli, A., Olson, A., &
Kreiman, G., Koch, C., & Fried, I. (2000). Category-specific visual Price, C. J. (2001). The effects of case mixing on word recogni-
responses of single neurons in the human medial temporal lobe. tion: Evidence from a PET study. J. Cogn. Neurosci., 13(6),
Nat. Neurosci., 3(9), 946–953. 844–853.
802 language
McCann, R. S., Folk, C. L., & Johnston, J. C. (1992). The role Pinsk, M. A., Doniger, G. M., & Kastner, S. (2004). Push-pull
of spatial attention in visual word processing. J. Exp. Psychol. mechanism of selective attention in human extrastriate cortex.
Hum. Percept. Perform., 18(4), 1015–1029. J. Neurophysiol., 92(1), 622–629.
Mechelli, A., Crinion, J. T., Long, S., Friston, K. J., Lambon Price, C. J., & Devlin, J. T. (2003). The myth of the visual word
Ralph, M. A., Patterson, K., et al. (2005). Dissociating reading form area. NeuroImage, 19(3), 473–481.
processes on the basis of neuronal interactions. J. Cogn. Neurosci., Price, C. J., & Friston, K. J. (1997). The temporal dynamics of
17(11), 1753–1765. reading: A PET study. Proc. R. Soc. Lond. B Biol. Sci., 264(1389),
Mechelli, A., Humphreys, G. W., Mayall, K., Olson, A., & 1785–1791.
Price, C. J. (2000). Differential effects of word length and visual Price, C. J., Moore, C. J., & Frackowiak, R. S. (1996). The effect
contrast in the fusiform and lingual gyri during reading. Proc. of varying stimulus rate and duration on brain activity during
R. Soc. Lond. B Biol. Sci., 267(1455), 1909–1913. reading. NeuroImage, 3(1), 40–52.
Medendorp, W. P., Goltz, H. C., Vilis, T., & Crawford, Price, C. J., Wise, R. J. S., & Frackowiak, R. S. J. (1996). Dem-
J. D. (2003). Gaze-centered updating of visual space in human onstrating the implicit processing of visually presented words and
parietal cortex. J. Neurosci., 23, 6209–6214. pseudowords. Cereb. Cortex, 6, 62–70.
Mesulam, M. M. (1999). Spatial attention and neglect: Parietal, Priftis, K., Rusconi, E., Umilta, C., & Zorzi, M. (2003). Pure
frontal and cingulate contributions to the mental representation agnosia for mirror stimuli after right inferior parietal lesion.
and attentional targeting of salient extrapersonal events. Philos. Brain, 126(Pt. 4), 908–919.
Trans. R. Soc. Lond. B Biol. Sci., 354(1387), 1325–1346. Puce, A., Allison, T., Asgari, M., Gore, J. C., & McCarthy,
Michel, F., Henaff, M. A., & Bruckert, R. (1991). Unmasking G. (1996). Differential sensitivity of human visual cortex to faces,
of visual deficits following unilateral prestriate lesions in man. letterstrings, and textures: A functional magnetic resonance
NeuroReport, 2(6), 341–344. imaging study. J. Neurosci., 16, 5205–5215.
Miozzo, M., & Caramazza, A. (1998). Varieties of pure alexia: Rayner, K., & Bertera, J. H. (1979). Reading without a fovea.
The case of failure to access graphemic representations. Cogn. Science, 206(4417), 468–469.
Neuropsychol., 15, 203–238. Rayner, K., & McConkie, G. W. (1976). What guides a reader’s
Mishkin, M., & Forgays, D. G. (1952). Word recognition as a eye movements? Vis. Res., 16(8), 829–837.
function of retinal locus. J. Exp. Psychol., 43, 43–48. Rees, G., Wojciulik, E., Clarke, K., Husain, M., Frith, C., &
Molko, N., Cohen, L., Mangin, J. F., Chochon, F., LehÉricy, S., Driver, J. (2000). Unconscious activation of visual cortex in the
Le Bihan, D., et al. (2002). Visualizing the neural bases of a dis- damaged right hemisphere of a parietal patient with extinction.
connection syndrome with diffusion tensor imaging. J. Cogn. Brain, 123(Pt. 8), 1624–1633.
Neurosci., 14, 629–636. Riddoch, J. (1990). Neglect and the peripheral dyslexias. Cogn.
Mozer, M. C. (2002). Frames of reference in unilateral neglect and Neuropsychol., 7, 369–389.
visual perception: A computational perspective. Psychol. Rev., Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of
109, 156–185. object recognition in cortex. Nat. Neurosci., 2, 1019–1025.
Mycroft, R., Hanley, J. R., & Kay, J. (2002). Preserved access Rizzo, M., Nawrot, M., Blake, R., & Damasio, A. (1992). A
to abstract letter identities despite abolished letter naming in a human visual disorder resembling area V4 dysfunction in the
case of pure alexia. J. Neurolinguistics, 15, 99–108. monkey. Neurology, 42, 1175–1180.
Nakamura, K., Dehaene, S., Jobert, A., Le Bihan, D., & Kouider, Rizzo, M., & Vecera, S. P. (2002). Psychoanatomical substrates
S. (2005). Subliminal convergence of kanji and kana words: of Balint’s syndrome. J. Neurol. Neurosurg. Psychiatry, 72(2),
Further evidence for functional parcellation of the posterior tem- 162–178.
poral cortex in visual word perception. J. Cogn. Neurosci, 17, Rollenhagen, J. E., & Olson, C. R. (2000). Mirror-image confu-
954–968. sion in single neurons of the macaque inferotemporal cortex.
Nazir, T. A. (2000). Traces of print along the visual pathway. In Science, 287, 1506–1508.
A. Kennedy, R. Radach, D. Heller, & J. Pynte (Eds.), Reading as Rolls, E. T. (2000). Functions of the primate temporal lobe corti-
a perceptual process (pp. 3–22). Amsterdam: Elsevier. cal visual areas in invariant visual object and face recognition.
Nazir, T. A., Ben-Boutayab, N., Decoppet, N., Deutsch, Neuron, 27(2), 205–218.
A., & Frost, R. (2004). Reading habits, perceptual learning, and Saffran, E. M., & Coslett, H. B. (1996). “Attentional dyslexia”
recognition of printed words. Brain Lang., 88(3), 294–311. in Alzheimer’s disease: A case study. Cogn. Neuropsychol., 13,
Nazir, T. A., Jacobs, A. M., & O’Regan, J. K. (1998). Letter 205–228.
legibility and visual word recognition. Mem. Cogn., 26, 810– Sawamura, H., Georgieva, S., Vogels, R., Vanduffel, W., &
821. Orban, G. A. (2005). Using functional magnetic resonance
O’Regan, J. K., Levy-Schoen, A., Pynte, J., & Brugaillere, B. imaging to assess adaptation and size invariance of shape process-
(1984). Convenient fixation location within isolated words of ing by humans and monkeys. J. Neurosci., 25(17), 4294–4306.
different length and structure. J. Exp. Psychol. Hum. Percept. Schmahmann, J. D., & Pandya, D. N. (2006). Fiber pathways of the
Perform., 10(2), 250–257. brain. Oxford, UK: Oxford University Press.
Paap, K. R., Newsome, S. L., & Noel, R. W. (1984). Word shape’s Sereno, M. I. (2001). Mapping of contralateral space in retinotopic
in poor shape for the race to the lexicon. J. Exp. Psychol. Hum. coordinates by a parietal cortical area in humans. Science, 294,
Percept. Perform., 10(3), 413–428. 1350–1354.
Paulesu, E., McCrory, E., Fazio, F., Menoncello, L., Bruns- Sereno, S. C., & Rayner, K. (2003). Measuring word recognition
wick, N., Cappa, S. F., et al. (2000). A cultural effect on brain in reading: Eye movements and event-related potentials. Trends
function. Nat. Neurosci., 3(1), 91–96. Cogn. Sci., 7(11), 489–493.
Petit, J. P., Midgley, K. J., Holcomb, P. J., & Grainger, J. (2006). Serre, T., Oliva, A., & Poggio, T. (2007). A feedforward archi-
On the time course of letter perception: A masked priming ERP tecture accounts for rapid categorization. Proc. Natl. Acad. Sci.
investigation. Psychon. Bull. Rev., 13(4), 674–681. USA, 104(15), 6424–6429.
cohen and dehaene: ventral and dorsal contributions to word reading 803
Shallice, T. (1988). From neuropsychology to mental structure. Cam- evidence for a selective visual attentional disorder. Dyslexia, 10(4),
bridge, UK: Cambridge University Press. 339–363.
Shallice, T., & Warrington, E. K. (1977). The possible role of van Atteveldt, N., Formisano, E., Goebel, R., & Blomert, L.
selective attention in acquired dyslexia. Neuropsychologia, 15(1), (2004). Integration of letters and speech sounds in the human
31–41. brain. Neuron, 43(2), 271–282.
Sieroff, E., Pollatsek, A., & Posner, M. I. (1988). Recognition Vinckier, F., Dehaene, S., Jobert, A., Dubus, J. P., Sigman, M.,
of visual letter strings following injury to the posterior visual & Cohen, L. (2007). Hierarchical coding of letter strings in the
spatial attention system. Cogn. Neuropsychol., 5, 427–449. ventral stream: Dissecting the inner organization of the visual
Sigman, M., Pan, H., Yang, Y., Stern, E., Silbersweig, D., & word-form system. Neuron, 55(1), 143–156.
Gilbert, C. D. (2005). Top-down reorganization of activity in Vinckier, F., Naccache, L., Papeix, C., Forget, J., Hahn-Barma,
the visual pathway after learning a shape identification task. V., Dehaene, S., et al. (2006). “What” and “Where” in word
Neuron, 46(5), 823–835. reading: Ventral coding of written words revealed by parietal
Skarratt, P. A., & Lavidor, M. (2006). Magnetic stimulation of atrophy. J. Cogn. Neurosci., 18, 1998–2012.
the left visual cortex impairs expert word recognition. J. Cogn. Warrington, E. K., Cipolotti, L., & McNeil, J. (1993).
Neurosci., 18(10), 1749–1758. Attentional dyslexia: A single case study. Neuropsychologia, 34,
Somers, D. C., Dale, A. M., Seiffert, A. E., & Tootell, 871–885.
R. B. H. (1999). Functional MRI reveals spatially specific Warrington, E. K., & Shallice, T. (1980). Word-form
attentional modulation in human primary visual cortex. Proc. dyslexia. Brain, 103(1), 99–112.
Natl. Acad. Sci. USA, 96, 1663–1668. Weekes, B. S. (1997). Differential effects of number of letters on
Suzuki, K., Yamadori, A., Endo, K., Fujii, T., Ezura, M., & word and nonword naming latency. Q. J. Exp. Psychol. [A], 50,
Takahashi, A. (1998). Dissociation of letter and picture naming 439–456.
resulting from callosal disconnection. Neurology, 51, 1390–1394. Whiting, W. L., Madden, D. J., Langley, L. K., Denny, L. L.,
Tarkiainen, A., Helenius, P., Hansen, P. C., Cornelissen, P. L., Turkington, T. G., Provenzale, J. M., et al. (2003). Lexical
& Salmelin, R. (1999). Dynamics of letter string perception and sublexical components of age-related changes in neural acti-
in the human occipitotemporal cortex. Brain, 122(Pt. 11), vation during visual word identification. J. Cogn. Neurosci., 15(3),
2119–2132. 475–487.
Tootell, R. B., Mendola, J. D., Hadjikhani, N. K., Liu, A. K., Whitney, C. (2001). How the brain encodes the order of letters in
& Dale, A. M. (1998). The representation of the ipsilateral visual a printed word: The SERIOL model and selective literature
field in human cerebral cortex. Proc. Natl. Acad. Sci. USA, 95(3), review. Psychon. Bull. Rev., 8(2), 221–243.
818–824. Whitney, C., & Lavidor, M. (2004). Why word length only matters
Treisman, A., & Souther, J. (1986). Illusory words: The roles of in the left visual field. Neuropsychologia, 42(12), 1680–1688.
attention and of top-down constraints in conjoining letters to Wright, N. D., Mechelli, A., Noppeney, U., Veltman, D. J.,
form words. J. Exp. Psychol. Hum. Percept. Perform., 12(1), 3–17. Rombouts, S. A., Glensman, J., et al. (2007). Selective activation
Ullman, S. (2007). Object recognition and segmentation by a around the left occipito-temporal sulcus for words relative to
fragment-based hierarchy. Trends Cogn. Sci., 11(2), 58–64. pictures: Individual variability or false positives? Hum. Brain
Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual Mapp, 29(8), 986–1000.
systems. In D. J. Ingle, M. A. Goodale, & R. J. Mansfield (Eds.), Ziegler, J. C., & Goswami, U. (2005). Reading acquisition,
Analysis of visual behavior (pp. 549–586). Cambridge, MA: MIT developmental dyslexia, and skilled reading across languages:
Press. A psycholinguistic grain size theory. Psychol. Bull., 131(1), 3–29.
Valdois, S., Bosse, M. L., & Tainturier, M. J. (2004). The cogni- Zihl, J. (1995). Eye movement patterns in hemianopic dyslexia.
tive deficits responsible for developmental dyslexia: Review of Brain, 118(Pt. 4), 891–912.
804 language
55 The Neural Basis of
Syntactic Processing
david caplan
abstract Syntactic structures are unique mental representations and other similar information, mostly relating to the rela-
that relate the meanings of words to one another. Understanding tionships between items, actions, and properties referred to
of the neural basis for syntax consists mostly of information about by the words in a sentence. Propositions are the source of
the areas in which these structures are assigned and used to deter-
mine meaning in the process of comprehension, together with
much of the information that is stored in semantic memory.
electrophysiological correlates of these processes. This chapter In addition, because propositions can be true or false, they
briefly reviews deficit-lesion correlations and neurovascular studies can be used to reason, including making inferences, and to
that are relevant to the first of these topics. Both these sources of plan actions. Without propositions, language would consist
data suggest that the brain does not support syntactic processing in of designating items, actions, and properties of items and
an abstract fashion but as part of performing the task that is the
actions—a significant functional capacity, to be sure, but far
purpose of the comprehension process and that these task-related
syntactic operations are supported by multiple brain areas. less rich and useful than that which language affords because
it includes propositions.
For sequences of words to convey propositional relation-
ships in a flexible manner—one that allows unlikely or
Marr (1982) articulated a useful framework for describing impossible relationships to be expressed—it is necessary that
cognitive functions. In this system, a cognitive function is the meanings of words be combinable into propositions in
described at three levels: a level at which the representations some way that does not correspond to likely events. That is,
of the information in the cognitive domain are described combinatory possibilities have to be available to allow the
(the representational level), a level at which the operations sequence of words “man dog bite” to be associated with the
that compute these representations are described (the algo- proposition that a man is biting a dog, and not vice versa.
rithmic level), and a level at which the neural mechanisms Humans use the ability to refer to unlikely and false events
that support the storage of the representations and the when they lie, when they consider hypothetical situations,
activity of the operations that compute them are described and in other circumstances. The principles that allow these
(the neural level). In this chapter, I briefly review syntactic functions are the syntactic structures of language.
processing using this framework to organize the presenta- Syntactic structures need not be complex to permit
tion. Readers may find that more space is devoted to the unlikely propositions to be expressed: a simple active form
representational and algorithmic levels than is the case in (“The man is biting the dog”) would suffice for this basic
other chapters. If so, this emphasis reflects my sense that purpose. But syntactic structures are much more complex
these levels are less well understood by neurologically ori- than this one requirement imposes, and the complexity
ented cognitive neuroscientists in this domain than may be adds to the semantic information they allow language
the case in other cognitive areas. to convey. Features of syntax such as embedding allow
propositions, not merely words, to be related to one another.
Syntactic representations and their processing The sentence “The man who chased the girl fell down”
expresses a relationship between two propositions—the
Sentences convey information beyond that which is con- man chased the girl, and the man fell down. The syntactic
veyed by words alone. This information, collectively known structure known as a relative clause allows these two proposi-
as the propositional content of a sentence, includes who is tions to apply to the same man. Similarly, complement
initiating and receiving an action (thematic roles), which structures allow us to express propositional attitudes: “John
adjectives are assigned to which nouns (attribution of modi- believed/disagreed/expected/feared that it would rain”
fication), which words refer to the same items (co-reference), expresses a variety of states of mind that John is in vis-á-vis
the proposition that it will rain. Syntactic structures are
david caplan Department of Neurology, Massachusetts General needed to allow these sorts of relations between propositions
Hospital, Boston, Massachusetts to be conveyed.
806 language
structures that differ in fairly important ways from those egies that apply), the deficit is due to the lesion the patient
postulated in these other domains. Certainly the contribu- has sustained. The corollary of this statement is that the
tion of syntactic relations between words to meaning differs integrity of the lesioned area/neural process is necessary for
from the contribution of the rules relating elements in these the operation to take place. To apply deficit-lesion analyses
other domains to meaning, because of the differences in the to the problem of localization of syntactic operations, it is
meanings conveyed by sentences and these other represen- therefore necessary to characterize the deficits in patients,
tational systems. their lesions, and the relations between the two.
Most models of sentence comprehension maintain that Two basic views of deficits affecting syntactically based
“parsing” and “interpretive” operations assign syntactic comprehension have been articulated. The first is that
structures and use them to determine aspects of meaning in individual parsing or interpretive operations are selectively
the process of understanding a sentence. These models have affected by brain damage. The second is that patients lose
articulated principles whereby parsing rules apply. For the ability to apply what have been called “resources” to the
instance, in the sentence “The boy wanted to go to the game task of assigning and interpreting syntactic structure. The
yesterday,” yesterday preferentially modifies to go not wanted, first of these deficits may be likened to a student not being
suggesting a general principle of attaching new phrases to able to calculate π to eight decimal places in his/her head
the last incomplete phrase in a developing syntactic struc- because s/he does not know the formula for calculating π.
ture. Sentence interpretation relies on more than just syn- The second may be likened to a student knowing the formula
tactic structure and word meanings; information about the but not being able to hold the intermediate products of
frequency with which constructions appear, the plausibility computation in mind. Exactly what prevents the application
of the meaning of a sentence, and other factors affect the of such knowledge is unclear, but most models of cognitive
ease of syntactic analysis and comprehension (MacDonald, processes include limitations of this sort.
Pearlmutter, & Seidenberg, 1994). For instance, “While the The hallmark of a deficit affecting syntactic operations is
man ate the hot dog burned in the fire” is harder to structure the combination of abnormally low (or chance) performance
and interpret than “While the man ate the wood burned in in understanding sentences that require a syntactic analysis
the fire,” because the hot dog, but not the wood, is a plausible to be understood—semantically reversible sentences that
theme of ate, which reinforces an ultimately incorrect struc- cannot be understood by the application of simple heuristics
ture. The principles affecting structure building and these such as the assignment of thematic roles to nouns following
other factors interact online (as sentences are analyzed syn- a simple pattern (e.g., “The boy who the girl pushed is
tactically and assigned meaning). Recent studies provide tall”)—and the retained ability to understand “semantically
evidence that features of the nonlinguistic environment in irreversible” sentences with the same syntactic structures,
which a sentence is uttered (e.g., the nature of items visible that is, sentences in which the meaning can simply be
to a listener) enter into these interactions (Tanenhaus, inferred from the meanings of the words and knowledge
Spivey-Knowlton, Eberhard, & Sedivy, 1995). about likely relations between them (e.g., “The book that the
This chapter will review studies that provide data relevant girl read is long”) (Caramazza & Zurif, 1976). Researchers
to the neural basis of parsing and interpretive operations. who advocate “specific deficit” accounts of aphasic distur-
Only results of deficit-lesion correlations in patients with bances in this area have claimed that this pattern occurs
focal lesions and neurovascular activation studies in normal for representations and processes specified in linguistics and
subjects will be covered. Some other potential sources of psycholinguistic models. Some of these deficits are said to
data (intraoperative stimulation, subdural electrode place- be very specific. For instance, the “trace deletion hypothesis”
ment, transcranial magnetic stimulation, magnetoencepha- (Grodzinsky, 2000) maintains that individual patients cannot
lography, intraoperative and subdural recordings, optical process sentences that Chomsky’s theory maintains contain
imaging) have not been extensively used for these studies; a certain type of moved items (the term “trace deletion
for review of electrophysiological studies, see Hagoort, hypothesis” refers to an earlier version of Chomsky’s theory
Baggio, and Willems (chapter 56 in this volume). Only in which these items were moved and left a “trace,” not
studies of comprehension will be reviewed, as there is more copied). The claim is that some patients have lost the ability
work in this area than in production. to connect certain moved (or, now, copied) noun phrases to
their “traces” in sentences such as those mentioned previ-
Deficit-lesion correlation studies of syntactic processing ously (relative clauses, questions, indirect questions, passive),
with the consequence that these noun phrases are not
The logic underlying the use of deficits to explore the neural assigned thematic roles. At the other end of the spectrum,
basis of syntactic processing is that, if a patient’s perfor- some researchers have suggested that certain aphasics
mance can be analyzed as being due to a deficit in a syntactic have deficits that apply to a large set of related operations,
operation (plus residual abilities and any compensatory strat- such as the operations that map all syntactic structures
808 language
through the use of speeded presentation (Miyake, Carpenter, interpretation. For instance, unless the proper control studies
& Just, 1994), concurrent tasks (King & Just, 1991), and are done (discussed previously), performances that are inter-
other methods mimic aphasic performance. preted as failures to “co-index traces” can be seen as due to
These arguments are also not ironclad. The argument reductions in processing resources that lead to failures to
that some patients can understand sentences that contain comprehend sentences that contain “traces.”
certain structures or operations in isolation but not sentences We must begin with a major caveat about lesion-deficit
that contain combinations of those structures and operations studies: the vast majority of these studies do not examine
suffers from the same limitations of the database that we lesions quantitatively. Many are based on the assumption
discussed earlier: it is based on a single performance measure that Broca’s or nonfluent patients have “anterior” lesions
(accuracy) in a single task (enactment). Testing the second and Wernicke’s, fluent, conduction, and anomic patients
result—that as patients’ performances deteriorate, more have “posterior” lesions, whereas the reality is far more
complex sentence types are affected more than less complex complicated (Mohr et al., 1978; Vanier & Caplan, 1989). A
ones—risks circularity unless the effects of resource reduc- number of studies summarize radiological reports and/or
tion are modeled and measured separately from comprehen- display lesions, usually on a single transverse section of the
sion on sentences of the sort that are used to test the effects brain imaged with computer tomography or magnetic reso-
of complexity. Three studies have addressed this issue in nance, and emphasize the area in which lesions in patients
different ways: two (Caplan et al., 1985; Caplan et al., 2007a) with certain types of performances (analyzed as deficits of
have found this pattern; the third, a smaller study, did not particular types) overlap. Such analyses do not investigate
(Dick et al., 2001). The data regarding interference effects many questions. For instance, neither the most direct predic-
in normal subjects are complex. Interactions of load and tion made by distributed models (that lesion size correlates
syntactic complexity, and of these factors with subject groups with performance level) nor the claim that the insertion of
that differ in processing resource capacity, are critical pieces traces into syntactic structures occurs in Broca’s area has
of evidence that would support this model, but these interac- ever been tested on the basis of radiological data by advo-
tions only occur under special circumstances (see Caplan & cates of these models (Mesulam, 1990; Damasio & Damasio,
Waters, 1999; Caplan et al., 2006, for reviews), reducing the 1992; Dick et al., 2001; Grodzinsky, 2000).
strength of this argument. The finding that first factors on To my knowledge, there are only six studies in the litera-
which all sentence types load account for the majority of ture in which radiological images have been analyzed and
the variance is an extremely robust finding, regardless of related to sentence comprehension in aphasics. Three are
the task over which factors are extracted or whether they are based on instruments that do not examine syntactic process-
extracted over several tasks (DeDe Caplan, 2006; Caplan et ing: Karbe and colleagues (1989), who used the Western
al., 2007a). The hypothesis that reductions of processing Aphasia Battery, which does not characterize deficits in a
capacity are sources of aphasic syntactic comprehension linguistically or psycholinguistically specific way; Kempler,
deficits fares better, in my view, than the hypothesis that Curtiss, Metter, Jackson, and Hanson (1991), who used the
individual patients have specific deficits.2 Token Test, which confounds syntactic processing with
Accepting the view that what is to be correlated with short-term memory requirements; and Dronkers, Wilkin,
lesion parameters is either some measure of performance Van Valin, Redfern, and Jaeger (2004), who used the
that captures a deficit a patient has with a particular syntac- CYCLE, which does not separate lexical from syntactic
tic structure or operation, or some measure of performance errors. Two studies used appropriate measures to test syn-
that captures the “amount” of resources available to a tactic processing but had other limitations. Tramo, Baynes,
patient, what do studies of deficit-lesions correlation show and Volpe (1988) presented reversible sentences in a
about the way the brain is organized to support parsing and sentence-picture matching task, but studied only one con-
interpretation? Four models of brain organization for syn- trast (active and passive sentences) and only reported three
tactic processing have been suggested, based on data of this cases. Caplan and colleagues (1996) studied 25 sentence
sort. Localizationist models are represented by Grodzinsky types testing many aspects of syntactic processing, but only
(2000), who claims that Chomskian traces are coindexed in 18 patients were studied, lesions were identified subjectively,
Broca’s area; variable localization models by ourselves and scans were normalized along a single linear dimension
(Caplan, 1994; Caplan et al., 2007b); invariant evenly dis- in the anterior-posterior plane only, likely leading to signifi-
tributed models by Dick and colleagues (2001) and Damasio cant inaccuracies in the estimates of percents of regions of
and Damasio (1992); and invariant unevenly distributed interest (ROIs) that were lesioned. Other problems are found
models by Mesulam (1990, 1998). Most of these models in many of the studies that also used inappropriate test
have been articulated as applying to specific operations, instruments (see Caplan et al., 2007b, for discussion). In all
but the evidence can often be interpreted in terms of these studies, analyses were limited to examining the effect
reductions in the resource system that underlies parsing and of lesions in individual locations; only Caplan and colleagues
810 language
1. I helped the girli that Mary saw [ti] in the park. BOLD signal in left pars opercularis than sentences with
2. I told Mary that the girl ran in the park. transitive verbs and vice versa for sentences with object-
before-subject word order. This finding is consistent with the
BOLD signal increased in left inferior frontal gyrus (IFG) conclusion that this part of Broca’s area is involved in
and bilateral superior temporal sulcus (STS) in sentence 1 mapping the linear order of thematic roles onto the hierar-
compared to sentence 2. Ben Shachar, Palti, and Grodzinsky chy of thematic roles. However, the picture is quite compli-
(2004) found increased BOLD signal in the left IFG, in the cated. Grewe and colleagues (2005) found that an increase
left ventral precentral sulcus, and bilaterally in superior tem- in BOLD signal in left IFG associated with a less common
poral gyrus (STG) (marginally on the right) in the contrast word order (object before subject) was not present when the
of embedded wh-questions (sentences 3 and 4) against yes/no first noun phrase was a pronoun. Since pronouns preferen-
questions (sentence 5) in a verification task: tially occur in first position in the German middle field, the
3. The waiter asked which touristi [ti] ordered the alcoholic authors interpreted this result as evidence that particular
drink in the morning. language-specific syntactic features override the usual effect
of word order. However, the result could also indicate that
4. The waiter asked which alcoholic drinki the tourist ordered
the object-before-subject word order does not always lead
[ti] in the morning.
to activation in left IFG. Grewe and colleagues (2007) also
5. The waiter asked if the tourist ordered the alcoholic drink failed to find that noun animacy had the effects that the
in the morning. theory predicts: an increase in BOLD signal that occured
These studies are the only studies to date that contrast sen- with object-before-subject order was not greater when the
tences with and without the co-indexation of a “trace,” and object was inanimate. This finding contradicts the hypothe-
they yield activation in more than one region. sis in Grewe and colleagues (2006) and Bornkessel and
Bornkessel and Schlesewsky (2006) have published a major Schlesewsky (2006) that left IFG is activated by sentences
position paper on a neurologically based model of parsing that require mapping noun phrases (NPs) that violate the
and sentence interpretation. They argue that the linear animacy principle onto thematic roles. Another issue is that,
order of noun phrases is mapped onto thematic roles in as in the studies of BOLD signal associated with processing
Broca’s area and that deviations from the usual mapping “traces,” brain areas other than left IFG have been activated
lead to increased activity in this area. The mapping is deter- in these studies. In the Bornkessel and colleagues (2005)
mined by both general features of language and cognition study, the left STS and inferior parietal sulcus (IPS) showed
(e.g., animate nouns are more likely to be agents than inani- the same pattern as left IFG, for morphologically unambigu-
mate nouns) and language-specific grammatical features ous sentences. Thus the hypothesis that left IFG supports a
(e.g., case marking is more important in determining the- general function of “decoding the prominence relations
matic roles in a language that has a great deal of visible case between arguments,” and that it is the only brain area to do
marking, such as German, than in a language that does not, so, is not well established.
such as English). The model is groundbreaking in that it is The processing resource system underlying parsing and
the first detailed model of aspects of sentence comprehension interpretation has been studied in functional neuroimaging
that is based primarily upon neurological data, mostly studies that examine the difference between object- and
event-related potential studies. It is worth noting, however, subject-extracted structures (e.g., “The boy who the girl
that the model only deals with a small aspect of sentence chased fell”; “The boy who chased the girl fell”). Object
processing—assigning the two most basic thematic roles (pro- extraction is more demanding than subject extraction and is
totypical agents and themes) in the simplest syntactic struc- thought to require more “resources” than subject extraction,
tures (single sentences). As noted, the model mostly deals for many reasons. These studies use a variety of sentences—
with ERP data, but the anatomical hypotheses are partially cleft sentences, relative clauses, conjoined sentences, wh-
based on fMRI studies; I shall briefly review these here. questions, complement clauses, main clauses, topicalization,
As noted, the model focuses on the role of Broca’s area in and dative shifts in English, German, Dutch, Japanese, and
mapping noun phrases onto thematic roles. The first critical Hebrew. There is great variability in the areas activated
study of this topic was Bornkessel, Zysset, Friederici, von in these studies. In Just, Carpenter, Keller, Eddy, and
Cramon, and Schlesewsky (2005), in which participants veri- Thulborn (1996), this contrast activated frontal and tempo-
fied the meaning of German sentences with verb-final com- ral perisylvian cortex bilaterally. In Stromswold, Caplan,
plement clauses in which word order, morphological case Alpert, and Rauch (1996), left IFG was activated. Cooke and
ambiguity, and verb class (transitive/dative object experi- colleagues (2001) found increased BOLD signal in bilateral
encer) were varied. Bornkessel and colleagues found that, for inferior temporal lobe in this contrast. In other studies of
sentences with subject-before-object word order, sentences ours (Caplan, Alpert, & Waters, 1998, 1999; Waters, Caplan,
with dative-object-experiencer verbs produced greater Stanzcak, & Alpert, 2003), activation was seen variably in
812 language
These considerations suggest that, once the effects of strat- In the past decade or so, evidence has accrued that this
egies and task-sentence type interactions are eliminated, view is inaccurate in important respects. It may be that there
specific parsing and interpretation operations may yet be are task-independent parsing and interpretive operations,
supported by a limited number of brain areas, perhaps only but there is very strong evidence that the assignment of
one. Other data show that the picture is more complicated, syntactic structure interacts at the earliest possible moment
however (Caplan & Waters, 2007). We used the same correct with other types of information in the process of assigning
sentences used in the plausibility and nonword-detection sentence structure and meaning, such as the assessment of
tasks, containing only real words, in a third task—font- how plausible certain meanings are, the activation of mean-
change detection. Participants saw these sentences and foils ings based upon nongrammatical heuristics, the assignment
consisting of grammatically correct, meaningful sentences of structure and meaning based upon the frequency of occur-
containing only real words, one of which appeared in a rence of particular constructions or sequences of words, and
slightly different font from the others, and were required to so on (MacDonald et al., 1994). Though all these operations
indicate whether a sentence had a font change. Analyses of could be regarded as part of a larger, integrated process
the behavioral data again showed that subjects processed that assigns sentence meaning, the problem still remains of
sentences as sentences. For sentences containing only words isolating the operations that assign the grammatically licensed
without font changes—the same grammatical, plausible sen- syntactic structure of a sentence and use it to determine the
tences that were analyzed in the plausibility-judgment task meaning of the sentence (recall that it is this structure that
and in the nonword-detection task—there was an increase allows sentences to convey unlikely information, to express
in BOLD signal, but it was located in the left supramarginal complex relations between items and propositions, and to
gyrus (left BA 39), not left IFG. There were no areas acti- convey both propositional and discourse-level information).
vated in both nonword detection and font-change detection, Even more unexpectedly from a “modular” point of view,
and no functional connectivity between the areas activated task demands appear to influence parsing and interpretation
in the two tasks. These results indicate that different parsing online; for instance, how one attaches the prepositional
and interpretive operations were applied in the two tasks—a phrase in a sentence such as “Put the toy on the rug . . .”
result that was confirmed by the finding that the effects of depends upon how many toys are in an array that is being
the position of a nonword or a word with a font change inspected and where these toys are located (Tanenhaus,
on detection response times differed. Thus, although indi- Spivey-Knowlton, Eberhard, & Sedivy, 1995). The applica-
viduals do assign and interpret syntactic structures even tion of parsing and interpretive operations thus may differ
when these structures are completely irrelevant to task per- in different tasks, and, more seriously from the point of view
formance, the task they are performing still affects which of identifying the neural basis of these operations, differences
operations they deploy. To identify the neural basis of par- in how a task is performed as a function of the sentences that
ticular parsing operations thus requires knowing what are being presented may be responsible for differences in
parsing and interpretive operations are applied in a task, as neural activity associated with sentence contrasts and for
well as knowing that neurovascular effects are not due to deficits that affect performance on one sentence type in a
strategies or task-sentence type interactions. given task. Finally, many tasks involve strategic use of cogni-
tive operations, such as subvocal rehearsal, that are applied
Concluding comments to a greater extent when an individual is presented with more
complex sentences. These ancillary cognitive operations
The past 15 years have seen great changes in models of must also be eliminated from consideration if the neural basis
parsing and sentence interpretation. For close to three of parsing and interpretation is to be identified. The results
decades (roughly 1965–1995), heavily influenced by Chom- of recent studies of patients and neurovascular responses to
sky’s views regarding the domain specificity of syntactic rep- syntactic contrasts have led to some findings that suggest that
resentations and Fodor’s (1972) concept of modular cognitive these questions are important to consider.
processes, researchers studying syntactic processing made With respect to task dependency of deficits and activation,
the assumption that parsing and interpretive operations were recent lesion studies have shown that deficits affecting par-
task independent, and that the use of the products of the ticular parsing operations or the resource system that sup-
interpretive process to perform tasks occurred independently ports them are affected by task, and the same is true of
of the assignment of syntactic structure and propositional activation associated with sentence contrasts. These results
meaning. Correspondingly, deficit-lesion correlations were indicate that most of the data obtained thus far regarding
interpreted as providing evidence for the location of neural parsing and interpretation may identify brain regions that
tissue that supports task-independent syntactic operations. support a combination of parsing and interpretation and
Most functional neuroimaging studies of syntactic processing performance of particular tasks. Areas of the brain that are
have been interpreted within the same framework. always activated by a syntactic contrast regardless of task are
814 language
interpretation, but none have been definitely shown to play this Caplan, D., Chen, E., & Waters, G. (2008). Task-dependent and
role. Deficits in a short-term semantic memory system are task-independent neurovascular responses to syntactic process-
thought to lead to quite specific, different disturbances in ing. Cortex, 44(3), 257–275.
comprehension (Martin & He, 2004). Caplan, D., DeDe, G., & Michaud, J. (2006). Task-independent
and task-specific syntactic deficits in aphasic comprehension.
Aphasiology, 20, 893–920.
REFERENCES Caplan, D., & Hildebrandt, N. (1988). Disorders of syntactic
comprehension. Cambridge, MA: MIT Press (Bradford Books).
Baddeley, A. D. (1986). Working memory. Oxford, UK: Clarendon Caplan, D., Hildebrandt, N., & Makris, N. (1996). Location of
Press. lesions in stroke patients with deficits in syntactic processing in
Ben-Shachar, M., Hendler, T., Kahn, I., Ben-Bashat, D., & sentence comprehension. Brain, 119, 933–949.
Grodzinsky, Y. (2003). The neural reality of syntactic transfor- Caplan, D., & Waters, G. (1990). Short-term memory and
mations: Evidence from fMRI. Psychol. Sci., 14, 433–440. language comprehension: A critical review of the neuropsycho-
Ben-Shachar, M., Palti, D., & Grodzinsky, Y. (2004). The neural logical literature. In T. Shallice & G. Vallar (Eds.), The neuro-
correlates of syntactic movement: Converging evidence from two psychology of short-term memory (pp. 337–389). Cambridge, UK:
fMRI experiments. Neuroimage, 21, 1320–1336. Cambridge University Press.
Berndt, R., Mitchum, C., & Haendiges, A. (1996). Comprehen- Caplan, D., & Waters, G. S. (1999). Verbal working memory
sion of reversible sentences in “agrammatism”: A meta- capacity and language comprehension. Behav. Brain Sci., 22,
analysis. Cognition, 58, 289–308. 114–126.
Blumstein, S., Byma, G., Kurowski, K., Hourihan, J., Brown, T., Caplan, D., & Waters, G. S. (2003). On-line syntactic processing
& Hutchinson, A. (1998). On-line processing of filler-gap con- in aphasia: Studies with auditory moving windows presentation.
structions in aphasia. Brain Lang., 61(2), 149–169. Brain Lang., 84(2), 222–249.
Boland, J. E., Tanenhaus, M. K., Garnsey, S., M., & Carlson, Caplan, D., & Waters, G. (2007). BOLD signal response to implicit
G. N. (1995). Verb argument structure in parsing and inter- syntactic processing. Long Beach, CA: Psychonomic Society.
pretation: Evidence from wh-questions. J. Mem. Lang., 34, Caplan, D., Waters, G., & Hildebrandt, N. (1997). Syntactic
774–806. determinants of sentence comprehension in aphasic patients in
Bornkessel, I., & Schlesewsky, M. (2006). The extended sentence-picture matching and enactment tasks. J. Speech Hear.
argument dependency model: A neurocognitive approach to Res., 40, 542–555.
sentence comprehension across languages. Psychol. Rev., 113, Caplan, D. Waters, G., Kennedy, D. Alpert, A., Makris, N.,
787–821. DeDe, G., Michaud, J., & Reddy, A. (2007a). A study of syn-
Bornkessel, I., Zysset, S., Friederici, A. D., von Cramon, D. Y., tactic processing in aphasia. I. Psycholinguistic aspects. Brain
& Schlesewsky, M. (2005). Who did what to whom? The neural Lang., 101, 103–150.
basis of argument hierarchies during language comprehension. Caplan, D., Waters, G., Kennedy, D., Alpert, A., Makris, N.,
NeuroImage, 26, 221–233. DeDe, G., Michaud, J., & Reddy, A. (2007b). A study of
Caplan, D. (1994). The cognitive neuroscience of syntactic syntactic processing in aphasia II. Neurological aspects. Brain
processing. In M. Gazzaniga (Ed.), The cognitive neurosciences Lang., 101, 151–177.
(pp. 871–879). Cambridge, MA: MIT Press. Caramazza, A., Basili, A. G., Koller, J. J., & Berndt, R. S.
Caplan, D. (1995). Issues arising in contemporary studies of disor- (1980). An investigation of repetition and language processing in
ders of syntactic processing in sentence comprehension in agram- a case of conduction aphasia. Brain Lang., 14, 235–271.
matic patients. Brain Lang., 50, 325–338. Caramazza, A., Capitani, E., Rey, A., & Berndt, R. S. (2001).
Caplan, D. (2001a) The measurement of chance performance in Agrammatic Broca’s aphasia is not associated with a single
aphasia, with specific reference to the comprehension of seman- pattern of comprehension performance. Brain Lang., 76,
tically reversible passive sentences: A note on issues raised 158–184.
by Caramazza, Capitani, Rey and Berndt (2000) and Drai, Caramazza, A., & Zurif, E. R. (1976). Dissociation of algorithmic
Grodzinsky and Zurif (2000). Brain Lang., 76, 193–201. and heuristic processes in language comprehension: Evidence
Caplan, D. (2001b). Points regarding the functional neuroanatomy from aphasia. Brain Lang., 3, 572–582.
of syntactic processing: A response to Zurif (2001). Brain Lang., Chomsky, N. (1995). The minimalist program. Cambridge, MA: MIT
79, 329–332. Press.
Caplan, D. (2006). fMRI studies of syntactic processing. Curr. Med. Cooke, A., Zurif, E. B., DeVita, C., Alsop, D., Koenig, P., Detre,
Imaging Rev., 2, 443–451. J., Gee, J., PinÃngo, M., Balogh, J., & Grossman, M. (2001).
Caplan, D. (2007). Functional neuroimaging studies of syntactic Neural basis for sentence comprehension: Grammatical and
processing in sentence comprehension: A critical selective review. short-term memory components. Hum. Brain Mapp., 15, 80–94.
Lang. Linguistics Compass, 1, 32–47. Cupples, L., & Inglis, A. L. (1993). When task demands induce
Caplan, D., Alpert, N., & Waters, G. (1998). Effects of syntactic “asyntactic” comprehension: A study of sentence interpretation
structure and propositional number of patterns of regional cere- in aphasia. Cogn. Neuropsychol., 10, 201–234.
bral blood flow. J. Cogn. Neurosci., 10, 541–552. Damasio, A. R., & Damasio, H. (1992). Brain and language. Sci.
Caplan, D., Alpert, N., & Waters, G. (1999). PET studies Am., September, 89–95.
of sentence processing with auditory sentence presentation. DeDe, G., & Caplan, D. (2006). Factor analysis of syntactic deficits
NeuroImage, 9, 343–351. in aphasic comprehension. Aphasiology, 20, 123–135.
Caplan, D., Baker, C., & Dehaut, F. (1985). Syntactic determi- Dell, G. S., Schwartz, M. F., Martin, N., Saffran, E. M., &
nants of sentence comprehension in aphasia. Cognition, 21, Gagnon, D. A. (1997). Lexical access in aphasic and nonaphasic
117–175. speakers. Psychol. Rev., 104, 801–838.
816 language
Vanier, M., & Caplan, D. (1989). CT-Scan correlates of agram- in sentence comprehension: Effects of working memory and
matism. In L. Menn & L. K. Obler (Eds.), Agrammatic aphasia (pp. speed of processing. NeuroImage, 19, 101–112.
37–114). New York: J. Benjamin. Zurif, E., Swinney, D., Prather, P., Solomon, J., & Bushell,
Waters, G. S., Caplan, D., Stanzcak, L., & Alpert, N. (2003). C. (1993). An on-line analysis of syntactic processing in Broca’s
Individual differences in rCBF correlates of syntactic processing and Wernicke’s aphasia. Brain Lang., 45, 448–464.
abstract Language and communication are about the exchange other reasons, simple composition seems not to hold across
of meaning. A key feature of understanding and producing lan- all possible expressions in the language (for a discussion of
guage is the construction of complex meaning from more elemen- this and other issues related to compositionality, see Baggio,
tary semantic building blocks. The functional characteristics of this
semantic unification process are revealed by studies using event-
van Lambalgen, & Hagoort, in press). One of the challenges
related brain potentials. These studies have found that word for a cognitive neuroscience of language is to account for the
meaning is assembled into compound meaning in not more than functional and neuroanatomical underpinnings of online
500 ms. World knowledge, information about the speaker, co- meaning composition.
occurring visual input, and discourse all have an immediate impact In linking the requirements of the language system
on semantic unification and trigger electrophysiological responses
as instantiated in the finite and real-time machinery of
that are similar to those triggered by sentence-internal semantic
information. Neuroimaging studies show that a network of brain the human brain to the broader domain of cognitive
areas, including the left inferior frontal gyrus, the left superior/ neuroscience, three functional components are considered
middle temporal cortex, the left inferior parietal cortex, and, to a to be the core of language processing (Hagoort, 2005).
lesser extent, their right-hemisphere homologues are recruited to The first is the memory component, which refers to the
perform semantic unification.
different types of language information stored in long-
term memory (the mental lexicon) and to how this infor-
mation is retrieved (lexical access). The unification component
Ultimately, language is the vehicle for the exchange of
refers to the integration of lexically retrieved information
meaning between speaker and listener, between writer and
into a representation of multiword utterances, as well as the
reader. The unique feature of this vehicle is that it enables
integration of meaning extracted from nonlinguistic modali-
the assembly of complex expressions from simpler ones. The
ties; this component is at the heart of the combinatorial
cognitive architecture necessary to realize this expressive
nature of language. Finally, the control component relates
power is tripartite in nature, with levels of form (sound,
language to action, and is invoked, for instance, when the
graphemes, manual gestures in sign language), syntax, and
correct target language has to be selected (in the case of
meaning as the core components of our language faculty
bilingualism) or for handling turn taking during conversa-
(Jackendoff, 1999, 2002; Levelt, 1999). The principle of
tion. In principle, this MUC (memory, unification, control)
compositionality is often invoked to characterize the expres-
framework applies to both language production and lan-
sive power of language at the level of meaning. The most
guage comprehension, although details of their functional
strict account of compositionality states that the meaning of
anatomy within each component will be different. The focus
an expression is a function of the meanings of its parts and
of this chapter is on the unification component.
the way they are syntactically combined (Fodor & Lepore,
Classically, psycholinguistic studies of unification have
2002; Heim & Kratzer, 1998; Partee, 1984). In this account,
focused on syntactic analysis. However, as we saw, unifica-
complex meanings are assembled bottom-up from the mean-
tion operations take place not only at the syntactic process-
ings of the lexical building blocks by means of the combina-
ing level. Combinatoriality is a hallmark of language across
torial machinery of syntax. This process is sometimes referred
representational domains (cf. Jackendoff, 2002). Thus, also
to as simple composition (Jackendoff, 1997). That this is not
at the semantic and phonological levels, lexical elements are
without problems can be seen in adjective-noun construc-
combined and integrated into larger structures (cf. Hagoort,
tions such as “flat tire,” “flat beer,” “flat note,” and so on
2005). In the remainder of this chapter, we will discuss
(Keenan, 1979). In all these cases, the meaning of “flat” is
semantic unification. Semantic unification refers to the inte-
quite different and strongly context dependent. For this and
gration of word meaning into an unfolding representation
of the preceding context. This is more than the concatena-
peter hagoort Donders Institute for Brain, Cognition and tion of individual word meanings, as is clear from the
Behaviour, Radboud University Nijmegen; Max Planck Institute
adjective-noun examples given earlier. In the interaction
for Psycholinguistics, Nijmegen, The Netherlands
giosuè baggio and roel m. willems Donders Institute for Brain, with the preceding sentence or discourse context, the appro-
Cognition and Behaviour, Radboud University Nijmegen, priate meaning is selected or constructed, so that a coherent
Nijmegen, The Netherlands interpretation results.
820 language
Figure 56.1 Participants read the sentences as in the example in category violations (dotted line). This pattern is indicative of a
a visual-half-field presentation design. Context words were pre- “predictive” strategy, in which semantic information associated
sented at central fixation, whereas sentence-final target words (e.g., with the expected item is preactivated in the course of processing
“oranges”) were presented to the left or right of fixation. As illus- the context information. The response to targets presented to the
trated, words presented to the left visual field (LVF) travel initially LVF/RH (shown on left), however, was qualitatively different:
to the right hemisphere (RH) and vice versa. ERPs are shown here expected exemplars again elicited smaller N400s than violations,
from a representative (right medial central) site as indicated. The but the response to the two types of violations did not differ. This
response to target words presented to the RVF (left hemisphere) pattern is more consistent with a plausibility-based integrative
(shown on right), yielded the same pattern as that observed with strategy. Taken together, the results indicate that the hemispheres
central fixation: expected exemplars (solid line) elicited smaller differ in how they use context to process semantic information in
N400s than did violations of either type, but within-category online language processing. (Reprinted with permission from Kutas
violations (dashed line) also elicited smaller N400s than between- & Federmeier, 2000.)
822 language
A
Figure 56.2 (A) Grand average ERPs for a representative elec- for semantic and world-knowledge violations compared to the
trode site (Cz) for correct condition (black line), world-knowledge correct condition, based on the results of a minimum-T-field
violation (blue dotted line), and semantic violation (red dashed line). conjunction analysis. Both violations resulted in a single common
ERPs are time locked to the presentation of the critical words activation (P = 0.043, corrected) in the left inferior frontal gyrus.
(underlined). Spline-interpolated isovoltage maps display the topo- The crosshairs indicate the voxel of maximal activation. (Reprinted
graphic distributions of the mean differences from 300 to 550 ms with permission from Hagoort, Hald, Bastiaansen, & Petersson,
between semantic violation and control (left), and between world 2004.) (See color plate 71.)
knowledge violation and control (right). (B) The common activation
integrating locative nouns when the aspect of the main verb to construct a situation model in which locations and other
is imperfective and the denoted location is a prototypical one dimensions of the action become relevant, while such
given the verb’s semantics. In sentences with an imperfective, dimensions are ignored if the action is viewed perfectively.
such as “The diver was snorkeling in the ocean/pond,” a The imperfective leads also to expectations concerning
larger N400 was evoked by pond than by ocean. This N400 the outcome of the event described. Baggio, van Lambalgen,
effect was reduced if the aspect was perfective, as in “The and Hagoort (2008) investigated whether, in sentences like
diver had snorkeled in the ocean/pond.” Describing an “The girl was writing a letter when her friend spilled coffee
event as ongoing using the imperfective aspect leads readers on the tablecloth/paper,” the goal state (a complete letter)
824 language
A
B C
Figure 56.3 (A) Grand-average topographies displaying the quency of negative responses in a button-press, probe-selection task
mean amplitude difference between the ERPs evoked by the (r = −0.415, T(22) = −2.140, P = 0.043). The mean difference of
sentence-final verb when it terminated versus when it did not ter- negative responses between terminated and nonterminated accom-
minate the accomplishments in the progressive. Circles represent plishments is plotted on the abscissa. The mean amplitude differ-
electrodes in a significant (P < 0.05) cluster. (B) Grand-average ERP ence at frontopolar and frontal electrodes between terminated
waveforms from a representative site (F3) time-locked to the onset and nonterminated accomplishments in the 500–700-ms interval
(0 ms) of the verb in terminated versus nonterminated accomplish- following the onset of the sentence-final verb is plotted on the
ments. Negative values are plotted upward. (C ) Scatter plot display- ordinate. (After Baggio, van Lambalgen, & Hagoort, 2008.) (See
ing the correlation between the amplitude of the sustained anterior color plate 72.)
negativity elicited by terminated accomplishments and the fre-
formed (see figure 56.5). The semantics of meal and devour Conclusion In general, ERP research on semantic
suggest a plausible thematic role assignment to meal: a theme processing has found that word meaning is very rapidly
instead of an agent as the syntax implies. In this case, semantic assembled into compound meaning. This statement holds for
plausibility overrides syntactic constraints, and the verb individual word meanings in the context of single words,
devouring is presumably perceived as a morphosyntactic sentences, or discourse. But it also holds for meaning that is
violation indexed by the P600. Conflicts between syntactic extracted from pictures, co-speech gestures, or stereotypes
and semantic constraints might result in N400 or P600 inferred from speaker characteristics (Willems, Özyürek, &
effects depending on whether, respectively, the semantic or Hagoort, 2007, 2008; van Berkum et al., 2008). The effects
the syntactic constraints are the weakest. In cases where the of semantic processing are most often observed as modulations
input is anomalous because of a conflict between semantic of the N400 amplitude. The topographic distribution of the
and syntactic cues, the modus operandi of the system seems N400 differs slightly for different stimulus types. It is more
to obey a “loser takes all” principle. That is, if the semantic evenly distributed for auditory than for the visual N400.
cues are stronger than the syntactic cues, the effect will Pictures and co-speech gestures elicit a more frontal N400
appear at the level of syntactic unification (P600). Kuperberg than sentences without concomitant nonlinguistic infor-
(2007) argues that there are at least two neural routes mation. This finding suggests that the set of neural generators
subserving language comprehension: (1) a semantic, memory- contributing to the scalp-recorded N400 is not fully
based stream that provides elementary meanings as well as overlapping for the different types of meaningful stimuli.
conceptual, categorical, and thematic relations between This result is consistent with the results from fMRI studies,
them; (2) a combinatorial stream that provides analyses showing both overlapping and distinct activations in
based on morphosyntactic constraints and thematic roles as connection to the various types of meaningful input (see
given in the input. The P600 reported by Kim and Osterhout the next section). Intracranial recordings and MEG studies
(2005), for example, might be taken to suggest that semantic indicate that the scalp-recorded N400 is caused by coor-
associations between words are the strongest constraints— dinated activity in a number of different brain areas, including
for instance, because in this case they are taken into account the anterior inferotemporal cortex (McCarthy, Nobre,
earlier than the syntactic cues. Bentin, & Spencer, 1995), the superior temporal cortex
826 language
Figure 56.5 At the interface between syntax and semantics. tion verbs (dashed line). In both cases the inconsistency between
Grand-average ERPs recorded at three midline sites and six medial- grammatical roles and thematic role biases resulted in robust P600
lateral sites. All sentences are syntactically correct. (A) ERPs to effects. Onset of the critical verbs is indicated by the vertical bar.
passive control verbs (solid line) and thematic violation verbs (dashed Each hash mark represents 100 ms. Positive voltage is plotted down.
line). (B) ERPs to active control verbs (solid line) and thematic viola- (Kim & Osterhout, 2005; reprinted with permission.)
(Dale et al., 2000; Helenius, Salmelin, Service, & Connolly, consistent finding across all these studies is the activation of
1998; Halgren et al., 2002), and the left inferior frontal the left inferior frontal cortex (LIFC), more particularly BA
cortex (Halgren et al., 1994, 2002; Guillem, Rougier, & 47 and BA 45. In addition, the left superior and middle
Claverie, 1999). Other ERP effects (e.g., anterior negativities) temporal cortex is often found to be activated (see figure 56.6
have also been observed to aspects of postlexical semantic for an overview), as well as left inferior parietal cortex. For
processing. How they differ from the N400 effects in their instance, Rodd and colleagues had subjects listen to English
functional characterization is an issue for further research. sentences such as “There were dates and pears in the fruit
bowl” and compared to the BOLD response of these sen-
The semantic unification network tences to the BOLD response of sentences such as “There
was beer and cider on the kitchen shelf.” The crucial differ-
In recent years a series of fMRI studies were aimed at iden- ence between these sentences is that the former contains two
tifying the semantic unification network. These studies either homophones—“dates” and “pears”—which, when pre-
compared sentences containing semantic/pragmatic anom- sented auditorily, have more than one meaning. This is not
alies with their correct counterparts (Hagoort et al., 2004; the case for the words in the second sentence. The sentences
Newman, Pancheva, Ozawa, Neville, & Ullman, 2001; with the lexical ambiguities led to increased activations in
Kuperberg et al., 2000, 2003; Kuperberg, Sitnikova, & LIFC and in the left posterior middle/inferior temporal
Lakshmanan, 2008; Ni et al., 2000; Baumgaertner, Weiller, gyrus. In this experiment all materials were well-formed
& Buchel, 2002; Kiehl, Laurens, & Liddle, 2002; Friederici, English sentences in which the ambiguity usually goes unno-
Ruschemeyer, Hahne, & Fiebach, 2003; Ruschemeyer, ticed. Nevertheless, the results were very similar to those
Zysset, & Friederici, 2006) or compared sentences with and obtained in experiments that used semantic anomalies.
without semantic ambiguities (Hoenig & Scheef, 2005; Areas involved in semantic unification were found to be
Rodd, Davis, & Johnsrude, 2005; Zempleni, Renken, Hoeks, sensitive to the increase in semantic unification load that
Hoogduin, & Stowe, 2007; Davis et al., 2007). The most resulted from the ambiguous words.
In short, the semantic unification network seems to include (Willems et al., 2007). This finding suggests that activation
at least LIFC, left superior/middle temporal cortex, and increases in left posterior temporal cortex are triggered most
the (left) inferior parietal cortex. To some degree, the right strongly by processes involving the retrieval of lexical-
hemisphere homologues of these areas are also found to be semantic information. LIFC, however, is a key node in the
activated (see figure 56.6). In the following subsections we semantic unification network, unifying semantic information
will discuss the possible contributions of these regions to from different modalities.
semantic unification. From these findings it seems that semantic unification is
realized in a dynamic interplay between LIFC as a multi-
The Multimodal Nature of Semantic Unification modal unification site on the one hand, and modality-
An indication for the respective functional roles of the left specific areas on the other hand.
frontal and temporal cortices in semantic unification comes
from a few studies investigating semantic unification of Semantic Unification Beyond the Sentence Level
multimodal information with language. Using fMRI, Recently a few studies have set out to investigate the neural
Willems and colleagues assessed the neural integration of networks involved in semantic processing at the level of
semantic information from spoken words and from co- multisentence utterances, such as short stories. Besides
speech gestures into a preceding sentence context (Willems the network that is also activated to semantic unification at
et al., 2007). Spoken sentences were presented in which a the sentence level, story comprehension involves activation
critical word was accompanied by a co-speech gesture. of dorsomedial prefrontal cortex and, presumably, right
Either the word or the gesture could be semantically inferior frontal cortex. In a recent meta-analysis, Ferstl
incongruous with respect to the previous sentence context. and colleagues report the consistent involvement of
Both an incongruous word and an incongruous gesture led medial prefrontal cortex, left STS/MTG, and LIFC when
to increased activation in LIFC as compared to congruous participants process coherent text as compared to sentences
words and gestures (see Willems et al., 2008, for a similar that do not form a coherent story or as compared to word
finding with pictures of objects). Interestingly, the activation lists (Ferstl, Neumann, Bogler, & von Cramon, 2008). In a
of the left posterior STS was increased by an incongruous variant of this line of research, Kuperberg, Lakshmanan,
spoken word but not by an incongruous hand gesture. The Caplan, and Holcomb (2006) presented participants
latter resulted in a specific increase in dorsal premotor cortex with sentence quartets in which the relation of the last
828 language
Table 56.1
Involvement of the inferior frontal cortex in fMRI studies of sentence comprehension employing semantic anomalies or semantic ambiguities.
The table shows the studies that were used for the overview in figure 56.6, a brief description of the contrast that was employed in
each of the studies, the reported coordinates of the local maxima in inferior frontal cortex in MNI space, and a verbal description of the
location of the local maxima. When necessary, Talairach coordinates were converted to MNI space using the transformation suggested
by Brett (http://imaging.mrc-cbu.cam.ac.uk/imaging/MniTalairach). Note that in computing the mean coordinates the findings from
Kuperberg and colleagues (2003) and Ni and colleagues (2000) were not taken into consideration, since no coordinates were reported in
these studies.
Coordinates
Study Comparison x y z (MNI) Region
Baumgaertner et al., 2002 Sem. incongruent > congruent −51 36 −6 Left IFG
Davis et al., 2007 High ambiguity > low ambiguity −40 24 18 Left IFG
−48 6 34
−40 18 24
46 36 18 Right IFG
Friederici et al., 2003 Sem. incongruent > congruent No activation —
Hagoort et al., 2004 Sem. incongruent > congruent ŀ World −44 30 8 Left IFG
knowledge incongruent > congruent
Hoenig & Scheef, 2005 Sem. incongruent > congruent −50 18 −14 Left IFG
−50 43 11
Kiehl et al., 2002 Sem. incongruent > congruent −48 32 4 Left IFG/ant. temporal
36 32 −16 Right IFG/ant. temporal
Kuperberg et al., 2000 Sem. incongruent > congruent No activation —
Pragm. incongruent > congruent No activation —
Kuperberg et al., 2003 Pragm. incongruent > congruent (No coordinates) Left IFG
Kuperberg et al., 2008 Pragmatic incongruent > congruent −43 25 −10 Left IFG
Sem. incongruent > congruent −49 4 10 Left IFG
29 19 5 Right IFG
Newman et al., 2001 Sem. incongruent > congruent −50 34 5 Left IFG
Ni et al., 2000 Sem. incongruence detection > tone pitch (No coordinates) Left IFG
discrimination
Right IFG
Oddball paradigm with semantically (No coordinates) Left IFG
incongruent sentences
(No coordinates) Right IFG
Rodd et al., 2005 High ambiguity > low ambiguity −50 30 20 Left IFG
−56 16 22 Left IFG
−42 14 32 Left IFS
36 26 4 Right IFG
50 36 16 Right IFG
Rueschemeyer et al., 2006 Sem. incongruent > synt. incongruent −50 30 15 Left ant. IFG
Willems et al., 2007 Sem. incongruent > congruent −43 11 27 Left IFS
Willems et al., 2008 Sem. incongruent > congruent −45 14 27 Left IFS
Zempleni et al., 2007 Subordinate meaning > dominant meaning −48 26 20 Left IFG
−52 16 26 Left IFG
34 20 −10 Right IFG
Coordinates
Study Comparison x y z (MNI) Region
Baumgaertner et al., 2002 Sem. incongruent > congruent No activation —
Davis et al., 2007 High ambiguity > low ambiguity −50 −44 −12 Left ITG
−54 −60 −2
Friederici et al., 2003 Sem. incongruent > congruent −60 −42 20 Left STG
63 −40 20 Right STG
58 −24 13 Right STG
Hagoort et al., 2004 Sem. incongruent > congruent No activation —
Hoenig & Scheef, 2005 Sem. incongruent > congruent No activation —
Kiehl et al., 2002 Sem. incongruent > congruent No activation —
Kuperberg et al., 2000 Sem. incongruent > congruent 43 −11 −7 Right MTG
49 −17 4 Right STG
Pragm. incongruent > congruent −49 −31 9 Left STG
Kuperberg et al., 2003 Pragm. incongruent > congruent (No coordinates) Left STS
Kuperberg et al., 2008 Pragm. violations > correct sentences −27 −28 −19 Left ant. med.
temporal cortex
Sem. incongruent > congruent −53 −20 −1 Left STG
58 −19 3 Right STG
Newman et al., 2001 Sem. incongruent > congruent 70 −36 −15 Right MTG
Ni et al., 2000 Sem. incongruence detection > tone pitch (No coordinates) Left STG/MTG
discrimination
(No coordinates) Right STG/MTG
Oddball paradigm with semantically (No coordinates) Left pSTG
incongruent sentences
Rodd et al., 2005 High ambiguity > low ambiguity −52 −50 −10 Left pITG
−58 −8 −6 Left STG
Rueschemeyer et al., 2006 Sem. incongruent > synt. incongruent — —
Willems et al., 2007 Sem. incongruent > congruent −53 −52 2 Left STS
Willems et al., 2008 Sem. incongruent > congruent −53 −35 −3 Left STS
Zempleni et al., 2007 Subordinate meaning > dominant meaning −50 −48 −12 Left ITG/MTG
56 −34 −16 Right ITG/MTG
830 language
Table 56.3
Summary of the activations in the studies used for the overview in figure 56.6.
The coordinates from tables 56.1 and 56.2 were used. Table 56.3 specifies the mean coordinates for left
and right inferior frontal and temporal cortices, the standard deviation in the x, y, and z directions in milli-
meters, the mean Euclidian distance of the local maxima to the mean coordinates, the number of maxima
that were reported, and the number of studies that report maxima in that region. Note that the number of
maxima is higher than the number of studies, since several studies report more than one maximum. Note
that the findings from Kuperberg and colleagues (2003) and Ni and colleagues (2000) were not used in
computing the mean coordinates, since no coordinates were reported in these studies.
sentence to the previous story context was manipulated. Controlled Processing and Selection Accounts for
The less related sentences required an extra causal inference LIFC Although LIFC (including Broca’s area) has
in order to make sense of the story. It was found that less traditionally been construed as a language area, there is a
related sentences (which evoked more inferencing) led wealth of recent neuroimaging data suggesting that its
to stronger activations in left and right IFC, left MTG, role extends beyond the language domain. Several authors
left middle fontal gyrus, and bilateral medial prefrontal have therefore argued that LIFC function is best character-
cortex (Kuperberg et al.; see Hasson, Nusbaum, & Small, ized as “controlled retrieval” or “(semantic) selection”
2007, for a related result). These and other studies (e.g., (Thompson-Schill, D’Esposito, Aguirre, & Farah, 1997;
St George, Kutas, Martinez, & Sereno, 1999; Xu, Kemeny, Wagner, Pare-Blagoev, Clark, & Poldrack, 2001; Badre,
Park, Frattali, & Braun, 2005; Sieborger, Ferstl, & von Poldrack, Pare-Blagoev, Insler, & Wagner, 2005; Gold,
Cramon, 2007) suggest that LIFC and left superior/middle Balota, Kirchoff, & Buckner, 2005; Moss et al., 2005;
temporal cortex are also important for unification of Thompson-Schill, Bedny, & Goldberg, 2005). For instance,
information beyond the sentence level. It is interesting Thompson-Schill and colleagues showed that LIFC was
to note that the medial prefrontal cortex, which is found more strongly activated in a verb-generation task when the
activated for discourse but not for sentence-level process- noun that served as the cue allowed for many different verb
ing, has been implicated in so-called mentalizing tasks, responses, as opposed to nouns that are reliably related to
requiring the observer to take the perspective of someone only one or a few verbs (Thompson-Schill et al., 1997). In
else (Buckner, Andrews-Hanna, & Schacter, 2008; Frith & response to the noun cue “scissors,” for example, most
Frith, 2006). According to Mason and Just, this domain- participants generate the verb “to cut,” whereas the noun
general area is recruited in discourse processing for the “wheel” triggers a more diverse set of responses. On the basis
sake of interpreting a protagonist’s or agent’s perspective of these and other findings, it was argued that LIFC guides
(Mason & Just, 2006). In addition, right-hemisphere regions semantic selection among competing alternatives, with
are sometimes but not consistently reported in the context higher activation when there are more competitors.
of discourse processing (Maguire, Frith, & Morris, 1999; How does the selection account of LIFC function relate
St George et al., 1999; Ferstl et al., 2008) (see Ferstl et al., to the unification account? As is discussed in more detail
2008; Mason & Just, 2006, for extensive reviews). Some elsewhere, unification often implies selection (Hagoort,
studies find that the temporal poles may be related to 2005). For instance, in the study by Rodd and colleagues
successful integration during story comprehension (Fletcher described earlier, increased activation in LIFC is most likely
et al., 1995; Maguire et al.). The studies that report these due to increased selection demands in reaction to sentences
activations are mostly done using PET. It is hard to assess with ambiguous words. Selection is often, but not always, a
the consistency of temporal pole activation during story/text prerequiste for unification. Unification with or without selec-
comprehension because of the susceptibility to artifacts that tion is a core feature of language processing. During natural
these regions often suffer from in fMRI studies (but see Xu language comprehension, information has to be kept in
et al.; Ferstl et al.). working memory for a certain period of time, and incoming
832 language
semantic unification operations are under top-down control Dale, A. M., Liu, A. K., Fischl, B. R., Buckner, R. L., Belliveau,
of left, and in the case of discourse, also right inferior frontal J. W., Lewine, J. D., & Halgren, E. (2000). Dynamic
statistical parametric mapping: Combining fMRI and MEG
cortex. This contribution modulates activations of lexical
for high-resolution imaging of cortical activity. Neuron, 26,
information in memory as represented by the left superior 55–67.
and middle temporal cortex, presumably with additional Davis, M. H., Coleman, M. R., Absalom, A. R., Rodd, J. M.,
support for unification operations in left inferior parietal Johnsrude, I. S., Matta, B. F., Owen, A. M., & Menon, D. K.
areas (e.g., angular gyrus). A more precise account of the (2007). Dissociating speech perception and comprehension at
individual contributions of these core nodes in the unifica- reduced levels of awareness. Proc. Natl. Acad. Sci. USA, 104,
16032–16037.
tion network awaits further research. DeLong, K. A., Urbach, T. P., & Kutas, M. (2005). Probabilistic
word pre-activation during language comprehension inferred
acknowledgments We thank Jos van Berkum, Karl-Magnus
from electrical brain activity. Nat. Neurosci., 8, 1117–1121.
Petersson, and the Neurocognition of Language Ph.D.s for their
Federmeier, K. D. (2007). Thinking ahead: The role and roots
comments on an earlier version of this chapter.
of prediction in language comprehension. Psychophysiology, 44,
491–505.
Federmeier, K. D., & Kutas, M. (1999). A rose by any other name:
REFERENCES
Long-term memory structure and sentence processing.
Badre, D., Poldrack, R. A., Pare-Blagoev, E. J., Insler, R. Z., J. Mem. Lang., 41, 469–495.
& Wagner, A. D. (2005). Dissociable controlled retrieval and Ferretti, T. R., Kutas, M., & McRae, K. (2007). Verb aspect
generalized selection mechanisms in ventrolateral prefrontal and the activation of event knowledge. J. Exp. Psychol. Learn. Mem.
cortex. Neuron, 47, 907–918. Cogn., 33, 182–196.
Baggio, G., & van Lambalgen, M. (2007). The processing conse- Ferstl, E. C., Neumann, J., Bogler, C., & von Cramon,
quences of the imperfective paradox. J. Semantics, 24, 307–330. D. Y. (2008). The extended language network: A meta-analysis
Baggio, G., van Lambalgen, M., & Hagoort, P. (2008). of neuroimaging studies on text comprehension. Hum. Brain
Computing and recomputing discourse models: An ERP study. Mapp., 29, 581–593.
J. Mem. Lang., 59, 36–53. Fletcher, P. C., Happe, F., Frith, U., Baker, S. C., Dolan,
Baggio, G., van Lambalgen, M., & Hagoort, P. (in press). The R. J., Frackowiak, R. S., & Frith, C. D. (1995). Other minds
processing consequences of compositionality. In W. Hinzen, in the brain: A functional imaging study of “theory of mind” in
E. Machery, & M. Werning (Eds.), The Oxford handbook of compo- story comprehension. Cognition, 57, 109–128.
sitionality. Oxford, UK: Oxford University Press. Fodor, J., & Lepore, E. (2002). The compositionality papers. Oxford,
Baumgaertner, C., Weiller, C., & Buchel, C. (2002). Event- UK: Oxford University Press.
related fMRI reveals cortical sites involved in contextual sen- Frege, G. (1892). Uber Sinn und Bedeutung. Zeitschrift für Philoso-
tence integration. NeuroImage, 16, 736–745. phie und philosophische Kritik, 100, 25–50.
Beauchamp, M. S., Lee, K. E., Argall, B. D., & Martin, Friederici, A. D., Ruschemeyer, S. A., Hahne, A., & Fiebach,
A. (2004). Integration of auditory and visual information about C. J. (2003). The role of left inferior frontal and superior tempo-
objects in superior temporal sulcus. Neuron, 41, 809–823. ral cortex in sentence comprehension: Localizing syntactic and
Bookheimer, S. (2002). Functional MRI of language: New semantic processes. Cereb. Cortex, 13, 170–177.
approaches to understanding the cortical organization of seman- Frith, C. D., & Frith, U. (2006). The neural basis of mentalizing.
tic processing. Annu. Rev. Neurosci., 25, 151–188. Neuron, 50, 531–534.
Brown, C., & Hagoort, P. (1993). The processing nature of Gold, B. T., Balota, D. A., Kirchhoff, B. A., & Buckner, R. L.
the N400: Evidence from masked priming. J. Cogn. Neurosci., 5, (2005). Common and dissociable activation patterns associated
34–44. with controlled semantic and phonological processing: Evidence
Buckner, R. L., Andrews-Hanna, J. R., & Schacter, D. L. (2008). from fMRI adaptation. Cereb. Cortex, 15, 1438–1450.
The brain’s default network: Anatomy, function, and relevance Guillem, F., Rougier, A., & Claverie, B. (1999). Short- and long-
to disease. Ann. NY Acad. Sci., 1124, 1–38. delay intracranial ERP repetition effects dissociate memory
Calvert, G. A., Campbell, R., & Brammer, M. J. (2000). systems in the human brain. J. Cogn. Neurosci., 11, 437–458.
Evidence from functional magnetic resonance imaging of Hagoort, P. (2005). On Broca, brain, and binding: A new frame-
crossmodal binding in the human heteromodal cortex. Curr. work. Trends Cogn. Sci., 9, 416–423.
Biol., 10, 649–657. Hagoort, P., & Brown, C. (1994). Brain responses to lexical
Carreiras, M., Garnham, A., Oakhill, J., & Cain, K. (1996). The ambiguity resolution and parsing. In C. Clifton, Jr., L. Frazier,
use of stereotypical gender information in constructing a mental & K. Rayner (Eds.), Perspectives on sentence processing (pp. 45–81).
model: Evidence from English and Spanish. Q. J. Exp. Psychol. Hillsdale, NJ: Lawrence Erlbaum.
[A], 49, 639–663. Hagoort, P., Brown, C., & Osterhout, L. (1999). The neurocog-
Coulson, S., King, J. W., & Kutas, M. (1998). Expect the unex- nition of syntactic processing. In C. M. Brown and P. Hagoort
pected: Event-related brain response to morphosyntactic viola- (Eds.), The neurocognition of language (pp. 273–317). Oxford, UK:
tions. Lang. Cogn. Process., 13, 21–58. Oxford University Press.
Culicover, P., & Jackendoff, R. (2005). Simpler syntax. Oxford, Hagoort, P., Hald, L., Bastiaansen, M., & Petersson, K. M.
UK: Oxford University Press. (2004). Integration of word meaning and world knowledge in
Cutler, A., & Clifton, C. E. (1999). Comprehending spoken language comprehension. Science, 304, 438–441.
language: A blueprint of the listener. In C. M. Brown and Halgren, E., Baudena, P., Heit, G., Clarke, J. M., Marinkovic,
P. Hagoort (Eds.), The neurocognition of language (pp. 123–166). K., & Chauvel, P. (1994). Spatio-temporal stages in face
Oxford, UK: Oxford University Press. and word processing. 2. Depthrecorded potentials in the
834 language
Partee, B. H. (1984). Compositionality. In F. Veltman & Traxler, M., Pickering, M., & McElree, B. (2002). Coercion in
F. Landmand (Eds.), Varieties of formal semantics. Dordrecht: Foris. sentence processing: Evidence from eye movements and self-
Partee, B. H., Ter Meulen, A., & Wall, R. E. (1990). paced reading. J. Mem. Lang., 47, 530–547.
Mathematical methods in linguistics. Dordrecht: Kluwer. Van Atteveldt, N. M., Formisano, E., Blomert, L., & Goebel,
Pylkkänen, L., & McElree, B. (2007). An MEG study of silent R. (2007). The effect of temporal asynchrony on the multisen-
meaning. J. Cogn. Neurosci., 19, 1905–1921. sory integration of letters and speech sounds. Cereb. Cortex, 17,
Rodd, J. M., Davis, M. H., & Johnsrude, I. S. (2005). The neural 962–974.
mechanism of speech comprehension: fMRI studies of semantic Van Atteveldt, N., Formisano, E., Goebel, R., & Blomert,
ambiguity. Cereb. Cortex, 15, 1261–1269. L. (2004). Integration of letters and speech sounds in the human
Ruschemeyer, S. A., Fiebach, C. J., Kempe, V., & Friederici, brain. Neuron, 43, 271–282.
A. D. (2005). Processing lexical semantic and syntactic informa- van Berkum, J. J. A., Hagoort, P., & Brown, C. M. (1999).
tion in first and second language: fMRI evidence from German Semantic integration in sentences and discourse: Evidence from
and Russian. Hum. Brain Mapp., 25, 266–286. the N400. J. Cogn. Neurosci., 11, 657–671.
Ruschemeyer, S. A., Zysset, S., & Friederici, A. D. (2006). Native van Berkum, J. J. A., van den Brink, D., Tesink, C. M. J. Y., Kos,
and non-native reading of sentences: An fMRI experiment. M., & Hagoort, P. (2008). The neural integration of speaker
NeuroImage, 31, 354–365. and message. J. Cogn. Neurosci., 20, 580–591.
Seuren, P. A. M. (1998). Western linguistics: An historical introduction. Vigneau, M., Beaucousin, V., Herve, P. Y., Duffau, H.,
Oxford, UK: Blackwell. Crivello, F., Houde, O., Mazoyer, B., & Tzourio-Mazoyer,
Sieborger, F. T., Ferstl, E. C., & von Cramon, D. Y. (2007). N. (2006) Meta-analyzing left hemisphere language areas: Pho-
Making sense of nonsense: An fMRI study of task induced infer- nology, semantics, and sentence processing. NeuroImage, 30,
ence processes during discourse comprehension. Brain Res., 1166, 1414–1432.
77–91. Wagner, A. D., Pare-Blagoev, E. J., Clark, J., & Poldrack,
St George, M., Kutas, M., Martinez, A., & Sereno, M. I. (1999). R. A. (2001). Recovering meaning: Left prefrontal cortex guides
Semantic integration in reading: Engagement of the right hemi- controlled semantic retrieval. Neuron, 31, 329–338.
sphere during discourse processing. Brain, 122, 1317–1325. Willems, R. M., Özyürek, A., & Hagoort, P. (2007). When lan-
Sturt, P. (2007). Semantic re-interpretation and garden path guage meets action: The neural integration of gesture and
recovery. Cognition, 105, 477–488. speech. Cereb. Cortex, 17, 2322–2333.
Tesink, C. M. J. Y., Petersson, K. M., van Berkum, J. J. A., Willems, R. M., Özyürek, A., & Hagoort, P. (2008). Seeing
van den Brink, D., Buitelaar, J. K., & Hagoort, P. (in press). and hearing meaning: Event-related potential and functional
Unification of speaker and meaning in language comprehension: magnetic resonance imaging evidence of word versus picture
An fMRI study. J. Cogn. Neurosci. integration into a sentence context. J. Cogn. Neurosci., 20, 1235–
Thompson-Schill, S. L., Bedny, M., & Goldberg, R. F. (2005). 1249.
The frontal lobes and the regulation of mental activity. Curr. Xu, J., Kemeny, S., Park, G., Frattali, C., & Braun, A. (2005).
Opin. Neurobiol., 15, 219–224. Language in context: Emergent features of word, sentence, and
Thompson-Schill, S. L., D’Esposito, M., Aguirre, G. K., & narrative comprehension. NeuroImage, 25, 1002–1015.
Farah, M. J. (1997). Role of left inferior prefrontal cortex in Zempleni, M. Z., Renken, R., Hoeks, J. C., Hoogduin, J. M., &
retrieval of semantic knowledge: A reevaluation. Proc. Natl. Acad. Stowe, L. A. (2007). Semantic ambiguity processing in sentence
Sci. USA, 94, 14792–14797. context: Evidence from event-related fMRI. NeuroImage, 34,
Traxler, M., McElree, B., Williams, R., & Pickering, 1270–1279.
M. (2005). Context effects in coercion: Evidence from eye move-
ments. J. Mem. Lang., 53, 1–25.
abstract Infants learn language(s) with apparent ease, and the day help resolve the classic debate about the interaction
tools of modern neuroscience are providing valuable information between biology and culture that produces the human
about the mechanisms that underlie this capacity. Noninvasive, safe capacity for language. Neuroscientific studies will also
brain technologies have now been proven feasible for use with
children starting at birth, and studies in the past decade at the
provide valuable information that may allow us to diagnose
phonetic, word, and sentence levels have produced an explosion developmental disabilities at a stage in development when
in neuroscience research examining young children’s language interventions are more likely to improve children’s lives.
processing. At all levels of language, the neural signatures of Remarkable progress has been made in the last decade in
learning can be documented at remarkably early points in develop- scientists’ abilities to examine the young infant brain while
ment. Importantly both for theory and for the eventual application
its owner processes language, reacts to social stimuli such
of this work to the diagnosis and treatment of developmental dis-
abilities, early brain measures of infants’ responses to phonetic dif- as faces, listens to music, or hears the voice of the child’s
ferences are reflected in infants’ language abilities in the second mother. This review focuses on the new techniques and what
and third year of life. Developmental neuroscience studies using they are teaching us about the earliest phases of language
language are beginning to answer questions about the origins and acquisition.
development of human’s language faculty.
Neuroscientific studies on infants and young children now
extend from phonemes to words to sentences. These studies
fuel the hope that an understanding of development in typi-
Infants begin life with the capacity to detect phonetic distinc- cally developing children and in children with developmen-
tions across all languages, and they develop a language- tal disabilities will be achieved. Studies show that exposure
specific phonetic capacity and acquire early words before to language in the first year of life begins to set the neural
the end of the first year ( Jusczyk, 1997; Kuhl, Conboy, architecture in a way that vaults the infant forward in the
Padden, Rivera-Gaxiola, & Nelson, 2008; Werker & Curtin, acquisition of language. The goal in this chapter is to explore
2005). A major question remains, however: Do infants’ what we have learned about the neural mechanisms that
initial capacities and their ability to learn effortlessly from underlie language in typically developing children, and how
exposure to language reflect domain-specific mechanisms they differ in children with developmental disabilities that
that operate exclusively on linguistic data or mechanisms involve language such as autism.
that operate on more general learning mechanisms? In a
classic debate, a nativist and a learning theorist took very Neuroscience techniques measure language processing
different positions regarding the innate state and the nature in the young brain
of learning regarding language. Noam Chomsky (1959)
argued that infants’ innate capacities and the manner in Rapid advances have been made in the development of
which language was acquired were unique to language and noninvasive techniques to examine language processing
to humans, while B. F. Skinner (1957) asserted that neither in infants and young children (figure 57.1). These methods
the initial state nor the manner in which language was include electroencephalography (EEG)/event-related poten-
learned was unique. tials (ERPs), magnetoencephalography (MEG), functional
The tools of modern developmental neuroscience are magnetic resonance imaging (fMRI), and near-infrared
bringing us closer to addressing these issues and may one spectroscopy (NIRS). ERPs have been widely used to study
speech and language processing in infants and young
patricia k. kuhl Institute for Learning and Brain Sciences, children (for reviews see Conboy, Rivera-Gaxiola, Silva-
University of Washington, Seattle, Washington Pereyra, & Kuhl, 2008; Friederici, 2005; Kuhl, 2004; Kuhl
& Rivera-Gaxiola, 2008). Event-related potentials (ERPs), a ERPs provide precise time resolution (milliseconds), making
part of the EEG, reflect electrical activity that is time-locked them well suited for studying the high-speed and temporally
to the presentation of a specific sensory stimulus (e.g., sylla- ordered structure of human speech. ERP experiments
bles, words) or a cognitive process (recognition of a semantic can also be carried out in populations who, because of age
violation within a sentence or phrase). By placing sensors on or cognitive impairment, cannot provide overt responses.
a child’s scalp, the activity of neural networks firing in a Spatial resolution of the source of brain activation is,
coordinated and synchronous fashion in open field con- however, limited.
figurations can be measured, and voltage changes occurring Magnetoencephalography (MEG) is another brain-
as a function of cortical neural activity can be detected. imaging technique that tracks activity in the brain with
838 language
exquisite temporal resolution. MEG (as well as EEG) Functional MRI techniques would be very valuable with
techniques are safe and and noiseless, allowing data collec- infants, but few studies have attempted fMRI with infants
tion while infants listen to language in a quiet environment. (Dehaene-Lambertz, Dehaene, & Hertz-Pannier, 2002;
The SQUID (superconducting quantum interference device) Dehaene-Lambertz, Hertz-Pannier, Dubois, Meriaux, &
sensors located within the MEG helmet measure the minute Roche, 2006). The technique requires subjects to be per-
magnetic fields associated with electrical currents that are fectly still, and the MRI device produces loud sounds making
produced by the brain when it is performing sensory, motor, it necessary to shield infants’ ears while delivering language
or cognitive tasks. MEG allows precise localization of stimuli.
the neural currents responsible for the sources of the mag- Near-infrared spectroscopy (NIRS) also measures cere-
netic fields, and it has been used to test phonetic discri- bral hemodynamic responses in relation to neural activity,
mination in adults (Kujala, Alho, Service, Ilmoniemi, & but employs the absorption of light, which is sensitive to the
Connolly, 2004). concentration of hemoglobin, to measure activation (Aslin
Recently a genuine advance was documented by the first & Mehler, 2005). NIRS utilizes near-infrared light to measure
MEG studies testing awake infants in the first year of life changes in blood oxy- and deoxyhemoglobin concentrations
(Bosseler et al., 2008; Cheour et al., 2004; Imada et al., 2006, in the brain as well as total blood volume changes in various
2008). In these studies, the use of sophisticated head-tracking regions of the cerebral cortex. The NIRS system can deter-
software and hardware allows correction for infants’ head mine where and how active the specific regions of the brain
movements, so infants are free to move comfortably during are by continuously monitoring blood hemoglobin levels,
the tests. MEG studies allow whole-brain imaging during and reports have begun to appear on infants in the first
speech discrimination, which is now providing data on the two years of life (Bortfeld, Wruck, & Boas, 2007; Homae,
location and timing of brain activation in critical regions Watanabe, Nakano, Asakawa, & Taga, 2006; Pena et al.,
(Broca’s and Wernicke’s) involved in language acquisition 2003; Taga & Asakawa, 2007). Homae and colleagues, for
(see Bosseler et al.; Imada et al., 2006, 2008). example, provided data using NIRS that suggest that sleep-
MEG and/or EEG can be combined with magnetic reso- ing 3-month-old infants process the prosodic information in
nance imaging (MRI), a technique that provides static struc- sentences in the right temporoparietal region. As with other
tural/anatomical pictures of the brain. Using mathematical techniques relying on hemodynamic changes such as fMRI,
modeling methods, the specific brain regions that produce NIRS does not provide good temporal resolution. One of
the magnetic or electrical signals can be identified in the the most important uses of this technique is that coregistra-
human brain with high spatial resolution (millimeter). Struc- tion with other testing techniques such as EEG and MEG
tural MRIs allow measurement of anatomical changes in may be possible.
white and gray matter in specific brain regions across the life The use of these techniques with infants and young
span. MRIs can be superimposed on the physiological activ- children has produced an explosion of neuroscience studies
ity detected by MEG or EEG to refine the spatial localiza- using stimuli that tap all levels of language—phoneme, word,
tion of brain activities for individual participants. and sentence. In the next sections, examples of recent find-
Functional magnetic resonance imaging (fMRI) is now ings will be described to give a sense of the promise of neu-
considered a standard method of neuroimaging in adults roscience for the study of language acquisition in children.
because it provides high-spatial-resolution maps of neural
activity across the entire brain (e.g., Gernsbacher & Kachak, Neural signatures of phonetic learning in typically
2003). However, unlike EEG and MEG, fMRI does not developing children
directly detect neural activity, but rather the changes in
blood oxygenation that occur in response to neural activa- Perception of the basic units of speech—the vowels and
tion/firing. Neural events happen in milliseconds, while the consonants that make up words—is one of the most widely
blood-oxygenation changes that they induce are spread out studied behaviors in infancy and adulthood, and studies
over several seconds, thereby severely limiting fMRI’s tem- using ERPs have advanced our knowledge of development
poral resolution. Adult studies are employing new fMRI and learning.
data-analysis methods for speech stimuli and correlating the Behavioral studies demonstrated that at birth young
fMRI data to behavioral data. For example, Raizada, Tsao, infants exhibit a universal capacity to detect differences
Liu, and Kuhl (2009), using a multivariate pattern classifier, between phonetic contrasts used in the world’s languages
showed that English—but not Japanese—speakers exhibited (Eimas, Siqueland, Jusczyk, & Vigorito, 1971). We have
distinct neural activity patterns for /ra/ and /la/ in primary referred to this as Phase 1 in development (Kuhl et al., 2008).
auditory cortex. Subjects who behaviorally distinguished This universal capacity is dramatically altered by language
the sounds most accurately also had the most distinct neural experience starting as early as 6 months for vowels and
activity patterns. by 10 months for consonants: over time, native language
840 language
A C
Figure 57.2 (A) A 7.5-month-old infant wearing an ERP electro- children, those whose MMN values at 7.5 months indicated better
cap. (B) Infant ERP waveforms at one sensor location (CZ) for one discrimination (−1 SD) and those MMN values indicated poorer
infant are shown in response to native (English) and nonnative discrimination (+1 SD). Vocabulary growth was significantly faster
(Mandarin) phonetic contrast at 7.5 months. The mismatch nega- for infants with better MMN phonetic discrimination for the native
tivity (MMN) is obtained by subtracting the standard waveform contrast at 7.5 months of age (C, left). In contrast, infants with better
(black) from the deviant waveform (gray). This infant’s response discrimination for the nonnative contrasts (−1 SD) as indicated by
suggests that native-language learning has begun because the MMN at 7.5 months showed slower vocabulary growth (C, right).
MMN negativity in response to the native English contrast is con- Both contrasts predict vocabulary growth, but the effects of better
siderably stronger (more negative) than that to the nonnative discrimination are reversed for the native and nonnative contrasts.
constrast. (C ) Hierarchical linear growth modeling of vocabulary (From Kuhl & Rivera-Gaxiola, 2008.)
growth between 14 and 30 months is shown for two groups of
Two patterns of ERP response were observed—an early (Newman, Ratner, Jusczyk, Jusczyk, & Dow, 2006), as well
positive-going wave (P150–250) and a later negative-going as in studies that use infants’ early processing efficiency for
wave (N250–550) (Rivera-Gaxiola, Silva-Pereyra, et al., words to predict later language (Fernald, Perfors, & March-
2005). Further work examined the patterns of the same audi- man, 2006). Taken as a whole, these studies form bridges
tory ERP positive-negative complexes in a larger sample of between the early precursors to language in infancy and
11-month-old monolingual American infants using the same measures of language competencies in early childhood,
contrasts used in the developmental study, and found that bridges that are important to theory building as well as to
infants’ response to the nonnative contrast predicted the clinical populations with developmental disabilities that
number of words produced at 18, 22, 25, 27, and 30 months involve language.
of age (Rivera-Gaxiola, Klarman, et al., 2005). Infants ERP studies at the phonetic level suggest that the young
showing an N250–550 to the foreign contrast at 11 months brain’s response to the elementary building blocks of
of age (indexing better neural discrimination) produced sig- language matters and that initial native language phonetic
nificantly fewer words at all ages when compared to infants learning is a pathway to language (Kuhl, 2008). The data
showing a less negative response. Scalp distribution analyses also suggest that discriminating nonnative phonetic contrasts
on 7-, 11-, 15-, and 20-month-old infants revealed that for a longer period of time in early development—reflecting
the P150–250 and the N250–550 components differ in dis- infants’ initial, more immature state—can be linked to
tribution (Rivera-Gaxiola et al., 2007). Thus in both Kuhl slower language development. In infants exposed to a single
and colleagues (2008) and Rivera-Gaxiola, Klarman, and language, the ability to attend to changes in the phonetic
colleagues (2005), an enhanced negativity in response to contrasts that are relevant to the culture’s language, while
the nonnative contrast is associated with slower language at the same time reducing attention to phonetic contrasts
development. from other languages that are discriminable but irrelevant
The continuity in language development documented in to the language of their culture, appears to be an important
these studies using infants’ early phonetic skills to predict first step toward the acquisition of language. What neuro-
later language (Kuhl, Conboy, et al., 2005; Kuhl et al., science tools may allow us to do in the future is to understand
2008; Rivera-Gaxiola, Klarman, et al., 2005; Tsao, Liu, & this process and its relation to the “critical period” for lan-
Kuhl, 2004) is also seen in studies that use infants’ early guage development (see Kuhl, Conboy, et al., 2005, for
pattern detection skills for speech to predict later language discussion).
842 language
name at 4.5 months (Mandel, Jusczyk, & Pisoni, 1995). At 2005), but also over the life span. Individual differences in
6 months, infants use their own names or the word Mommy the response latency to a familiar word at the age of 2 are
in an utterance to identify word boundaries (Bortfeld, related to both lexical and grammatical measures collected
Morgan, Golinkoff, & Rathbun, 2005) and look appropri- between 15 and 25 months, providing more evidence that
ately to pictures of their mother or father when hearing processing speed is associated with greater language facility
Mommy or Daddy (Tincoff & Jusczyk, 1999). By 7 months, (Fernald et al., 2006).
infants listen longer to passages containing words they previ- Mills and colleagues (2005) used ERPs in 20-month-old
ously heard rather than passages containing words they have toddlers to examine new word learning. The children lis-
not heard ( Jusczyk & Hohne, 1997), and by 11 months tened to known and unknown words, and to nonwords that
infants prefer to listen to words that are highly frequent in were phonotactically legal in English. ERPs were recorded
language input over infrequent words (Halle & de Boysson- as the children were presented with novel objects paired with
Bardies, 1994). the nonwords. After the learning period, ERPs to the non-
Behavioral studies indicate that infants learn words using words that had been paired with novel objects were shown
both “statistical learning” strategies in which the transitional to be similar to those of previously known words, suggesting
probabilities between syllables are exploited to identify that that new words may be encoded in the same neural
likely words (Newport & Aslin, 2004; Saffran, 2003; Saffran, regions as previously learned words.
Aslin, & Newport, 1996) and pattern detection strategies in ERP studies on German infants reveal the development
which infants use the typical pattern of metric stress that of word-segmentation strategies based on the typical stress
characterizes ambient language to segment running speech patterns of German words. When presented with bisyllabic
into likely words (Cutler & Norris, 1988; Höhle, Bijeljac- strings with either a trochaic (typical in German) or iambic
Babic, Herold, Weissenborn, & Nazzi, 2009; Johnson & pattern, infants who heard a trochaic pattern embedded in
Jusczyk, 2001; Nazzi, Iakimova, Bertoncini, Frédonie, & an iambic string showed the N200 ERP component, similar
Alcantara, 2006). to that elicited in response to a known word, whereas infants
How is word recognition evidenced in the brain? ERPs presented with the iambic bisyllable embedded in the
in response to words index word familiarity as early as 9 trochaic pattern showed no response (Weber, Hahne,
months of age and word meaning by 13–17 months of Friedrich, & Friederici, 2004). The data suggest that German
age. ERP studies have shown differences in amplitude infants at this age are applying a metric segmentation
and scalp distributions for components that are related to strategy, consistent with the behavioral data of Höhle and
words that are known versus unknown to the child (Mills, colleagues (2009).
Coffey-Corina, & Neville, 1993, 1997; Mills, Plunkett, Prat,
& Schafer, 2005; Molfese, 1990; Molfese, Morse, & Peters, Infants’ early lexicons
1990; Molfese, Wetzel, & Gill, 1993; Thierry, Vihman, &
Roberts, 2003). There is evidence suggesting that young children’s word
As early as 9 months of age, ERPs indicate word familiar- representations are phonetically underspecified. Children’s
ity, and by 13–17 months of age, studies show ERP compo- growing lexicons must code words in a way that distinguishes
nents that reliably signal the brain’s coding of words that words from one another, and, given that by the end of
are known versus unknown by the child (Mills et al., 1993, the first year infants’ phonetic skills are language specific
1997, 2005; Molfese et al., 1990, 1993; Thierry et al., 2003). (Best & McRoberts, 2003; Kuhl et al., 2006; Werker & Tees,
Toddlers with larger vocabularies tend to have a more 1984), it was assumed that children’s early word representa-
focalized and larger N200 for known words—they show tions were phonetically detailed. However, studies suggest
an enhanced negativity to known versus unknown words that learning new words taxes young children’s capacities,
only at left temporal and parietal electrode sites—whereas and that as a result, new word representations are not
children with smaller vocabularies show more broadly dis- phonetically complete.
tributed effects (Mills et al., 1993), features that also distin- Reactions to mispronunciations—the age at which chil-
guish typically developing preschool children from preschool dren no longer accept tup for cup or bog for dog—provides
children with autism (Coffey-Corina, Padden, Kuhl, & information about phonological specificity. Studies across
Dawson, 2007). languages suggest that by one year of age mispronunciations
Processing efficiency for phonemes and words can be seen of common words (Fennel & Werker, 2003; Jusczyk & Aslin,
as well in the relative focalization and duration of brain 1995), words in stressed syllables (Vihman, Nakai, DePaolis,
activation in adult MEG studies (Zhang, Kuhl, Imada, & Halle, 2004), or monosyllabic words (Swingley, 2005) are
Kotani, & Tohkura, 2005), indicating that these features not accepted as target words, indicating well-specified rep-
index language experience and proficiency not only in resentations. Other studies using visual fixation of two targets
children (Conboy, Rivera-Gaxiola, et al., 2008; Friederici, (e.g., apple and ball) while one is named (Where’s the ball?)
844 language
Figure 57.3 ERP responses to normal sentences and sentences with either semantic or syntactic anomalies show distinct distribution and
polarity differences in adults. (From Kuhl & Rivera-Gaxiola, 2008.)
& Friederici, 2005; Silva-Pereyra, Conboy, Klarman, & and attributed to the immaturities and inefficiencies of the
Kuhl, 2007; Silva-Pereyra, Klarman, Lin, & Kuhl, 2005; developing processing mechanisms.
Silva-Pereyra, Rivera-Gaxiola, & Kuhl, 2005). Holcomb, Syntactic processing of sentences with semantic content
Coffey, and Neville (1992) reported the N400 in response information removed—“jabberwocky sentences”—has also
to semantic anomaly in children from 5 years of age been tested using ERP measures with children. Silva-Pereyra
to adolescence; the latency of the effect was shown to and colleagues (2007) recorded ERPs to phrase structure
decline systematically with age (see also Hahne, Eckstein, & violations in 36-month-old children using sentences in which
Friederici, 2004; Neville, Coffey, Holcomb, & Tallal, 1993). the content words were replaced with pseudowords while
Studies also show that syntactically anomalous sentences leaving grammatical function words intact. The ERP com-
elicit the P600 in children between 7 and 13 years of age ponents elicited to the jabberwocky phrase-structure viola-
(Hahne et al.). tions differed from the same violations in real sentences.
Recent studies have examined these ERP components Two negative components, one from 750 to 900 ms and the
in preschool children. Harris (2001) reported an N400-like other from 950 to 1050 ms, rather than the positivities seen
effect in 36–38-month-old children, which was largest over in response to phrase structure violations in real sentences
posterior regions of both hemispheres, unlike the adult in the same children, were observed. Jabberwocky studies
scalp distribution. Friederich and Friederici (2005) observed with adults (Canseco-Gonzalez, 2000; Hahne & Jeschenick,
an N400-like wave to semantic anomalies in 19- and 24- 2001; Munte, Matzke, & Johanes, 1997) have also reported
month-old German-speaking children. negative-going waves for jabberwocky sentences, though at
Silva-Pereyra, Rivera-Gaxiola, and Kuhl (2005) recorded much shorter latencies.
ERPs in children between 36 and 48 months of age in
response to semantic and syntactic anomalies. In both cases ERP measures of early language processing in children
the ERP effects in children were more broadly distributed with autism spectrum disorder (ASD)
and elicited at later latencies than in adults. In work
with even younger infants (30 month olds), Silva-Pereyra, Scientific discoveries on the progression toward language by
Klarman, and colleagues (2005) used the same stimuli and typically developing children are now providing new insights
observed late positivities distributed broadly posteriorally in into the language deficit shown by children with autism
response to syntactic anomalies and anterior negativities spectrum disorder (ASD). Neural measures of language pro-
in response to semantically anomalous sentences, though in cessing in children with autism, involving both phonemes
each case with longer latencies than seen in the older chil- and words, when coupled with measures of ASD children’s
dren and in adults (figure 57.4), a pattern seen repeatedly social interest in speech, are revealing a tight coupling
846 language
speech versus nonspeech signals may provide clues to their In new studies with the siblings of children with autism
aversion to the highly intonated speech signals typical of spectrum disorder, we are now exploring whether these
motherese. early brain and behavioral responses to syllables, and
Recent studies extend the findings on children with autism listening preferences for speech, are diagnostic markers for
to word processing using ERP measures (Coffey-Corina, autism. The interest in these measures is that they can be
Padden, Kuhl, & Dawson, 2008). In this study, 24 toddlers used reliably in infants as early as 6 months of age, an age
with autism spectrum disorders between 18 and 31 months at which intervention measures might be more effective in
of age were separated into high-functioning and low- changing the course of development for children at risk for
functioning subgroups defined by the severity of their social autism.
symptoms. ERP measures were recorded in response to
known words, unknown words, and words played backward. Mirror neurons and shared brain systems
They were compared to ERPs elicited from a group of 20
typically developing toddlers between the ages of 20 and 31 Neuroscience studies that focus on shared neural systems
months of age. for perception and action have a long tradition in speech
The results for typically developing toddlers showed a research (Fowler, 2006; Liberman & Mattingly, 1985). The
highly localized response to the difference between known discovery of mirror neurons for social cognition (Gallese 2003;
and unknown words at a left temporal electrode site (T3) Meltzoff & Decety, 2003; Pulvermuller, 2005; Rizzolatti,
in the 200–500-ms and 500–700-ms windows (figure 2005; Rizzolatti & Craighero, 2004) has reinvigorated
57.5A). These data replicate previous data on typically this tradition. Neuroscience studies using speech and whole-
developing children published by Mills and colleagues brain imaging techniques have the capacity to examine the
(1993) and indicate that highly focalized responses are a origins of shared brain systems in infants from birth (Bosseler
marker of increasing developmental sophistication in the et al., 2008; Imada et al., 2006).
processing of words in typically developing children. It was In speech, the theoretical linkage between perception
therefore of interest to observe that toddlers with ASD and action came in the form of the original motor theory
showed a very diffuse response to known and unknown (Liberman, Cooper, Shankweiler, & Studdert-Kennedy,
words. Known words showed a greater negativity than 1967) and in a different formulation of the direct perception
unknown words across all electrode sites, and at a later of gestures, named direct realism (Fowler, 1986). Both posited
latency than age-matched typically developing children close interaction between speech perception and speech pro-
(figure 57.5B). Both the more diffuse pattern of brain activa- duction. The perception-action link for speech was viewed
tion and responses with longer latencies are patterns as innate by the original motor theorists (Liberman &
observed in younger typically developing children (Mills Mattingly, 1985). Alternatively, it was viewed as forged early
et al., 1997). in development through experience with speech motor
Replicating the pattern seen in the studies of phonetic movements and their auditory consequences (Kuhl &
perception in children with autism, the word-processing Meltzoff, 1982, 1996). Two new infant studies have shed
results for children with ASD differed markedly depending some light on the developmental issue.
on the children’s social skills. High-functioning toddlers Imada and colleagues (2006) used magnetoenceph-
with ASD produced ERP responses that were similar to alography (MEG), studying newborns, 6-month-old infants,
those of typically developing children—they exhibited a and 12-month-old infants while they listened to nonspeech
localized left-hemisphere response to known and unknown signals, harmonics, and syllables (figure 57.7). Dehaene-
words. Significant word-type effects were observed only at Lambertz and colleagues (2006) used fMRI to scan 3-month-
the left parietal electrode site (P3) in the 200–500-ms time old infants while they listened to sentences. Both studies
window (figure 57.5C ). In contrast, ERP waveforms of low- show activation in brain areas responsible for speech pro-
functioning toddlers with ASD exhibited a diffuse response duction (the inferior frontal, Broca’s area) in response to
to words. Known words were significantly more negative auditorally presented speech. Imada and colleagues reported
than unknown words at multiple electrode sites and in all synchronized activation in response to speech in auditory
measurement windows (figure 57.5D). and motor areas at 6 and 12 months, and Dehaene-
The idea that ERP measures in response to syllables and Lambertz and colleagues reported activation in motor
words may allow us to predict future language outcomes in speech areas in response to sentences in 3-month-olds.
young children with ASD is exciting. Toward that end, we Is activation of Broca’s area to the pure perception of
note that children with ASD exhibited highly significant speech present at birth? Newborns tested by Imada and col-
correlations between their ERP components at the initial test leagues showed no activation in motor speech areas for any
time and their verbal IQ scores measured one year after signals, whereas auditory areas responded robustly to
ERP data collection (figure 57.6). all signals, suggesting the possibility that perception-action
linkages for speech develop by 3 months of age as infants connections are forged in early infancy as perception and
produce vowellike sounds. But further work must be done action are jointly experienced.
to answer the question. How the binding of perception and
action takes place, and whether it requires experience, is one Bilingual infants: two languages, one brain
of the exciting questions that can now be addressed with
infants from birth using the tools of modern neuroscience. One of the most interesting questions is how infants map two
We now know a great deal about the linkages and the distinct languages in the brain. From phonemes to words,
circuitry underlying language processing in adults (Kuhl & and then to sentences, how do infants simultaneously bathed
Damasio, in press). What is unknown, but waiting to be dis- in two languages develop the neural networks necessary to
covered, is the state of this circuitry at birth and how refined respond in a nativelike manner to two different codes?
848 language
bilingual French-English infants, examined discrimination
of dental (French) and alveolar (English) consonants. They
demonstrated that at 6–8 months, infants in all three lan-
guage groups succeeded; at 10–12 months, monolingual
English infants and French-English bilingual infants, but not
monolingual French infants, distinguished the English con-
trast. Thus bilingual infants performed on par with their
English monolingual peers and better than their French
monolingual peers. Moreover, data from an ERP study of
Spanish-English bilingual infants show that, at both 6–9
and 9–12 months of age, bilingual infants show MMN
responses to both Spanish and English phonetic contrasts
(Rivera-Gaxiola & Romo, 2006), distinguishing them from
English-learning monolingual infants who fail to respond
to the Spanish contrast at the later age (Rivera-Gaxiola,
Silva-Pereyra, et al., 2005).
ERP studies on word development in bilingual children
have just begun to appear. Conboy and Mills (2006) recorded
Figure 57.6 Predictive correlations for children with ASD ERPs to known and unknown English and Spanish words
between the mean amplitude of ERPs to known words at the in bilingual children at 19–22 months. Expressive vocabu-
left parietal electrode site (P3) and verbal IQ measured one year lary sizes were obtained in both English and Spanish, and
later. A more negative response predicted significantly higher
were used to determine language dominance for each child.
verbal IQ (r = −.521, p = .013). (From Coffey-Corina, Padden,
Kuhl, & Dawson, 2008.) A conceptual vocabulary score was calculated by summing
the total number of words in both languages and then
subtracting the number of times a pair of conceptually
Bilingual language experience could potentially have equivalent words (e.g., “water” and “agua”) occurred in the
an impact on development—because the learning process two languages.
requires the development of two codes and because it could ERP differences to known and unknown words in the
take a longer period of time for sufficient data from both dominant language occurred as early as 200–400 and 400–
languages to be experienced than in the monolingual case. 600 ms in these 19–22-month-old infants, and were broadly
Infants learning two first languages simultaneously might distributed over the left and right hemispheres, resembling
reach the developmental change in perception at a later patterns observed in younger (13–17-month-old) mono-
point in development than infants learning either language lingual children (Mills et al., 1997). In the nondominant
monolingually. This difference could depend on such factors language of the same children, these differences were not
as the number of people in the infants’ environment produc- apparent until late in the waveform, from 600 to 900 ms.
ing the two languages in speech directed toward the child Moreover, children with high versus low conceptual vocabu-
and the amount of input they provide. These factors could lary scores produced greater responses to known words in
change the rate of development in bilingual infants. the left hemisphere, particularly for the dominant language
There are very few studies that address this question thus (Conboy & Mills, 2006).
far, and the data that do exist provide somewhat mixed Research has just begun to explore the nature of the
results. Some studies suggest that infants exposed to two bilingual brain, and it is one of the areas in which neurosci-
languages show later acquisition of phonetic skills in the two ence techniques will be of strong interest. Using whole-brain
languages when compared to monolingual infants (Bosch & imaging, we may be able to understand whether learning a
Sebastian-Galles, 2003a, 2003b). This is especially the case second language at different ages—in infancy as opposed to
when infants are tested on contrasts that are phonemic in adulthood—recruits different brain structures. These kinds
only one of the two languages; this has been shown both for of data may play a role in our eventual understanding of the
vowels (Bosch & Sebastian-Galles, 2003b) and consonants “critical period” for language learning.
(Bosch & Sebastian-Galles, 2003a). However, other studies
report no change in the timing of the developmental transi- Conclusions
tion in phonetic skills in the two languages of bilingual infants
(Burns, Yoshida, Hill, & Werker, 2007; Sundara, Polka, & Knowledge of infant language acquisition is now beginning
Molnar, 2008). For example, Sundara and colleagues, testing to reap benefits from information obtained by experi-
monolingual English and monolingual French as well as ments that directly examine the human brain’s response to
linguistic material as a function of experience. EEG, MEG, precursors to diagnose children with developmental disabili-
fMRI, and NIRS technologies—all safe, noninvasive, and ties that involve language. In fact, new studies suggest the
proven feasible—are now being used in studies with very possibility that early measures of the brain’s responses to
young infants, including newborns, as they listen to the pho- speech may provide a diagnostic marker for autism spectrum
netic units, words, and sentences of a specific language. disorder. The fact that language experience affects brain
Brain measures now document the neural signatures of processing of both the signals being learned (native patterns)
learning as early as 7 months for native-language phonemes, and the signals to which the infant is not exposed (nonnative
9 months for familiar words, and 30 months for semantic patterns) may play a role in our understanding of the brain
and syntactic anomalies in sentences. Studies show continu- mechanisms underlying the critical period. At the phonetic
ity from the earliest phases of language learning in infancy level, the data suggest that learning itself, not merely time,
to the complex processing evidenced at the age of three may contribute to the critical-period phenomenon. Whole-
when all typically developing children show the ability to brain imaging now allows us to examine multiple brain areas
carry on a sophisticated conversation. Individual variation during speech processing, including both auditory and motor
in language-specific processing at the phonetic level—at the brain regions, revealing the possible existence of a shared
cusp of the transition from Phase 1, in which all phonetic brain system (a “mirror” system) for speech. Research has
contrasts are discriminated, to Phase 2, in which infants also begun to use these measures to understand how the
focus on the distinctions relevant to their native language—is bilingual brain maps two distinct languages. Answers to the
strongly linked to infants’ abilities to process words and sen- classic questions about the unique human capacity to acquire
tences two years later. This finding is important theoretically language will be enriched by studies that utilize the tools of
but is also vital to the eventual use of these early speech modern neuroscience to peer into the infant brain.
850 language
acknowledgments The author is supported by the National Cheour, M., Ceponiene, R., Lehtokoski, A., Luuk, A., Allik, J.,
Science Foundation’s Science of Learning Center grant to the Alho, K., et al. (1998). Development of language-specific
University of Washington LIFE Center (SBE 0354453), by grants phoneme representations in the infant brain. Nat. Neurosci., 1,
from the National Institutes of Health (HD 37954; MH066399; 351–353.
HD34565; HD55782), by core grants (P30 HD02274; P30 Cheour, M., Imada, T., Taulu, S., Ahonen, A., Salonen, J., &
DC04661), and by a grant from the Cure Autism Now Foundation. Kuhl, P. (2004). Magnetoencephalography is feasible for infant
This chapter updates the information in Kuhl, Conboy, Padden, assessment of auditory discrimination. Exp. Neurol., 190, 44–51.
Riveva-Gaxiola, and Nelson, Philosophical Transactions of the Royal Chomsky, N. (1959). Review of Verbal behavior by B. F. Skinner.
Society of London [Biology] (2008) and Kuhl and Rivera-Gaxiola, Language, 35, 26–58.
Annual Review of Neuroscience (2008). Coffey-Corina, S., Padden, D., Kuhl, P. K., & Dawson,
G. (2007). Electrophysiological processing of single words in toddlers and
school-age children with autism spectrum disorder. Paper presented at
the Annual Meeting of the Cognitive Neuroscience Society,
REFERENCES
New York.
Aslin, R. N., & Mehler, J. (2005). Near-infrared spectroscopy for Coffey-Corina, S., Padden, D., Kuhl, P. K., & Dawson,
functional studies of brain activity in human infants: Promise, G. (2008). ERPs to words correlate with behavioral measures in
prospects, and challenges. J. Biomed. Opt., 10, 011009. children with autism spectrum disorder. J. Acoust. Soc. Am., 123,
Bailey, T. M., & Plunkett, K. (2002). Phonological specificity in 3742.
early words. Cogn. Dev., 17, 1265–1282. Conboy, B. T., Brooks, R., Taylor, M., Meltzoff, A. N., &
Ballem, K. D., & Plunkett, K. (2005). Phonological specificity in Kuhl, P. K. (2008). Joint engagement with language tutors predicts brain
children at 1;2. J. Child Lang., 32, 159–173. and behavioral responses to second-language phonetic stimuli. Paper pre-
Best, C. C., & McRoberts, G. W. (2003). Infant perception of sented at the XVIth Biennial International Conference on Infant
non-native consonant contrasts that adults assimilate in different Studies, Vancouver, BC.
ways, Lang. Speech, 46, 183–216. Conboy, B. T., & Kuhl, P. K. (2007). ERP mismatch negativity effects
Bialystok, E. (1999). Cognitive complexity and attentional control in 11-month-old infants after exposure to Spanish. Paper presented at
in the bilingual mind. Child Dev., 70, 636–644. the Society for Research in Child Development, Boston.
Bortfeld, H., Morgan, J. L., Golinkoff, R. M., & Rathbun, Conboy, B. T., & Mills, D. L. (2006). Two languages, one devel-
K. (2005). Mommy and me: Familiar names help launch babies oping brain: Event-related potentials to words in bilingual
into speech-stream segmentation. Psychol. Sci., 16, 298–304. toddlers. Dev. Sci., 9, F1–F12.
Bortfeld, H., Wruck, E., & Boas, D. A. (2007). Assessing infants’ Conboy, B. T., Rivera-Gaxiola, M., Silva-Pereyra, J., & Kuhl,
cortical response to speech using near-infrared spectroscopy. P. K. (2008). Event-related potential studies of early language
NeuroImage, 34, 407–415. processing at the phoneme, word, and sentence levels. In A. D.
Bosch, L., & Sebastian-Galles, N. (2003a). Language experience and Friederici & G. Thierry (Eds.), Early language development, Vol. 5:
the perception of a voicing contrast in fricatives: Infant and adult data. Bridging brain and behavior: Trends in language acquisition research (pp.
Paper presented at the Proceedings of the International 23–64). Amsterdam: John Benjamins.
Congress of Phonological Sciences, Barcelona. Conboy, B. T., Sommerville, J., & Kuhl, P. K. (2008). Cognitive
Bosch, L., & Sebastian-Galles, N. (2003b). Simultaneous bilin- control skills and speech perception after short-term second
gualism and the perception of a language-specific vowel contrast language experience during infancy. J. Acoust. Soc. Am., 123,
in the first year of life. Lang. Speech, 46, 217–243. 3581.
Bosseler, A. N., Imada, T., Pihko, E., MÄkelÄ, J., Taulu, S., Cutler, A., & Norris, D. (1988). The role of strong syllables in
Ahonen, A., et al. (2008). Neural correlates of speech and non- segmentation for lexical access. J. Exp. Psychol. Hum. Percept.
speech processing: Role of language experience in brain activa- Perform., 14, 113–121.
tion. J. Acoust. Soc. Am., 123, 3333. Dehaene-Lambertz, G., Dehaene, S., & Hertz-Pannier,
Brainard, M. S., & Knudsen, E. I. (1998). Sensitive periods for L. (2002). Functional neuroimaging of speech perception in
visual calibration of the auditory space map in the barn owl optic infants. Science, 298, 2013–2015.
tectum. J. Neurosci., 18, 3929–3942. Dehaene-Lambertz, G., Hertz-Pannier, L., Dubois, J., Meriaux,
Brooks, R., & Meltzoff, A. N. (2008). Gaze following and point- S., & Roche, A. (2006). Functional organization of perisylvian
ing predicts accelerated vocabulary growth through two years of activation during presentation of sentences in preverbal infants.
age: A longitudinal growth curve modeling study. J. Child Lang., Proc. Natl. Acad. Sci. USA, 103, 14240–14245.
35, 207–220. Eimas, P. D., Siqueland, E. R., Jusczyk, P., & Vigorito,
Burnham, D., Kitamura, C., & Vollmer-Conna, U. (2002). J. (1971). Speech perception in infants. Science, 171, 303–306.
What’s new, pussycat? On talking to babies and animals. Science, Englund, K. T. (2005). Voice onset time in infant directed speech
296, 1435. over the first six months. First Lang., 25, 219–234.
Burns, T. C., Yoshida, K. A., Hill, K., & Werker, J. F. (2007). Fennell, C. T., & Werker, J. F. (2003). Early word learners’ ability
The development of phonetic representation in bilingual and to access phonetic detail in well-known words. Lang. Speech, 46,
monolingual infants. Appl. Psycholinguist., 28, 455–474. 245–264.
Canseco-Gonzalez, E. (2000). Using the recording of event- Fenson, L., Dale, P., Reznick, J. S., Thal, D., Bates, E., Hartung,
related brain potentials in the study of sentence processing. In J., et al. (1993). MacArthur communicative development inventories: User’s
Language and the brain: Representation and processing (pp. 229–266). guide and technical manual. San Diego: Singular Publishing
San Diego: Academic Press. Group.
Carlson, S. M., & Meltzoff, A. N. (2008). Bilingual experience Fernald, A., Perfors, A., & Marchman, V. A. (2006). Picking up
and executive functioning in young children. Dev. Sci., 11, speed in understanding: Speech processing efficiency and vocab-
282–298. ulary growth across the 2nd year. Dev. Psychol., 42, 98–116.
852 language
Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Oberecker, R., & Friederici, A. D. (2006). Syntactic event-related
Studdert-Kennedy, M. (1967). Perception of the speech code. potential components in 24-month-olds’ sentence comprehen-
Psychol. Rev., 74, 431–461. sion. Neuroreport, 17, 1017–1021.
Liberman, A. M., & Mattingly, I. G. (1985). The motor theory Oberecker, R., Friedrich, M., & Friederici, A. D. (2005). Neural
of speech perception revised. Cognition, 21, 1–36. correlates of syntactic processing in two-year-olds. J. Cogn.
Liu, H.-M., Kuhl, P. K., & Tsao, F.-M. (2003). An association Neurosci., 17, 1667–1678.
between mothers’ speech clarity and infants’ speech discrimina- Pena, M., Maki, A., Kovacic, D., Dehaene-Lambertz, G.,
tion skills. Dev. Sci., 6, F1–F10. Koizumi, H., Bouquet, F., et al. (2003). Sounds and silence: An
Liu, H. -M., Tsao, F.-M., & Kuhl, P. K. (2007). Acoustic analysis optical topography study of language recognition at birth. Proc.
of lexical tone in Mandarin infant-directed speech. Dev. Psychol. Natl. Acad. Sci. USA, 100, 11702–11705.
43, 912–917. Pulvermuller, F. (2005). The neuroscience of language: On brain
Mandel, D. R., Jusczyk, P. W., & Pisoni, D. (1995). Infants’ circuits of words and serial order. Cambridge, UK: Medical Research
recognition of the sound patterns of their own names. Psychol. Council, Cambridge University Press.
Sci., 6, 314–317. Raizada, R. D. S., Tsao, F.-M., Liu, H.-M., & Kuhl, P. K. (2009).
Meltzoff, A. N., & Decety, J. (2003). What imitation tells us Quantifying the adequacy of neural representations for a cross-
about social cognition: A rapprochement between developmen- language phonetic discrimination task: Prediction of individual
tal psychology and cognitive neuroscience. Philos. Trans. R. Soc. differences. Cereb. Cortex. [Epub ahead of print. doi:10.1093/
Lond. B Biol. Sci., 358, 491–500. cercor/bhp076.]
Mills, D. L., Coffey-Corina, S. A., & Neville, H. J. (1993). Raudenbush, S. W., Bryk, A. S, Cheong, Y. F., & Congdon,
Language acquisition and cerebral specialization in 20- R. (2005). HLM-6: Hierarchical Linear and Nonlinear Modeling.
month-old infants. J. Cogn. Neurosci., 5, 317–334. Lincolnwood, IL: Scientific Software International.
Mills, D. L., Coffey-Corina, S. A., & Neville, H. J. (1997). Rivera-Gaxiola, M., Klarman, L., Garcia-Sierra, A., & Kuhl,
Language comprehension and cerebral specialization from P. K. (2005). Neural patterns to speech and vocabulary growth
13–20 months. Dev. Neuropsychol., 13, 233–237. in American infants. NeuroReport, 16, 495–498.
Mills, D. L., Plunkett, K., Prat, C., & Schafer, G. (2005). Rivera-Gaxiola, M., & Romo, H. (2006). Infant head-start learners:
Watching the infant brain learn words: Effects of vocabulary size Brain and behavioral measures and family assessments. Paper presented
and experience. Cogn. Dev., 20, 19–31. at the From Synapse to Schoolroom: The Science of Learning,
Mills, D. L., Prat, C., Zangl, R., Stager, C. L., Neville, NSF Science of Learning Centers Satellite Symposium, Society
H. J., & Werker, J. F. (2004). Language experience and the for Neuroscience Annual Meeting, Atlanta.
organization of brain activity to phonetically similar words: ERP Rivera-Gaxiola, M., Silva-Pereyra, J., Klarman, L.,
evidence from 14- and 20-month-olds. J. Cogn. Neurosci., 16, Garcia-Sierra, A., Lara-Ayala, L., Cadena-Salazar, C.,
1452–1464. et al. (2007). Principal component analyses and scalp distribution
Molfese, D. L. (1990). Auditory evoked responses recorded from of the auditory P150–250 and N250–550 to speech contrasts
16-month-old human infants to words they did and did not in Mexican and American infants. Dev. Neuropsychol., 31, 363–
know. Brain Lang., 38, 345–363. 378.
Molfese, D. L., Morse, P. A., & Peters, C. J. (1990). Auditory Rivera-Gaxiola, M., Silva-Pereyra, J., & Kuhl, P. K. (2005).
evoked responses to names for different objects: Cross-modal Brain potentials to native and non-native speech contrasts
processing as a basis for infant language acquisition. Dev. Psychol., in 7- and 11-month-old American infants. Dev. Sci., 8, 162–
26, 780–795. 172.
Molfese, D. L., Wetzel, W., & Gill, L. A. (1993). Known versus Rizzolatti, G. (2005). The mirror neuron system and its function
unknown word discriminations in 12-month-old human infants: in humans. Anat. Embryol., 210, 419–421.
Electrophysiological correlates. Dev. Neuropsychol., 9, 241–258. Rizzolatti, G., & Craighero, L. (2004). The mirror-neuron
Munte, T. F., Matzke, M., & Johanes, S. (1997). Brain activity system. Annu. Rev. Neurosci., 27, 169–192.
associated with syntactic incongruencies in words and psuedo- Saffran, J. R. (2003). Statistical language learning: Mechanisms
words. J. Cogn. Neurosci., 9, 318–329. and constraints. Curr. Dir. Psychol. Sci., 12, 110–114.
Naatanen, R., Lehtokoski, A., Lennes, M., Cheour, M., Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical
Huotilainen, M., Iivonen, A., et al. (1997). Language-specific learning by 8-month-old infants. Science, 274, 1926–1928.
phoneme representations revealed by electric and magnetic Silva-Pereyra, J., Conboy, B. T., Klarman, L., & Kuhl, P. K.
brain responses. Nature, 385, 432–434. (2007). Grammatical processing without semantics? An event-
Nazzi, T., Iakimova, G., Bertoncini, J., Frédonie, S., & related brain potential study of preschoolers using Jabberwocky
Alcantara, C. (2006). Early segmentation of fluent speech by sentences. J. Cogn. Neurosci., 19, 1050–1065.
infants acquiring French: Emerging evidence for crosslinguistic Silva-Pereyra, J., Klarman, L., Lin, L. J.-F., & Kuhl, P. K.
differences. J. Mem. Lang., 54, 283–299. (2005). Sentence processing in 30-month-old children: An ERP
Neville, H. J., Coffey, S. A., Holcomb, P. J., & Tallal, P. (1993). study. NeuroReport 16, 645–648.
The neurobiology of sensory and language processing in lan- Silva-Pereyra, J., Rivera-Gaxiola, M., & Kuhl, P. K. (2005).
guage-impaired children. J. Cogn. Neurosci., 5, 235–253. An event-related brain potential study of sentence comprehen-
Newman, R., N., Ratner, B., Jusczyk, A. M., Jusczyk, P. W., & sion in preschoolers: Semantic and morphosyntactic processing.
Dow, K. A. (2006). Infants’ early ability to segment the conver- Cogn. Brain Res., 23, 247–258.
sational speech signal predicts later language development: A Skinner, B. F. (1957). Verbal behavior. New York: Appleton-
retrospective analysis. Dev. Psychol., 42, 643–655. Century-Crofts.
Newport, E. L., & Aslin, R. N. (2004). Learning at a distance. Stager, C. L., & Werker, J. F. (1997). Infants listen for more
I. Statistical learning of non-adjacent dependencies. Cogn. Psychol., phonetic detail in speech perception than in word-learning tasks.
48, 127–162. Nature, 388, 381–382.
854 language
58 Genetics of Language
franck ramus and simon e. fisher
abstract It has long been hypothesized that the human faculty are only beginning to be systematically searched, and the
to acquire a language is in some way encoded in our genetic many differences that are found are not straightforwardly
program. However, only recently has genetic evidence been avail- identifiable as associated with language (Fisher & Marcus,
able to begin to substantiate the presumed genetic basis of lan-
guage. Here we review the first data from molecular genetic studies
2006). However, part of the answer will likely come from
showing association between gene variants and language disorders addressing a related but different question: What human
(specific language impairment, speech sound disorder, develop- genetic variations are associated with variations in the ability
mental dyslexia), we discuss the biological function of these genes, to learn a language? Indeed, most genetic methods rely on
and we further speculate on the more general question of how the detecting correlations between variations in the genotype
human genome builds a brain that can learn a language.
and variations in the phenotype. The capacity to acquire
spoken language is usually treated as a universal character-
istic of our species. Nevertheless, like many other traits, the
Since the beginning of the cognitive revolution, it has been language abilities that are observed in the human population
hypothesized that the human faculty to acquire a language vary along a normal distribution. Cases in the lower end of
is “innate,” that is, part of our species’ biological makeup the distribution (“disorders”) are typically the most informa-
and, therefore, encoded in some way in our genetic program tive, as they may highlight causal relationships between
(Chomsky, 1959). Over the years, a wide variety of argu- genes, brain, and cognition that are often not readily appar-
ments have been advanced in support of this view: the uni- ent in normal development. Indeed, disorders of language
versality of some properties of human languages (Chomsky, acquisition have so far provided almost all the available data
1957), the “poverty of the stimulus” available for language on language genetics. Furthermore, developmental language
acquisition (Chomsky, 1965), the spontaneous emergence of disorders are diverse, affecting different aspects of language,
languages (Bickerton, 1984; Goldin-Meadow & Mylander, therefore promising to illuminate putative genetic influences
1998), biological adaptations such as that of the vocal tract on particular components of language (phonology, morphol-
(Lenneberg, 1967), the existence of inherited disorders that ogy, syntax, articulation . . .). Accordingly, this chapter
may specifically affect language (Gopnik & Crago, 1991), the reviews the genetic data gathered on the various types of
heritability of language abilities and disorders (Stromswold, language-related disorders (specific language impairment,
2001), the adaptiveness of language as a communication speech sound disorder, developmental dyslexia . . .) and
system (Pinker & Bloom, 1990), and the plausibility of a reflects on what they teach us about the genetic basis of
gradual evolution of the language faculty (Jackendoff, 1999) language.
(on the special topic of language evolution, see chapter 59
in this volume by W. Tecumseh Fitch). Evidence for genetic influences on language
Although the evidence gathered in the last decades in favor
of a biological basis of language looks convincing to many Historically, the first hint at a genetic influence on language
scientists, until recently genetic evidence has remained rela- abilities came from the observation that language-related
tively indirect, in the sense that it has not addressed the fun- disorders tend to run in families (Hallgren, 1950; Morley,
damental questions: If there is a genetic basis for language, 1967; Stephenson, 1907; Tallal et al., 2001): when one
then what exactly is there in the human genome that is dif- person has language problems, the risk in first-degree rela-
ferent from other species and that gives us language? How tives is around 50%, far above the normal population preva-
does it build a brain that can learn a human language? lence. Although the inheritance pattern in many families
There is no easy way to obtain a direct answer to these may appear consistent with autosomal dominant transmis-
fascinating questions. Genetic differences between species sion (e.g., the transmission of a dominant gene variant
carried by a nonsexual chromosome), this observation is not
sufficient to prove genetic involvement, as members of a
franck ramus Laboratoire de Sciences Cognitives et
Psycholinguistique, EHESS, CNRS, DEC-ENS, Paris, France family share not only genes but also a linguistic environment.
simon e. fisher Wellcome Trust Centre for Human Genetics, It is conceivable that parents with a language disorder would
University of Oxford, Oxford, United Kingdom constitute a less favorable environment for the acquisition of
856 language
(at least in mammals). They can then analyze the similarity 1980). The latter view has been much challenged in recent
between the sequences in the various species and attempt to years (Ramus, 2003; S. Rosen, 2003; S. White, Frith, et al.,
reconstitute the evolutionary history of the specific gene 2006; S. White, Milne, et al., 2006). As will become apparent
variants that have appeared in the human lineage. More- later, the neurobiological and genetic data are consistent
over, prior knowledge of the gene’s function in other species with the view that an auditory disorder is not necessary to
can give the first clues to its role in humans. engender a phonological deficit in people with dyslexia
• Expression studies investigate the expression pattern of (Ramus, 2004). An alternative view is that phonological rep-
the candidate gene (where and when the protein is synthe- resentations in dyslexia are intrinsically normal and that the
sized) as another important clue to its function. observed difficulties in certain (but not all) phonological tasks
• Many other approaches may be used to further investi- arise from a deficit in the access to these representations that
gate the function of a candidate gene: detection of familiar is particularly recruited for short-term memory and con-
parts in the sequence and comparison with other, similar scious manipulations (Marshall, Harcourt-Brown, Ramus,
genes, algorithmic predictions of the shape of the protein, in & van der Lely, submitted; Ramus & Szenkovits, 2008;
vitro experiments to study the mechanisms of action of the Szenkovits, Darma, Darcy, & Ramus, submitted). The
target protein and its interactions with other molecules, in elucidation of the precise nature of the phonological deficit
vivo experiments to study the effects of disrupting its expres- will therefore determine whether dyslexia can inform us on
sion, particularly on brain development and function, and the links between genes and phonology per se, or rather
so on. between genes and some cognitive processes operating on
We now turn to the specific results obtained on the differ- phonological representations.
ent forms of language disorders. In the 1970s, Galaburda and colleagues began to dissect
human brains whose medical records indicated a diagnosis
of developmental dyslexia (Galaburda & Kemper, 1979).
Developmental dyslexia After dissecting four consecutive brains and finding evidence
for abnormalities of neuronal migration in all four, they
Developmental dyslexia is by definition a disorder of reading hypothesized that this was unlikely to occur by chance and
and spelling acquisition, despite adequate intelligence and that such brain development aberrations might provide an
opportunity, and in the absence of obvious sensory, neuro- explanation of dyslexia (Galaburda, Sherman, Rosen,
logical, or psychiatric disorder. Nevertheless, it has been well Aboitiz, & Geschwind, 1985). Most interestingly, neuronal
established over the last three decades that most cases of migration disruptions were found predominantly in left peri-
dyslexia can be attributed to a subtle disorder of oral lan- sylvian areas traditionally associated with language. More
guage (the “phonological deficit”),2 whose symptoms happen specifically, these areas are the left inferior frontal, posterior
to surface most prominently in reading acquisition (Lyon, superior temporal, and supramarginal and angular gyri.
Shaywitz, & Shaywitz, 2003; Ramus, 2003; Snowling, 2000). Galaburda and colleagues subsequently confirmed these
Therefore dyslexia is expected to ultimately reveal some- findings in three more brains (Humphreys, Kaufmann, &
thing about genetic factors implicated in language, in par- Galaburda, 1990), as well as the rarity of such abnormalities
ticular in phonology. However, both the exact nature of the in control brains (Kaufmann & Galaburda, 1989). Unfortu-
phonological deficit and its underlying cognitive/neural nately, no attempt at an independent replication was ever
causes remain unclear. published, so the dyslexia research community came to con-
Indeed, the main symptoms of the “phonological deficit sider these findings as intriguing but inconclusive. Neverthe-
in dyslexia” are poor phonological awareness (the ability to less, brain-imaging studies have largely confirmed structural
pay attention to and explicitly manipulate speech sounds), and functional abnormalities in dyslexics’ left perisylvian
poor verbal short-term memory, and slow lexical retrieval areas, although at a different level of description. Findings
(evidenced in rapid naming tasks where subjects must name from MRI studies typically consist of reduced gray matter
series of objects, colors, or digits in quick succession). This density, reduced anisotropy of the underlying white matter,
diversity of impairments has led many researchers to hypoth- and hypo- or hyperactivations (Démonet, Taylor, & Chaix,
esize that dyslexics’ phonological representations are some- 2004; Eckert, 2004; Temple, 2002). At the moment it is
what degraded, fuzzy, or noisy, lacking either in temporal impossible to establish their relationship with putative per-
or spectral resolution, or insufficiently attuned to the catego- turbations of neuronal migration, which are not visible in
ries of the native language. This degradation is assumed MRI scans. Quite strikingly, new results emerging from
either to be specific to the speech-processing system (Adlard genetic studies suggest a reappraisal of the old neuronal
& Hazan, 1998; Serniclaes, Van Heghe, Mousty, Carré, & migration hypothesis.
Sprenger-Charolles, 2004; Snowling, 2000) or to follow from Until recently, linkage studies had provided at least
a lower-level auditory deficit (Goswami et al., 2002; Tallal, six reliable chromosomal loci suspected to harbor genes
858 language
fest most remarkably during the acquisition of written lan- be larger on the left than on the right, show a reduced or
guage, which recruits those abilities particularly intensively reversed asymmetry in people with SLI (De Fossé et al.,
(Galaburda, LoTurco, Ramus, Fitch, & Rosen, 2006; 2004; Gauger, Lombardino, & Leonard, 1997; Plante,
Ramus, 2004). There may be alternative neurogenetic path- Swisher, Vance, & Rapcsak, 1991). An extra sulcus in the
ways that lead to dyslexia and that remain to be uncovered. left IFG has also been reported in some individuals with SLI
However, the convergence of data from multiple lines of (Clark & Plante, 1998). In addition, it has been suggested
investigation makes this neuronal migration model particu- that children with SLI present a broader pattern of deviant
larly compelling as at least one highly testable account of asymmetries, again in favor of the right hemisphere on
dyslexia etiology. average (Herbert et al., 2005). Affected children have also
been shown to have a larger total brain volume, as a result
Specific language impairment of a substantial increase in white matter volume, while the
cerebral cortex and the caudate nucleus are relatively smaller
Specific language impairment (SLI) is a disorder of language (Herbert et al., 2003). Finally, it should be noted that in
acquisition that can be attributed neither to mental retarda- Galaburda’s dissection studies, three to four of the seven
tion, nor to other known pathologies (autism, brain lesion, patients showed, on top of dyslexia, some form of language
epilepsy, deafness . . .), nor to environmental deprivation or delay or disorder (Galaburda et al., 1985; Humphreys et al.,
disadvantage. Children with SLI show heterogeneous pro- 1990). Therefore it is not impossible that the same set of
files, but typically have their language development delayed, neuronal migration disruptions, perhaps located slightly dif-
with reduced vocabulary, reduced expression and/or com- ferently, might lie at the heart of SLI as well as of dyslexia
prehension abilities, reduced verbal short-term memory, (Ramus, 2004, 2006b). However, there is no direct evidence
and persistent production of ungrammatical patterns affect- for that in the case of SLI.
ing both syntax (sentence structure) and morphology (e.g., At the genetic level, thus far the search for genes associ-
verb inflections, gender, plural or case marking) (Leonard, ated with SLI has been less successful than for dyslexia.
1998). Nevertheless, there are quite a few interesting results to
At a cognitive level, the most straightforward hypothesis mention. Familial transmission of language disorders is
is that children with SLI have deficits in one or several com- widely reported, and one study has also reported that atypi-
ponents of language, including syntax, morphology, phonol- cal perisylvian asymmetry patterns can be found in the rela-
ogy, the lexicon, and their interfaces (van der Lely, 2005). tives of children with SLI (Plante, 1991), suggesting that the
The precise combination of deficits in a given child, plus the transmission of neuroanatomical phenotypes underlies that
interaction between different language abilities throughout of behavioral phenotypes. Twin studies also have applica-
development, would produce the particular cognitive profile tions beyond simple heritability estimations. Analyzing cor-
presented by the child. An alternative view is that linguistic relations between the performance of one twin in a given
deficits arise either from a perceptual (auditory) deficit (Tallal test and the other twin in a different test allows one to esti-
& Gaab, 2006; Tallal & Piercy, 1973) or from a more general mate whether the same sources of genetic variance underlie
cognitive deficit (Leonard, 1998; Tomblin & Pandich, 1999). both capacities. One study thus found that syntactic and
Again, this debate is quite controversial and goes well beyond morphological abilities (typically measured, in English, by
the present chapter, so we refer the reader to the appropriate the ability to form the past tense of verbs) share some of their
literature (Bishop, Adams, Nation, & Rosen, 2005; Ramus, genetic variance, but phonological short-term memory and
2004; S. Rosen, 2003; Tallal, 2004; Tallal & Gaab, 2006; morphological abilities do not (Bishop et al., 2006). This
van der Lely, 2005; van der Lely, Rosen, & Adlard, 2004; finding suggests that some genetic factors may have differ-
van der Lely, Rosen, & McClelland, 1998). For the purpose ential effects on distinct aspects of language. In a similar
of the present discussion, while leaving the precise nature of vein, another study of children with SLI found that deficits
impairments open, we assume that deficits can have differ- in phonological tests (nonword repetition) are highly herita-
ential impacts on aspects of language. As we will see, this ble, while impairments on a popular auditory processing test
view is at least consistent with the available neurobiological do not show significant evidence of genetic influence (Bishop
and genetic data. et al., 1999). This finding casts further doubt on the idea that
The overall picture provided by neurobiological data, language and phonological deficits necessarily originate
although far from being clear and consistent, is that loosely from low-level perception.
defined language-related brain areas are disrupted or differ- Finally, genomewide linkage studies of SLI have con-
ently organized in children with SLI. The most frequent verged on three main linkage sites: one named SLI1 on
MRI findings have concerned asymmetries between left and chromosome 16, another named SLI2 on chromosome 19
right perisylvian areas. The inferior frontal gyrus (IFG: (SLI Consortium, 2002, 2004), and a third one on chromo-
Broca’s area) and the planum temporale, generally found to some 13 (Bartlett et al., 2003, 2002). So far no candidate gene
860 language
researchers noted that the most profound problems were & Gadian, 2003; Vargha-Khadem et al.; Watkins, Vargha-
impaired speech articulation reminiscent of DVD (Hurst Khadem, et al., 2002). These include putative abnormalities
et al., 1990). Indeed, subsequent reports showed that word in cortical language-related regions, with decreased gray
and nonword repetition tasks provided the most robust diag- matter density in the inferior frontal gyrus (containing
nostic marker of the disorder (Vargha-Khadem et al., 1998). Broca’s area) and increased density in the posterior portion
Consistent with a diagnosis of DVD, the deficits of affected of the superior temporal gyrus (Wernicke’s area). Notably,
members are already evident when repeating shorter utter- the sites of pathology suggested by such analyses were
ances, but become more dramatic with increases in syllable not limited to the cerebral cortex, but extended to the
number and complexity (Watkins, Dronkers, & Vargha- cerebellum and the striatum, where there were significant
Khadem, 2002). Tests of nonspeech praxis in the KE family reductions in gray matter density in the caudate nucleus
indicate reduced performance when making simultaneous accompanied by increases in the putamen. Functional neu-
and sequential oral movements on command (Alcock, roimaging of the KE family during language tasks identified
Passingham, Watkins, & Vargha-Khadem, 2000; Vargha- abnormal patterns of neural activation in the affected
Khadem et al.). This is again reminiscent of other cases of members, even under covert (silent) conditions when there
DVD, which (as noted earlier) often show evidence of oral was no requirement for spoken output (Liegeois et al., 2003).
dyspraxia affecting nonspeech movements. Notably, affected Broca’s area, other cortical language-related regions, and
members of the KE family are not significantly impaired in the putamen were significantly underactivated in affected
making single simple oral movements or in limb praxis, and individuals, who showed a more posterior and bilateral
they do not show gross oromotor dysfunction, for example, pattern of activation than unaffected members of the family.
in feeding or swallowing (Alcock et al.). Sites of abnormalities include both areas associated with
The speech difficulties of the KE family are accompanied motor control and areas associated with language, mirroring
by linguistic impairments that are not confined to spoken the co-occurrence of motor and linguistic symptoms at
language or to the expressive domain. For example, affected the cognitive level. It has been suggested that abnormalities
members perform worse than unaffected members on written in development and function of distributed frontostriatal
tests of verbal fluency and nonword spelling, as well as in and/or frontocerebellar circuits are responsible for the DVD
lexical decision tasks assessing receptive vocabulary, and and accompanying linguistic impairments of the family
they display significant deficits in reception and production (Vargha-Khadem, Gadian, Copp, & Mishkin, 2005).
of grammar (Watkins, Dronkers, et al., 2002), albeit not as Genomewide scanning of the KE family identified a
selectively as proposed in initial linguistic studies (Gopnik, region of chromosome 7q31 showing highly significant
1990). They show difficulties in generating word inflections linkage to the disorder (Fisher et al., 1998), which was found
and derivations, but tests of past-tense production indicate to contain at least 70 genes (Lai et al., 2000). The search was
similar levels of deficits for both regular and irregular words, cut short by the serendipitous discovery of another child
and their receptive impairments extend to syntax at the affected with DVD (unrelated to the KE family) who had a
word-order level (Gopnik & Crago, 1991; Watkins et al.). gross chromosomal abnormality mapping within the region
The relationship between the motoric and linguistic aspects of interest (Lai et al., 2000, 2001). The child, known as CS,
of the disorder in the KE family is the subject of continuing carried a balanced translocation involving exchange of
debate. One hypothesis is that a primary deficit in articula- material between chromosomes 5 and 7, with a breakage in
tion could lead to more general impoverishment in language the 7q31 band. It was shown that the chromosome 7 break-
representation at many other levels (Watkins et al.). However, point of this child directly interrupted a novel gene, known
it is not clear why accurate speech articulation would be as FOXP2 (Lai et al., 2001). Analysis of the gene in the KE
necessary to acquire all the other dimensions of language, family uncovered a heterozygous single-base change in all
and indeed it has been shown that it is not (Fourcin, 1975a, 15 affected members, which was not found in any unaffected
1975b; Lenneberg, 1962; Ramus, Pidgeon, & Frith, 2003). members or in several hundred independent controls (Lai
A plausible alternative is that multiple components of lan- et al., 2001). This mutation was predicted to disrupt the
guage (articulation, phonology, the lexicon, morphology, function of the protein encoded by FOXP2, a hypothesis
and syntax) are concurrently affected, without one deficit that has since been robustly confirmed (Groszer et al., 2008;
being responsible for all the others. Vernes et al., 2006).
The brains of affected people from the KE family appear FOXP2 encodes a protein belonging to the “Forkhead
overtly normal in structure on standard evaluation of bOX” (or FOX) family of transcription factors, which act
MRI scans (Vargha-Khadem et al., 1998). However, statisti- to regulate the expression of suites of genes during embryo-
cal comparisons to unaffected members using voxel-based genesis and development and in adulthood (Carlsson &
morphometry revealed subtle anomalies affecting multiple Mahlapuu, 2002). The single-base missense mutation in the
brain regions (Belton, Salmond, Watkins, Vargha-Khadem, FOXP2 gene of affected KE family members alters one
862 language
samples from Neanderthals indicate that they also carried genes, including two affecting different disorders. And indeed
the human amino-acid substitutions, which would suggest a none of the genes associated with dyslexia has been associ-
more ancient origin (at least 300,000–400,000 years) for the ated with SLI or SSD so far. Second, there is no hint as yet
changes (Krause et al., 2007). At the moment, nothing is of any overlap between dyslexia and SLI linkage sites, a fact
known about the functional consequences of these two that may seem puzzling. However, it is not all that surpris-
amino-acid changes, but this finding raises the possibility ing, given the statistical power of most linkage analyses
that FOXP2 might have acquired new functional roles in (Marlow et al., 2003), and this gap may well be bridged
humans. sooner or later.
In summary, FOXP2 may simultaneously contribute to • Genetic linkage sites also overlap between SLI and
human language pathways by at least two routes. The first autism. Furthermore, the CNTNAP2 gene, identified as a
route is through an evolutionarily conserved role related to downstream target of FOXP2, also appears to be associated
motor sequencing and vocal learning, as observed in non- with common cases of SLI (Vernes et al., 2008), as well
linguistic species (studies of birds and mice). Deficits in these as with autistic spectrum disorder (Arking et al., 2008;
processes are likely to mediate parts of the DVD phenotype Bakkaloglu et al., 2008). One study further suggested the
associated with FOXP2 disruption. Second, the human association between CNTNAP2 and language abilities in
version may have putative novel functions that remain to be autism, as measured by age at first word (Alarcon et al.,
understood but that might conceivably contribute to more 2008). This finding suggests etiological overlaps between SLI
human-specific aspects of language. and autism.
864 language
development and that certain of their variations may alter to creating new connections between two existing brain
the development and/or function of particular brain areas, areas. Even an altogether new brain area could evolve rela-
which in turn are useful for some aspects of language acquisi- tively simply by having a modified transcription factor pre-
tion. Thus these genes are necessary for normal language natally define new boundaries on the cortex, push around
acquisition, but they are of course not sufficient, and further- previously existing areas, and create the molecular condi-
more they have not necessarily evolved for the purpose of tions for a novel form of cortex in Brodmann’s sense: still
language acquisition. Some of them (like FOXP2) have the basic six layers, but with different relative importance,
indeed undergone some human-specific modifications, different patterns of internal and external connectivity, and
apparently under selection pressure, and within a time frame different distributions of types of neurons across the layers.
that is compatible with the evolution of language in the This would essentially be a new quantitative variation within
human lineage. In such a case it is possible that these changes a very general construction plan, requiring little new in
were one of the steps that made it possible for humans to terms of genetic material, but this area could nevertheless
develop language. Other known genes associated with lan- present novel input/output properties that, together with the
guage disorders also differ slightly between humans and adequate input and output connections, might perform an
other mammals, but so far there is no evidence that these entirely novel information-processing function of great
differences are functionally significant and may have played importance to language. Even if the ultimate form of that
a role in language evolution (Fisher & Francks, 2006). Nev- brain area turns out to require many genetic changes, there
ertheless, this lack of evidence does not make those genes is no necessity that all the changes coevolved simultaneously.
uninteresting. Once the area is delineated, further genetic changes could
The language faculty is very unlikely to be an entirely new progressively shift its boundaries and refine its cellular
organ that has appeared from scratch in the human brain makeup and thus its information-processing capabilities.
(Fisher & Marcus, 2006). Rather, it should be seen as a Thus even the creation of a new neuroanatomical and cogni-
product of “descent with modification,” that is, a new com- tive module is not as unlikely as one might imagine and does
bination of old and possibly new cognitive ingredients not require improbable assumptions about dramatic genetic
(Marcus, 2006). Old ingredients may include auditory per- changes. Dramatic effects can be obtained by small changes
ception, primate vocalization, long-term, short-term, and in the way the construction plan is laid out.
working memory, sequence processing, a conceptual system, In a nutshell, there is no need of a “gene for language” to
and many more. Of course each of these components must explain the genetic basis of language. Having said that, it is
have to some extent evolved in human-specific ways in order now known that some human genes (perhaps 150 to 300)
to be harnessed for linguistic purposes, a fact that implies really are human specific, in the sense that they are entirely
that some of the genes that were already implicated in the new concatenations of bits of other genes that have no equiv-
construction of the corresponding brain areas either have alent in other species (Bailey et al., 2002; Nahon, 2003).
undergone some functional changes or have been triggered Very little is known about those genes, but it is of course
in new ways by upstream transcription factors and other possible that one or more of them could have been impor-
regulatory elements. Thus even a human gene identical to tant in the evolution of the neural bases for language. The
an ancestral primate version could nowadays be important point is that even if this is not the case, more standard genetic
for language, if for instance it is involved in the construction changes in ancestral genes would still be adequate to explain
of a relevant brain area in virtue of being expressed in new the emergence of a new cognitive ability such as language.
ways by a transcription factor such as FOXP2. As for new
cognitive ingredients, it is not yet entirely settled what (if Perspectives The picture laid out in this chapter is of
anything) should fall into that category. An influential and course very incomplete. Many more genes associated with
controversial proposal is that a capacity for recursion is the language disorders remain to be found, and genes associated
unique new cognitive ingredient required for language, with normal variations in language abilities remain even to
together with an adaptation of “interfaces” between this new be searched for. Nevertheless, the data that we have discussed
component and the old ones (Fitch, Hauser, & Chomsky, are probably a reasonable illustration of what can be expected
2005; Hauser, Chomsky, & Fitch, 2002; but see Jackendoff in the future. We can expect more genes involved in aspects
& Pinker, 2005; Pinker & Jackendoff, 2005). of brain development (neuronal migration being just one
Taking this as a working hypothesis, it is unlikely that such possibility), as well as more transcription factors and other
a new cognitive capacity could have evolved overnight genes with a restricted cortical expression that may affect the
thanks to a single mutation. Even if it is truly new in a cogni- development of more specific brain areas. Genes involved in
tive sense, it is likely to be much less novel in biological neurotransmission, however, are currently out of the picture
terms. For instance, a change in a single gene producing a (although implicated in other disorders such as ADHD), but
signaling molecule (or a receptor, channel, etc.), could lead there is of course no guarantee that they will remain so.
866 language
Belton, E., Salmond, C. H., Watkins, K. E., Vargha-Khadem, Cope, N., Harold, D., Hill, G., Moskvina, V., Stevenson, J.,
F., & Gadian, D. G. (2003). Bilateral brain abnormalities associ- Holmans, P., et al. (2005). Strong evidence that KIAA0319 on
ated with dominantly inherited verbal and orofacial dyspraxia. chromosome 6p is a susceptibility gene for developmental dys-
Hum. Brain Mapp., 18(3), 194–200. lexia. Am. J. Hum. Genet., 76(4), 581–591.
Bickerton, D. (1984). The language bioprogram hypothesis. Behav. De Fossé, L., Hodge, S. M., Makris, N., Kennedy, D. N.,
Brain Sci., 7, 173–221. Caviness, V. S., Jr., McGrath, L., et al. (2004). Language-asso-
Bishop, D. V. M., & Adams, C. (1990). A prospective study of the ciation cortex asymmetry in autism and specific language impair-
relationship between specific language impairment, phonologi- ment. Ann. Neurol., 56(6), 757–766.
cal disorders and reading retardation. J. Child Psychol. Psychiatry, Dean, C., & Dresbach, T. (2006). Neuroligins and neurexins:
31(7), 1027–1050. Linking cell adhesion, synapse formation and cognitive function.
Bishop, D. V. M., Adams, C. V., Nation, K., & Rosen, S. (2005). Trends Neurosci., 29(1), 21–29.
Perception of transient nonspeech stimuli is normal in specific Démonet, J.-F., Taylor, M. J., & Chaix, Y. (2004). Developmental
language impairment: Evidence from glide discrimination. Appl. dyslexia. Lancet, 363(9419), 1451–1460.
Psycholinguist., 26, 175–194. Eckert, M. (2004). Neuroanatomical markers for dyslexia: A
Bishop, D. V. M., Adams, C. V., & Norbury, C. F. (2006). Distinct review of dyslexia structural imaging studies. Neuroscientist, 10(4),
genetic influences on grammar and phonological short-term 362–371.
memory deficits: Evidence from 6-year-old twins. Genes Brain Enard, W., Przeworski, M., Fisher, S. E., Lai, C. S., Wiebe, V.,
Behav., 5(2), 158–169. Kitano, T., et al. (2002). Molecular evolution of FOXP2,
Bishop, D. V. M., Bishop, S. J., Bright, P., James, C., Delaney, a gene involved in speech and language. Nature, 418(6900),
T., & Tallal, P. (1999). Different origin of auditory and pho- 869–872.
nological processing problems in children with language impair- Felsenfeld, S., & Plomin, R. (1997). Epidemiological and offspring
ment: Evidence from a twin study. J. Speech Lang. Hear. Res., 42(1), analyses of developmental speech disorders using data from the
155–168. Colorado Adoption Project. J. Speech Lang. Hear. Res., 40(4),
Bishop, D. V. M., & Snowling, M. J. (2004). Developmental dys- 778–791.
lexia and specific language impairment: Same or different? Feuk, L., Kalervo, A., Lipsanen-Nyman, M., Skaug, J.,
Psychol. Bull., 130(6), 858–886. Nakabayashi, K., Finucane, B., et al. (2006). Absence of a
Bonkowsky, J. L., & Chien, C. B. (2005). Molecular cloning paternally inherited FOXP2 gene in developmental verbal
and developmental expression of foxP2 in zebrafish. Dev. Dyn., dyspraxia. Am. J. Hum. Genet., 79(5), 965–972.
234(3), 740–746. Fisher, S. E. (2006). Tangled webs: Tracing the connections
Booth, J. R., Wood, L., Lu, D., Houk, J. C., & Bitan, T. (2007). between genes and cognition. Cognition, 101(2), 270–297.
The role of the basal ganglia and cerebellum in language pro- Fisher, S. E., & DeFries, J. C. (2002). Developmental dyslexia:
cessing. Brain Res., 1133(1), 136–144. Genetic dissection of a complex cognitive trait. Nat. Rev. Neurosci.,
Brkanac, Z., Chapman, N. H., Matsushita, M. M., Chun, L., 3, 767–780.
Nielsen, K., Cochrane, E., et al. (2007). Evaluation of candi- Fisher, S. E., & Francks, C. (2006). Genes, cognition and
date genes for DYX1 and DYX2 in families with dyslexia. Am. dyslexia: Learning to read the genome. Trends Cogni. Sci., 10(6),
J. Med. Genet. B Neuropsychiatr. Genet., 144B(4), 556–560. 250–257.
Burbridge, T. J., Wang, Y., Volz, A. J., Peschansky, V. J., Lisann, Fisher, S. E., & Marcus, G. F. (2006). The eloquent ape: Genes,
L., Galaburda, A. M., et al. (2008). Postnatal analysis of the brains and the evolution of language. Nat. Rev. Genet., 7(1),
effect of embryonic knockdown and overexpression of candidate 9–20.
dyslexia susceptibility gene homolog Dcdc2 in the rat. Neuro- Fisher, S. E., Vargha-Khadem, F., Watkins, K. E., Monaco,
science, 152(3), 723–733. A. P., & Pembrey, M. E. (1998). Localisation of a gene impli-
Carlsson, P., & Mahlapuu, M. (2002). Forkhead transcription cated in a severe speech and language disorder. Nat. Genet., 18(2),
factors: Key players in development and metabolism. Dev. Biol., 168–170.
250(1), 1–23. Fitch, W. T., Hauser, M. D., & Chomsky, N. (2005). The evolu-
Caspi, A., McClay, J., Moffitt, T. E., Mill, J., Martin, J., tion of the language faculty: Clarifications and implications.
Craig, I. W., et al. (2002). Role of genotype in the cycle of vio- Cognition, 97(2), 179–210.
lence in maltreated children. Science, 297(5582), 851–854. Flax, J. F., Realpe-Bonilla, T., Hirsch, L. S., Brzustowicz,
Caspi, A., Sugden, K., Moffitt, T. E., Taylor, A., Craig, I. W., L. M., Bartlett, C. W., & Tallal, P. (2003). Specific language
Harrington, H., et al. (2003). Influence of life stress on depres- impairment in families: Evidence for co-occurrence with reading
sion: Moderation by a polymorphism in the 5-HTT gene. Science, impairments. J. Speech Lang. Hear. Res., 46(3), 530–543.
301(5631), 386–389. Fourcin, A. J. (1975a). Language development in the absence of
Chomsky, N. (1957). Syntactic structures. The Hague: Mouton. expressive speech. In E. H. Lenneberg & E. Lenneberg (Eds.),
Chomsky, N. (1959). A review of B. F. Skinner’s Verbal behavior. Foundations of language development (Vol. 2, pp. 263–268). New York:
Language, 35, 26–58. Academic Press.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: Fourcin, A. J. (1975b). Speech perception in the absence of speech
MIT Press. productive ability. In N. O’Connor (Ed.), Language, cognitive deficits
Clark, M. M., & Plante, E. (1998). Morphology of the inferior and retardation (pp. 33–43). London: Butterworths.
frontal gyrus in developmentally language-disordered adults. Friederici, A. D., & Kotz, S. A. (2003). The brain basis of
Brain Lang., 61(2), 288–303. syntactic processes: Functional imaging and lesion studies.
Colledge, E., Bishop, D. V., Koeppen-Schomerus, G., Price, NeuroImage, 20 (Suppl 1), S8–17.
T. S., Happe, F. G., Eley, T. C., et al. (2002). The structure of Galaburda, A. M., & Kemper, T. L. (1979). Cytoarchitectonic
language abilities at 4 years: A twin study. Dev. Psychol., 38(5), abnormalities in developmental dyslexia: A case study. Ann.
749–757. Neurol., 6(2), 94–100.
868 language
MacDermot, K. D., Bonora, E., Sykes, N., Coupe, A. M., Lai, Plante, E., Swisher, L., Vance, R., & Rapcsak, S. (1991). MRI
C. S., Vernes, S. C., et al. (2005). Identification of FOXP2 findings in boys with specific language impairment. Brain Lang.,
truncation as a novel cause of developmental speech and lan- 41(1), 52–66.
guage deficits. Am. J. Hum. Genet., 76(6), 1074–1080. Ramus, F. (2003). Developmental dyslexia: Specific phonological
Marcus, G. F. (2006). Cognitive architecture and descent with deficit or general sensorimotor dysfunction? Curr. Opin. Neurobiol.,
modification. Cognition, 101(2), 443–465. 13(2), 212–218.
Marien, P., Engelborghs, S., Fabbro, F., & De Deyn, P. P. Ramus, F. (2004). Neurobiology of dyslexia: A reinterpretation of
(2001). The lateralized linguistic cerebellum: A review and a new the data. Trends Neurosci., 27(12), 720–726.
hypothesis. Brain Lang., 79(3), 580–600. Ramus, F. (2006a). Genes, brain, and cognition: A roadmap for the
Marino, C., Giorda, R., Luisa Lorusso, M., Vanzin, L., Salandi, cognitive scientist. Cognition, 101(2), 247–269.
N., Nobile, M., et al. (2005). A family-based association Ramus, F. (2006b). A neurological model of dyslexia and other
study does not support DYX1C1 on 15q21.3 as a candidate domain-specific developmental disorders with an associated sen-
gene in developmental dyslexia. Eur. J. Hum. Genet., 13(4), 491– sorimotor syndrome. In G. D. Rosen (Ed.), The dyslexic brain:
499. New pathways in neuroscience discovery (pp. 75–101). Mahwah, NJ:
Marlow, A. J., Fisher, S. E., Francks, C., MacPhie, I. L., Cherny, Lawrence Erlbaum.
S. S., Richardson, A. J., et al. (2003). Use of multivariate linkage Ramus, F., Peperkamp, S., Christophe, A., Jacquemot, C.,
analysis for dissection of a complex cognitive trait. Am. J. Hum. Kouider, S., & Dupoux, E. (in press). A psycholinguistic
Genet., 72(3), 561–570. perspective on the acquisition of phonology. In C. Fougeron,
Marshall, C. R., Harcourt-Brown, S., Ramus, F., & van der B. Kühnert, & E. Delais-Roussarie (Eds.), Papers in laboratory
Lely, H. K. J. (submitted). Investigating phonological grammar phonology X.
in children with SLI and/or dyslexia: Is there compensation for Ramus, F., Pidgeon, E., & Frith, U. (2003). The relationship
place assimilation? between motor control and phonology in dyslexic children.
Marshall, C. R., Harcourt-Brown, S., Ramus, F., & van der J. Child Psychol. Psychiatry, 44(5), 712–722.
Lely, H. K. J. (in press). The link between prosody and language Ramus, F., & Szenkovits, G. (2008). What phonological deficit?
skills in children with SLI and/or dyslexia. Int. J. Lang. Commun. Q. J. Exp. Psychol., 61(1), 129–141.
Disord. Redon, R., Ishikawa, S., Fitch, K. R., Feuk, L., Perry, G. H.,
McArthur, G. M., Hogben, J. H., Edwards, V. T., Heath, Andrews, T. D., et al. (2006). Global variation in copy number
S. M., & Mengler, E. D. (2000). On the “specifics” of specific in the human genome. Nature, 444(7118), 444–454.
reading disability and specific language impairment. J. Child Rosen, G. D., Bai, J., Wang, Y., Fiondella, C. G., Threlkeld,
Psychol. Psychiatry, 41(7), 869–874. S. W., LoTurco, J. J., et al. (2007). Disruption of neuronal
McBride, M. C., & Kemper, T. L. (1982). Pathogenesis of migration by RNAi of Dyx1c1 results in neocortical and hippo-
four-layered microgyric cortex in man. Acta Neuropathol., 57(2–3), campal malformations. Cereb. Cortex, 17(11), 2562–2572.
93–98. Rosen, S. (2003). Auditory processing in dyslexia and specific lan-
Meng, H., Smith, S. D., Hager, K., Held, M., Liu, J., Olson, guage impairment: Is there a deficit? What is its nature? Does it
R. K., et al. (2005). DCDC2 is associated with reading disability explain anything? J. Phonetics, 31, 509–527.
and modulates neuronal development in the brain. Proc. Natl. Scerri, T. S., Fisher, S. E., Francks, C., MacPhie, I. L.,
Acad. Sci. USA, 102, 17053–17058. Paracchini, S., Richardson, A. J., et al. (2004). Putative func-
Morley, M. E. (1967). The development and disorders of speech in child- tional alleles of DYX1C1 are not associated with dyslexia sus-
hood. Baltimore: Williams & Wilkins. ceptibility in a large sample of sibling pairs from the UK. J. Med.
Nahon, J.-L. (2003). Birth of “human-specific” genes during Genet., 41(11), 853–857.
primate evolution. Genetica, 118(2–3), 193–208. Schumacher, J., Anthoni, H., Dahdouh, F., König, I. R., Hillmer,
Newbury, D. F., Bonora, E., Lamb, J. A., Fisher, S. E., Lai, A. M., Kluck, N., et al. (2005). Strong genetic evidence for
C. S., Baird, G., et al. (2002). FOXP2 is not a major susceptibil- DCDC2 as a susceptibility gene for dyslexia. Am. J. Hum. Genet.,
ity gene for autism or specific language impairment. Am. J. Hum. 78, 52–62.
Genet., 70(5), 1318–1327. Serniclaes, W., Van Heghe, S., Mousty, P., Carré, R., &
Oliver, B. R., & Plomin, R. (2007). Twins’ Early Development Sprenger-Charolles, L. (2004). Allophonic mode of speech
Study (TEDS): A multivariate, longitudinal genetic investigation perception in dyslexia. J. Exp. Child Psychol., 87, 336–361.
of language, cognition and behavior problems from childhood Shriberg, L. D., Ballard, K. J., Tomblin, J. B., Duffy, J. R.,
through adolescence. Twin Res. Hum. Genet., 10(1), 96–105. Odell, K. H., & Williams, C. A. (2006). Speech, prosody,
Paracchini, S., Scerri, T., & Monaco, A. P. (2007). The genetic and voice characteristics of a mother and daughter with a 7;13
lexicon of dyslexia. Annu. Rev. Genomics Hum. Genet., 8, 57–79. translocation affecting FOXP2. J. Speech Lang. Hear. Res., 49(3),
Paracchini, S., Thomas, A., Castro, S., Lai, C., Paramasivam, 500–525.
M., Wang, Y., et al. (2006). The chromosome 6p22 haplotype Shriberg, L. D., Tomblin, J. B., & McSweeny, J. L. (1999). Preva-
associated with dyslexia reduces the expression of KIAA0319, a lence of speech delay in 6-year-old children and comorbidity
novel gene involved in neuronal migration. Hum. Mol. Genet., with language impairment. J. Speech Lang. Hear. Res., 42(6),
15(10), 1659–1666. 1461–1481.
Pinker, S., & Bloom, P. (1990). Natural language and natural Shu, W., Cho, J. Y., Jiang, Y., Zhang, M., Weisz, D., Elder,
selection. Behav. Brain Sci., 13(4), 707–784. G. A., et al. (2005). Altered ultrasonic vocalization in mice with
Pinker, S., & Jackendoff, R. (2005). The faculty of language: a disruption in the Foxp2 gene. Proc. Natl. Acad. Sci. USA, 102(27),
What’s special about it? Cognition, 95(2), 201–236. 9643–9648.
Plante, E. (1991). MRI findings in the parents and siblings SLI Consortium. (2002). A genomewide scan identifies two novel
of specifically language-impaired boys. Brain Lang., 41(1), loci involved in specific language impairment. Am. J. Hum. Genet.,
67–80. 70(2), 384–398.
870 language
White, S., Milne, E., Rosen, S., Hansen, P. C., Swettenham, J., susceptibility locus for dyslexia on 15q21. Mol. Psychiatry, 9(12),
Frith, U., et al. (2006). The role of sensorimotor impairments 1111–1121.
in dyslexia: A multiple case study of dyslexic children. Dev. Sci., Zeesman, S., Nowaczyk, M. J., Teshima, I., Roberts, W., Cardy,
9(3), 237–255. J. O., Brian, J., et al. (2006). Speech and language impairment
White, S. A., Fisher, S. E., Geschwind, D. H., Scharff, C., & and oromotor dyspraxia due to deletion of 7q31 that involves
Holy, T. E. (2006). Singing mice, songbirds, and more: Models FOXP2. Am. J. Med. Genet. A, 140(5), 509–514.
for FOXP2 function and dysfunction in human speech and lan- Zhang, J., Webb, D. M., & Podlaha, O. (2002). Accelerated
guage. J. Neurosci., 26(41), 10376–10379. protein evolution and origins of human-specific features: FOXP2
Wigg, K. G., Couto, J. M., Feng, Y., Barr, C. L., Anderson, B., as an example. Genetics, 162(4), 1825–1835.
Cate-Carter, T. D., et al. (2004). Support for EKN1 as the
abstract The last decade has seen rapid and impressive progress living relatives, the chimpanzees, are unable to acquire lan-
in understanding the biology and evolution of complex “innova- guage past the level of a young child. Sometime in our recent
tive” traits (e.g., insect wings or vertebrate eyes), and the fruits of evolution, in the last 5–7 million years since we diverged
this understanding are beginning to have an impact on our under-
standing of that most innovative of human traits: language.
from our last common ancestor with chimpanzees, a suite of
Although language, as a whole, is unique to Homo sapiens, many of important innovations have occurred, which together com-
the neural and cognitive mechanisms supporting language are prise the human capacity to acquire language. The nature
shared with other species. An empirically based, mechanistic of, and biological basis for, this capacity is a core focus of
understanding of the evolution of language therefore requires contemporary research on the biology of language.
research on both unique aspects of language (such as complex
Languages themselves, like English or Chinese, are obvi-
syntax) and broadly shared features. Evolutionary developmental
biology (“evo-devo”) has added a new twist to this distinction, with ously not inborn. We each acquire the language of our local
the discovery that traits shared due to convergent evolution (such as community through an experience-dependent process of
vocal learning in humans and birds) may nonetheless be based on language acquisition. In Darwin’s words “language is an
homologous genes and developmental pathways. Such “deep homolo- art . . . not a true instinct, for every language has to be learnt.
gies” may involve convergence at the phenotypic level and homol-
It differs, however, widely from all ordinary arts, for man
ogy at the genotypic level, and illustrate the need to rethink
traditional ideas about homology. Studies of eyes, limbs, and body has an instinctive tendency to speak, as we see in the babble
plans have revealed deep homologies in all these systems. Here, I of our young children; whilst no child has an instinctive
suggest that language is also likely to have its share of deep homolo- tendency to brew, bake, or write” (Darwin, 1871, p. 55).
gies, and that this possibility provides a powerful rationale for Today there is wide agreement that the language acquisi-
investigations of convergently evolved traits in widely separated tion process has a strong biological basis and represents an
species. I illustrate the potential of this new approach with an
exploration of the neural and genetic basis of vocal learning in “instinct to learn” that is part of every normal child’s genetic
humans and birds. I conclude that neuroethological investigations heritage. Although this human capacity is unique when con-
of diverse vertebrate species, from fish to birds to mice, will power- sidered as a whole (sometimes termed the “faculty of lan-
fully augment more traditional work on primates in the search for guage in a broad sense,” or FLB), most of the component
the neural mechanisms underlying language. mechanisms underlying language are not unique to our
species. Factors shared with chimpanzees and other pri-
mates (e.g., mechanisms underlying lexical acquisition) are
Humans (like most species) are unique in many ways, but traditionally believed to be homologies, traits that were
language is the jewel in our cognitive crown. Language present in our shared common ancestor. Other traits are
makes possible the greatest human cultural achievements, shared with more distant biological relatives but not with
ranging from quantum physics to the novel to the Internet, other primates (e.g., mechanisms underlying imitative vocal
because knowledge can be conveyed from mind to mind, learning). Such traits are traditionally considered to repre-
across generations, with progressive refinements and elabo- sent convergent evolution, analogy, or “homoplasy.”
ration. Without language and the community of minds that This latter category is of particular interest to neuroscien-
it creates, our species would be little more than an unusually tists, because animal species sharing linguistically relevant
clever bipedal ape. It is clear that human language rests on traits like vocal learning (e.g., songbirds) are more amenable
a unique, recently evolved, biological basis: our nearest to experimental analysis than are chimpanzees or other
nonhuman primates. It is becoming increasingly clear that
w. tecumseh fitch School of Life Sciences, University of traits that have evolved independently (“mere” analogies)
Vienna, Vienna, Austria may nonetheless be based on shared genetic developmental
874 language
Thus an adequate neural model will also need to incorporate ontogeny and phylogeny for all traits. Because language
a rich theory of neural epigenesis: the interactions between changes, the linguistic target of the learner has been filtered
gene expression in independent cells as influenced by their through the minds of previous humans, and this process
local neural processing environment within the brain and leads to an “evolutionary” dynamic of its own (Darwin,
the external sensory world. This understanding also appears 1871; Fitch, 2007b; Kirby, Smith, & Brighton, 2004; Pagel,
far off, though recent progress driven by molecular techniques Atkinson, & Meade, 2007). The importance of this
is cause for hope. It will require the abandonment of scala interposition of an additional form of glossogenetic change,
natura models of brain evolution (cf. Striedter, 2004), and with a time scale between that of ontogeny and phylogeny,
accepting the broad and deep similarities among all is increasingly recognized (Deacon, 1997; Hurford, 1990;
vertebrate brains, from fish to humans, while simultaneously Kirby et al.; Nettle, 1999), but scientists are only beginning
allowing for the equally important differences among the to resolve some of the problems in gene/culture coevolution
brains of even closely related species. that this raises (Richerson & Boyd, 2005). Fortunately, we
have other biological examples of cultural change for
The Linguistic Challenge For the (psycho)linguist, the comparison (e.g., bird or whale “song”).
cognitive revolution in psychology and the associated The same is not true for the central feature of semantics:
generative revolution in linguistics have led to both the ability to express arbitrary thoughts. Current under-
considerable progress and a confusing profusion of theoretical standing of animal communication strongly suggests that
frameworks and perspectives. While there is widespread humans are unique in this ability: if there are nonhuman
agreement that the capacity for language has a strong species with open-ended semantics, they are remarkably
biological basis, unique when considered as a whole to our clever at hiding these abilities from generations of dedicated
species, there is little consensus about the detailed nature of ethologists (Bradbury & Vehrencamp, 1998; Hauser, 1996).
this capacity. Plausible hypotheses include a continuum Efforts at training nonhuman species with languagelike
from a broad and general “capacity for culture” (Tomasello, systems reveal both commonalities and significant differ-
1999; Tomasello, Carpenter, Call, Behne, & Moll, 2005) to ences (Kako, 1999; Tomasello & Call, 1997). One of the
a detailed computational system specific to language and to sharp differences is the apparent drive in our species to
humans (Pinker & Jackendoff, 2005). Intermediate positions express our thoughts and feelings to others. This drive poses
include the possibility that language is built on a broadly significant evolutionary problems, for the evolution of such
shared cognitive foundation, with a few powerful but novel apparently “altruistic” behavior is not predicted by standard
computational operations knitting shared mechanisms models of the evolution of communication and cooperation
together (Hauser, Chomsky, & Fitch, 2002; Fitch, Hauser, (Axelrod & Hamilton, 1981; Dawkins & Krebs, 1978;
& Chomsky, 2005). We can refer to all of the cognitive/ Trivers, 1971; Zahavi, 1993), while the kin-selection route
neural mechanisms involved in language production, to cooperative communication (Hamilton, 1964; Maynard
perception, or processing as the “faculty of language in a Smith & Harper, 2003) appears confounded by the fact that
broad sense,” or FLB; the subset of processing mechanisms humans do not communicate exclusively with kin (cf. Fitch,
specific to language and to our species can then be referred 2004). Thus language evolution still poses deep evolutionary
to as the “faculty of language in a narrow sense,” or FLN puzzles, if not the “embarrassment to evolutionary theory”
(Hauser et al.). Because the nature of this latter, more specific, once proclaimed by Premack (1986).
subset remains highly controversial (Fitch et al.; Pinker
& Jackendoff, 2005), I conservatively take the FLB as a This brief survey should convince any skeptics that the
whole as the explanandum here, emphasizing that this is a biology and evolution of language involve a profusion of
multicomponent system. The FLB includes the mechanisms challenging scientific problems. For many years the topic of
underlying phonetics, phonology, syntax, semantics, and language evolution was neglected as a result. However, a
pragmatics, without regard to whether each component is series of methodological advances have combined with theo-
specific to humans or not. Many components of the FLB will retical progress in a number of disciplines to reawaken hope
be shared with nonhuman animals and thus can be studied that “biolinguistics,” as this field is sometimes called, can
comparatively, and such studies are required to determine become an productive, empirical scientific discipline. This
the contents of FLN. hope is exciting, because language is such a central aspect
of human nature that a failure to understand it entails a
The Evolutionary Challenge For the evolutionary fundamentally incomplete understanding of ourselves. I
biologist, language evolution raises a number of challenges optimistically believe that the challenges I have sketched,
as well. One is that the cultural transmission of language and though daunting, are within the realm of scientific inquiry
the resultant process of language change (“glossogeny”) raise and will eventually yield to concerted empirical research.
important issues beyond those raised by the interaction of Indeed, I think that with collaborative interdisciplinary effort
876 language
with lens and retina, of vertebrates and cephalopods (squid “modularist specialization” versus “generalist universality.”
and octopus) is a textbook case of convergent evolution. But Both viewpoints have been applied to human language.
we now know that Pax6 controls eye development in both. As cogently observed by Heyes (2003), these different
We must thus hypothesize that some simple eyespots in the viewpoints about cognitive evolution need not be in opposi-
Ur-bilaterian were, in some sense, ancestral to modern tion. A multicomponent perspective on language suggests
complex eyes, but this hypothesis greatly stretches the tradi- that some components (e.g., long-term memory for lexical
tional morphological definition of homology. items) might be broadly shared between species and cogni-
The solution to this problem has been to recognize a new tive domains, while others (e.g., mechanisms underlying
possibility, that developmental programs, down to the detailed recursive syntax) might be unique to our species (Hauser,
level of gene sequences and expression patterns during Chomsky, et al., 2002). Such a mixed bag is expected from
development, may be shared by virtue of common descent Darwinian evolution, and a theoretical framework for under-
(and thus homologous in one sense) while the structures that standing language evolution must fully encompass both pos-
they build are not. From a structural viewpoint, squid and sibilities if we are to empirically resolve the issue. Arguments
human eyes are convergently evolved analogues, but from a that “language” is monolithically modular or domain-general
developmental viewpoint the genetic tools utilized to build both oversimplify the situation to the detriment of empirical
them are homologues. This superficially paradoxical situa- progress. Instead, humans have multiple mechanisms (bio-
tion results from deep homology, and it demands wholesale logically based predilections, biases, and constraints) crucial
rethinking of traditional notions of homology (Rutishauser to language. Once we have accurately specified the part-
& Moline, 2005; Shubin et al., 1997). Given increasing evi- icular mechanism of interest (e.g., vocal learning, syntax
dence that developmental pathways are highly conserved, comprehension, lexical acquisition, “theory of mind,” etc.),
across all metazoan phyla, deep homology may be common it becomes an empirical question whether the components
and indeed may be the rule rather than the exception in underlying such traits constitute widely shared mechanisms
development. This evidence raises the possibility that shared making up a general vertebrate or mammalian “cognitive
homologous developmental pathways underlying the evolu- tool kit” or highly specific components uniquely tuned to
tion of such innovative traits as eyes, wings, limblessness in human language. Probably, some will be shared, and some
reptiles, or echolocation in bats might have implications for will be unique to our species. We expect even “unique”
debates in the biology and evolution of language. mechanisms to function in a context of a suite of shared
cognitive mechanisms that both predated them in evolution-
Deep homology and evolutionary innovation in ary time and are shared with nonhuman species. A broad
cognition and language comparative approach is a logical prerequisite for addressing
such questions, because no valid claim of “human unique-
A persistent debate in cognitive science, inherited by cogni- ness” can be made without a search for similar mechanisms
tive neuroscience, may be characterized as the “specialist/ in other animals.
generalist” debate. At the “specialist” end of the continuum, The discovery of deep homology raises the fascinating
often typified by neuroethologists, organisms are seen as possibility that even “unique” innovations, evolved during
supremely adapted to their particular way of life: echolocat- recent human evolution and isolated to our small branch of
ing bats have evolved specializations of hearing and vocal the primate lineage, might derive from more widely shared
production, electroreceptive fish have evolved innovative developmental processes. If so, the developmental pathways
electrical field production and perception mechanisms, and involved may be expected to impose certain constraints (or
food-caching birds have evolved prodigious memories biases) on the system thus evolved, constraints that can be
(Camhi, 1984; Schnitzler, Menne, Kober, & Heblich, 1983). understood by examining the nature of the developmental
However, this perspective has not gone unchallenged by process in nonhuman species. This approach provides an
researchers who point out that underlying mechanisms exciting empirical possibility: that the nature of the neural
may be shared across superficially different cognitive domains developmental processes that give humans our unique
(cf. Bolhuis & Macphail, 2001). Similarly, a dominant con- capacity for language can be probed, at a detailed molecular
temporary paradigm in human evolutionary psychology genetic level, by examining analogous processes in other
favors a view of the human mind/brain as a “Swiss army vertebrates. Indeed, such inquiry could actually aid in gene
knife”: a series of domain-specific adaptive modules (Barkow, discovery and thus help solve the “needle in a haystack”
Cosmides, & Tooby, 1992). Opponents of this nativist/mod- problem discussed earlier. To the extent that nature repeat-
ularist view emphasize the deep similarities in cognitive per- edly uses the same developmental tool kit to solve similar
formance both across cognitive domains and between species evolutionary problems, generating deep homology, we can
(for a balanced review see Laland & Brown, 2002). We might expect investigations of widely separated organisms, from
characterize these different viewpoints as emphasizing honeybees to birds, to offer valuable cues to the genes and
878 language
do not typically develop species-typical song. The subsong Although there are presumably many genes involved in
phenomenon in birds provides a striking parallel with vocal motor control, and many more involved in language
human babbling, which appears to play a similar functional more broadly, the example of FOXP2 shows that deep
role in human vocal learning (Doupe & Kuhl, 1999; homology is not a phenomenon restricted to peripheral mor-
Locke & Pearson, 1990). This, then, is a shared behavioral phology. Furthermore, it illustrates the power of a model
mechanism, similar to the neural mechanisms just discussed, system like birdsong to illuminate our understanding of
that underpins the convergence of vocal learning in humans human vocal control. Experiments like those just discussed
and birds. in birds are impossible in humans for ethical reasons and in
Finally, recent genetic studies of birdsong learning dem- primates for practical reasons: primates are incapable of
onstrate further similarities. Recent experimental work in complex vocal learning. While FOXP2 knockout mice have
the laboratory of Constance Scharff has now clearly docu- been created that show various motor deficits (Shu et al.,
mented a role in avian vocal learning of a gene that was 2005) and knockin mice with human versions of FOXP2
originally discovered in the context of human vocal motor have been engineered (Groszer et al., 2008), there is no evi-
control and learning: forkhead-box P2, or FOXP2 (cf. dence of vocal learning in mice, and thus interpretation of
Ramus and Fisher, chapter 58 in this volume). Like the these results will remain problematic. Indeed, the develop-
HOX and PAX genes discussed earlier, FOX genes code for mental processes in which FOXP2 is involved almost cer-
a transcription factor: a protein that binds to DNA and tainly require a suite of other coevolved mechanisms that
enhances or inhibits the expression of other genes. Also like are not present in most mammals. In contrast, in species like
HOX and PAX, FOX genes are members of a large and songbirds with a fully developed vocal learning ability, the
highly conserved family of transcription factors. Perhaps discovery of one of the genes involved opens the door to
surprisingly, given this conservatism, a human-specific muta- targeted search for other genes in the system. The use of
tion in this gene leads to a specific deficit in oral and vocal large-scale gene-expression assays, targeted on genes known
motor control, first discovered in a family living in England to be up-regulated during vocal learning in birds (Wada
(Vargha-Khadem, Gadian, Copp, & Mishkin, 2005; Vargha- et al., 2006), will play an important role in the discovery of
Khadem & Passingham, 1990; Vargha-Khadem et al., such genes. Thus the undisputed fact that birdsong and
1998). The discovery of this gene (Fisher, Vargha-Khadem, human speech evolved independently may turn out to be
Watkins, Monaco, & Pembrey, 1998) was groundbreaking quite irrelevant to the question of the genetic mechanisms
in that it uncovered the first, presumably of many, genes involved, which may well be largely homologous.
involved in human cognition and language. A decade later
FOXP2 remains the clearest example of a gene involved in Conclusions and prospects
spoken language, shared by all nonclinical human popula-
tions, and different from chimpanzees and other primates. In this chapter I have argued that the discovery of deep
In a striking new demonstration of deep homology, Scharff homology is relevant to cognitive neuroscience, and in the
and her colleagues have shown that FOXP2 (and other case of FOXP2 to spoken language. However, I fully appre-
closely related FOX genes, including FOXP1) is expressed ciate that speech is not language (Fitch, 2000) and constitutes
in similar brain regions in birds and humans and plays a role just one component of a set of diverse mechanisms necessary
in vocal learning (Haesler et al., 2004; Scharff & Haesler, for human language. What of these other mechanisms? In
2005). In the most direct evidence of this role, Haesler and particular, what of semantics and syntax, which most schol-
colleagues (2007) showed that a novel lentivirus-mediated ars agree are more central to human language than is speech
knockdown of FOXP2 expression, via RNA interference, (though see Lieberman, 1998, 2000)? At present we know
decreased the quality and quantity of vocal learning in zebra far less about the neural and genetic mechanisms underlying
finches. The effect occurred only with injections in brain semantics or syntax than those underlying speech, but a
regions specifically evolved in vocal learning (Area X, combination of brain imaging, gene expression profiling,
homologous to basal ganglia in humans) and not injections and exploitation of the comparative method gives reasons
in nonsong areas, providing strong evidence for a key role for optimism concerning these components of language. I
of FOXP2 in vocal learning in birds. Although the ability will thus end by listing some open questions concerning
for vocal learning evolved separately in birds and humans, these additional factors.
the behavioral and neural mechanisms involved show that
there are fundamental similarities at the computational and Semantics A central challenge language poses for
circuit levels, and that the genetic mechanisms involved are evolutionary theory is the readiness humans exhibit to share
identical. Thus FOXP2 constitutes a deep homology: a con- information with other, unrelated, individuals. This drive is
served homologous developmental pathway underlying a striking in its absence in most animals, even in language-
convergently evolved trait. trained chimpanzees who have the machinery for transmitting
880 language
in an exciting new world, both for cognitive neuroscientists evolution of morphologies through heterochrony. Development
interested in tying complex cognition to the underlying (Suppl.), 1994, 135–142.
Duboule, D. (2007). The rise and fall of Hox gene clusters.
neural architecture and for evolutionary biologists interested
Development, 134(14), 2549–2560.
in uncovering the phylogenetic trajectory that led to human Enard, W., Khaitovich, P., Klose, J., & Paäbo, S. (2002). Intra-
language. In this new era, the identification of deep homolo- and interspecific variation in primate gene expression patterns.
gies may play a central role. This is excellent news for com- Science, 296, 340–343.
parative biologists, because it suggests that a far broader Fay, J. C., Wyckoff, G. J., & Wu, C.-I. (2001). Postive and
range of vertebrates, and even nonchordates, may offer negative selection on the human genome. Genetics, 158,
1227–1234.
valuable windows into the genetic basis of that most human Fernald, R. D. (2000). Evolution of eyes. Curr. Opin. Neurobiol., 10,
of traits, language. 444–450.
Fisher, S. E., Vargha-Khadem, F., Watkins, K. E., Monaco,
A. P., & Pembrey, M. E. (1998). Localisation of a gene impli-
REFERENCES
cated in a severe speech and language disorder. Nat. Genet., 18(2),
Axelrod, R., & Hamilton, W. D. (1981). The evolution of coop- 168–170.
eration. Science, 211, 1390–1396. Fitch, W. T. (2000). The evolution of speech: A comparative
Barkow, J., Cosmides, L., & Tooby, J. (Eds.). (1992). The adapted review. Trends Cogn. Sci., 4(7), 258–267.
mind. Oxford, UK: Oxford University Press. Fitch, W. T. (2004). Kin selection and “mother tongues”:
Bolhuis, J. J., & Macphail, E. M. (2001). A critique of the neuro- A neglected component in language evolution. In D. K. Oller &
ecology of learning and memory. Trends Cogn. Sci., 5(10), U. Griebel (Eds.), Evolution of communication systems: A comparative
426–433. approach (pp. 275–296). Cambridge, MA: MIT Press.
Bradbury, J. W., & Vehrencamp, S. L. (1998). Principles of animal Fitch, W. T. (2007a). Evolving meaning: The roles of kin selection,
communication. Sunderland, MA: Sinauer Associates. allomothering and paternal care in language evolution.
Bugnyar, T., Stöwe, M., & Heinrich, B. (2004). Ravens, Corvus In C. Lyon, C. Nehaniv, & A. Cangelosi (Eds.), Emergence of com-
corax, follow gaze direction of humans around obstacles. Proc. munication and language (pp. 29–51). New York: Springer.
R. Soc. Lond. B Biol. Sci., 271(1546), 1331–1336. Fitch, W. T. (2007b). Linguistics: An invisible hand. Nature, 449,
Call, J., & Tomasello, M. (2007). The gestural communication of apes 665–667.
and monkeys. London: Lawrence Erlbaum. Fitch, W. T., & Hauser, M. D. (2004). Computational constraints
Camhi, J. M. (1984). Neuroethology: Nerve cells and the natural behavior of on syntactic processing in a nonhuman primate. Science, 303,
animals. Sunderland, MA: Sinauer Associates. 377–380.
Carroll, S. B. (2003). Genetics and the making of Homo sapiens. Fitch, W. T., Hauser, M. D., & Chomsky, N. (2005). The evolu-
Nature, 422(6934), 849–857. tion of the language faculty: Clarifications and implications. Cog-
Carroll, S. B. (2005). Endless forms most beautiful. New York: W. W. nition, 97(2), 179–210.
Norton. Foster, K. R., Wenseleers, T., & Ratnieks, F. L. W. (2006).
Carroll, S. B. (2006). The making of the fittest: DNA and the ultimate Kin selection is the key to altruism. Trends Ecol. Evol., 21(2),
forensic record of evolution. New York: W. W. Norton. 57–60.
Carroll, S. B., Grenier, J. K., & Weatherbee, S. D. (2005). From Friederici, A. D., Bahlmann, J., Heim, S., Schubotz, R. I., &
DNA to diversity: Molecular genetics and the evolution of animal design Anwander, A. (2006). The brain differentiates human and
(2nd ed.). Malden, MA: Blackwell. non-human grammars: Functional localization and structural
Chen, F.-C., & Li, W.-H. (2001). Genomic divergences between connectivity. Proc. Natl. Acad. Sci. USA, 103(7), 2458–2463.
humans and other hominoids and the effective population size Gehring, W. J., & Ikeo, K. (1999). Pax 6: Mastering eye morpho-
of the common ancestor of humans and chimpanzees. Am. genesis and eye evolution. Trends Genet., 15, 371–377.
J. Hum. Genet., 68(2), 444–456. Gentner, T. Q., Fenn, K. M., Margoliash, D., & Nusbaum,
Chimpanzee Sequencing and Analysis Consortium. (2005). Initial H. C. (2006). Recursive syntactic pattern learning by songbirds.
sequence of the chimpanzee genome and comparison with the Nature, 440, 1204–1207.
human genome. Nature, 437, 69–87. Gómez, R. L., & Gerken, L. (1999). Artificial grammar learning
Christiansen, M., & Kirby, S. (Eds.). (2003). Language evolution. by 1-year-olds leads to specific and abstract knowledge. Cognition,
Oxford, UK: Oxford University Press. 70(2), 109–135.
Darwin, C. (1871). The descent of man and selection in relation to sex. Grice, H. P. (1975). Logic and conversation. In D. Davidson &
London: John Murray. G. Harman (Eds.), The logic of grammar (pp. 64–153). Encino, CA:
Dawkins, R., & Krebs, J. R. (1978). Animal signals: Information Dickenson.
or manipulation? In J. R. Krebs & N. B. Davies (Eds.), Behavioural Groszer, M., Keays, D., Deacon, R., de Bono, J.,
ecology (pp. 282–309). Oxford, UK: Blackwell. Prasad-Mulcare, S., Gaub, S., et al. (2008). Impaired synaptic
De Robertis, E. M., & Sasai, Y. (1996). A common plan for dor- plasticity and motor learning in mice with a point mutation
soventral patterning in Bilateria. Nature, 380, 37–40. implicated in human speech deficits. Curr. Biol., 18, 354–362.
Deacon, T. W. (1997). The symbolic species: The co-evolution of language Haesler, S., Rochefort, C., Geogi, B., Licznerski, P., Osten,
and the brain. New York: W. W. Norton. P., & Scharff, C. (2007). Incomplete and inaccurate vocal imi-
Doupe, A. J., & Kuhl, P. K. (1999). Birdsong and human speech: tation after knockdown of FoxP2 in songbird basal ganglia
Common themes and mechanisms. Annu. Rev. Neurosci., 22, nucleus Area X. PLoS Biol, 5, e321.
567–631. Haesler, S., Wada, K., Nshdejan, A., Morrisey, E. E., Lints, T.,
Duboule, D. (1994). Temporal colinearity and the phylotypic pro- Jarvis, E. D., et al. (2004). FoxP2 expression in avian vocal
gression: A basis for the stability of a vertebrate Bauplan and the learners and non-learners. J. Neurosci., 24, 3164 –3175.
882 language
Schlaug, G., Jäncke, L., Huang, Y., & Steinmetz, H. (1995). In Tomasello, M., Carpenter, M., Call, J., Behne, T., & Moll,
vivo evidence of structural brain asymmetry in musicians. Science, H. (2005). Understanding and sharing intentions: The origins of
267, 699–701. cultural cognition. Behav. Brain Sci., 28, 675–735.
Schnitzler, H.-U., Menne, D., Kober, R., & Heblich, K. (1983). Toro, J. M., & Trobalón, J. B. (2005). Statistical computations
The acoustical image of fluttering insects in echolocating over a speech stream in a rodent. Percept. Psychophys., 67(5),
bats. In F. Huber & H. Markl (Eds.), Neuroethology and behavioral 867–875.
physiology: Roots and growing pains (pp. 235–251). Berlin: Trivers, R. L. (1971). The evolution of reciprocal altruism. Q. Rev.
Springer-Verlag. Biol., 46, 35–57.
Shu, W., Cho, J. Y., Jiang, Y., Zhang, M., Weisz, D., Elder, van Heyningen, V., & Williamson, K. A. (2002). PAX6 in sensory
G. A., et al. (2005). Altered ultrasonic vocalization in mice with development. Hum. Mol. Genet., 11(10), 1161–1167.
a disruption in the Foxp2 gene. Proc. Natl. Acad. Sci. USA, 102(27), Vargha-Khadem, F., Gadian, D. G., Copp, A., & Mishkin,
9643–9648. M. (2005). FOXP2 and the neuroanatomy of speech and lan-
Shubin, N. (2008). Your inner fish: A journey into the 3.5 billion-year history guage. Nat. Rev. Neurosci., 6(2), 131–138.
of the human body. London: Penguin Books. Vargha-Khadem, F., & Passingham, R. (1990). Speech and lan-
Shubin, N., Tabin, C., & Carroll, S. (1997). Fossils, genes and guage deficits. Nature, 346, 226.
the evolution of animal limbs. Nature, 388, 639–648. Vargha-Khadem, F., Watkins, K., Price, C. J., Ashburner, J.,
Sperber, D., & Wilson, D. (1986). Relevance: Communication and cogni- Alcock, K., Connelly, A., et al. (1998). Neural basis of an
tion. Oxford, UK: Blackwell. inherited speech and language disorder. Proc. Natl. Acad. Sci. USA,
Striedter, G. F. (2004). Principles of brain evolution. Sunderland, MA: 95, 12695–12700.
Sinauer. von Melchner, L., Pallas, S. L., & Sur, M. (2000). Visual behav-
Tomarev, S. I., Callaerts, P., Kos, L., Zinovieva, R., Halder, iour mediated by retinal projections directed to the auditory
G., Gehring, W., et al. (1997). Squid Pax-6 and eye develop- pathway. Nature, 404(6780), 871–876.
ment. Proc. Natl. Acad. Sci. USA, 94(6), 2421–2426. Wada, K., Howard, J. T., McConnell, P., Whitney, O., Lints,
Tomasello, M. (1999). The cultural origins of human cognition. T., Rivas, M., et al. (2006). A molecular neuroethological
Cambridge, MA: Harvard University Press. approach for identifying and characterizing a cascade of behav-
Tomasello, M. (2001). Cultural transmission: A view from iorally regulated genes. Proc. Natl. Acad. Sci. USA, 103(41),
chimpanzees and human infants. J. Cross Cult. Psychol., 32(2), 15212–15217.
135–146. Wild, J. M. (1993). The avian nucleus retroambigualis: A nucleus
Tomasello, M. (2007). If they’re so good at grammar, then why for breathing, singing and calling. Brain Res., 606, 119–124.
don’t they talk? Hints from apes’ and humans’ use of gestures. Wilkins, A. S. (2002). The evolution of developmental pathways. Sunder-
Lang. Learn. Dev., 3(2), 133–156. land, MA: Sinauer.
Tomasello, M., & Call, J. (1997). Primate cognition. Oxford, UK: Zahavi, A. (1993). The fallacy of conventional signalling. Proc.
Oxford University Press. R. Soc. Lond. B Biol. Sci., 340, 227–230.
61 ledoux, schiller,
and cain 905
64 hariri 945
65 mitchell and
heatherton 953
66 beer 961
68 greene 987
Introduction
todd f. heatherton and
joseph e. ledoux
abstract To support attachment to the caregiver, altricial The social environment of the developing altricial animal
infants, including humans and rats, must identify, learn, and is very different at birth and weaning. For example, social
remember their caregiver. The early attachment process in the rat behavior in the infant rat following birth is limited to prox-
is distinguished by its behavior and underlying neural circuitry,
which are both exquisitely suited to promoting the infant-caregiver imity-seeking of the caregiver. And though the complex
relationship. Foremost, infants have the enhanced ability to acquire social behavior of the developing and preweanling rat pup
learned preferences, and this behavior is supported by the hyper- still involves proximity-seeking of the caretaker, it now must
functioning locus coeruleus and experience-induced changes in the also facilitate interactions with peers as well as the unfamiliar
olfactory bulb and anterior piriform cortex. But of equal impor- social world outside the nest. Thus the rapid maturation of
tance, infants have a decreased ability to acquire learned aversions
and fear, and this behavior is facilitated through attenuated amyg-
most altricial mammals and the ultimate transition to adult
dala activity. Presumably, this attachment circuitry constrains the social behavior require dynamic neural circuitry that is
infant to form only preferences for the caretaker regardless of capable of responding to these contrasting environments.
the quality of the care received. With maturation and the end of In this chapter, we will review the literature on infant
the infant-caregiver attachment learning period, the developing attachment learning and the underlying neural circuitry
rat’s social behavior and underlying circuitry transition to accom-
that mediate early infant-caregiver social interactions, the
modate life outside the nest. However, early-life environmental and
physiological stressors can alter the dynamic nature of this circuitry, transitioning role of this behavior and circuitry during devel-
particularly in respect to the amygdala. Such changes likely provide opment, and the enduring effects of stress on both the attach-
a framework for the lasting effects of early stress on emotional and ment circuitry and adult behavior.
cognitive outcome.
Early-life social behavior: Attachment learning
Altricial infants of many species, including the human and
The altricial infant’s social world revolves around the care- rat, must learn to identify, orient and approach, and prefer
giver, and as evolution would have it, the infant’s emotional their own mother (Bowlby, 1969; Polan & Hofer, 1998;
and social behaviors have been well crafted to form and Shair, Masmela, Brunelli, & Hofer, 1997). This attachment
maintain the infant-caregiver relationship. Infants of many learning begins during fetal life and continues after birth.
altricial species must learn to recognize their caregiver as the For example, human infants recognize, orient toward,
target of their social behavior and continue to express prox- and prefer their own mother’s voice when tested within
imity-seeking behaviors toward their caregiver to receive the hours of birth (DeCasper & Fifer, 1980). Furthermore, two-
food, protection, and warmth necessary for survival. This day-old newborns will increase suckling at the sound of
learning about the caregiver and the emergence of social their own mother’s voice versus any other human voice,
behavior directed toward the caregiver are referred to as indicative of a learned preference for maternal voice (Fifer
attachment, and this process has wide phylogenetic repre- & Moon, 1995). This recognition is also true regarding
sentation, including chicks, rodents, nonhuman primates, maternal odor. At birth, a human infant who is placed on
and humans. the mother’s ventrum will slowly approach a breast scented
with amniotic fluid in preference to an untreated breast
regina m. sullivan, stephanie moriceau, and charlis (Varendi, Porter, & Winberg, 1996), and a change in mater-
raineki Emotional Brain Institute, Nathan Kline Institute and nal diet, which will alter the odor of the amniotic fluid,
Child and Adolescent Psychiatry, New York University Langone directly influences this preferential response (Lecanuet &
Medical Center, Orangeburg, New York
tania l. roth Department of Neurobiology and the Evelyn F. Schaal, 1996; Mennella, Johnson, & Beauchamp, 1995;
McKnight Brain Institute, University of Alabama at Birmingham, Schaal, Marlier, & Soussignan, 1995). This early odor pref-
Birmingham, Alabama erence appears to be learned and modulates interaction with
sullivan, moriceau, raineki, and roth: infant fear and amygdala 889
the mother (Schaal, Marlier, & Soussignan, 1995; Sullivan as illustrated in figure 60.1, a broad range of stimuli have
et al., 1991). been shown to function as a reward capable of producing
Odor learning about the mother for infant attachment learned odor preferences in infant rats outside the nest
appears phylogenetically widespread. Similar learning con- (Alberts & May, 1984; Brake, 1981; Galef & Sherry, 1973;
trols early social behavior in rats (Alberts & May, 1984; Blass Johanson & Hall, 1979; Johanson & Teicher, 1980; Leon,
& Teicher, 1980; Polan & Hofer, 1998; Risser & Slotnick, 1975; McLean, Darby-King, Sullivan, & King, 1993;
1987; Teicher, Flaum, Williams, Eckhert, & Lumia, 1978), Pedersen et al., 1982; Sullivan, Brake, et al., 1986; Sullivan,
rabbits (Distel & Hudson, 1985; Hudson, 1985; Hudson & Hofer, & Brake, 1986; Weldon, Travis, & Kennedy, 1991;
Distel, 1983), and mice (Armstrong, DeVito, & Cleland, Wilson & Sullivan, 1994).
2006; Coppola, Coltrane, & Arsov, 1994; M. B. Hennessy, Though it is well established that the infant rat shows
Li, & Levine, 1980; Moles, Kieffer, & D’Amato, 2004). In excellent learning and memory ability, particularly for
these species, an infant’s social world after birth is the nest; learned odor preferences, we still understand very little of
therefore social behavior is mostly directed toward the the neural framework that is responsible for this early behav-
mother. Indeed, during this time of dependency upon the ior. Indeed, the neural structures that are well documented
mother, behavior is centered on maintaining contact with to support learned behavior in adult rats (e.g., hippocampus,
the mother, and this behavior is guided and controlled by frontal cortex, and amygdala) are not yet fully functional in
the presence of maternal odor (Galef & Kaner, 1980; Leon, infants. This suggests that the neural circuitry that mediates
1992). Specifically, maternal odor drives an infant to attachment learning and memory in the developing rat
approach the mother and induces nipple attachment, while might differ from that in the adult. Our work as well as that
chemical removal of the natural maternal odor disrupts of others has shown that indeed this is the case. Together,
these behaviors (Hofer, Shair, & Singh, 1976; Teicher & data implicate a unique neural framework in the infant that
Blass, 1977). is responsible for the olfactory-based attachment learning.
Infant rats learn their mother’s odor naturally within the
nest (Brunjes & Alberts, 1979; Campbell, 1984; Galef & Attachment learning circuitry
Kaner, 1980; Leon, 1975; Miller, Jagielo, & Spear, 1989;
Pedersen, Williams, & Blass, 1982; Rudy & Cheatle, 1977; Both anatomical and physiological changes within the olfac-
Sullivan, Brake, Hofer, & Williams, 1986; Sullivan, Hofer, tory bulb have been documented to support odor preference
& Brake, 1986; Sullivan, Wilson, Wong, Correa, & Leon, learning and memory in infant rats (Fillion & Blass, 1986;
1990; Terry & Johanson, 1996). However, this learning Fleming, O’Day, & Kraemer, 1999; Johnson, Woo, Duong,
can be mimicked in classical conditioning experiments Nguyen, & Leon, 1995; Moore, Jordan, & Wong, 1996;
outside the nest (Camp & Rudy, 1988; Haroutunian & Sullivan & Wilson, 1991; Wilson, Sullivan, & Leon, 1987;
Campbell, 1979; Moriceau & Sullivan, 2006; Roth & Woo, Coopersmith, & Leon, 1987; Yuan, Harley, Darby-
Sullivan, 2005; Spear, 1978; Sullivan, Hofer, & Brake, 1986; King, Neve, & McLean, 2003; Zhang, Okutani, Inoure, &
Sullivan, Landers, Yeaman, & Wilson, 2000). Specifically, Kaba, 2003). These changes occur not only in response to
paired presentations of odor and reward are sufficient to odors experienced in the nest (Sullivan et al., 1990), but also
produce both learned odor preferences (demonstrated by an in controlled learning experiments outside the nest (Sullivan
approach to the odor) and nipple attachment. Furthermore, & Leon, 1986; Johnson et al., 1995; Moriceau & Sullivan,
Figure 60.1 This graph illustrates pup preference learning from or foot-shock (0.5 mA). With the close of the sensitive period,
stroking (mimicking mother licking) and shock (mimicking pain twelve-day-old pups no longer show learned-odor associations with
received from mother) and developmental changes. During a sensi- stroking and, in contrast to younger pups, show learned aversions
tive period, eight-day-old rat pups readily form a learned odor to odor-shock presentations.
preference to contiguous presentations of odor and stroking or tail-
sullivan, moriceau, raineki, and roth: infant fear and amygdala 891
Sullivan, Landers, et al., 2000). Finally, nonhuman primate Smits, & Van Ree, 2002; Cunningham, Bhattacharyya,
and human infants exhibit strong proximity-seeking behav- & Benes, 2002; Morys, Berdel, Jagalska-Majewska, &
ior toward an abusive mother (Harlow & Harlow, 1965; Luczynska, 1999; Nair & Gonzalez-Lima, 1999). Synaptic
Maestripieri, Tomaszycki, & Carroll, 1999; Sanchez, Ladd, development begins to appear around PN5, with a dramatic
& Plotsky, 2001; Suomi, 2003). increase between PN10 and PN20, but adult levels are
Certain types of inhibitory learning, including fear of not reached until early adolescence (Mizukawa, Tseng, &
predators, cued- and contextual-fear conditioning, inhibitory Otsuka, 1989). Furthermore, the typical long-term synaptic
conditioning, and passive avoidance, do not emerge until plasticity (LTP) that is inducible in the adult basolateral
after postnatal days 10–11 (Blozovski & Cudennec, 1980; amygdala does not emerge until the end of the attachment
Camp & Rudy, 1988; Collier, Mast, Meyer, & Jacobs, 1979; learning period (Thompson, Sullivan, & Wilson, 2008).
Goldman & Tobach, 1967; Haroutunian & Campbell, 1979; Thus far in this review, we have discussed the literature
Myslivecek, 1997; Stehouwer & Campbell, 1978; Sullivan, that presents the case that it is difficult for infants to learn
Landers, et al., 2000). Indeed, aversive stimuli such as mod- aversions. Sadly, attachment occurs regardless of inadequate
erate shock (as shown in figure 60.1) and tailpinch elicit caregiving. Specifically, children tolerate considerable abuse
learned odor preferences in infant rats (Camp & Rudy, 1988; while remaining strongly attached to an abusive caretaker
Haroutunian & Campbell, 1979; Moriceau & Sullivan, 2006; (Helfer, Kempe, & Krugman, 1997; Pollak, 2003). More-
Moriceau et al., 2006; Roth & Sullivan, 2005; Spear, 1978; over, attachment despite abuse is spread throughout the
Sullivan, Hofer, and Brake, 1986; Sullivan, Landers, et al., animal kingdom (Camp & Rudy, 1988; Maestripieri et al.,
2000), despite an apparent pain response (Barr, 1995; Collier 1999; Rajecki et al., 1978; Salzen, 1970; Sullivan, Landers,
& Bolles, 1980; Emerich, Scalzo, Enters, Spear, & Spear, et al., 2000). An evolutionary explanation that we have
1985; Fitzgerald, 2005; Shair, Masmela, Brunelli, & Hofer, provided for this paradoxical attachment is that it is better
1997; Stehouwer & Campbell, 1978). for an altricial infant to have a bad caretaker than no
What could explain this paradoxical preference learning caretaker, as an altricial infant is dependent upon access
to aversive stimuli? Evidence suggests that the lack of amyg- to the mother’s milk, warmth, and protection (Hofer &
dala plasticity may play a leading role. Indeed, the limita- Sullivan, 2001).
tions on fear learning, passive avoidance, active avoidance, However, it is important to discuss the data that demon-
and inhibitory conditioning during the sensitive period cor- strate that infants can learn aversions under some cir-
respond to the period during development when the amyg- cumstances. Infant rats are able to learn to avoid odors
dala does not participate in the learning process (Blozovski if these are paired with malaise, such as that produced by
& Cudennec, 1980; Collier et al., 1979; Myslivecek, 1997). a LiCl injection or 1.0-mA shock (Abate, Spear, & Molina,
Specifically, the amygdala is not evoked during infant learn- 2001; Alleva & Calamandrei, 1986; Campbell, 1984;
ing in classical fear-conditioning or natural fear paradigms Coopersmith, Lee, & Leon, 1986; Gruest, Richer, & Hars,
(Moriceau, Roth, Okotoghaide, & Sullivan, 2004; Moriceau 2004; Haroutunian & Campbell, 1979; J. W. Hennessy,
& Sullivan, 2006; Moriceau et al., 2006; Roth & Sullivan, Smotherman, & Levine, 1976; Hoffmann, Hunt, & Spear,
2005; Wiedenmayer & Barr, 2001). On the contrary, in 1990; Hoffmann, Molina, Kucharski, & Spear, 1987; Hunt,
other animals ranging from Caenorhabditis-elegans to rodents Molina, Rajachandran, Spear, & Spear, 1993; Hunt, Spear,
and humans, the amygdala is a brain area that is readily & Spear, 1991; Miller, Molina, & Spear, 1990; Molina,
evoked by aversive stimuli in classical conditioning and Hoffmann, & Spear, 1986; Richardson & McNally,
natural fear paradigms (Blair, Schafe, Bauer, Rodrigues, & 2003; Rudy & Cheatle, 1983; Shionoya et al., 2006;
LeDoux, 2001; Davis, 1997; Fanselow & Gale, 2003; Smotherman, 1982; Smotherman, Hennessy, & Levine,
Fanselow & LeDoux, 1999; Herzog & Otto, 1997; Maren, 1976; Smotherman & Robinson, 1985, 1990; Spear, 1978;
2003; McGaugh, Roozendaal, & Cahill, 1999; Pape & Stork, Spear & Rudy, 1991; Stickrod, Kimble, & Smotherman,
2003; Pare, Quirk, & Ledoux, 2004; Rosenkranz & Grace, 1982). Interestingly, while in adult and preweaning rats,
2002; Sananes & Campbell, 1989; Schettino & Otto, 2001; the amygdala responds to odor-malaise conditioning
Sevelinges, Gervais, Messaoudi, Granjon, & Mouly, 2004; (Bermudez-Rattoni, Grijalva, Kiefer, & Garcia, 1986; Gale
Sigurdsson, Doyere, Cain, & LeDoux, 2007). et al., 2004; LeDoux, 2000; Touzani & Sclafani, 2005), in
One contributing factor for the apparent lack of amygdala infants, odor-malaise uses a nonamygdala neural circuit for
plasticity in early-life learning may be functional amygdala odor aversion learning that includes the olfactory bulb
immaturity. The development of the amygdala is considered (Raineki et al., 2009; Shionoya et al., 2006). Another remark-
protracted and extends into adolescence, though peak able constraint exists on aversion learning during infancy: If
neurogenesis and nuclei subdivision occur as early as the neonatal rats are nursing during odor-LiCl conditioning,
first week of life (Bayer, 1980; Berdel & Morys, 2000a, this prevents a learned odor aversion and instead produces
2000b; Berdel, Morys, & Maciejewska, 1997; Bouwmeester, a learned odor preference (Gubernick & Alberts, 1984;
Martin & Alberts, 1979; Melcer, Alberts, & Gubernick, (SHRP). Sensory stimulation provided by the mother during
1985; Shionoya et al., 2006). nursing and grooming seems to control the pups’ low CORT
Together, data indicate that aversions are not readily levels (Levine, 1962; Van Oers, Kloet, Whelan, & Levine,
learned by infants, and we attribute this to unique neural 1998). In fact, prolonged maternal separation (∼24 hours),
circuitry optimized to facilitate attachment to the caregiver, which deprives pups of maternal sensory stimulation,
regardless of the quality of care provided. In figure 60.2, we increases pups’ CORT levels (Levine, 2001), while the
provide a model of our current understanding of this early replacement of maternal sensory stimulation or maternal
social attachment circuit and how this circuitry changes to presence is able to reinstate the low level of CORT (Stanton
transition the developing animal from attachment learning & Levine, 1990; Stanton, Wallstrom, & Levine, 1987;
to learning that can accommodate both learned preferences Suchecki, Rosenfeld, & Levine, 1993). This reduced stress
and avoidances. reactivity experienced by neonates is hypothesized to protect
the developing organism from the negative influences of
Role of corticosterone in early life stress hormones (Sapolsky & Meaney, 1986). Indeed, high
doses of CORT administrated to the neonatal rat causes
As was discussed in the previous section, the developmental decreased mitosis, myelination, and altered granule cell
emergence of fear learning parallels amygdala plasticity and genesis (Bohn, 1980). Furthermore, animals treated during
maturation (Berdel & Morys, 2000b; Berdel et al., 1997; infancy with CORT show reduced DNA content and brain
Bouwmeester et al., 2002; Cunningham et al., 2002; Morys size as well as impaired adult behavior (Bohn, 1984) and neu-
et al., 1999; Nair & Gonzalez-Lima, 1999; Schwob, Haberly, roendocrine function (Erkine, Geller, & Yuwiler, 1979). But
& Price, 1984; Schwob & Price, 1984b; Thompson et al., it is important to note that moderate exposure to CORT
2008; Wilson, Best, & Sullivan, 2004). Pharmacological during the developmental stage may be beneficial. For
manipulations of corticosterone (CORT) levels in the infant example, juvenile rats who were exposed to CORT via the
have allowed us to further define the early social circuit, and dam’s milk show superior performance on the Morris water
furthermore have provided us a platform to assess how maze task, a test of spatial memory (McCornick et al., 2001).
changes in the early environment can affect the developing In adolescents and adults, while stress is generally consid-
brain and subsequent transition to adultlike behavior. ered to be detrimental to social behavior, in moderation it
In infant rats, CORT levels are relatively low (Henning, has an adaptive role and facilitates social interactions, learn-
1978; Walker, Sapolsky, Meaney, Vale, & Rivier, 1986), and ing, and the expression of learned social behavior (DeVries,
the ability of most stressful stimuli, that is, restraint or shock 2002; McEwen, 2002). Indeed, social stimuli directly influ-
(Grino, Paulmyer-Lacroix, Faudon, Renard, & Anglade, ence the CORT response. Specifically, maternal presence in
1994; Levine, 1962, 2001; Rosenfeld, Suchecki, & Levine, adolescent guinea pigs, peers in nonhuman primates, and
1992), to evoke CORT secretion is greatly reduced in com- mate presence in voles reduce CORT (Carter & Keverne,
parison to that in older animals (Butte, Kakihana, Farnham, 2002; DeVries, Glasper, & Detillion, 2003; M. B. Hennessy,
& Noble, 1973; Cate & Yasumura, 1975; Guillet & Maken, & Graves, 2002; M. B. Hennessy, Nigh, Sims, &
Michaelson, 1978; Guillet, Saffran, & Michaelson, 1980; Long, 1995), while social affiliation in humans blocks stress-
Levine, 1967). This period of reduced hypothalamic- induced CORT release (Kirschbaum, Klauer, Filipp, &
pituitary-adrenal (HPA) axis responsiveness during neonatal Hellhammer, 1995). Higher stress levels can produce a
development has been termed the stress hyporesponsive period defensive/offensive system under perceived danger that is
sullivan, moriceau, raineki, and roth: infant fear and amygdala 893
controlled, at least in part, by the amygdala (Korte, 2001). induced alterations of CORT levels and amygdala activity
Thus an adult’s ability to balance the stress response deter- have the potential to disrupt the learning transition and
mines whether social interactions occur or are inhibited by underlying neural circuitry.
a fear/anxiety response.
Like the role of NE, the role of CORT in mediating Consequences of early-life alterations in CORT and
learned behavior changes with maturation. While CORT is amygdala activity
considered to play a modulatory role in adult fear condition-
ing (Corodimas, LeDoux, Gold, & Schulkin, 1994; Hui The importance of the early environment in the regulation
et al., 2004; Pugh, Tremblay, Fleshner, & Rudy, 1997; of behavior throughout the life span has been long recog-
Roozendaal, Carmi, & McGaugh, 1996; Roozendaal, nized in both clinical and experimental studies. Indeed, it
Quirarte, & McGaugh, 2002; Thompson, Erickson, suffices to say that adult behavior is dependent on the care-
Schulkin, & Rosen, 2004), CORT is able to switch whether giver and the quality of the caregiving environment. In par-
infants learn an aversion or a preference. Specifically, ticular, early-life experiences in the context of early social
increasing CORT by systemic injections or by intra-amyg- attachment have the most profound impact on adolescent
dala infusions during 0.5-mA odor-shock conditioning or and adult emotion and cognition in rodents, nonhu-
presentation of naturally aversive stimuli is sufficient to elicit man primates, and humans (Bell & Denenberg, 1962;
both a fear response (learned or unlearned fear) and amyg- Denenberg, 1963; Harlow & Harlow, 1965; Levine, 1962;
dala participation in the infant (Moriceau et al., 2004; Rosenzweig et al., 1969; Schore, 2001). For example, the
Moriceau & Sullivan, 2004, 2006; Takahashi, 1994). Mater- learned attachment odor in rodents is retained and preferred
nal presence in older animals will lower CORT levels fol- well into adulthood (Coopersmith & Leon, 1986; Fillion &
lowing stressful stimuli such as shock (Stanton et al., 1987; Blass, 1986; Moore et al., 1996; Sevelinges et al., 2007;
Suchecki et al., 1993), block fear learning, reinstate the Shah, Oxley, Lovic, & Fleming, 2002; Woo & Leon, 1988),
attachment learning (preference), and prevent the participa- although the role of the odor in modifying behavior changes
tion of the amygdala in learning (Moriceau & Sullivan, from that used during infancy (attachment to the mother) to
2006). After PN15, only fear will be learned during an odor- that used in adulthood (reproduction). Specifically, following
shock conditioning (Upton et al., in prep). Furthermore, we odor-stroke attachment learning in infancy, adult male rats
have verified the causal relationship between maternal pres- exhibit enhanced sexual performance when exposed to the
ence and suppression of a shock-induced CORT release same odors that they experienced in infancy (Fillion & Blass,
in pups’ odor aversion learning by systemic and intra- 1986; Moore et al., 1996).
amygdala CORT infusions, which then permit pups to learn These results are consistent with observations in other
odor aversions even in the presence of the mother. species on the influence of early experiences on adult mate
To summarize, data indicate that during the attachment preferences (Slagsvold, Hansen, Johannessen, & Lifjeld,
period, the mother maintains low infant CORT levels and 2002). Infant-learned attachment odors also continue to
attenuates amygdala activation, preventing infants from elicit both enhanced neural responses of the olfactory bulb
responding to fear/aversive stimuli. Furthermore, through and attenuated amygdala activation in the adult (Sevelinges
manipulation of CORT levels, we have highlighted a transi- et al., 2007). In particular, an odor that is paired with pain
tion period of co-occurrence between the infant attachment to produce the learned attachment odor attenuates adult
learning system and the amygdala-dependent fear learning fear conditioning as well as attenuating amygdala neural
system (figure 60.3). This suggests that environmentally activity supporting the learning (Sevelinges et al., 2007).
Figure 60.3 This schematic represents pups’ developmental of amygdala-dependent fear conditioning, although this can be
learning transitions with odor–0.5-mA shock conditioning. Our advanced or retarded by increasing or decreasing CORT either
previous work suggests that PN10 is a transitional age for the onset pharmacologically or naturally (maternal presence lowers CORT).
sullivan, moriceau, raineki, and roth: infant fear and amygdala 895
behavior, ovulation, and sperm production (Gomes, Frantz, Our rodent animal model of attachment enables us to
Sanvitto, Anselmo-Franci, & Lucion, 1999; Gomes et al., assess some factors potentially associated with the clinical
2005; Mazaro & Lamano-Carvalho, 2006; Padoin, Cadore, outcome. This model of attachment accommodates both
Gomes, Barros, & Lucion, 2001; Raineki et al., 2008). abusive and pleasant attachment, yields an experimental
paradigm in which the effects of both endogenous and exog-
Summary and implications enous pharmacological insults to the developing brain and
behavior can be assessed, and allows us to identify the basic
The clinical literature has clearly shown that early-life neural circuitry for early social behavior (attachment
adverse experiences (physical and/or emotional) can com- learning).
promise adult mental health and social behavior (Gunnar & In this review, we have outlined the neural circuitry that
Quevedo, 2007; Teicher et al., 2003). The infant’s primary underlies the infant rat’s attachment to the mother, high-
environment is the caregiver, and while the environment lighting its predisposition to support proximity-seeking
expands as the child becomes more mobile and indepen- behaviors. We suggest that the infant rat’s attachment circuit
dent, the child’s primary environmental force remains the is due not simply to the absence or immaturity of brain
caregiver. Clinical literature suggests that the infant’s rela- structures but rather to the brain having unique character-
tionship with the caregiver is of the utmost importance in istics (LC hyperfunctioning and amygdala hypofunctioning)
shaping the child’s behavior (Gunnar & Quevedo, 2007; that enable the infant to survive in the environment unique
Schore, 2001; Teicher et al., 2003). For example, a child to infancy. More important, we have discussed how tempo-
with a healthy and secure attachment is likely to mature into ral characteristics of attachment can be manipulated by both
a mentally healthy adult, while a child in an abusive situation environmental and physiological factors and how these
has a greater probability of experiencing adult mental dys- factors may render the animal vulnerable to maladaptive
function and physical health problems. Indeed, the clinical brain development.
effects of an abusive relationship inside and outside of the While human children show behavior within the attach-
attachment dyad have different clinical outcomes, with ment system (proximity seeking, tolerance of pain) remark-
greater vulnerability to later mental health problems when ably similar to that of other species (rat, dog, and nonhuman
the abuse occurs within the attachment system (Zeanah, primate), it is unclear whether this attachment circuitry
Keyes, & Settles, 2003). The neurobiological effects of abuse exists in human infants. However, it does suggest that the
within versus outside the attachment system remain elusive, human infant’s brain is likely organized to ensure rapid,
especially with respect to the specific physical mechanism robust attachment to their caregiver. This further suggests
that causes such differential effects. Most of the clinical work that environmental and physiological factors may likewise
suggests that both mental and physical health is compro- alter attachment and adult emotional and cognitive well-
mised and expressed during childhood and that this contin- being through disruption of the brain areas involved in the
ues through adolescence into adulthood (Bremner, 2003; early attachment process.
Nemeroff, 2004).
The importance of these clinical studies has recently been acknowledgment This work was supported by grants NICHD-
highlighted with brain imaging research showing that these HD33402, NIMH H80603, and NSF-IBN0117234.
early adverse events are correlated with aberrant adult brain
functioning, most notably in the limbic system, frontal REFERENCES
cortex, and cerebellum (Bremner, 2003; Kaufman, Plotsky,
Abate, P., Spear, N. E., & Molina, J. C. (2001). Fetal and infantile
Nemeroff, & Charney, 2000; Nemeroff, 2004; Teicher et al.,
alcohol-mediated associative learning in the rat. Alcohol. Clin.
2003). Presumably, these changes arise through maltreat- Exp. Res., 25(7), 989–998.
ment-induced compromises in the trajectory of brain devel- Alberts, J. R., & May, B. (1984). Nonnutritive, thermotactile
opment (Stien & Kendall, 2004). However, owing to ethical induction of filial huddling in rat pups. Dev. Psychobiol., 17(2),
and practical issues, functional imaging of the immature 161–181.
human brain is not feasible under most circumstances. These Alleva, E., & Calamandrei, G. (1986). Odor-aversion learning
and retention span in neonatal mouse pups. Behav. Neural Biol.,
procedures generally require the child to remain motionless, 46(3), 348–357.
and therefore anesthesia is required. Thus we are left guess- Andersen, S. L., Lyss, P. J., Dumount, N. L., & Teicher, M. H.
ing when a particular brain area function emerges on the (1999). Enduring neurochemical effects of early maternal separa-
basis of anatomical, neurotransmitter, and synaptic develop- tion on limbic structures. Ann. NY Acad. Sci., 877, 756–759.
ment. This problem is potentiated by difficulty in assessing Armstrong, C. M., DeVito, L. M., & Cleland, T. A. (2006).
One-trial associative odor learning in neonatal mice. Chem.
connectivity within and between brain areas as well as decid- Senses, 31(4), 343–349.
ing whether a child’s brain area has the same or different Avishai-Eliner, S., Gilles, E., Eghbal-Ahmadi, M., Bar-El, Y.,
function as that area in the adult. & Baram, T. (2001). Altered regulation of gene and protein
sullivan, moriceau, raineki, and roth: infant fear and amygdala 897
catecholamine depletion on shock-precipitated wall climbing of the developing rat through a central mechanism independent
infant rat pups. Dev. Psychobiol., 18(3), 215–227. from corticotropin-releasing factor and arginine vasopressin.
Erkine, M. S., Geller, E., & Yuwiler, A. (1979). Effects of neo- Endocrinology, 135(6), 2549–2557.
natal hydrocortisone treatment on pituitary and adrenocortical Gruest, N., Richer, P., & Hars, B. (2004). Emergence of
responses to stress in young rats. Neuroendocrinology, 29, 191–199. long-term memory for conditioned aversion in the rat fetus. Dev.
Fanselow, M. S., & Gale, G. D. (2003). The amygdala, fear, and Psychobiol., 44(3), 189–198.
memory. Ann. NY Acad. Sci., 985, 125–134. Gubernick, D. J., & Alberts, J. R. (1984). A specialization of taste
Fanselow, M. S., & LeDoux, J. E. (1999). Why we think plasticity aversion learning during suckling and its weaning-associated
underlying Pavlovian fear conditioning occurs in the basolateral transformation. Dev. Psychobiol., 17(6), 613–628.
amygdala. Neuron, 23(2), 229–232. Guillet, R., & Michaelson, S. M. (1978). Corticotropin
Fenoglio, K. A., Brunson, K. L., Avishai-Eliner, S., Chen, Y., responsiveness in the neonatal rat. Neuroendocrinology, 27(3–4),
& Baram, T. Z. (2004). Region-specific onset of handling- 119–125.
induced changes in corticotropin-releasing factor and glucocor- Guillet, R., Saffran, M., & Michaelson, S. M. (1980). Pituitary-
ticoid receptor expression. Endocrinology, 145(6), 2702–2706. adrenal response in neonatal rats. Endocrinology, 106(3),
Fenoglio, K. A., Brunson, K. L., & Baram, T. Z. (2006). Hippo- 991–994.
campal neuroplasticity induced by early-life stress: Functional Gunnar, M., & Quevedo, K. (2007). The neurobiology of stress
and molecular aspects. Front. Neuroendocrinol., 27(2), 180–192. and development. Annu. Rev. Psychol., 58, 145–173.
Ferry, B., & McGaugh, J. L. (2000). Role of amygdala nore- Haberly, L. B. (2001). Parallel-distributed processing in olfactory
pinephrine in mediating stress hormone regulation in memory cortex: New insights from morphological and physiological
storage. Acta Pharmacol. Sin., 21(6), 481–493. analysis of neuronal circuitry. Chem. Senses, 26(5), 551–576.
Fifer, W., & Moon, C. (1995). The effects of fetal experience Hall, F. S., Wilkinson, L. S., Humby, T., & Robbins, T. W.
with sound. In J. Lecanuet, W. Fifer, N. Krasnegor, & (1999). Maternal deprivation of neonatal rats produces enduring
W. Smotherman (Eds.), Fetal development: A psychobiological perspec- changes in dopamine function. Synapse, 32(1), 37–43.
tive (pp. 351–368). Hillsdale, N J: Lawrence Erlbaum. Harley, C. W., & Sara, S. J. (1992). Locus coeruleus bursts
Fillion, T., & Blass, E. (1986). Infantile experience with suckling induced by glutamate trigger delayed perforant path spike
odors determines adult sexual behavior in male rats. Science, amplitude potentiation in dentate gyrus. Exp. Brain Res., 89(3),
231(4739), 729–731. 581–587.
Fitzgerald, M. (2005). The development of nociceptive circuits. Harlow, H., & Harlow, M. (1965). The affectional system. In
Nat. Rev. Neurosci., 6, 507–520. A. Schrier, H. Harlow & F. Stollnitz (Eds.), Behavior of nonhuman
Fleming, A. S., O’Day, D. H., & Kraemer, G. W. (1999). Neuro- primates (Vol. 2). New York: Academic Press.
biology of mother-infant interactions: Experience and central Haroutunian, V., & Campbell, B. A. (1979). Emergence of intero-
nervous system plasticity across development and generations. ceptive and exteroceptive control of behavior in rats. Science,
Neurosci. Biobehav. Rev., 23(5), 673–685. 205(4409), 927–929.
Foote, S. L., Aston-Jones, G., & Bloom, F. E. (1980). Impulse Hatalski, C. G., Guirquis, C., & Baram, T. Z. (1998). Cortico-
activity of locus coeruleus neurons in awake rats and monkeys is tropin releasing factor mRNA expression in the hypothalamic
a function of sensory stimulation and arousal. Proc. Natl. Acad. Sci. paraventricular nucleus and the central nucleus of the amygdala
USA, 77(5), 3033–3037. is modulated by repeated acute stress in the immature rat.
Gale, G. D., Anagnostaras, S. G., Godsil, B. P., Mitchell, S., J. Neuroendocrinol., 10(9), 663–669.
Nozawa, T., Sage, J. R., et al. (2004). Role of the basolateral Helfer, M. E., Kempe, R. S., & Krugman, R. D. (1997). The
amygdala in the storage of fear memories across the adult life- battered child. Chicago: University of Chicago Press.
time of rats. J. Neurosci., 24(15), 3810–3815. Hennessy, J. W., Smotherman, W. P., & Levine, S. (1976). Con-
Galef, B. G., Jr., & Kaner, H. C. (1980). Establishment and main- ditioned taste aversion and the pituitary-adrenal system. Behav.
tenance of preference for natural and artificial olfactory stimuli Biol., 16(4), 413–424.
in juvenile rats. J. Comp. Physiol. Psychol., 94(4), 588–595. Hennessy, M. B., Li, J., & Levine, S. (1980). Infant responsiveness
Galef, B. G., Jr., & Sherry, D. F. (1973). Mother’s milk: A to maternal cues in mice of 2 inbred lines. Dev. Psychobiol., 13(1),
medium for transmission of cues reflecting the flavor of mother’s 77–84.
diet. J. Comp. Physiol. Psychol., 83(3), 374–378. Hennessy, M. B., Maken, D. S., & Graves, F. C. (2002). Presence
Gilles, E. E., Schultz, L., & Baram, T. Z. (1996). Abnormal of mother and unfamiliar female alters levels of testosterone,
corticosterone regulation in an immature rat model of continu- progesterone, cortisol, adrenocorticotropin, and behavior in
ous chronic stress. Pediatr. Neurol., 15(2), 114–119. maturing Guinea pigs. Horm. Behav., 42(1), 42–52.
Goldman, P. S., & Tobach, E. (1967). Behaviour modification in Hennessy, M. B., Nigh, C. K., Sims, M. L., & Long, S. J. (1995).
infant rats. Anim. Behavi., 15(4), 559–562. Plasma cortisol and vocalization responses of postweaning age
Gomes, C. M., Frantz, P. J., Sanvitto, G. L., Anselmo-Franci, guinea pigs to maternal and sibling separation: Evidence for filial
J. A., & Lucion, A. B. (1999). Neonatal handling induces attachment after weaning. Dev. Psychobiol., 28(2), 103–115.
anovulatory estrous cycles in rats. Braz. J. Med. Biol. Res., 32(10), Henning, S. J. (1978). Plasma concentrations of total and free cor-
1239–1242. ticosterone during development in the rat. Am. J. Physiol., 235(5),
Gomes, C. M., Raineki, C., Ramos de Paula, P., Severino, G. S., E451–E456.
Helena., C. V. V., Anselmo-Franci, J. A., et al. (2005). Neona- Herzog, C., & Otto, T. (1997). Odor-guided fear conditioning in
tal handling and reproductive function in female rats. Endocrinol- rats: 2. Lesions of the anterior perirhinal cortex disrupt fear
ogy, 184, 435–445. conditioned to the explicit conditioned stimulus but not to the
Grino, M., Paulmyer-Lacroix, O., Faudon, M., Renard, M., & training context. Behav. Neurosci., 111(6), 1265–1272.
Anglade, G. (1994). Blockade of alpha 2-adrenoceptors stimu- Hess, E. (1962). Ethology: An approach to the complete analysis
lates basal and stress-induced adrenocorticotropin secretion in of behavior. In R. Brown, E. Galanter, E. Hess, & G. Mendler
sullivan, moriceau, raineki, and roth: infant fear and amygdala 899
expression of a learned association. J. Comp. Physiol. Psychol., Moles, A., Kieffer, B. L., & D’Amato, F. R. (2004). Deficit in
93(3), 430–445. attachment behavior in mice lacking the mu-opioid receptor
Mazaro, R., & Lamano-Carvalho, T. L. (2006). Prolonged gene. Science, 304(5679), 1983–1986.
deleterious effects of neonatal handling on reproductive Molina, J. C., Hoffmann, H., & Spear, N. E. (1986). Conditioning
parameters of pubertal male rats. Reprod. Fertil. Dev., 18(4), of aversion to alcohol orosensory cues in 5- and 10-day rats:
497–500. Subsequent reduction in alcohol ingestion. Dev. Psychobiol., 19(3),
McCornick, C. M., Rioux, T., Fisher, R., Lang, K., MacLaury, 175–183.
K., & Teillon, S. M. (2001). Effects of neonatal corticosterone Moore, C., Jordan, L., & Wong, L. (1996). Early olfactory experi-
treatment on maze performance and HPA axis in juvenile rats. ence, novelty, and choice of sexual partner by male rats. Physiol.
Physiol. Behav., 74, 371–379. Behav., 60(5), 1361–1367.
McEwen, B. S. (2002). Protective and damaging effects of stress Moriceau, S., Roth, T. L., Okotoghaide, T., & Sullivan, R. M.
mediators: The good and bad sides of the response to stress. (2004). Corticosterone controls the developmental emergence of
Metabolism, 51(6, Suppl. 1), 2–4. fear and amygdala function to predator odors in infant rat pups.
McGaugh, J. L. (2006). Make mild moments memorable: Add a Int. J. Dev. Neurosci., 22(5–6), 415–422.
little arousal. Trends Cogn. Sci., 10(8), 345–347. Moriceau, S., & Sullivan, R. M. (2004). Unique neural
McGaugh, J. L., Roozendaal, B., & Cahill, L. (1999). Modula- circuitry for neonatal olfactory learning. J. Neurosci., 24(5),
tion of memory storage by stress hormones and the amygdaloid 1182–1189.
complex. In M. Gazzaniga (Ed.), Cognitive neuroscience (2nd ed.). Moriceau, S., & Sullivan, R. M. (2006). Maternal presence serves
Cambridge, MA: MIT Press. as a switch between learning fear and attraction in infancy. Nat.
McLean, J. H., Darby-King, A., Sullivan, R. M., & King, Neurosci., 9(8), 1004–1006.
S. R. (1993). Serotonergic influence on olfactory learning in the Moriceau, S., Wilson, D. A., Levine, S., & Sullivan, R. M.
neonate rat. Behav. Neural Biol., 60(2), 152–162. (2006). Dual circuitry for odor-shock conditioning during infancy:
McLean, J. H., Harley, C. W., Darby-King, A., & Yuan, Q. Corticosterone switches between fear and attraction via amyg-
(1999). pCREB in the neonate rat olfactory bulb is selectively dala. J. Neurosci., 26(25), 6737–6748.
and transiently increased by odor preference-conditioned train- Morys, J., Berdel, B., Jagalska-Majewska, H., & Luczynska, A.
ing. Learn. Memory, 6(6), 608–618. (1999). The basolateral amygdaloid complex: Its development,
McLean, J. H., & Shipley, M. T. (1991). Postnatal development morphology and functions. Folia Morphol. (Warsz.), 58(3, Suppl.
of the noradrenergic projection from locus coeruleus to the olfac- 2), 29–46.
tory bulb in the rat. J. Comp. Neurol., 304(3), 467–477. Myslivecek, J. (1997). Inhibitory learning and memory in newborn
Meaney, M. J. (2001). Maternal care, gene expression, and the rats. Prog. Neurobiol., 53(4), 399–430.
transmission of individual differences in stress reactivity across Nair, H., & Gonzalez-Lima, F. (1999). Extinction of behavior in
generations. Annu. Rev. Neurosci., 24, 1161–1192. infant rats: Development of functional coupling between septal,
Meaney, M. J., Bhatnagar, S., Diorio, J., Larocque, S., hippocampal, and ventral tegmental regions. J. Neurosci., 19(19),
Francis, D., O’Donnell, D., et al. (1993). Molecular basis for 8646–8655.
the development of individual differences in the hypothalamic- Nakamura, S. T., & Sakaguchi, T. (1990). Development and
pituitary-adrenal stress response. Cell. Mol. Neurobiol., 13(4), plasticity of the locus coeruleus: A review of recent physiological
321–347. and pharmacological experimentation. Prog. Neurobiol., 34,
Meaney, M. J., Diorio, J., Francis, D., Widdwson, J., LaPlante, 505–526.
P., Caldji, C., et al. (1996). Early environmental regulation of Nemeroff, C. B. (2004). Neurobiological consequences of child-
forebrain glucocorticoid receptor gene expression: Implications hood trauma. J. Clin. Psychiatry, 65(Suppl. 1), 18–28.
for adrenocortical responses to stress. Dev. Neurosci., 18(1–2), Okutani, F., Kaba, H., Takahashi, S., & Seto, K. (1998). The
49–72. biphasic effects of locus coeruleus noradrenergic activation on
Meerlo, P., Horvath, K. M., Nagy, G. M., Bohus, B., & dendrodendritic inhibition in the rat olfactory bulb. Brain Res.,
Koolhaas, J. M. (1999). The influence of postnatal handling 783(2), 272–279.
on adult neuroendocrine and behavioural stress reactivity. Padoin, M. J., Cadore, L. P., Gomes, C. M., Barros, H. M., &
J. Neuroendocrinol., 11(12), 925–933. Lucion, A. B. (2001). Long-lasting effects of neonatal stimulation
Melcer, T., Alberts, J. R., & Gubernick, D. J. (1985). Early on the behavior of rats. Behav. Neurosci., 115(6), 1332–1340.
weaning does not accelerate the expression of nursing-related Pape, H. C., & Stork, O. (2003). Genes and mechanisms in the
taste aversions. Dev. Psychobiol., 18(5), 375–381. amygdala involved in the formation of fear memory. Ann. NY
Mennella, J. A., Johnson, A., & Beauchamp, G. K. (1995). Garlic Acad. Sci., 985, 92–105.
ingestion by pregnant women alters the odor of amniotic fluid. Pare, D., Quirk, G. J., & Ledoux, J. E. (2004). New vistas on
Chem. Senses, 20(2), 207–209. amygdala networks in conditioned fear. J. Neurophysiol., 92(1),
Miller, J. S., Jagielo, J. A., & Spear, N. E. (1989). Age-related 1–9.
differences in short-term retention of separable elements of an Pedersen, P. E., Williams, C. L., & Blass, E. M. (1982). Activation
odor aversion. J. Exp. Psychol. [Anim. Behav.], 15(3), 194–201. and odor conditioning of suckling behavior in 3-day-old albino
Miller, J. S., Molina, J. C., & Spear, N. E. (1990). Ontogenetic rats. J. Exp. Psychol. [Anim. Behav.], 8(4), 329–341.
differences in the expression of odor-aversion learning in 4- and Plotsky, P. M., & Meaney, M. J. (1993). Early postnatal experi-
8-day-old rats. Dev. Psychobiol., 23(4), 319–330. ence alters hypothalamic corticotropin-releasing factor (CRF)
Mizukawa, K., Tseng, I. M., & Otsuka, N. (1989). Quantitative mRNA, median eminence CRF content and stress-induced
electron microscopic analysis of postnatal development of release in adult rats. Mol. Brain Res., 18, 195–200.
zinc-positive nerve endings in the rat amygdala using Timm’s Polan, H. J., & Hofer, M. A. (1998). Olfactory preference for
sulphide silver technique. Brain Res. Dev. Brain Res., 50(2), mother over home nest shavings by newborn rats. Dev. Psychobiol.,
197–203. 33(1), 5–20.
sullivan, moriceau, raineki, and roth: infant fear and amygdala 901
Slagsvold, T., Hansen, B. T., Johannessen, L. E., & Lifjeld, Sullivan, R. M., Stackenwalt, G., Nasr, F., Lemon, C., &
J. T. (2002). Mate choice and imprinting in birds studied by Wilson, D. A. (2000). Association of an odor with activation of
cross-fostering in the wild. Proc. R. Soc. Lond. B Biol. Sci., 269(1499), olfactory bulb noradrenergic beta-receptors or locus coeruleus
1449–1455. stimulation is sufficient to produce learned approach responses
Smotherman, W. P. (1982). Odor aversion learning by the rat fetus. to that odor in neonatal rats. Behav. Neurosci., 114(5), 957–962.
Physiol. Behav., 29(5), 769–771. Sullivan, R. M., Taborsky-Barba, S., Mendoza, R., Itano, A.,
Smotherman, W. P., Hennessy, J. W., & Levine, S. (1976). Plasma Lean, M., Cotman, C., Payne, T., & Lott, I. (1991). Olfactory
corticosterone levels during recovery from LiCl produced taste classical conditioning in neonates. Pediatrics, 87, 511–518.
aversions. Behav. Biol., 16(4), 401–412. Sullivan, R. M., & Wilson, D. A. (1991). Neural correlates of
Smotherman, W. P., & Robinson, S. R. (1985). The rat fetus in its conditioned odor avoidance in infant rats. Behav. Neurosci., 105(2),
environment: Behavioral adjustments to novel, familiar, aver- 307–312.
sive, and conditioned stimuli presented in utero. Behav. Neurosci., Sullivan, R. M., Wilson, D. A., Lemon, C., & Gerhardt, G. A.
99(3), 521–530. (1994). Bilateral 6-OHDA lesions of the locus coeruleus impair
Smotherman, W. P., & Robinson, S. R. (1990). Rat fetuses respond associative olfactory learning in newborn rats. Brain Res., 64(1–2),
to chemical stimuli in gas phase. Physiol. Behav., 47(5), 863– 306–309.
868. Sullivan, R. M., Wilson, D. A., & Leon, M. (1989). Norepineph-
Sokoloff, G., & Blumberg, M. S. (1997). Thermogenic, respira- rine and learning-induced plasticity in infant rat olfactory system.
tory, and ultrasonic responses of week-old rats across the transi- J. Neurosci., 9(11), 3998–4006.
tion from moderate to extreme cold exposure. Dev. Psychobiol., Sullivan, R. M., Wilson, D. A., Wong, R., Correa, A., & Leon,
30(3), 181–194. M. (1990). Modified behavioral and olfactory bulb responses to
Spear, N. (1978). Processing memories: Forgetting and retention. Hillsdale, maternal odors in preweanling rats. Brain Res. Dev. Brain Res.,
N J: Lawrence Erlbaum. 53(2), 243–247.
Spear, N. E., & Rudy, J. W. (1991). Tests of the ontogeny of Sullivan, R. M., Zyzak, D. R., Skierkowski, P., & Wilson,
learning and memory: Issues, methods, and results. In H. N. D. A. (1992). The role of olfactory bulb norepinephrine in early
Shair, G. A. Barr, et al. (Eds.), Developmental psychobiology: New olfactory learning. Brain Res. Dev. Brain Res., 70(2), 279–282.
methods and changing concepts (pp. 84–113). New York: Oxford Suomi, S. J. (1997). Early determinants of behaviour: Evidence
University Press. from primate studies. Br. Med. Bull., 53(1), 170–184.
Stanley, W. (1962). Differential human handling as reinforcing Suomi, S. J. (2003). Gene-environment interactions and the neuro-
events and as treatments influencing later social behavior in biology of social conflict. Ann. NY Acad. Sci., 1008, 132–139.
Basenji puppies. Psychol. Reports 10, 775–788. Swanson, L. W., & Petrovich, G. D. (1998). What is the amyg-
Stanton, M., & Levine, S. (1990). Inhibition of infant gluco- dala?. Trends Neurosci., 21(8), 323–331.
corticoid stress response: Specific role of maternal cues. Dev. Takahashi, L. K. (1994). Organizing action of corticosterone on
Psychobiol., 23(5), 411–426. the development of behavioral inhibition in the preweanling rat.
Stanton, M. E., Wallstrom, J., & Levine, S. (1987). Maternal Brain Res. Dev. Brain Res., 81(1), 121–127.
contact inhibits pituitary-adrenal stress responses in preweanling Tao, X., Finkbeiner, S., Arnold, D. B., Shaywitz, A. J., &
rats. Dev. Psychobiol., 20(2), 131–145. Greenberg, M. E. (1998). Ca2+ influx regulates BDNF transcrip-
Stehouwer, D. J., & Campbell, B. A. (1978). Habituation of the tion by a CREB family transcription factor-dependent mecha-
forelimb-withdrawal response in neonatal rats. J. Exp. Psychol. nism. Neuron, 20(4), 709–726.
[Anim. Behav.], 4(2), 104–119. Teicher, M. H., Andersen, S. L., Polcari, A., Anderson,
Stickrod, G., Kimble, D. P., & Smotherman, W. P. (1982). In C. M., Navalta, C. P., & Kim, D. M. (2003). The neurobiologi-
utero taste/odor aversion conditioning in the rat. Physiol. Behav., cal consequences of early stress and childhood maltreatment.
28(1), 5–7. Neurosci. Biobehav. Rev., 27(1–2), 33–44.
Stien, P., & Kendall, J. (2004). Psychological trauma and the developing Teicher, M. H., & Blass, E. M. (1977). First suckling response of
brain: Neurologically based interventions for troubled children. Bingham- the newborn albino rat: The roles of olfaction and amniotic fluid.
ton, NY: Haworth Press. Science, 198(4317), 635–636.
Suchecki, D., Rosenfeld, P., & Levine, S. (1993). Maternal Teicher, M. H., Flaum, L. E., Williams, M., Eckhert, S. J.,
regulation of the hypothalamic-pituitary-adrenal axis in the & Lumia, A. R. (1978). Survival, growth and suckling behavior
infant rat: The roles of feeding and stroking. Brain Res., 75(2), of neonatally bulbectomized rats. Physiol. Behav., 21(4), 553–
185–192. 561.
Sullivan, R. M., Brake, S. C., Hofer, M. A., & Williams, Terry, L. M., & Johanson, I. B. (1996). Effects of altered olfactory
C. L. (1986). Huddling and independent feeding of neonatal rats experiences on the development of infant rats’ responses to
can be facilitated by a conditioned change in behavioral state. odors. Dev. Psychobiol., 29(4), 353–377.
Dev. Psychobiol., 19(6), 625–635. Thompson, B. L., Erickson, K., Schulkin, J., & Rosen, J. B.
Sullivan, R. M., Hofer, M. A., & Brake, S. C. (1986). Olfactory- (2004). Corticosterone facilitates retention of contextually con-
guided orientation in neonatal rats is enhanced by a conditioned ditioned fear and increases CRH mRNA expression in the
change in behavioral state. Dev. Psychobiol., 19(6), 615–623. amygdala. Behav. Brain Res., 149(2), 209–215.
Sullivan, R. M., Landers, M., Yeaman, B., & Wilson, D. A. Thompson, J. V., Sullivan, R. M., & Wilson, D. A. (2008). Devel-
(2000). Good memories of bad events in infancy. Nature, opmental emergence of fear learning corresponds with changes
407(6800), 38–39. in amygdala synaptic plasticity. Brain Res., 1200C, 58–65.
Sullivan, R. M., & Leon, M. (1986). Early olfactory learning Touzani, K., & Sclafani, A. (2005). Critical role of amygdala in
induces an enhanced olfactory bulb response in young rats. Brain flavor but not taste preference learning in rats. Eur. J. Neurosc.,
Res., 392(1–2), 278–282. 22(7), 1767–1774.
sullivan, moriceau, raineki, and roth: infant fear and amygdala 903
61 Emotional Reaction and Action:
From Threat Processing to
Goal-Directed Behavior
joseph e. ledoux, daniela schiller, and christopher cain
abstract Fear was traditionally studied by using instrumental controls reactions related to fear-arousing stimuli. Less is
aversive responses, such as avoidance. This work on instrumental known about how emotional actions are acquired and
actions failed to lead to a clear understanding of the underlying controlled.
fear circuitry. Over the past several decades, research on Pavlovian
fear reactions has elucidated the circuits that mediate fear. Armed
Because pathological states involving fear often include
with this information, we return to a consideration of fear-based the performance of instrumental responses that are mal-
instrumental actions. While both reactions and actions depend on adaptive, this is an important topic to understand. For
the amygdala, somewhat different circuits are involved. Fear reac- example, a characteristic feature of pathological anxiety is
tions require connections from the lateral to the central nucleus the escape and avoidance of fear-arousing situations. These
and from there to the brain stem, while fear-based actions (or at
can be effective strategies in the short run, since they reduce
least some such actions) involve connections from the lateral to the
basal amygdala and from there to forebrain targets (possibly the exposure to situations in which fear arousing stimuli occur
striatum). Elucidating the neural mechanisms underlying interac- and prevent threat escalation. Avoidance can also be effec-
tions between Pavlovian and instrumental aversive learning will tive in the long run, as long as it does not interfere with
enhance our understanding of how the brain shifts from passive normal daily life. However, avoidance becomes maladaptive
reactions to actions in the face of danger. This knowledge might
when routine activities are disrupted by excessive or inap-
aid in understanding how to break the vicious cycle of pathological
avoidance in anxiety disorders and could also lead to better coping propriate avoidance.
strategies and other therapeutic interventions. In this chapter, we give an overview of the relation
between reaction and action in the context of fear. However,
we will also consider positive or appetitive emotional states,
In 1996, a bomb exploded in Olympic Village in Atlanta. as research in this area has provided both insights into, and
As soon as the explosion occurred, everyone in the crowd challenges to, work on fear.
was frozen in fear. A few seconds later, they began to run
away. This scene, captured on video, illustrates two funda- Behavioral distinctions between reaction and action
mental ways in which people respond in emotional situa-
tions. First we react, then we act (LeDoux, 1996a; LeDoux To clarify the distinction between emotional reaction and
& Gorman, 2001). Reactions are inflexible responses that action requires that we consider these in more detail. To do
are automatically elicited by the stimulus, while actions this, we will put these into the context of a more general
are instrumental responses that are emitted. Reactions are taxonomy of behavior.
inevitable consequences that have been programmed by
evolution or individual experience, while actions are cogni- Taxonomy of Behavior Many behaviors that people
tively controlled responses that are flexibly selected at the and other animals perform fall into one of four categories:
moment to achieve a goal. reflex, reaction, action, and habit. (For other discussions
Research on the neural basis of emotion in animal models of this topic, see Balleine & Dickinson, 1998; Cardinal,
has traditionally focused on how emotional stimuli come to Parkinson, Hall, & Everitt, 2002; Lang & Davis, 2006;
elicit fear reactions. Much has been learned over the past H. H. Yin & Knowlton, 2006.)
several decades, especially about how the brain acquires and A reflex is a stimulus-evoked response that usually involves
a single muscle or a limited group of muscles. A puff of
joseph e. ledoux, daniela schiller, and christopher cain New air to the eye, for example, elicits a closure of that eye,
York University, Center for Neural Science, New York, New while painful stimulation of the foot elicits withdrawal of
York that foot.
Figure 61.1 Escape from fear learning represents instrumental test, right). Rearing and freezing were assessed during both phases.
learning that is motivated by fear and reinforced by fear reduction. Paired-EFF rats showed a twofold increase in the EFF escape
We designed a new EFF task that controls for factors that made response (rearing) during the training and testing session compared
past results inconclusive about the role of fear reduction in EFF to Paired-Yoked rats (A and B). Unpaired-EFF and Novel-EFF rats
learning (Cain & LeDoux, 2007). One day after Pavlovian tone- had no fear of the CS and did not acquire the EFF response
shock pairings (Paired), tone-shock unpairings (Unpaired), or no (enhanced rearing). Successful acquisition of this active escape
training (Novel), rats were presented with 25 tone-alone presenta- response was also associated with less passive freezing to the tone
tions in a new context (EFF training, left). For EFF rats, rearing (D; inset = minute-by-minute freezing during the EFF test). Further
during a tone presentation led to its immediate termination analysis demonstrated that EFF learning was response-specific and
(response-reinforcement pairing). Yoked rats received identical performance was motivated by fear (no difference in rearing in the
tone presentations independent of their behavior. One day after absence of the CS; data not shown). Figure adapted with permis-
EFF training, rats were presented with a single, continuous 10- sion from Journal of Experimental Psychology: Animal Behavior Processes
minute tone presentation to assess long-term EFF memory (EFF (Cain & LeDoux, 2007).
Figure 61.3 Groupings of amygdala nuclei. The various nuclei not represent meaningful function divisions, since functions are
of the amygdala are often partitioned into an evolutionarily old mediated by cells within much more localized regions, especially
division (the centromedial or corticomedial region) and an evolu- subnuclei and even subdivisions of subnuclei. Abbreviations: AB,
tionarily newer division (the basolateral region or basolateral accessory basal; AST, amygdalo-striatal transition area; B: basal
complex). While these divisions have some value in understanding nucleus; Ce, central nucleus; CPu, caudate putamen; CTX, cortex;
the phylogenetic and ontogenetic origins of the amygdala, they do La, lateral nucleus; M, medial nucleus. (See color plate 76.)
amygdala are continuous (anatomically and neurochem-
cially) with the lateral and medial divisions of the bed
nucleus of the stria terminalis and should be considered a
structural unit with functional significance, especially for
psychopathology. Swanson and Petrovich (1998) proposed a
more radical idea, arguing that “the amygdala,” whether
extended or not, does not exist as a structural unit. Instead,
they argue that the amygdala consists of regions that belong
to other regions or systems of the brain and that the designa-
tion “the amygdala” is not necessary. For example, in this
scheme, the lateral amygdala and basal amygdala are viewed
as nuclear extensions of the neocortex (rather than simply as
amygdala regions related to the neocortex), the central and
medial amygdala are ventral extensions of the striatum,
and the cortical nucleus is part of the olfactory system. While
this scheme has some merit, the present review focuses on
the organization and function of nuclei and subnuclei that
are traditionally said to be part of the amygdala, as these
perform their functions regardless of whether the amygdala
itself exists.
It is easy to be confused by the terminology that is used Figure 61.4 Subdivisions of the lateral nucleus of the amygdala.
to describe the amygdala nuclei, as different sets of terms are The lateral nucleus of the amygdala has three major subdivision:
used. This problem is especially acute with regards to the dorsal (LAd), ventrolateral (LAvl), and medial (LAm). Each of these
basolateral region of the amygdala. As was noted, the baso- has additional partitions. The dorsal subnucleus, for example, con-
tains a superior (sup) and inferior (inf ) region. Cells in the superior
lateral region consists of the lateral, basal, and accessory region have been implicated in the acquisition of fear conditioning,
basal nuclei. However, in another terminological scheme, and cells have been implicated in the inferior region in long-term
the basal and accessory basal nuclei are called the basolateral memory storage (see text). Abbreviations: B, basal nucleus; CE,
and basomedial nuclei, respectively. The use of the term baso- central nucleus. (See color plate 77.)
lateral to refer both to a specific nucleus (the basal or baso-
lateral nucleus) and to the larger region that includes the the control of fear reactions. The central amygdala (CE), on
lateral, basal, and accessory basal nuclei (the basolateral the other hand, is especially involved in the expression of
region) is the source of some difficulty, since authors do not fear reactions. The basal nucleus (B) and the intercalated cell
always clearly identify whether they are referring to the masses (ICT) also contribute.
nucleus or the region. Further, some studies use the term The standard model evolved from a series of studies
basolateral complex (BLA) to refer to the lateral and basal nuclei that first implicated the central amygdala (CE) starting in
(and usually not the accessory basal). the late 1970s (Hitchcock & Davis, 1986; Kapp et al., 1979,
Each of the nuclei of the amygdala can be further 1984; LeDoux, Iwata, Cicchetti, & Reis, 1988). Given its
partitioned into subnuclei (Pitkänen, 2000; Pitkänen, connections to hypothalamic and brain stem areas that
Pikkarainen, Nurminen, & Ylinen, 1997). For example, the control species-typical behaviors and autonomic and endo-
lateral nucleus has three major divisions: dorsal, ventrolat- crine responses related to fear, CE was viewed as important
eral, and medial (figure 61.4). Further division is also pos- for the expression of fear-related CRs. Studies showing that
sible. The dorsal subdivsion has a superior and an inferior the firing rate of cells in CE increased following CS-US
region. The central nucleus, on the other hand, has lateral, pairing (Pascoe & Kapp, 1985) suggested that CE may also
capsular, and medial divisions. These subnuclear partitions be a key site of plasticity in the formation of the CS-US
of the lateral and central nuclei have turned out to have association.
important functional significance. In the 1990s, emphasis began to shift to the lateral amyg-
dala (LA) when it was shown that sensory inputs from the
Standard Model of Conditioned Fear Reactions: Serial CS and US pathways mainly terminate in LA rather than
Processing Within the Amygdala Two areas of the in CE (Bordi & LeDoux, 1992; Clugnet & LeDoux, 1990;
amygdala are generally considered to be especially important LeDoux, Sakaguchi, Iwata, & Reis, 1986; LeDoux et al.,
for the acquisition and expression of Pavlovian fear 1984), that damage to LA disrupts fear conditioning
conditioning (figure 61.5). The lateral nucleus (LA) receives (LeDoux, Cicchetti, Xagoraris, & Romanski, 1990), and
and integrates the CS and US and later processes the CS in that CS and US inputs converge on single cells in LA
2001). These data are consistent with the notion that LA and independently, this is called the parallel model (Balleine &
CE are both sites of plasticity but that CE plasticity depends Killcross, 2006; Cardinal et al., 2002; Killcross et al., 1997;
on LA plasticity. Killcross & Blundell, 2002).
The fact that LA is normally required for fear condition- First of all, proponents of the parallel view argue that fear
ing does not rule out the possibility that under some circum- conditioning can occur when BLA is damaged (Balleine &
stances, fear conditioning can occur in animals with lesions Killcross, 2006). While this is true, essentially all of the
of LA and B. Indeed, fear conditioning can occur in animals evidence for this has come from studies in which overtrain-
with lesions of LA and B when extensive training is given, ing is used (Hall, Parkinson, Connor, Dickinson, & Everitt,
especially in contextual conditioning but also in cued 2000; Killcross et al., 1997; Lee et al., 2005; Maren, 1998,
conditioning in some cases (Hall, Thomas, & Everitt, 2000; 1999). Most studies that have contributed to the serial model
Killcross, Robbins, & Everitt, 1997; Lee, Dickinson, & have involved training with a small number of trials (fewer
Everitt, 2005; Maren, 1998, 1999). than 10). With overtraining, weak intra-amygdala pathways,
weak direct sensory inputs to CE, or more complex circu-
A Challenge to the Standard Model: Parallel itous pathways involving other brain regions may undergo
Processing Within the Amygdala In spite of the extensive synaptic plasticity, potentiation of which may allow the
evidence in support of the standard model, this view has pathways to be utilized in ways that are not possible with
been challenged. In these challenges, LA and B are usually few training trials. Indeed, recent studies show that CE is
considered together as an undifferentiated structure that is necessary for learning in overtrained animals (Zimmerman,
referred to as the BLA. It is argued that some aspects of fear Rabinak, McLachlan, & Maren, 2007). Thus, while findings
conditioning can be mediated by the BLA independent of from overtraining are important and interesting, they likely
CE and that other aspects can be mediated by CE involve different circuits than those involved in the rapid
independent of BLA. Because BLA and CE are proposed form of fear conditioning that occurs in natural situations
to receive CS and US inputs separately and function in which organisms do not always have the opportunity to
from LA to B mediated aversive action learning (figure 61.7). negative reinforcement. Further, they did not distinguish
Because the tasks that we and others before employed had between LA and B. Nevertheless, the general conclusion
several shortcomings, we developed a new task (Cain & regarding fear-based action is the same as that from our
LeDoux, 2007). This task will be especially useful in drawing study: LA and B are essential for this influence of Pavlovian
firm conclusions about the role of amygdala areas in EFF CS on instrumental learning.
learning and hence conditioned reinforcement. These conclusions from aversive conditioning are also
Another study that is relevant was performed by Killcross consistent with appetitive conditioning results that indicate
and colleagues (1997). They trained rats on a concurrent that the BLA but not the CE contributes to conditioned
Pavlovian conditioning and conditioned punishment task reinforcement (Burns, Everitt, & Robbins, 1999). The
and assessed the effects of CE versus BLA lesions. In this appetitive work suggests further that connections from the
task, a previously conditioned CS punishes and thereby BLA to the ventral striatum allow the conditioned rein-
weakens a previously established, appetitively motivated forcer, formed in the BLA, to mediate the acquisition of
instrumental response. In the same sessions, Pavlovian con- the instrumental action. Extra-amygdala circuits will be
ditioning was assessed by conditioned suppression of appeti- discussed below.
tive bar pressing, using a separate bar from that used for the
punishment assessment. Their results suggested that BLA Amygdala Contributions to Conditioned Motiva-
lesions, but not CE lesions, interfered with conditioned pun- tion: Studies of PIT Excluding conditioned suppression
ishment, which matches well with the results of Amorapanth studies, research on brain mechanisms of PIT mainly
and colleagues. But they also found that conditioned sup- involves appetitive conditioning tasks in which a previously
pression, the Pavlovian measure, was impaired by CE lesions conditioned CS is used as a conditioned incentive and its
but not by BLA lesions. This second finding is at odds with effects on instrumental behavior are assessed (Balleine
many studies on conditioned reactions supporting the serial & Killcross, 2006; Corbit & Balleine, 2005; Everitt et al.,
model. However, this was a complex task involving signifi- 2003; Hall et al., 2001; Holland & Gallagher, 2003;
cant overtraining, and the relevance of the results to simpler Talmi et al., 2008). This work indicates that the amygdala
tasks that do not involve overtraining should be viewed is critical for appetitive PIT. Further, as will be described
cautiously (Lee et al., 2005). Specifically, typical fear- below, different amygdala nuclei appear to make distinct
conditioning studies use fewer than 10 training trials (as few contributions to different forms of appetitive PIT.
as one in many studies), whereas Killcross and colleagues Prior to PIT studies, early appetitive conditioning research
used approximately 120 trials. Also, the instrumental task identified separate contributions of CE and BLA to general
involved conditioned punishment rather than conditioned affective responses and US-specific responses. Damage to
CE, but not to BLA, impaired the “preparatory” condi- tive but distinct (general PIT). BLA lesions disrupted US-
tioned approach (Parkinson, Robbins, & Everitt, 2000) specific PIT but had no effect on US-general PIT, while CE
and orienting (Gallagher, Graham, & Holland, 1990). An lesions disrupted general but not specific PIT. Contradictory
opposite result was found with US-specific “consummatory” results in earlier studies (Everitt et al., 2003; Hall et al., 2001;
responses: Lesions of BLA, but not of CE, interfered with Holland & Gallagher, 2003) have been attributed to the
US devaluation effects (Hatfield, Han, Conley, Gallegher, & failure to distinguish between specific and general PIT
Holland, 1996) and potentiation of feeding by an appetitive (Balleine & Killcross, 2006).
CS (Gallagher & Holland, 1992; Holland & Petrovich, Balleine and Killcross (2006) and others (Cardinal et al.,
2005; Holland, Petrovich, & Gallagher, 2002). Later studies 2002; Killcross & Blundell, 2002) argue that the model
pursued this dissociation, using appetitive PIT tasks (Balleine inspired by appetitive conditioning results should also apply
& Killcross, 2006; Corbit & Balleine, 2005; Everitt et al., to aversive motivation, based in large part on the effects of
2003; Hall et al., 2001; Holland & Gallagher, 2003), in amygdala lesions on conditioned suppression. In conditioned
which Pavlovian CSs enhanced instrumental responding for suppression, an aversive CS decreases food-motivated lever
food. Positive PIT (enhancement of responding) was observed pressing. This is viewed as a form of general PIT. The
when the USs in Pavlovian and instrumental conditioning finding that suppression depends on CE and not on BLA
matched (US-specific PIT) and when they were both appeti- (Killcross et al., 1997) is thus used to argue that aversive