Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu

NEUROBIOLOGY OF SPEECH PERCEPTION

The mechanisms by which human speech is processed in the brain are reviewed from both behavioral and neurobiological perspectives. Special consideration is given to the separation of speech processing as a complex acoustic-processing task versus a linguistic task. Relevant animal research is reviewed, insofar as these data provide insight into the neurobiological basis of complex acoustic processing in the brain.

Annu. Rev. Neurosci. 1997. 20:331–53 Copyright c 1997 by Annual Reviews Inc. All rights reserved NEUROBIOLOGY OF SPEECH PERCEPTION R. Holly Fitch, Steve Miller, and Paula Tallal Center for Molecular and Behavioral Neuroscience, Rutgers, The State University of New Jersey, 197 University Avenue, Newark, New Jersey 07102 KEY WORDS: auditory system, temporal processing, acoustic cues, timing, Wernicke’s area ABSTRACT The mechanisms by which human speech is processed in the brain are reviewed from both behavioral and neurobiological perspectives. Special consideration is given to the separation of speech processing as a complex acoustic-processing task versus a linguistic task. Relevant animal research is reviewed, insofar as these data provide insight into the neurobiological basis of complex acoustic processing in the brain. Introduction The mechanisms through which the human brain can perceive and discriminate complex and rapidly changing components of human speech are not, as yet, well understood. At a very basic level, research has failed to determine the neural mechanisms that encode simple high-frequency sounds in less time than the refractory periods of individual neurons. On a larger scale, scientists have not yet deciphered the mechanisms by which the temporally complex acoustic signals of speech, composed of multiple frequencies (i.e. formants) changing over times as short as 10 ms, are encoded in auditory cortex and interpreted with the rich complexity of language—all within the time constraints of ongoing speech. Nevertheless, a variety of research approaches have been employed to address these questions, and resulting data—with special emphasis on data relevant to the neurobiological bases of speech perception—are reviewed here. The extent to which existing data support neurobiological representation of speech as a function of complex acoustic properties versus linguistic content is also examined. Discussion of research avenues is roughly divided into neurological and behavioral studies of impaired populations; behavioral studies focusing on the 331 0147-006X/97/0301-0331$08.00 332 FITCH, MILLER & TALLAL relative importance of different acoustic cues to normal and abnormal speech perception; neuroimaging studies performed on intact or speech impaired humans during speech perception and/or auditory processing tasks; and animal studies of discrimination for species-specific communicative stimuli or complex auditory stimuli, including speech. We do not cover issues pertaining to the neural bases of higher-order aspects of language function such as syntax or semantics (but see Garrett 1995, Petersen & Fiez 1993 for further discussion). Rather, we focus here on the elemental neural mechanisms for perceiving and discriminating the complex acoustic signals comprising the individual sounds of speech (phonemes). The inclusion of studies on central auditory processing in nonhuman species within a paper on speech perception may be regarded by some as unjustly reductionistic, particularly by those who maintain the special nature of speech as compared to other forms of acoustic information processing. Indeed, it can be argued that the most fundamental controversy in the area of speech research pertains to whether human speech and language abilities emerged from a language “module” unique to the human brain (Wilkins & Wakefield 1995, Liberman & Mattingly 1985), or through the elaboration of more basic sensory, motor, and cognitive neural mechanisms common to human and nonhuman species (e.g. see Fitch et al 1993). At the very least, studies that focus on auditoryprocessing mechanisms for complex signals, including speech, in nonhuman species provide essential comparative data that allow theories pertaining to the evolution of human speech perception to be empirically assessed. What Is Speech? Speech is an acoustic signal comprised of multiple co-occurring frequencies, called formants. Whereas vowel sounds consist of specific combinations of temporally static, steady-state frequencies (see Figure 1), consonants contain variable onset times and rapid transitions of frequencies that change within syllables to/from a place of articulation (determined by the position of the speech apparatus) to/from the frequencies required to produce component vowels (see Figure 2). Although speakers vary widely in the size and shape of their vocal tract and thus the fundamental frequency (pitch) of their speech, the relative combinations of frequency required to produce speech signals are consistent and replicable across speakers. Thus an /æ/ sound can be consistently identified by a normal listener regardless of the pitch of the speaker (e.g. female, male, or child). This phenomenon indicates that absolute frequency per se is not critical to speech recognition. Rather, recognition depends on the relative combination of co-occurring static or transient frequencies. If a specific combination of frequencies consistently produced a specific speech sound, regardless of the NEUROBIOLOGY OF SPEECH PERCEPTION Figure 1 Spectrograph for vowel stimuli /æ/ and /a/. Figure 2 Spectrograph for consonant-vowel (CV) syllables /ba/ and /da/. 333 334 FITCH, MILLER & TALLAL preceding and ensuing sounds, then speech could be mapped according to relatively simple acoustic codes. However, the situation is more complex. Figure 3 shows spectrographs for a series of consonant-vowel (CV) syllables beginning with the same consonant. Note that the specific formant transitions produced when moving the articulators from the starting place for generation of the consonant to that of the ensuing vowel vary considerably depending on the initial frequencies, as well as those of the subsequent vowel. This is because the articulators “anticipate” the vowel even while the initial consonant is being produced. Thus, frequencies that comprise a consonant sound in a specific context also carry information indicating in advance which vowel is “coming up.” This process is called co-articulation. Interestingly, despite such significant variations in acoustic temporal and spectral characteristics, normal listeners are able to consistently identify a given speech sound (phoneme) regardless of the context of other adjacent phonemes. To compound this processing problem, many of the most significant cues needed to distinguish similar speech sounds (e.g. the cue that differentiates /ba/ from /da/) occur within extremely brief time windows. For example, the duration of formant transitions shown in Figure 3 is approximately 40 ms, and these brief components of acoustic information must be encoded and identified within the time constraints of ongoing speech. How does the human brain accomplish this feat? In order to address this question, we need to understand the role of spectral and temporal acoustic structure in speech perception, as well as the mechanisms by which acoustic signals with considerable variance in acoustic structure come to be represented as the same speech sound (phoneme) in the brain. Basic Overview of the Auditory System Acoustic information is encoded by physical transduction that occurs when sound waves (vibrations) are passed from the tympanic membrane into the cochlea. Vibrations within the organ of corti cause hair cells to bend at the frequency of incoming sound, and this transduction excites contacting spiral neurons that pass the signal through the auditory nerve. The auditory nerve projects to the cochlear nucleus, at which point subsets of ascending fibers cross to the contralateral superior olive and inferior colliculus, and other fibers synapse on the ipsilateral superior olive. Projections from the superior olive move through the lateral lemniscus, reach the inferior colliculus (IC), and continue through the medial geniculate nucleus (MGN) of the thalamus, primary auditory cortex (A1), and secondary auditory cortex (A2) (see Aitken et al 1984 or Miller & Towe 1979 for more detailed discussion of the auditory system). Within secondary auditory cortex, on the superior temporal gyrus, lies Wernicke’s area, a region traditionally associated with the perception of speech (see Figure 4). NEUROBIOLOGY OF SPEECH PERCEPTION 335 Figure 3 Acoustic structural differences in consonant formants as a function of ensuing vowel. [Reprinted with permission from Delattre et al (1955). Copyright © 1955 Acoustical Society of America.] 336 FITCH, MILLER & TALLAL Figure 4 Wernicke’s and Broca’s areas in human temporal cortex. [From Geschwind (1979). Copyright c 1979 by Scientific American, Inc. All rights reserved.] More recent studies have also demonstrated activation in frontal regions of the brain (e.g. Broca’s area) during speech and auditory perception tasks. Cumulative studies on the functional organization of the auditory system for the processing of spectral (frequency) cues have delineated a highly organized pattern of tonotopic mapping throughout the primary ascending stations of the auditory system (e.g. Imig et al 1977, Imig & Morel 1983, Kelly 1980, Merzenich & Reid 1974). This pattern is reflected all the way from the functional organization of the cochlea itself to the central nucleus of the IC, the ventral nucleus of the MGN, and A1. In each of these regions, laminar tonotopic organization (or layered organization on the basis of preferred frequency) has been demonstrated. Organization for temporal encoding within the auditory system is less well understood. Research suggests that temporal organization within the primary auditory stations may reflect a differential specialization of subregions that NEUROBIOLOGY OF SPEECH PERCEPTION 337 are based on temporal resolution (e.g. subregions of auditory structures may be specialized for different rates of temporal encoding) (Schreiner et al 1983; Schreiner & Urbas 1984, 1986, 1988; Schreiner & Langner 1988). Evidence also suggests that organization within auditory relay centers appears to be topographically organized, much like for spectral cues, on the basis of sensitivity to ranges in frequency modulation (Schreiner & Langner 1988). Moreover, temporal information appears to be encoded with the highest degree of resolution lower in the auditory system. As one progresses up the ascending pathway, auditory centers appear to respond to increasingly “segmented” components of temporal information and to correspondingly lose temporal resolution for individual bits (Rees & Moller 1983; Schreiner & Langner 1988). Thus at the level of the auditory nerve, temporal information may be encoded on a virtual millisecond-to-millisecond basis (Palmer 1982), while in the medial geniculate nucleus or primary auditory cortex, neural responses as measured by electrophysiology appear to occur at the onset of segments of temporal change (Schreiner & Langner 1988). Such organization may render the cortex specialized for responding to sequences of events within complex signals. This phenomenon may have special significance for speech perception: This process requires transformation from a complex acoustic signal that is characterized by significant variation into a specific representation of a given individual’s speech repertoire. These higher-level questions of neural representation can be addressed, at least in part, by neuroimaging studies of cortical activity during speech processing tasks—a topic discussed below. Neurologic and Behavioral Studies in Language-Impaired Populations Studies of speech perception in humans with quantifiable brain damage, as well as morphometric and behavioral studies on individuals with impaired language processing abilities, have historically provided the foundation of knowledge pertaining to the involvement of specific neural regions in speech processing. The classic view of the neuroanatomical representation of speech perception in the brain was derived from clinical data obtained from adults with acquired lesions (gunshot wounds or stroke), which involved tissue damage to rather large and ill-defined cortical brain areas. Based on such neurological studies, speech production functions were ascribed to frontal regions anterior and superior to the sylvian fissure (i.e. Broca’s area, Brodmann’s area 44), whereas speech perception was thought to reside in temporal regions posterior and inferior to the sylvian fissure (i.e. Wernicke’s area, Brodmann’s area 22; see Figure 4). Little or no functional significance for speech processing was attributed to subcortical brain regions. In addition, neural substrates of speech were thought to reside in the left hemisphere, with little or no interference in speech and 338 FITCH, MILLER & TALLAL language processes following damage to homologous areas in the right hemisphere. Ongoing research conducted over the past decade has, however, produced an overwhelming amount of data that has substantially modified our views on the neurobiology of human speech and language. For example, errors in phonological perception and production are seen when either Broca’s or Wernicke’s area is damaged (Blumstein 1995). Phonological perception errors are frequently restricted to patients with damage to the left hemisphere. Further, several studies have shown that left hemisphere damage interferes more with the perception of place of articulation (place) than with voice onset time (voicing) cues, whereas right hemisphere damage and nondamaged controls process place and voicing cues equally well (Oscar-Berman et al 1975, Blumstein et al 1977, Micelli et al 1978, Perecman & Kellar 1981). These findings support, at least in part, a common pathway or representation for both the perception and production of phonological information. They also demonstrate that widespread cortical regions (both anterior and posterior), specifically in the left hemisphere, are involved in speech perception. Moreover, recent research has revealed a heretofore unrecognized role for subcortical structures in the processing of speech and language. Data from extracellularly recorded neuronal activity in adults undergoing surgery for intractable epilepsy (Ojemann, 1991) and behavioral deficits in individuals with acquired language syndromes (Damasio et al 1982, Robin & Schienberg 1990) have suggested more than a secondary role for subcortical brain structures, particularly the basal ganglia and thalamic nuclei, in language processes (see Crosson 1992 for a review of this literature). Importantly, the extent to which these subcortical brain areas are abnormal may directly reflect both the outcome and the nature of language impairments in children (Aram et al 1990, Ludlow et al 1986). For example, studies have demonstrated that children with left neocortical damage show language deficits that appear to be recoverable, whereas children with damage that extends into the caudate nucleus show more pervasive and lasting language deficits (Aram et al 1985). Research has also shown that damage to the thalamus, particularly the left ventro-lateral and pulvinar thalamic nuclei, impairs language processing (Crosson 1992, Mateer & Ojemann 1983). Moreover, Hugdahl and colleagues (1990) found that stimulation to the left side of the thalamus increases the speech-processing ability of the right ear in patients undergoing surgery for Parkinsonian tremor, while left-sided lesions produced a marked decrease in this right ear advantage (see discussion of dichotic listening paradigm, below). Hugdahl et al (1990) suggest that the thalamus may act as a gating way station for relevant speech and language information en route to target cortical areas and that these thalamic mechanisms may be activated by stimulation and deactivated by lesions. Such NEUROBIOLOGY OF SPEECH PERCEPTION 339 studies have substantially enhanced our view of the diverse neural regions that contribute to speech perception. Evidence has also shown concurrent deficits in processing rapidly changing acoustic cues and speech following left hemisphere damage (Efron 1963, Tallal & Newcombe 1978), suggesting a direct association between the perception of acoustic temporal cues and speech perception. Also, preliminary evidence suggests that proscribed caudate damage may impair nonlingual auditory temporal processing (Tallal et al 1994), a finding consistent with reports that caudate volume as measured by MRI is reduced in language-learning impaired (LLI) children with severe auditory temporal processing deficits (Jernigan et al 1991). Combined with direct evidence from lesion studies showing that caudate damage impairs speech and language functions, these cumulative findings provide some neuropathological evidence linking basic auditory-processing mechanisms and speech-processing mechanisms at both subcortical and cortical levels. MRI studies of dyslexic brains have also evidenced consistent anomalies, such as atypical patterns of cerebral lateralization (e.g. Jernigan et al 1991, Larson et al 1990, Leonard et al 1993, Hynd & Semrud-Clikeman 1989). Neuropathological studies reveal cortical cellular anomalies (i.e. focal developmental cortical neuropathologies, including microgyric lesions, dysplasias, and ectopias) in the brains of human dyslexics (Galaburda & Kemper 1979, Galaburda et al 1985, Humphreys et al 1990). Research using animal models suggests that these anomalies arise as the consequence of interference with critical periods of neuromigration, possibly resulting from focal ischemic damage (e.g. Dvorak et al 1978, Humphreys et al 1991, Rosen et al 1992). Animal studies examining the behavioral consequences of developmental neuropathologies (specifically, cortical microgyric lesions) have shown auditory temporal processing impairments in microgyric rats (Fitch et al 1994), and these processing deficits are highly similar to auditory processing deficits seen in language-impaired children (Tallal & Piercy 1973, Tallal et al 1993). Combined findings suggest that anomalies evident in the brains of dyslexics may act, in part, to impair the encoding and consequent perception of rapidly changing auditory cues, such as those that occur in speech phonemes. The hypothesis that acoustic-processing deficits could impair speech perception and lead to consequent disabilities in the development of phonics necessary for language and reading development has strong historical support from studies of LLI children. These studies have shown that LLI children are profoundly impaired on rapid auditory processing tasks, even when nonlingual stimuli are used (Tallal & Piercy 1973, Tallal et al 1993). Recent evidence has shown that specific auditory temporal training can significantly speed up auditory-processing rates in these children and that improved processing rates are correlated with 340 FITCH, MILLER & TALLAL improved speech processing (Merzenich et al 1996, Tallal et al 1996). This field of research highlights the critical dependence of speech perception upon more basic prerequisite mechanisms of auditory temporal-encoding and perception. Further, the ability to assess auditory temporal processing abilities in infants (Benasich & Tallal 1996) may provide a valuable measure of an infant’s risk for developing later speech and language difficulties. Neuroanatomical evidence showing specific anomalies in the magnocellular neurons of the auditory thalamic nuclei (MGN) of dyslexic brains (Galaburda et al 1994) may also relate to auditory temporal processing deficits in this population. Galaburda et al (1994) speculate that these anomalies may parallel similar defects in the magnocellular subdivision of the visual thalamic nuclei (lateral geniculate nucleus, LGN) of dyslexics (Livingstone et al 1991). Specifically, magnocellular anomalies of the LGN are correlated with deficits in processing rapidly changing visual information (Livingstone et al 1991). The behavioral significance of magnocellular anomalies in the MGN, however, remained unclear until a recent series of animal studies showed that male rats with induced neocortical microgyria (like those seen in dyslexic brains) exhibited significant auditory temporal-processing deficits and also specific anomalies in magnocellular cells of the MGN (Herman et al 1995). Moreover, behavioral performance on the auditory task was correlated to MGN morphology in sham, but not lesioned, males. These results suggested that cortico-thalamic sensory-processing systems are anatomically aberrant in subjects with neonatal cortical injury and that these anatomic defects are behaviorally expressed as sensory-processing deficits. This relationship could explain the coincidence of focal cortical anomalies and MGN morphological anomalies in human dyslexic brains, as well as the auditory-processing deficits seen in LLI populations. We anticipate that this animal model will provide an exciting new avenue for studying the neurobiological substrates of normal speech perception in humans. Psychophysical Studies of Speech Perception in Intact Humans Behavioral studies on intact, healthy adults have also provided important insights into how speech is processed by the brain, including the mechanisms by which psychophysical parameters of speech relate to discrimination and perception. From a psychophysical perspective, speech can be regarded as a tertiary structure wherein frequency and amplitude cues are “wrapped up” inside a temporal structure or envelope. As noted above, evidence strongly suggests that these temporal changes in acoustic structure play a critical role in speech perception. Consistent with this assertion is the surprising demonstration that temporal cues could be extracted from human speech and applied to bands of noise (thus NEUROBIOLOGY OF SPEECH PERCEPTION 341 virtually eliminating spectral cues) and still be discriminated accurately as speech by normal listeners (Shannon et al 1995). The authors conclude that “the presentation of a dynamic temporal pattern in only a few broad spectral regions is sufficient for the recognition of speech” (Shannon et al 1995). Behavioral studies on normal humans have also been used to address the issue of cerebral laterality for processing speech and language (see Bryden 1982, for review). As noted above, clinical evidence from neurologically impaired populations has provided long-standing evidence that speech and language functions are primarily (though not exclusively) lateralized to the left hemisphere in most adults. In order to study this phenomenon from a behavioral approach, scientists have employed the dichotic listening method, wherein competing information is presented simultaneously to the two ears and discrimination or recall is assessed separately for each ear (e.g. Kimura 1967). Since auditory pathways are primarily crossed, this method allows a relative comparison of performance of each ear separately, from which relative performance of the contralateral hemisphere can be inferred. Using the dichotic listening method, it has consistently been shown that most people exhibit a right-ear (left hemisphere) advantage (REA) for discriminating speech sounds (see Bryden 1982, for review). Interestingly, this REA appears to be strongly influenced by temporal parameters. Specifically, slowing down or speeding up the formant transitions within a speech syllable alters the magnitude of the REA for speech (Schwartz & Tallal 1980). These results indicate that specialization of the left hemisphere for rapid acoustic change may underlie specialization for speech perception. This hypothesis is further supported by evidence that intact human listeners exhibit an REA not only for speech, but also for tone sequences that change within the time frame critical to speech (Brown et al 1995). These findings are consistent with the hierarchical dependence of speech perception upon the basic ability to process rapid acoustic change, and lead to a provocative hypothesis that the left hemisphere regions that subserve speech may be fundamentally specialized for the processing of rapidly changing acoustic information—an assertion consistent with the observation of left hemisphere specialization for complex auditory discrimination in nonhuman species (e.g. Dewson 1977, Ehret 1987, Fitch et al 1993, Gaffan & Harrison 1991, Heffner & Heffner 1986, Petersen et al 1978). If these mammalian left hemisphere regions are indeed fundamentally specialized for processing rapid acoustic change, then it stands to reason that these regions would be recruited as the primary locus for higher-order speech perception and language-related functions in humans. This hypothesis requires further study from both developmental and comparative perspectives. In sum, combined neuropathological and behavioral studies show that (a) speech perception can occur with limited spectral, but intact temporal cues; 342 FITCH, MILLER & TALLAL (b) hemispheric specialization for speech appears to reflect processing of temporal acoustic cues; (c) children unable to process temporal acoustic cues exhibit concomitant language-processing deficits; (d ) neurological damage to the left cerebral hemisphere and caudate nucleus appears to result in concurrent auditory temporal-processing and speech-perception deficits; and (e) anatomic anomalies are seen in the auditory thalamic nucleus (MGN) of human dyslexics. In male rats, these anomalies are correlated with deficiencies in rapid auditory processing. These diverse lines of research converge on the single notion that temporal change, or temporal structure, overlaid on frequency and intensity cues, is a necessary and perhaps sufficient acoustic cue in the processing of human speech. Moreover, defects in the neuroanatomic systems subserving these sensory functions appear to result in severely impaired auditory temporal processing—including, in humans, speech processing. In a developmental context such deficits could lead to the impaired phonological perception and language development evident in clinically language-disabled populations. Behavioral Studies and Categorical Perception of Speech Research has shown that a given acoustic pattern is not directly translated, point-to-point, into a singular speech sound. Rather, the brain processes the complex acoustic information and assigns a label, based on known categories of speech signals, to represent each phoneme category. Although these phoneme categories may be innately predisposed, research with infants shows that the psychophysical boundaries of these categories are largely determined from experience beginning at birth (see Kuhl 1992, for review). Apparently, infants learn by listening to ongoing speech where to set up acoustic “boundaries,” which categorize the speech phonemes critical to their native language. This ability to create perceptual boundaries between speech sounds is termed categorical perception and results in a sharply defined (categorical) response pattern when an individual is presented with speech sounds that gradually change acoustically from one phoneme to another (Figure 5). This phenomenon was historically thought to distinguish the discrimination of human speech from other types of environmental or musical stimuli, where such categorical boundaries were not expected. Categorical perception by humans for speech 0thus became a defining hallmark for a unique “speech module” in the human brain, a module that did not apply to other complex auditory processes and had no homologous substrate in any nonhuman species. The hypothesis that categorical perception provided psychophysical evidence for the unique nature of human speech processing was severely challenged in 1975, when Kuhl and Miller demonstrated that a nonhuman species (the chinchilla) showed categorical perception of human speech sounds [consonantvowel (CV) syllables] (Kuhl 1981, 1987; Kuhl & Miller 1975, 1978). In an NEUROBIOLOGY OF SPEECH PERCEPTION 343 Figure 5 Categorical perception of /ba/ versus /da/ as a function of linear change in stimulus acoustic properties (using computer-generated stimuli). equally fatal blow to the “speech is special” hypothesis, psychophysicists began to show categorical perception for certain types of complex nonspeech acoustic signals in humans (Cutting & Rosner 1974, Miller et al 1976, Pisoni 1977). Since the 1970s, ongoing research has demonstrated categorical perception by monkeys of species-specific coos (May et al 1989) and by avian species for species-specific bird calls (e.g. see Dooling et al 1990). These results have severely weakened the argument that human speech perception is a unique process, and they highlight the critical value of animal research, as well as studies of central auditory processing of complex temporal and spectral acoustic signals, in providing comprehensive understanding of the neurobiological basis of human speech perception. Neuroimaging and Electrophysiological Recording During Speech Perception and Auditory Processing Tasks The advent of in vivo imaging in awake, behaving subjects has revolutionized our ability to relate structure to function in the human brain. These new tools have particularly important implications for the study of higher cortical processes such as speech perception. Structurally, three-dimensional magnetic resonance imaging (MRI) provides detailed static images of the in vivo brain, with spatial resolution on the order of cubic millimeters (Damasio & Frank 1992). Functionally, several imaging approaches currently provide images of in vivo brain activity during sensory stimulation, cognitive processing, or motor action. These approaches can be evaluated based on the extent to which they provide fine-grained temporal and/or spatial resolution. On the one hand, surface recordings of the electrical [electroencephalography (EEG), evoked potential (EP), event-related potential (ERP)] and magnetic [magnetoencephalography 344 FITCH, MILLER & TALLAL (MEG)] fields of the brain provide precise temporal resolution (1 ms) of brain activity, but relatively poor spatial resolution regarding localization or source (for review, see Hillyard 1993). In contrast, positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) provide excellent spatial resolution for recording changes in regional cerebral blood flow and cellular metabolism during cognition, but these methods have poor temporal resolution as compared with that of MEG and ERPs, which is a significant drawback for studies of speech and other central auditory processes (Petersen & Fiez 1993). Despite the relative limitations of each of these individual functional imaging approaches, converging results are nevertheless improving our understanding of the neural processes subserving speech. Consistent with the long-standing localization of Wernicke’s area in the superior temporal gyrus, bilateral activation of superior temporal gyrus has been reported during performance of an acoustic phonemic discrimination task using fMRI (Binder et al 1994a,b). Interestingly, however, Binder et al (1994a,b) also found activation of superior temporal gyrus during the presentation of noise, and found no consistent semantic-specific differences in activation in response to different types of speech stimuli, including pseudo-words. Other recent studies utilizing PET have shown colocalization of cortical activity with the presentation of rapidly changing tone sequences, as well as CV syllables and CVC words (Fiez et al 1995). Specifically, Brodmann area 45, a frontal region that leads to Broca’s aphasia when damaged, showed activation for stimuli that incorporated rapid change (CV syllables, CVC words, and nonverbal tone triplets). Steady-state vowels, which are verbal, but do not incorporate transient acoustic information, failed to significantly activate this area. These results pinpoint a functional organization of speech-processing regions that is based, at least in part, on acoustic structure, and not only linguistic relevance or meaning. They show that similar secondary auditory cortical regions subserve the discrimination of complex acoustic stimuli, including speech and nonspeech. Activation of this same frontal left hemisphere region (Broca’s area) recently occurred during a phonemic discrimination task, as measured by PET (Zattore et al 1992). Moreover, right hemisphere activation of this region occurred for pitch discrimination, an acoustic processing task that does not require the integration of information changing within brief time windows. More recently, Shaywitz et al (1995), using fMRI, observed activation of the left hemisphere’s frontal area in men during a phonemic discrimination task, whereas they observed a bilateral pattern of activation of this area in women performing the same task. The latter result is consistent with the results of dichotic listening studies, which show a stronger REA in men than in women for discriminating speech sounds (Kimura & Harshman 1984, McGlone 1980). Although studies by NEUROBIOLOGY OF SPEECH PERCEPTION 345 Zattore et al (1992) and Shaywitz et al (1995) utilized phonemic stimuli carrying both acoustic and linguistic relevance, which make separation of relevant features underlying region-specific brain activation difficult, the PET studies by Fiez and colleagues (1995) specifically demonstrated activity in the same frontal cortical regions with speech and nonspeech auditory stimuli that incorporated rapid acoustic change. The latter result suggests that the results of Zattore et al (1992) and Shaywitz et al (1995) could be interpreted to reflect activation by rapidly changing acoustic spectra (including speech) rather than by phonological stimuli per se. Functional imaging studies with LLI populations have also provided insight into how speech is processed in the brain under normal and anomalous conditions. For example, a series of imaging studies on LLI children support the view that anomalies in anatomic asymmetry in frontal and posterior temporal regions may relate to the inability of LLI individuals to activate these same areas (Neville et al 1993, Rumsey et al 1992, Hagman et al 1992). Research by Stefanatos and colleagues (1989) found that children with primary receptive language problems showed reduced auditory-evoked responses specific to frequency modulated tones. Neville and colleagues (1993) examined electrophysiological recordings from both language-impaired and control subjects during visual and auditory sensory–processing tasks and observed abnormal patterns of hemispheric activation in the language-impaired group. Further, auditory ERP components—considered to represent activity in the perisylvian area (particularly the superior temporal sulcus)—were abnormal in a subset of children who had difficulties in rapid auditory processing (Neville et al 1993). Although many studies have been performed to assess the functional activity of the brain during speech perception, only a few have investigated activity as a function of categorical perception. The ability of MEG and scalp-recorded ERP measures to provide temporal resolution on the order of milliseconds makes them ideal tools for investigating the neurophysiological basis for the categorical perception of speech stimuli. Initial investigations utilizing signalaveraging techniques have provided little evidence for a unique electrical or magnetic signature for the perception of phonetic categories that cannot be attributed to differences in temporal or spectral stimulus parameters (Lawson & Gaillard 1981; Aaltonen et al 1987; Sams et al 1990; Kraus et al 1992, 1993; Sharma et al 1993).1 In summary, several researchers using imaging techniques have examined the brain’s functional activity in response to speech stimuli. Cumulative results support the critical relevance of temporal acoustic cues to functional speech 1 See Molfese & Betz (1988) for review of ERP experiments (analyzed using factor analytic approaches) examining the hemipsheric specialization for speech processing. 346 FITCH, MILLER & TALLAL perception and, moreover, suggest that defects in the underlying ability to process rapid acoustic change are associated with anomalous patterns of cortical activity during speech processing. Finally, although we must presume that higher cortical activity of the brain reflects, at some level, the transformation of complex acoustic signals into meaningful speech representations, evidence obtained from imaging studies still fails to strongly support the presence of a unique pattern of cortical activity that reflects processing of linguistic, but not other acoustically complex, forms of information. Such a result is consistent with psychophysical data from categorical perception studies, which failed to support the idea of unique processing of speech in humans. In sum, little empirical data supports the idea of specialized brain activity in humans for processing speech as compared with other similarly spectrally and temporally complex acoustic signals. Neurobiology of Central Auditory Processing in Animals Our discussion thus far has addressed results from clinical studies of neuropathological populations, behavioral studies of normal and language-impaired subjects, and neuroimaging studies performed during speech perception tasks in normal and language-impaired populations. Each of these research approaches has contributed to our understanding of the mechanisms by which the human brain processes speech—however, these methods are also limited in their ability to probe the fundamental functions of the nervous system. For this reason, our basic neurobiological understanding of sensory and cognitive systems has traditionally been derived from the use of animal models. However, counter to prolific animal research on visual processing, spatial navigation, and memory, which has contributed to our understanding of these functions in the human brain, the field of speech and language research has suffered from a longstanding unwillingness to use animal models in the study of mechanisms that may subserve speech perception. This reflects a widely held belief that human speech and language represent processes unique to humans, and as such, unamenable to study in nonhuman species. Yet, as reviewed above, both human and animal research largely fails to support this view. Empirical evidence suggests that humans do not, in fact, possess a neural module that is activated only by speech and that makes humans distinct from all other animals. Rather, like every other sensory and cognitive process studied to date, critical precursors to the functions that underlie speech perception (specifically, complex acoustic processing) appear to be found in nonhuman species. For example, deficits in complex auditory discrimination, including speciesspecific calls, have been observed following temporal cortical ablation in monkeys (Dewson et al 1970; Dewson 1977; Gaffan & Harrison 1991; Heffner & Heffner 1986, 1989) and cats (Diamond & Neff 1957). Such findings parallel NEUROBIOLOGY OF SPEECH PERCEPTION 347 evidence of human aphasia and complex acoustic-processing deficits following temporal cortical damage. Interestingly, animal studies also support the assertion of left hemisphere specialization for the processing of complex acoustic stimuli in nonhuman species (Dewson et al 1970, Dewson 1977, Fitch et al 1993, Gaffan & Harrison 1991) and further suggest that this specialization may underlie left hemisphere specialization for the discrimination of speciesspecific communicative signals (Ehret 1987, Heffner & Heffner 1986, Petersen et al 1978). As such, these results support the assertion that left hemisphere specialization for the perception of complex acoustic stimuli may underlie the well-known left hemisphere specialization for human speech perception. Moreover, at a basic level, animal studies can shed light on the mechanisms whereby temporal acoustic information is encoded in the auditory system. Research by Schreiner and colleagues (e.g. see Schreiner & Langner 1988 described above) specifically shows that (a) the fidelity of point-to-point temporal encoding is gradually replaced by segmented responses to units of complex temporal cues as one moves up the ascending auditory system and (b) auditory relay structures may contain subregions that are specialized for encoding of high-resolution temporal information. The latter assertion is consistent with findings of magnocellular anomalies in the LGN and MGN of human dyslexics (Livingstone et al 1991, Galaburda et al 1994), particularly since animal studies have linked the presence of magnocellular anomalies in the MGN to auditory discrimination deficits (Herman et al 1995). The notion of temporally specialized subregions is also consistent with findings by Kraus et al (1994b), which demonstrated a neurophysiologic response to acoustic stimulus change (as measured by mismatch negativity, or MMN, to a deviant tone burst) in the caudo-medial region of the guinea pig auditory thalamic nucleus (MGN). Kraus et al (1994a) obtained similar results from the guinea pig thalamus by using CV syllables as stimuli. These results are of particular interest because neurophysiologic response to temporal change was not elicited in the ventral MGN, the tonotopically organized central relay station of the auditory thalamus. These findings suggest that regions of auditory structures traditionally considered secondary because they are not strongly involved in spectral analysis may actually be primary for the temporal analysis critical to speech perception. Animal research utilizing speech stimuli has also shed light on potential neurobiological speech-processing mechanisms in humans. In studies in which electrophysiological recordings in monkeys were performed during the auditory presentation of phonemes, Steinschneider et al (1994) found that a characteristic “double on” neural response pattern in auditory cortex may reflect the categorical perception of voiced versus unvoiced consonants. Specifically, the researchers found that the place where the second burst to voicing onset dissipated marked the categorical boundary between consonants that differed in 348 FITCH, MILLER & TALLAL voice onset time (Steinschneider et al 1994). Moreover, results supported assertions of increasingly segmented neural responses to complex acoustic stimuli as one ascends the auditory system: Electrophysiological responses measured in thalamocortical fibers reflected an initial transient response to the initiation, “on,” of a complex acoustic signal (e.g. speech) followed by phase-locked response to the syllable periodicity, whereas cortical (A1) responses were seen at the start of periodic and aperiodic segments defining the voice-onset-time (VOT), thus accentuating the acoustic transients (see also Steinschneider et al 1995). These findings, as well as data obtained from neural modeling of categorical perception (Buonomano & Merzenich, 1995), support an acoustic rather than linguistic basis for the categorical perception of phonetic stimuli and, moreover, provide neurobiological information about the mechanisms underlying this response. Summary Neural representation for speech processing apparently occurs over a more widely distributed neural system than was once believed. Data have shown that acoustic information is initially encoded on a point-to-point basis, including the frequency information underlying the temporal envelope, at the level of the auditory nerve. As speech signals ascend the auditory pathway, neuronal response patterns to the temporal envelope, which behavioral studies demonstrate is sufficient for speech perception (Shannon et al 1995), become increasingly segmented as neurons begin to respond with increasing preference to units of temporal acoustic information (Schreiner & Langner 1988; Steinschneider et al 1994, et al 1995). Primary and secondary cortical activity reflected by human in vivo neuroimaging measures may reflect activation to complex acoustic stimuli incorporating very rapid acoustic changes, regardless of whether or not these rapid acoustic changes are or are not occurring within speech (e.g. Fiez et al 1995, Binder et al 1994). These results are consistent with behavioral and neuropathological evidence supporting the critical relevance of rapid auditory processing systems to normal speech perception (e.g. Tallal et al 1993, Galaburda et al 1994, Herman et al 1995). These results also support the conclusion that what is selectively damaged by left hemisphere lesions involves mechanisms critical to the processing of information within a time frame of tens of milliseconds. We suggest that a disruption of this mechanism leads to the phonological disorders so commonly seen in aquired and developmental aphasias. We hypothesize that these mechanisms are common to both the perception and production of speech information within this time range and point to the work of Kimura & Archibald (1974) and Ojemann (1984) in support of this hypothesis. This review focuses on data derived from a wide variety of research approaches that provide insight into the neurobiological basis of speech perception. NEUROBIOLOGY OF SPEECH PERCEPTION 349 Speech perception has not, traditionally, been an area firmly embraced by modern-day neuroscientists. Historically, this may derive from a belief that the neurobiological mechanisms that subserve speech are uniquely human and, as such, not amenable to basic neuroscientific methods of study at the molecular or systems level of inquiry. However, the data overwhelmingly fail to provide support for separate or uniquely human neural-processing systems for speech. Indeed, the data converge to suggest that speech processing is subserved by neurobiological mechanisms specialized for the representation of temporally complex acoustic signals, regardless of communicative or linguistic relevance, in humans and nonhumans alike. Further, evidence obtained from animal studies suggests a neurobiological basis for the final matching of acoustic patterns to speech templates that is consistent with categorical perception (Steinschneider et al 1994, 1995). The fact that nonhuman species show categorical perception for speech phonemes at both the behavioral and neurobiological level undermines the traditional view that categorical perception reflects speech-specific processing that is unique to the human brain. This is not to say that speech itself—and indeed, species-specific communications—is not a special form of acoustic information. It only says that the brain does not, as far as we are aware, process this information in a unique manner. Rather, it appears that neurobiological systems for processing temporally complex acoustic stimuli are recruited, or even exploited, for processing acoustically rich species-specific communications such as human speech. As stated at the outset, we have not addressed the neurobiological mechanisms that underlie higher-order processing of speech signals that ultimately represent language. We discuss data relating to the neurobiological processing of subunits of human speech (phonemes) but in no way suggest that this review reflects the final steps in the processing of meaningful language. We do suggest, however, that in order to fully understand the neurobiological mechanisms underlying language processing as it relates to semantics, syntax, grammar, and ultimately conceptual thought, we first need to better understand how the building blocks of sentences and words—that is, phonemes—are processed. Phonology offers an important link between neurobiological systems (i.e. sensory, perceptual, and motor) and higher aspects of language (semantics and syntax). By studying speech processing at the level of the acoustic mechanisms that subserve it, neuroscientists are now in the position to make rapid and substantial advances in understanding the fundamental neurobiological underpinnings of language. ACKNOWLEDGMENTS The authors wish to thank Illya Shell for technical assistance with this manuscript and the NIDCD, Charles A. Dana Foundation, McDonnell-Pew Foundation, and March of Dimes for research funding. 350 FITCH, MILLER & TALLAL Visit the Annual Reviews home page at http://www.annurev.org. Literature Cited Aaltonen O, Niemi P, Nyrke T, Tuhkanen M. 1987. Event-related brain potentials and the perception of a phonetic continuum. Biol. Psychol. 24(3):197–207 Aitkin LM, Irvine DRF, Webster WR. 1984. Central neural mechanisms of hearing. In Handbook of Physiology, Sect. 1: The Nervous System, Sensory Processes, ed. I DarienSmith, 3:675–737. Bethesda, MD: Am. Physiol. Soc. Aram DM, Gillespie LL, Yamashita TS. 1990. Reading among children with left and right brain lesions. Dev. Neuropsychol. 6(4):301– 17 Aram H, Ekelman BL, Rose DF, Whitaker HA. 1985. Verbal and cognitive sequelae following unilateral lesions acquired in early childhood. J. Clin. Exp. Neuropsychol. 7(1):55–78 Benasich AA, Tallal P. 1996. Auditory temporal processing thresholds, habituation, and recognition memory over the first year. Infant Behav. Dev. 19:339–57 Binder JR, Rao SM, Hammeke TA, Yetkin FZ, Frost JA, et al. 1994a. Effects of stimulus rate on signal response during functional magnetic resonance imaging of auditory cortex. Cogn. Brain Res. 2(1):31–8 Binder JR, Rao SM, Hammeke TA, Yetkin FZ, Jesmonowicz A, et al. 1994b. Functional magnetic resonance imaging of human auditory cortex. Ann. Neurol. 35(6):662–72 Blumstein SE. 1995. The neurobiology of the sound structure of language. In The Cogntive Neurosciences, ed. M Gazzaniga, pp. 915– 29. Cambridge, MA: MIT Press Blumstein SE, Baker E, Goodglass H. 1977. Phonological factors in auditory comprehension in aphasia. Neuropsychologia 15:19–30 Brown CP, Fitch RH, Tallal P. 1995. Gender and hemispheric differences for auditory temporal processing. Soc. Neurosci. Abstr. 1:440 Bryden MP. 1982. Laterality: Functional Asymmetry in the Intact Brain. New York: Academic Buonomano D, Merzenich MM. 1995. Temporal information transformed into a spatial code by a neural network with realistic properties. Science 267:1028–30 Crosson BA. 1992. Subcortical Functions in Language and Memory. New York: Guilford Cutting JE, Rosner BS. 1974. Categorical boundaries in speech and music. Percept. Psychophys. 16:564–70 Damasio AR, Damasio H, Rizzo M, Varney N, Gersh F. 1982. Aphasia with nonhemorrhagic lesions in the basal ganglia and internal capsule. Arch. Neurol. 39:15–20 Damasio H, Frank R. 1992. Three dimensional in vivo mapping of brain lesions in humans. Arch. Neurol. 49:137–43 Delattre P, Liberman A, Cooper FS. 1955. J. Acoust. Soc. Am. 27(4):769–73 Dewson JH III. 1977. Preliminary evidence of hemispheric asymmetry of auditory function in monkeys. In Lateralization in the Nervous System, ed. S Harnad, RW Doty, L Goldstein, J Jaynes, G Krauthamer, pp. 63–71. New York: Academic Dewson JH III, Cowey A, Weiskrantz L. 1970. Disruptions of auditory sequence discrimination by unilateral and bilateral cortical ablations of superior temporal gyrus in the monkey. Exp. Neurol. 28:529–49 Diamond IT, Neff WD. 1957. Ablation of temporal cortex and discrimination of auditory patterns. J. Neurophysiol. 20:300–15 Dooling RJ, Brown SD, Park TJ, Okanoya K. 1990. Natural perceptual categories for vocal signals in Budgerigars (Melopsittacus Undulatus). In Comparative Perception: Complex Signals, ed. WC Stebbins, MA Berkley, 2:345–74. New York: Wiley Dvorák K, Feit J, Juránková Z. 1978. Experimentally induced focal microgyria and status verrucosus deformis in rats—pathogenesis and interrelation histological and autoradiographical study. Acta Neuropathol. 44:121– 29 Efron R. 1963. Temporal perception, aphasia and deja vu. Brain 86:403–24 Ehret G. 1987. Left hemisphere advantage in the mouse brain for recognizing ultrasonic communication calls. Nature 325(6101):249–51 Fiez JA, Raichle ME, Miezin FM, Peterson SE, Tallal P, Katz WF. 1995. Activation of a left frontal area near Broca’s area during auditory detection and phonological access tasks. J. Cogn. Neurosci. 7(3):357–75 Fitch RH, Brown CP, O’Connor K, Tallal P. 1993. Functional lateralization for auditory temporal processing in male and female rats. Behav. Neurosci. 107(5):844–50 Fitch RH, Brown CP, Tallal P. 1993. Left hemisphere specialization for auditory temporal processing in rats. In Temporal Information Processing in the Nervous System: Special NEUROBIOLOGY OF SPEECH PERCEPTION Reference to Dyslexia and Dysphasia, ed. P Tallal, AM Galaburda, RR Llinás, C von Euler, pp. 346–47. New York: NY Acad. Sci. Fitch RH, Tallal P, Brown CP, Galaburda AM, Rosen GD. 1994. Induced microgyria and auditory temporal processing in rats: a model for language impairment. Cerebr. Cortex 4:260–70 Gaffan D, Harrison S. 1991. Auditory-visual associations, hemispheric specialization and temporal-frontal interaction in the rhesus monkey. Brain 114:2133–44 Galaburda AM, Kemper TL. 1979. Cytoarchitectonic abnormalities in developmental dyslexia: a case study. Ann. Neurol. 6(2):94– 100 Galaburda AM, Menard MT, Rosen GD, Livingstone MS. 1994. Evidence for abberrant auditory anatomy in developmental dyslexia. Proc. Natl. Acad. Sci. USA 91:8010–13 Galaburda AM, Sherman GF, Rosen GD, Geschwind AF. 1985. Developmental dyslexia: four consecutive patients with cortical anomalies. Ann. Neurol. 18(2):222–33 Garrett M. 1995. The structure of language processing: neurophysiological evidence. In The Cognitive Neurosciences, ed. MS Gazzaniga, pp. 881–99. Cambridge, MA: MIT Press Geschwind N. 1979. Specialization of the human brain. Sci Am. 241(3):180 Hagman JO, Wood F, Buchsbaum MS, Tallal P, Flowers L, Katz W. 1992. Cerebral brain metabolism in adult dyslexic subjects assessed with positron emission tomography during performance of an auditory task. Arch. Neurol. 49:734–39 Heffner HE, Heffner RS. 1986. Effects of unilateral and bilateral auditory cortex lesions on the discrimination of vocalizations by Japanese macaques. J. Neurophysiol. 56:683–701 Heffner HE, Heffner RS. 1989. Effects of restricted lesions on absolute thresholds and aphasia-like deficits in Japanese Macaques. Behav. Neurosci. 103:158–69 Herman A, Fitch, RH, Galaburda, AM, Rosen GD. 1995. Induced microgyria and its effects on cell size, cell number, and cell packing density in the medial geniculate nucleus. Soc. Neurosci. Abstr. 21:1711 Hillyard SA. 993. Electrical and magnetic brain recordings: contributions to cognitive neuroscience. Curr. Op. Neurobiol. 3:217–24 Hugdahl K, Wester K, Abjornsen A. 1990. The role of the left and right thalamus in language asymmetry: dichotic listening in Parkinsonian patients undergoing stereotactic thalamotomy. Brain Lang. 39:1–13 Humphreys P, Kaufmann WE, Galaburda AM. 1990. Developmental dyslexia in women: 351 neuropathological findings in three patients. Ann. Neurol. 28(6):727–38 Humphreys P, Rosen, GD, Press, DM, Sherman, GF, Galaburda, AM. 1991. Freezing lesions of the newborn rat: a model for cerebrocortical microgyria. J. Neuropathol. Exp. Neurol. 50:145–60 Hynd-Semrud GW, Clikeman MS. 1989. Dyslexia and neurodevelopmental pathology: relationships to cognition, intelligence, and reading skill acquisition. Learn. Disabil. 22:204–15 Imig TH, Morel A. 1983. Organization of the thalamocortical auditory system in the cat. Annu. Rev. Neurosci. 6:95–120 Imig TJ, Ruggero MA, Kitzes LM, Javel E, Brugge JF. 1977. Organization of auditory cortex in the owl monkey (Aotus Trivirgatus). J. Comp. Neurol. 171:111–28 Jernigan TL, Hesselink JR, Sowell E, Tallal PA. 1991. Cerebral structure on magnetic resonance imaging in language- and learningimpaired children. Arch. Neurol. 48(5):539– 45 Kelly J. 1980. The auditory cortex of the rat. In Cerebral Cortex of the Rat, ed. B Kolb, RC Tees, p. 381–405. Cambridge, MA: MIT Press Kimura D. 1967. Functional asymmetry of the brain in dichotic listening. Cortex 3:163–78 Kimura D, Archibald Y. 1974. Motor function of the left hemisphere. Brain 97:337–50 Kimura D, Harshman R. 1984. Sex differences in brain organization for verbal and nonverbal functions. In Progress in Brain Research, ed. GJ DeVries, 61:423–41. Amsterdam: Elsevier Kraus N, McGee T, Carrell T, King C, Littman T, Nicol T. 1994a. Discrimination of speechlike contrasts in the auditory thalamus and cortex. J. Acoust. Soc. Am. 96:2758–68 Kraus N, McGee T, Carrell T, Sharma A, Micco A, Nichol T. 1993. Speech-evoked cortical potentials in children. J. Am. Acad. Audiol. 4(4):238–48 Kraus N, McGee T, Littman T, Nicol T. 1992. Reticular formation influences on primary and non-primary auditory pathways as reflected by the middle latency response. Brain Res. 587:186–94 Kraus N, McGee T, Littman T, Nicol T, King C. 1994b. Nonprimary auditory thalamic representation of acoustic change. J. Neurophysiol. 72:1270–77 Kuhl PK. 1981. Discrimination of speech by non-human animals: basic auditory sensitivities conducive to the perception of speechsound categories. J. Acoust. Soc. Am. 70:340– 49 Kuhl PK. 1987. The special-mechanisms debate in speech research: categorization tests on 352 FITCH, MILLER & TALLAL animal and infants. In Categorical Perception: The Groundwork of Cognition, ed. S. Harnad, p. 355–86. Cambridge: Cambridge Univ. Press Kuhl PK. 1992. Psychoacoustics and speech perception: internal standards, perceptual anchors, and prototypes. In Developmental Psychoacoustics, ed. LA Werner, EW Rubel, pp. 293–332. Washington, DC: Am. Psychol. Assoc. Kuhl PK, Miller JD. 1975. Speech perception by the chinchilla: voiced-voiceless distinction in alveolar plosive consonants. Science 190:69–72 Kuhl PK, Miller JD. 1978. Speech perception by the chinchilla: identification functions for synthetic VOT stimuli. J. Acoust. Soc. Amer. 63:905–17 Larsen JP, Hoien T, Lundberg I, Odegaard H. 1990. MRI evaluation of the size and symmetry of the planum temporale in adolescents with developmental dyslexia. Brain Lang. 39:289–301 Lawson EA, Gaillard AW. 1981. Evoked potentials to constant-vowel syllables. Acta Psycholog. 49(1):17–25 Leonard CM, Voeller KKS, Lombardino LJ, Morris MK, Hynd GW, et al. 1993. Anomalous cerebral structure in dyslexia revealed with magnetic resonance imaging. Arch. Neurol. 50:461–69 Liberman AM, Mattingly IG. 1985. The motor theory of speech perception revisited. Cognition 21(1):1–36 Livingstone MS, Rosen GD, Drislane FW, Galaburda AM. 1991. Physiological and anatomical evidence for a magnocellular defect in developmental dyslexia. Proc. Natl. Acad. Sci. USA 88:7943–47 Ludlow CL, Rosenberg J, Fair C, Buck D, Schesselman S, Salazar A. 1986. Brain lesions associated with nonfluent aphasia fifteen years following penetrating head injury. Brain 109:55–80 Mateer C, Ojemann GA. 1983. Thalamic mechanisms in language and memory. In Language Function and Brain Organization, ed. S Segalowitz, pp. 171–91. New York: Academic May B, Moody DB, Stebbins WC. 1989. Categorical perception of nonspecific communication sounds by Japanese macaques, Macaca fuscata. J. Acoust. Soc. Am. 85:837–47 McGlone J. 1980. Sex differences in human brain asymmetry: a critical review. Behav. Brain Sci. 3:215–63 Merzenich MM, Jenkins WM, Miller SL, Schreiner C, Tallal P, et al. 1996. Temporal processing deficits of language learning impaired children are remediated by training. Science 271:77–81 Merzenich MM, Reid MD. 1974. Representation of the cochlea within the inferior colliculus of the cat. Brain Res. 77:397–415 Miceli G, Caltagirone C, Gainotti G, Payer-Rigo P. 1978. Discrimination of voice versus place contrasts in aphasia. Brain Lang. 6:47–51 Miller JD, Wier CC, Pastore RE, Kelly WJ, Dooling RJ. 1976. Discrimination and labelling of noise-buzz sequences with varying noise-lead times: an example of categorical perception. J. Acoust. Soc. Am. 60:410–17 Miller JM, Towe AL. 1979. Audition: structural and acoustical properties. In Physiology and Biophysics, The Brain and Neural Function, ed. T Ruch, HD Patton, 1:339–75. Philadelphia: Saunders. 20th ed. Molfese DL, Betz JC. 1988. Electrophysiological indices of the early development of lateralization for language and cognition, and their implications for predicting later language development. In Brain Lateralization in Children: Developmental Implications, ed. DL Molfese, SJ Segalowitz, pp. 171–90. New York: Gilford Neville HJ, Coffey SA, Holcomb PJ, Tallal P. 1993. The neurobiology of sensory and language processing in language-impaired children. J. Cogn. Neurosci. 5:235–53 Oscar-Berman M, Zurif E, Blumenstein S. 1975. Effects of unilateral brain damage on the processing of speech sounds. Brain Lang. 2:345–55 Ojemann GA. 1984. Common cortical and thalamic mechanisms for language and motor function. Am. J. Physiol. 246:R901–3 Ojemann GA. 1991. Cortical organization of language. Neuroscience 11(8):2281–87 Palmer AR. 1982. Encoding of rapid amplitude fluctuations by cochlear nerve fibers in the guinea pig. Arch. Otorhinolaryngol. 236:197–202 Perecman E, Kellar L. 1981. The effect of voice and place among aphasia, nonaphasic right damaged and normal subjects on a metalinguistic task. Brain Lang. 12:213–22 Petersen MR, Beecher MD, Zoloth SR, Moody DB, Stebbins WC. 1978. Neural lateralization of species-specific vocalizations by Japanese macaques (Macaca fuscata). Science 202:325–27 Petersen SE, Fiez JA. 1993. The processing of single words studied with positron emission tomography. Annu. Rev. Neurosci. 16:509–30 Pisoni DB. 1977. Identification and discrimination of the relative onset time of two component tones: implications for voicing perception in stops. J. Acoust. Soc. Amer. 61:1352– 61 Rees A, Moller AR. 1983. Responses of neurons in the inferior colliculus of the rat to AM and FM tones. Hear. Res. 10:301–30 NEUROBIOLOGY OF SPEECH PERCEPTION Rosen GD, Press DM, Sherman GF, Galaburda AM. 1992. The development of enduced cerebrocortical microgyria in the rat. J. Neuropathol. Exp. Neurol. 51(6):601–11 Robin DA, Schienberg S. 1990. Subcortical lesions and aphasia. J. Speech Hear. Disord. 55:90–100 Rumsey JM, Andreason P, Zametkin AJ, Aquino T, King AC, et al. 1992. Failure to activate the left temporoparietal cortex in dyslexia. Arch. Neurol. 49:527–34 Sams M, Aulanko R, Aaltonen O, Naatanen R. 1990. Event-related potentials to infrequent changes in synthesized phonetic stimuli. Cogn. Neurosci. 2(4):344–57 Schreiner CE, Langner G. 1988. Coding of temporal patterns in the central auditory nervous system. In Functions of the Auditory System: Neurobiological Bases of Hearing, ed. GM Edelman, WE Gall, WM Cowan, pp. 337– 61. New York: Wiley Schreiner CE, Urbas JV. 1984. Functional differentiation of cat auditory cortex areas demonstrated using amplitude modulation. Neurosci. Lett. 14:S334 (Suppl.) Schreiner CE, Urbas JV. 1986. Representation of amplitude modulation in the auditory cortex of the cat. I. The anterior auditory field (AFF). Hear. Res. 21(3):227–41 Schreiner CE, Urbas JV. 1988. Representation of amplitude modulation in the auditory cortex of the cat. II. Comparison between cortical fields. Hear. Res. 32(1):49–64 Schreiner CE, Urbas JV, Mehrgardt S. 1983. Temporal resolution of amplitude modulation and complex signals in the auditory cortex of the cat. In Hearing: Physiological Bases and Psychophysics, ed. R Klinke, R Hartman, pp. 169–75. Berlin: Springer-Verlag Schwartz J, Tallal P. 1980. Rate of acoustic change may underlie hemispheric specialization for speech perception. Science 207:1380–81 Shannon RV, Zeng FG, Kamath V, Wygonski J, Ekelid M. 1995. Speech recognition with primary temporal cues. Science 270(5234): 303–4 Sharma A, Kraus N, McGee T, Carrell T, Nicol T. 1993. Acoustic versus phonetic rep- 353 resentation of speech as reflected by the mismatch negativity event-related potential. Electroencephal. Clin. Neurophysiol. 88:64– 71 Shaywitz BA, Shaywitz SE, Pugh KR, Constable RT, et al. 1995. Sex differences in the functional organization of the brain for language. Nature 373(6515):607–9 Stefanatos GA, Green GGR, Ratcliff GG. 1989. Neurophysiological evidence of auditory channel anomalies in developmental dysphasia. Arch. Neurol. 46:871–75 Steinschneider M, Schroeder CE, Arezzo JC, Vaughan HG. 1994. Speech-evoked activity in primary auditory cortex: effects of voice onset time. Electroencephal. Clin. Neurophysiol. 92(1):30–43 Steinschneider M, Schroeder CE, Arezzo JC, Vaughan HG. 1995. Physiologic correlates of the voice onset time boundary in primary auditory cortex (A1) of the awake monkey. Brain Lang. 48(3):326–40 Tallal P, Jernigan T, Trauner D. 1994. Developmental bilateral damage to the head of the caudate nuclei: implications for speechlanguage pathology. J. Med. Speech Lang. Pathol. 2:23–28 Tallal P, Miller S, Bedi G, Byma G, Wang X, et al. 1996. Language comprehension in language-learning impaired children improved with acoustically modified speech. Science 271:81–84 Tallal P, Miller S, Fitch RH. 1993. Neurobiological basis of speech: a case for the preeminence of temporal processing. Ann. NY Acad. Sci. 682:27–47 Tallal P, Newcombe F. 1978. Impairment of auditory perception and language comprehension in dysphasia. Brain Lang. 5:13–24 Tallal P, Piercy M. 1973. Defects of non-verbal auditory perception in children with developmental aphasia. Nature 241:468–69 Wilkins W, Wakefield J. 1995. Brain evolution and neurolinguistic preconditions. Behav. Brain Sci. 18:161–226 Zatorre RJ, Evans AC, Meyer E, Gjedde A. 1992. Lateralization of phonetic and pitch discrimination in speech processing. Science 256:846–49