Nothing Special   »   [go: up one dir, main page]

0% found this document useful (0 votes)
37 views66 pages

Third Class: Neuroscience of Hearing and Speech: Neurolinguistics

The document discusses neuroscience research on speech processing in the brain. It summarizes key findings from three studies: 1) Scott et al. (2000) used PET imaging to identify a pathway for intelligible speech in the left temporal lobe, specifically the superior temporal sulcus, which was selectively activated by intelligible speech. 2) Overath et al. (2015) used "sound quilts" - foreign speech with temporal structure disrupted at different scales - to isolate auditory analysis of speech timing. They hypothesized regions for speech-specific analysis would respond more to longer coherent segments. 3) Previous models of speech processing were either that it is encapsulated from acoustic processing or integrated with it. The double

Uploaded by

Ania
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views66 pages

Third Class: Neuroscience of Hearing and Speech: Neurolinguistics

The document discusses neuroscience research on speech processing in the brain. It summarizes key findings from three studies: 1) Scott et al. (2000) used PET imaging to identify a pathway for intelligible speech in the left temporal lobe, specifically the superior temporal sulcus, which was selectively activated by intelligible speech. 2) Overath et al. (2015) used "sound quilts" - foreign speech with temporal structure disrupted at different scales - to isolate auditory analysis of speech timing. They hypothesized regions for speech-specific analysis would respond more to longer coherent segments. 3) Previous models of speech processing were either that it is encapsulated from acoustic processing or integrated with it. The double

Uploaded by

Ania
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

NEUROLINGUISTICS

third class:
neuroscience of hearing and speech

the ear

price!

THE HEARING BR

Outer
ear
0VUFSFBS

Middle
.JEEMF
ear
FBS

KEY TE

Inner
*OOFS
ear
FBS

Semicircular
4FNJDJSDVMBS
=  canals
DBOBMT hTA

Malleus
.BMMFVT
*ODV
Incus

Pinna
1JOOB

Stapes

External
&YUFSOBM
auditory
canal
BVEJUPSZDBOBM

air
sound =

5FNQPSBM
Temporal
bone
CPOF

Oval
0WBM
window
X
JO E P X

Ear
drum
&BSESVN

The structure of the outer, middle and inner ear.

oscillation (waves, vibration)


of pressure

water

Cochlea
$PDIMFB
Auditory
nerve
"
VEJUP SZ OFSWF

Belt regio
Part of sec
cortex, with
projections
auditory co

Parabelt r
Part of sec
cortex, rec
projections
adjacent b

Tonotopic
The princip
close to ea
frequency
by neurons
spatially cl
other in th

the cochlea (transductor)

A membrane within the cochlea, termed the basilar membrane, contains tiny
hair cells linked to receptors. Sound induces mechanical movement of the
basilar membrane and the hair cells on it. These movements induce a flow of
ions through stretch-sensitive ion channels, that initiates neural activity (release
of neurotransmitters)

from the cochlea to the cortex


and inner ear.

areas called the belt


s et al., 1999). These
ive some input from
s and, hence, damage
ex does not produce
lead to problems in
unds (Musiek et al.,
way is not a passive
n from the ear but,
ctive extraction and
the auditory signal.
ochlear nucleus has
eniculate nucleus has
rtex has 100,000,000
, there are descend90.000
go as far back as the
cochlear
1953) and
may be
on.
neurons
m can be said to have
ust as different parts
spond maximally to
neurons within the

neuronal processing of sounds (in 5 synapses)

Au or
or e

Au or
or e

7 FOUSBM
DPDIMFBS
OVDMFVT
" V E JUP SZ
OFSWF
%PSTBM
DPDIMFBS

*OGF SJP S
DPMJDVMVT
4VQFSJPS
 PMJWBSZ
OVDMFVT

. FEJBM
HFOJDVMBUF
OVDMFVT

100.000.000
cortical neurons

500.000
thalamic
neurons

brainstem

from the cochlea to the cortex

236 THE
STUDENTS GUIDE TO COGNITIVE NEUROSCIENCE
tonotopic
organization

and those responding to lower frequencies more centrally (Kiang et al., 1965).
central regions toTolower
frequencies
some extent,
this organization is carried upwards to the early cortical
Sparse scanning
stages. In both humans (Formisano et al., 2003) and other animals (Merzenich
In
fMRI,
a
short
break
in
outer
regions to ethigher
al., 1973) frequencies
there is evidence that the central region of the primary auditory cortex
scanning to enable
responds to lower frequencies and the outer regions, on both sides, to higher
sounds to be presented
frequencies.
in relative silence.

Core area

dierent pathways:
where vs. what
Parabelt area

900 Hz

300 Hz

900 Hz

3,000 Hz

Belt area

1,800 Hz

secondary (aka non-primary)


auditory cortex (belt area)

3,000 Hz

primary auditory
cortex (A1)

1,800 Hz

KEY TERM

how specific is the speech as compared to


other sounds? (remember Wernicke)

cognitive
science
questions

VERY SPECIFIC!
4.3 Processes: the neuroanatomy of language
Classical Model
Brocas Aphasia

Conduction Aphasia

fluentness, prosody,
articulation, word finding, and
complex grammar disturbes

both comprehension and production preserved BUT


repetition impaired

comprehension more or less


preserved

how independent is
speech processing
from acoustic processing?
Wernickes Aphasia
comprehension disturbed
prosody preserved, production undisturbed to
overshooting

Structures Formelles du Langage

03.02.2016

at which level can be comprehension impaired?


acoustic sound?
phonological level?
lexical/word level?

cascade processing flow?

two possible models of speech processing in


The predictions of these models differ, for example in the case of
our predictions
brain (cf. Poeppel
The
of theselectures)
models differ, for example in the case of
neuropsychological
patients
neuropsychological patients

price!

If (a) is correct, then a lesion leading to problems with auditory


If (a) is correct, then a lesion leading to problems with auditory
pattern recognition (agnosia) must also be associated with speech
pattern recognition (agnosia) must also be associated with speech
perception deficits. If (b) is correct, there
is a double
dissociation.
auditory
agnosia
perception deficits. If (b) is correct, there is a double dissociation.

SPEECH
SPEECH

AUDITORY
AUDITORY
AUDITORY
AUDITORY
PATTERN
PATTERN
PATTERN
PATTERN
RECOGNITION
RECOGNITION RECOGNITION
RECOGNITION

LOCALIZATION
LOCALIZATION

EARLYEARLY
HEARING
HEARING

(a)

(a)

SPEECH
SPEECH

EARLY
EARLY
HEARING
HEARING

(b)

(b)

LOCALIZATION
LOCALIZATION

double dissociation
between agnosia and
speech comprehension

MEET THE PURE WORD DEAFNESS

Test case: pure word deafness


Table 1 Auditory disorders following cortical and/or subcortical lesions

Speech comprehension

Pure word
deafness

Auditory
agnosia

Cortical
deafness

impaired

impaired

(or mildly impaired)


Speech repetition

impaired

impaired

(or mildly impaired)


Recognition of familiar

impaired

impaired

Recognition of music

+/-

impaired

Hearing sensitivity

impaired

non-speech sounds

(audiometry)
Language I:
Spontaneous speech
Language II:
Reading comprehension
Language III:
Writing
+-sign indicates adequate performance in a given domain.

high functional
specialization
(encapsulation?
modules?
inpdenendency?)
of dierent
components

Scott et al. 2000

identification of a pathway for intelligible speech


in the left temporal lobe

previous studies..
Dominance of le, hemisphere in Language clearly
observable from lesion studies
BUT prior func<onal imaging studies have shown bilateral
ac<va<on by speech in healthy individuals
These studies contrasted speech with rest, simple tones
or noise
Speech is a complex acous<c signal
BeAer baseline(s) might reveal le, lateraliza<on of speech
and subtle dis<nc<ons within

Scott et al. 2000

identification of a pathway for intelligible speech


in the left temporal lobe

experimental design (PET + subtractive method)


passive listening task


Intelligible Condi-ons:
Sp: Normal Speech
Vco: Noise Vocoded Speech (harsh whisper but
comprehensible a8er training)

Speech Phone*cs Contrast:


(Sp, RSp & Vco) Rvco
(+Phon+CA)-(+CA)
Intelligible Speech Contrast:
(Sp, Vco) (Rsp, Rvco)
(+Int+CA) (+CA)

Unintelligible Condi-ons:
RSp: Rotated Speech (preserve some phone-c features,
e.g. frica?ves, voiced-unvoiced, and intona?on)
RvCo: Rotated Noise-Vocoded Speech (completely
dis?nct from speech)

Scott et al. 2000

Phone&cs Contrast
Le# STG ac*vated
(Orange)

Intelligibility Contrast
Le# (more anterior)
STS ac*vated (yellow)

identification of a pathway for intelligible speech


in the left temporal lobe

Scott et al. 2000

identification of a pathway for intelligible speech


in the left temporal lobe

CONCLUSION:
with subtle exp. design
specificity of left hemisphere for
intelligible signals in passive listening
anterolateral stream of neural information
from the primary auditory cortex
(Superior Temporal Sulcus: only activated
with intelligible speech)
monosynaptic connections between
A1 and STS
confirmed by single-cell recording
in primates (specie-specific vocalizations)

Overath et al. 2015

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

Speakers produce different sounds when expressing the same utterance

linguistic analysis

preceded by a stage of acoustic analysis

patterns of sound energy


mapping
intermediate invariant representation of
features/phonemes/syllables

Overath et al. 2015

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

two approaches on previous studies:

manipulating intelligibility both


linguistically and acoustically

manipulating speech-relevant
acoustic process

acoustic vs. linguistic processes?

speech specific mechanism?

goal of this study:

isolate the auditory analysis of speech


in particular: its temporal attributes
by using foreign speech (German to English speakers)
no lexical-semantic or syntactic content

Overath et al. 2015

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

A Sequence of Object-Processing Stages Revealed by fMRI in the Human Occipital Lobe


Grill-Spector et al. (1998), Human Brain Mapping

Scrambling
methodology

increasing activation

decreasing activation

suppressed
activation
for degraded
object pics

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

Overath et al. 2015

speech segments
length varied parametrically (30 to 960 ms)

quilts:

scrambled and reordered so that only local constraints are kept


a
Source

Frequency

1
1

2
30 ms

30-960 ms

3
X

[CnR (t, f )

The structure of a quilted signal


is similar to that of the source
signal within a segment and
across a segments border,

but diers from the source at
larger scales for source signals
that contain large-scale
dependencies.

...

L
Cn+1
(t, f )]2

t,f

Quilt

Frequency

3 9 5 2
3

...

Time

30ms Segments

960msS1Segments
CoModulation Control Ex 1
Frequency (Hz)

9413
2731

Speech

680

hypothesis:

50

2731
680
50

9413

9413

5113

5113

5113

10
-10 2731

10

2731

10

15

1411

15

1411

15

680

20
-20

680

20

680

20

274

25

274

25

274

25

region subserving speech-specific analysis

2731
1411

dB
1

30

30

increasing response as the internal structure of the


quilt is preserved (i.e. longer segment length)

9413
5113
2731
1411
680
274
1

9413

-30
30

S1 CoModulation Control Ex 2

Frequency (Hz)

quency (Hz)

Modulation!
Control

30 ms Quilt

00

9413

960 ms Quilt

9413

960 ms Quilt

30 ms Quilt

9413

9413

5113

5113

10

2731

10

2731

10

15

1411

15

1411

15

20

680

20

680

20

25

274

25

274

25

30

30

30

Overath et al. 2015


analysis:

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

Region of Interests (ROIs)

defined anatomically
(primary Auditory cortex, Heschl
Gyrus, A1
Planum Temporale non-primary
auditory cortex, A2)

defined functionally
(Superior Temporal Sulcus,
associative auditory area)
contrast between response
to 960 ms quilts vs. 30 ms quilts

Superior temp sulc Planum t. Heschls gyrus

Overath et al. 2015


A1

A2

ROI

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

ROI

segmented speech
plateau
at
480 ms

decreasing response to short segments in STS!


in both left and right hemispheres

Overath et al. 2015


A1

A2

ROI

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

ROI

segmented speech

decreasing response to short


segments in STS!
in both left and right
hemispheres

specialization for processing of


acoustic structure in speech in STS
only STS but not A1 and A2 show strong sensitivity to the temporal
extent of natural speech structured
is it really speech specificity? or STS is sensitive to more general
features of the signal that were degraded in shorter segments..

CONTROLS!

Overath et al. 2015


modulation control stimuli

similar response
in A1/A2

2,057

100

10

10
Audio Frequency (Hz)

8,844

similar power spectra (3-5 Hz)


1
2,856
347 amplitude modulation properties
maintained
60
0
820
no
1 pitch/prosody
4
16
128
Modulation channel (Hz)

0
-9
6

-3
0
od

30

Condition

10

919

Condition

M
od

90

0.2

S-

Speech 30
Speech 960
Mod 30
Mod 960

0.2

S-

80

0.4

Mod 960

0.4

-9
6

10
4

0.6

od

70

0.6

-3
0

60

0.8

od

1
4
16
128
Modulation channel (Hz)

0.8

4,325

50

1.0

S30

60

8,844

40

udio frequency (Hz)

347

1.0

30

1.2

919

1.2

S96

Long term power spectra are very similar between source and quilt signal
2

Right hemisphere

Left hemisphere

96

b
Proportion of response
to localizer (960 ms)

Speech 960
synthesized
stimuli with103
4
8,844
algorithm decomposing
4,425 speech signal using an auditory
3
model
2,057

Level (dB)

Cochlear channel (Hz)

Cochlear channel (Hz)

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

Speech

8,844
2,856
820

Modulation control

Co-modulation control
8,844

no modulation2,856
nor response in STS
820

Overath et al. 2015


enviromental sounds

l sounds that
generated an
correlations
power distrime statistical
gain showed
eak effect of
of quilt type
ion between
, H2p = 0.95)
ontrol quilts
quilts in the
e to that for

animal vocalisations
A R T(dogs
I C Lbarking,
E S birds songs)
human actions (footsteps, sawing wood)

Left hemisphere
Proportion of response to localizer (960 ms)

esponses
k) to speech
egment
e participants
ition set.

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

Right hemisphere

1.2

1.2

1.0

1.0

0.8

0.8

0.6

0.6

0.4

0.4

0.2

S-30

S-960

Env-30 Env-960

Condition

0.2

S-30

S-960

Env-30 Env-960

Condition

the response in HG showed a weak main effect of segment length


small modulation
(F(1,4) = 15.59, P < 0.05, H2p = 0.8), but no main effect of quilt type
small
in STSThe response
or interaction between quilt type
andresponse
segment length.

similar response
in A1/A2

Proportion of response to localizer (960 ms)

Quilts made from environmental sounds evoked an overall response To test the importance of p
cortical
analysis
of speech-specific
temporal
in the individual fROIs that was much the
lower
than that
for any
of the from noise-vocoded
speech
Overath et al. 2015 structure
revealed
responses
speech quilts, irrespective of segment length
(main
effect ofby
quilt
type: to
of sound
each ofquilts
our source speec
F(1,4) = 72.59, P = 0.001, H2p = 0.95; Fig. 5). The effect of segment length of ten frequency bands (co
on the fROI response was also much larger for speech quilts than cochlea, covering the audi
speech
fornoise-vocoded
environmental sound
quilts, producing an interaction between which relies on fine spectra
version
of speech
eliminate
quiltnoise-vocoded
type and segment
length
(F(1,4) =stimuli
9.45, Pthat
< 0.05,
H2p =pitch
0.7), but coarse spectral conten
but maintain
coarse
spectral between
content sufficient
for phonetic identification
remains present. When Eng
although
a pairwise
comparison
the two environmental
(cf.quilt
Scott
et al. conditions
2000)
sound
control
was significant (P < 0.05). In contrast, intelligibility remains high2
The fROI response to quil
Left hemisphere
Right hemisphere
lower than that to normal
similar response
effect
of segment length (m
1.2
1.2
in A1/A2
P < 0.01, H2p = 0.85, withou
Although the response to
1.0
1.0
thatbig
to modulation!
noise-vocoded spee
(F(1,4)
= 91.55,
P = 0.001, H2
(effect
of segment-size)
0.8
0.8
but smaller
response
stimuli
were nonetheless
m
than real speech
non-speech
quilts that we t
in STS ANOVAs for th
measures
0.6
0.6
0.4

0.4

0.2

S-30

S-960 NV-30
Condition

NV-960

0.2

S-30

S-960 NV-30
Condition

NV-960

this 6
effect
is not to noise-v
Figure
Responses
due to pitch
(os.e.m.)
in HGproperties
(red), PT (blue
(as noise-vocoded
quilts
(solid) and noise-vocode
30speech
and 960
ms.not
Data
are aver
does
have
scanned
with the noise-vocode
these properties)

Overath et al. 2015

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

notable feature: response plateau to speech quilts at 480 ms


hyp 1. brain regions integrate sound information at timescales up to half a second
hyp 2. the response plateau reflects statistics of speech (acoustic dependencies
only present in quilts made of segments longer than half a second)

compressed speech (faster)

0.05). These results

30

60 120 240 480 960 Orig


Segment duration (ms)

ROI response

b 1.0
Proportion of response to localizer (960 ms)

Naturalness rating

essed speech quilts.


nquilted speech
compressed (red)
were intermixed in
monotonically with
ed speech quilts
did not plateau at
f unquilted speech,
completeness.
ompressed and
s (os.e.m.) to quilts
compressed (blue) or
linear function fits
95% confidence
ompressed and

naturaleness
rating

0.9

smaller response

N = 15

but same plateau


at 480 ms

0.8

N = 11
0.7

0.6
30

60

120

240

480

Segment duration (ms)

960

Overath et al. 2015

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

conclusions
STS (superior temporal sulcus)
is bilaterally tuned to the acoustic structure of speech
particularly, temporal structure (use of quilts)

independently of lexical/syntactic information (German to English!)

results cannot be explained in


term of sensitivity to

amplitude modulation
prosodic pitch variation

Overath et al. 2015


conclusions

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

A1/A2 responses are stimulus irrelevant

hierarchy of selectivity
STS responses are stimulus relevant
speech-specific analysis

subsequent stages of auditory processing

primary areas (A1, Heschl's Gyrius)


secondary areas (Planum Temporale)

specialised associative areas


(STS, Superior Temporal Sulcus)

parametric sensitivity up to half a second, then plateau


why half-a-second sensitivity?
why weak lateralization?

sillable, words, acoustic-phonetic primitives


maybe intelligibility is critical for lateralization
(cf. Scott et al. 2000)

failure to identify subregions in STS?

limits in spatial resolution of


hemodynamic BOLD response

Overath et al. 2015

the cortical analysis of speech-specific temporal


structure revealed by responses to sound quilts

critiques?
what is STS processing?
which speech related representations?

sillables
speech related
acoustic statistics
words
phonemes

for linguists:
these results are quite trivial

all of the things above

we must have something in our brain dedicated to


processing speech-specific features to account for the existence of

..but neuroscientists are very skeptical


these results are useful to convince them of some
strong functional specialization of speech perception

intermezzo
LINGUISTIC THEORY

NEUROSCIENCE
brain activation

phonology
phonemes

how the brain reacts to


specific behaviour

?
cortical networks & areas
functional specialization

syntax
constituents, phrase structures,
syntactic operations and dependencies

semantics
interpretation, meaning
logic, abstraction, concepts

pragmatics
intentions, communications,
goals, language use, context

intermezzo
NEUROSCIENCE
brain activation
how the brain reacts to
specific behaviour

LINGUISTIC THEORY
phonology
phonemes

syntax
cortical networks & areas
functional specialization
is this the best way
to understand how
language works (in the brain)?

what are you doing here?

constituents, phrase structures,


syntactic operations and dependencies

semantics
interpretation, meaning
logic, abstraction, concepts

pragmatics
intentions, communications,
goals, language use, context

David Marrs level of analysis in cognitive science


cognitive revolution

NEUROSCIENCE
and
LINGUISTIC THEORY

can this help?


THEY
ALWAYS
DO!

are cognitive sciences!

information processing systems must be understood at 3 levels of analysis


(one level is not enough!)

1) COMPUTATIONAL LEVEL

what the system does and why

2) ALGORITHMIC LEVEL

how it performs its computation

3) IMPLEMENTATION LEVEL

how it is realized in the brain


(what neural structures and activities)

David Marrs level of analysis in cognitive science


1) COMPUTATIONAL LEVEL

LINGUISTIC THEORY

what the system does and why

representations

LINGUISTIC THEORY

2) ALGORITHMIC LEVEL
how it performs its computation

operation/processes

3) IMPLEMENTATION LEVEL
how it is realized in the brain

NEUROSCIENCE

NEUROSCIENCE

David Marrs level of analysis in cognitive science


representations
A representation is a formal system for making explicit certain entities or types of information, together with a
specification of how the system does this. And I shall call the result of using a representation to describe a given
entity a description of the entity in that representation (Marr and Nishihara, 1978).
For example, the Arabic, Roman, and binary numeral systems are all formal systems for representing numbers. The
Arabic representation consists of a string of symbols drawn from the set (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), and the rule for
constructing the description of a particular integer n is that one decomposes n into a sum of multiples of powers of
10 and unites these multiples into a string with the largest powers of 10 and unites these multiples into a string with
the largest powers on the left and the smallest on the right. Thus, thirty-seven equals 3 x 10(1) + 7 x 10(0), which
becomes 37, the Arabic numeral system's description of the number. What this description makes explicit is the
number's decomposition into powers of 10. The binary numeral system's description of the number 37 is 100101,
and this description makes explicit the number's decomposition into powers of 2. In the Roman numeral system,
thirty-seven is represented as XXXVII.
This definition of a representation is quite general. For example, a representation for shape would be a formal
scheme for describing some aspects of shape, together with rules that specify how the scheme is applied to any
particular shape. A musical score provides a way of representing a symphony; the alphabet allows the construction
of a written representation of words; and so forth. The phrase "formal scheme" is critical to the definition, but
the reader should not be frightened by it. The reason is simply that we are dealing with the informationprocessing machines, and the way such machines work is by using symbols so stand for things--to represent
things, in our terminology. To say that something is a formal scheme means only that it is a set of symbols
with rules for putting them together--no more and no less.

David Marrs level of analysis in cognitive science


representations

formal system (made of symbols)


formal scheme: uses symbols to represent things
the rules that specify how the scheme works

compress information
allow for ABSTRACTION and GENERALIZATION

is this real? does this have to be like this?

COULD HAVE IT BEEN OTHERWISE?

David Marrs level of analysis in cognitive science


representations

formal system (made of symbols)


formal scheme: uses symbols to represent things
the rules that specify how the scheme works

compress information
allow for ABSTRACTION and GENERALIZATION
trade-off: a representation makes certain information explicit
at the expenses of other information that is pushed in the background
it may be effortful to recover certain information

there are more algorithms that can process a representation


the choice of the algorithm may depend on the hardware
it may not always be the optimal choice

which face is male and which is female?

which one is more bended?

do you see a woman?

which one is darker, A or B?

which one is darker, up or down?

which orange circle is bigger?

there are no black dots (only in your head)

bodies are of the same grey

sometimes things are not

..as they appear

is this guy going uphill or downhill?

your head now.

back to neurolinguistics
LINGUISTIC THEORY

NEUROSCIENCE

phonology vs. phonetics

brain activation
cortical networks & areas
functional specialization

representations
phonemes

phonetic sounds

operation/processes
continuous
non-symbolic
more numerous than
phonological cat.

phonemes
(/t/, /d/, /a/)
phonetic
representations

phonological/phonetic processing/recognition

mapping analog speech onto phonemes


basic discrete symbolic unit of language

phonemes
phonological minimal pairs in English
phonemes are different for 1 feature

Conclusion: /p/, /b/, /m/, /s/, /n/ and /g/ are phonemes in
Fall2014
English.

36

phonology
phonetic category

phonological category
(phoneme)

fine grained discrimination

discrete

more numerous than phonemes

combined into lexical forms


NO fine grained discrimination

Phone&c vs. Phonological Units


Phonetics

Phonology

Continuous/Gradient

Discrete/Symbolic/
Categorical

Physical

Abstract

General

Language Specific

phonological rules apply


equally to all members of
a phonological category
(all-or-nothing property)
compressed information storing
CAT = /k/ /ae/ /t/

Phillips et al. 2000


categorial perception:

Auditory cortex accesses phonological categories:


an MEG mismatch study

nonmonotonic identification function

when speakers are asked to identify speech sounds from a


continuum that shows monotonic acoustic variation

better discrimination of between-category contrasts (/t/ vs. /d/)


vs. within-category contrasts (/d/ vs. /d/)

newborn are able to discriminate any phonetic difference


they lose this ability at about 6 months of life (2 weeks time window)
humans can detect speech from sine-waves
nonmonotonic perception found in non-humans
independent from phonological categories
phonetic representation

vs. phonological categories


where all within-category contrasts are lost

Phillips et al. 2000

Auditory cortex accesses phonological categories:


an MEG mismatch study

Voice Onset Time (VOT)

P
En pil

B
En bil

Fall2014

Sound Pressure Amplitude

Release of
stop

Onset of Glo;al vibra>ons following


vowel
Long VOT

Short VOT

38

Phillips et al. 2000


categorial perception:

Auditory cortex accesses phonological categories:


an MEG mismatch study

nonmonotonic identification function

The Con(nuous Nature of Speech vs. (the sort


of) Discrete Iden(ca(on Func(on

Phillips et al. 2000

Auditory cortex accesses phonological categories:


an MEG mismatch study

previous results from fMRI studies


phonological processing vs. phonetic processing
parts of the Superior Planum Temporale
Superior Temporal Gyrus (STG)

ERP and MEG studies


auditory cortex (A1) has access to phonetic information
MMN = Mismatch Negativity (ERP)
MMF = Mismatch Field (MEG)

oddball paradigm

Phillips et al. 2000

Auditory cortex accesses phonological categories:


an MEG mismatch study

X X X X X Y X X X X Y X X X X X X Y X X X Y X ...

MMF
Standards and deviants elicit dierent MEG responses

Fall2014

oddball paradigm

Averaged responses to the deviant s:muli show a


characteris:c response component not observed in the
averaged responses to standards

52

contrasts: pure tone, pitch, intensity, duration..


many-to-one ratio (oddball is the one)

unclear whether A1 has access to phonological categories (phonemes)

Phillips et al. 2000


syntethic Da-Ta/VOT continuum

Fall2014

60 msec

Auditory cortex accesses phonological categories:


an MEG mismatch study

same parameter varied in


within-standard & within-deviant
contrasts

Source: Colin Phillips

40

Phillips et al. 2000


pretest

Auditory cortex accesses phonological categories:


an MEG mismatch study

forced choice identification task


for each subject: perceptual boundary (/Da/ vs /Ta/)

real experiment
stimuli randomly sampled from one of the phonological categories
many-to-one ratio (oddball is the one)

two experiments: Phonological Contrast vs. Acoustic Contrast

ological categories for use in the


In the phonological mismatch
were randomly chosen tokens of
l categories, and deviants were
ns of the other phonological

the ratio of shorter-VOT to longer-VOT stimuli, but


substantially changed
the ratiocortex
among accesses
phonological
caAuditory
phonological
categories:
tegories. If the MMF an
simply
reflects
grouping
of sounds at
MEG
mismatch
study
an acoustic level, then there should be no difference
between the phonological and acoustic experiments; on
NB

Phillips et al. 2000

two experiments: Phonological Contrast vs. Acoustic Contrast

acoustic
difference
withincategory
was greater
than
between
category

many-to-one ratio
with phono contrast

VOT shifted by
20 ms
same acoustic
dierence
many-to-one
phono contrast
was dramatically
modified

Phillips et al. 2000


Figure 3. Grand average responses to standard and deviant /d/ across sensor array and
at representative anterior/
superior and posterior/inferior
channels (eight subjects). A
polarity reversal can be seen in
the difference between standards and deviants from anterior/superior to inferior/
posterior channels, indicating a
source of the difference wave
around the center of the sensor
array.

results

strong bipolar
MMF in
the phonological
experiment

Auditory cortex accesses phonological categories:


an MEG mismatch study

ctors. Responses to /d/ show the stimulus-type


nel interaction characteristic of an MMF in the
0 msec time interval, F(36, 7) = 6.38, p < .01, e =
nd in the 210270 msec time interval, F(36, 7) =
< .05, e = 0.068, and this interaction was not
ant or marginally significant at any other time
.
results
parison of responses to standard and deviant /t/
show a significant stimulus-type channel intert the 150210 msec time interval, F(36, 7) = 1.95,
7, e = 0.061, or at any of the other intervals
d above. When the average field strength data are
strong MMF in
from a narrower 170190 msec time interval
ponding to the peak of the
the difference
wave),
phonological
mulus-type channel interaction approaches
experiment
ance when the less conservative HuynhFeldt

Phillips et al. 2000

. Comparison of difwaves in phonological


stic contrast experi) Grand average difwave: /dt/
d. (b) Grand average
e wave: control com-

average difference wave in the phonological contrast


experiment and the acoustic contrast experiment. As
Auditory
cortex
phonological
categories:
the figure
clearly
shows,accesses
the dipolar
MMF observed
in
an MEG mismatch
study
the phonological
contrast experiment
is not observed in
the acoustic contrast experiment. This observation is
confirmed by statistical analyses. A repeated measures
ANOVA was run using the same latency intervals and
factors used in the phonological contrast experiment.
The only difference was that the levels of the category
factor (/d/ and /t/ in the phonological experiment)
were replaced with the categories low VOT and high
VOT. Two separate ANOVAs were conducted, one for
the high-VOT standards and deviants, excluding the lowVOT stimuli, and one no
for all
stimuliincombined. The
MMF
combined data matches the range of conditions entered
the
into the analysis reported
for phonological
the phonological contrast

experiment

Not surprisingly, a main effect of


erved in the 90150 msec time
.59, p < .01, e = 0.073. These
o the dipolar distribution of the
onse components. In the 90150
results
was also a marginally significant
ory interaction, F(1, 5) = 4.28, p
that responses to standards were
sponses strong
to deviants
for in
stimuli in
MMF
whereasthe
the phonological
reverse was the case
VOT group. At the 210270 msec
experiment
s again a significant stimulus-type
, F(1, 5) = 8.12, p < .05, due to
means described above. No other
ctions were significant or marginno MMF
in
f the other
time intervals
used in
thedata
phonological
ombined
for the acoustic

acoustic experiment the field strength of the difference


Auditory
cortex
accesses
categories:
wave does
not rise
above
the phonological
level observed
in the
an MEG
mismatch
study
prestimulus
interval.
This contrast
between the phonological and acoustic experiments shows the all-or-noth-

Phillips et al. 2000

f the phonological and acoustic


reinforces the difference behe two experiments. Comparison
/t/-deviant condition from the
experiment and the low-VOT-

-14

Phonological Experiment
4

RMS Field Strength (1 = 10ft)

experiment

x10

1
Acoustic Experiment
0
-100

100

200
Time(msec)

300

400

500

Figure 7. RMS of adjusted grand-average difference wave in main and


control conditions.

Phillips et al. 2000

Auditory cortex accesses phonological categories:


an MEG mismatch study

MMF source localization: Primary Auditory Cortex (A1, Heschls Gyrus)

Phillips et al. 2000

Auditory cortex accesses phonological categories:


an MEG mismatch study

discussion
MMF in Phonological experiment
absence of MMF in
Acoustic experiment

oddball many-to-one ratio of phonological


categories sufficient to elicit MMF
phonological category responsible for the
MMF observed in the first exp

left A1 has access to symbolic representation of phonological categories


are they really there? we cant say. consider late onset (150-200 ms)
it likely that the generator is - phonemes are stored/computed somewhere else (STG?)

abstract discrete linguistic categories are available in a part of the brain (A1)
that is known to be involved in relatively low-level auditory processing

end of todays class


readings:
Ward: chapter 10
(just a little..)
articles:
Overath et al. 2015
the cortical analysis of speech-specific temporal
structure revealed by responses to sound quilts

Scott et al. 2000


identification of a pathway for intelligible speech
in the left temporal lobe

Phillips et al. 2000


Auditory cortex accesses phonological categories:
an MEG mismatch study

credits (to serious Neurolinguists):


Andrea Santi

David Poeppel

You might also like