Nothing Special   »   [go: up one dir, main page]

Intelligibility in Speaking

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

2 Setting Priorities

What Teachers and Researchers Say

Currently there is very little debate about whether intelligibility is an


appropriate goal for teaching spoken language. At least since the late
1980s, a majority of knowledgeable teachers and researchers have
advocated some version of the Intelligibility Principle for teaching
pronunciation. This does not always affect how pronunciation is
actually taught, or how published teaching materials are constructed,
but the advocacy has influenced both implicit and explicit discussions
of priorities.
The opposite of the Intelligibility Principle, the Nativeness Principle
(Levis, 2005a), is now rarely put forth by teachers or researchers as a
worthwhile goal, even though it remains vibrant, living on in popular
beliefs about language learning, in spy novels (where effective spies
always seem to become native-like in an unusually short time), and in
accent-reduction advertising. The Nativeness Principle, by definition,
says that learners have to completely match a native speaker’s produc-
tion of all pronunciation targets, segmental and suprasegmental, in
order to have achieved the target pronunciation in a foreign language.
A number of pronunciation class texts appear to follow the Nativeness
Principle. Such books include a nearly exhaustive collection of exer-
cises for teaching segmentals and segmental contrasts (e.g., Orion
2002), including sound contrasts that are rarely applicable, although
often with a less than complete treatment of suprasegmentals. The
advantage of such an approach to pronunciation teaching is that it
defines what must be mastered and articulates a final objective –
native-like speech. There is no obvious need to prioritize because the
entire L2 sound system must be learned to mastery.
The power of the Nativeness Principle can be seen in the fact that it
is not unusual to hear language learners themselves say they want to
sound like a native speaker (NS). Every time I teach pronunciation,
one or two students tell me unbidden that this is what they want. This
is not surprising, as it shows that learners believe nativeness to be a

33

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
34 A Framework for the Teaching of Spoken Language

desirable goal, and it shows that they understand that any deviation
from that norm can mark them as being different or as being hard to
understand. Their desire assumes a dichotomous state of affairs: native
(and therefore unaccented and easy to understand) or nonnative (and
therefore accented and harder to understand). Even though it is
unlikely that they will ever reach their goal, their desire comes from
a noble motive in that they want their spoken language to facilitate
their communication. Not being language teachers, they do not realize
that reaching a good-enough pronunciation is neither a sign of weak-
ness nor of low standards, but is instead all that is really necessary (or
all that is really possible, in most cases).
Unfortunately, few teachers or students have the time, aptitude, or
the age necessary to achieve the kind of mastery needed for native-like
pronunciation. An intelligibility-based approach, in contrast, requires
prioritizing what is taught. Teaching to achieve intelligibility is chal-
lenging precisely because of this prioritized approach to errors. Such
an approach is based on the assumption that some errors have a
relatively large effect on understanding and others a relatively small
effect. Judy Gilbert (personal communication) talks about priorities in
terms of a battlefield medical image – triage. When faced with many
people who are wounded, medics must prioritize treatment. Applied to
pronunciation, the image suggests that certain errors (injuries to com-
munication, to follow the metaphor) should be treated first because
they are more likely to harm communication than are others. Other
injuries to communication are far less problematic, and neither listen-
ers nor speakers will be harmed by lack of accuracy in such cases.
What such an approach should look like in the classroom is not
clear, however, partly because proposals based on the Intelligibility
Principle conflict, and partly because the Nativeness Principle
continues to strongly influence classroom practice and teachers’ atti-
tudes. Also, the Intelligibility Principle must be context-sensitive and
connected to both speaking and listening – speakers need to be intelli-
gible to listeners, and listeners need to be able to understand speakers.
So decisions about priorities not only involve helping learners produce
speech in an accessible way for listeners, but also involve teaching
those same learners to understand the speech they hear.

What Is Involved in Pronunciation?


An intelligibility-based approach to teaching pronunciation requires a
clear description of the key elements of pronunciation and how they
relate to one another. This description is the goal of the following
section.

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Setting Priorities 35

When we use the term pronunciation, we are talking about


an interrelated system of sounds and prosody that communicates
meaning through categorical contrasts (e.g., phonemes), systematic
variations (e.g., allophones), and individual variations that may mark
gradient differences such as gender, age, origin, etc. As such, the
system of pronunciation can be divided in a variety of ways, but the
divisions are merely a way to understand pieces of the system, even
though all parts of the system interact with each other in ways that
often make it impossible to separate their effects upon understanding.
The classification of the features that I use here is somewhat unusual
(Figure 2.1). Typically, pronunciation is divided into segmentals
and suprasegmentals, but from the viewpoint of spoken language
understanding, word-level features are those most likely to impact
intelligibility at the lexical level, and discourse-level features are likely
to affect intelligibility at the semantic and pragmatic levels, and they
are also more likely to impact comprehensibility. There are also
spoken language elements related to pronunciation that are nonethe-
less not included in my categories for pronunciation. Any of the
categories (word-level, discourse-level, and related areas) may impact
perceptions of speech. In addition, although Figure 2.1 divides features
into separate categories, this is a failure of the visual in representing
the ways in which the categories interact. For example, word stress
and rhythm both affect the pronunciation of vowels, with unstressed
syllables strongly leaning toward schwa. Schwa is clearly a frequent
segmental, but it is not a phoneme of English, but rather a variant of
other vowels when they occur in unstressed syllables (Ladefoged,
1980). Stressed syllables in any part of a word are also the location
for aspirated voiceless stop consonants, as in pill, repeat, till, return,
kill, and acute, creating a dependency of some consonant allophones

English pronunciation

Word-level Discourse-level Related areas

Rhythm Intonation
Segmentals Word Stress
Rate Loudness

Fluency Voice quality

Figure 2.1 Pronunciation features related to intelligibility

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
36 A Framework for the Teaching of Spoken Language

and stress. The rhythm of the phrase level and the stress patterns of
words affect the ways sounds change in connected speech (e.g., the
phoneme /t/ as realized in city, nature, button, and “can’t you’ ” are
not the phonetic sound [t] in North American English). Pronunciation
of the individual segments, in other words, is dependent upon where
they occur in words, which are in turn affected by where they occur
within a phrase. Prominence occurs on syllables that are emphasized
through a combination of pitch, syllable duration (a rhythmic feature),
and loudness. Prominence typically (but not always) occurs on the
stressed syllable of the prominent word, connecting prominence to
word stress. Prominence results in segments that are pronounced with
particular clarity and precision. The prominent syllable in a phrase is
often marked by pitch, and is the beginning of the final pitch contour
in the phrase – that is, its intonation.

Word-Level Features
Word-level features include segmentals (vowels and consonants) and
word stress (Figure 2.2). They also include consonant clusters, a type
of segmental which may, when mispronounced, have an effect on
syllable structure. This introduction is provided to define how
segmentals and word stress are related to intelligibility and
comprehensibility.
Segmentals include approximately forty phonemes for most
varieties of English (around twenty-four consonant phonemes and
fourteen or more vowel phonemes). The phonemes mask the number
of sounds that English uses, since phonemes often have multiple well-
known allophones that are important for pronunciation teaching. For
example, English regularly employs a glottal stop ([ʔ]) before vowel
initial words (e.g., I ! [ʔaɪ]) or as an allophone for /t/ before a final
nasal (e.g., button –[bʌʔn̩ ]). Other well-known allophones include

Word-level

Segmentals Word stress

Consonants Vowels Duration Clarity

Figure 2.2 Word-based features important to intelligibility

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Setting Priorities 37

aspirated voiceless stops and affricates in pill/till/kill/chill, the dark


(velarized) /l/ in all (as opposed to the light /l/ in word initial position,
e.g., lap), the flapped /t/ in city, the now increasingly rare voiceless
labial-velar fricative [ʍ] in which (sounds like [hw]), and many others.
For vowels, allophones are almost too numerous to count, as vowel
quality often shifts noticeably in the presence of nasals, before /ɹ/ or
dark /l/, before /g/, and in unstressed syllables.
Regarding word stress, English is a free-stress language in that stress
is not fixed to a particular syllable. Stress can occur on first, second, or
third syllables, but the placement is fixed for individual words. For
example, the main stress for a word may fall on the first syllable
(COMfort, BEAUtiful), the second (caNOE, rePULsive), the third
(referENdum, questionnAIRE), etc. Stressed syllables typically have
greater segmental clarity, greater syllable duration, and greater inten-
sity than unstressed syllables. When they are in particular discourse
contexts, they may also be marked with pitch movement (Ladd &
Cutler, 1983)
Both segmentals and word stress are likely to impact intelligibility at
the lexical level in that mispronunciations may lead listeners to fail to
decode the intended words. This failure may come from identifying
other possible words (as happens with minimal pairs) or failing to
identify any word that matches the speech signal (as with segments
that are distorted). Both consonants and vowels are affected by the
stress patterns of words. Mis-stressed words may especially affect the
ways that vowels are pronounced in English because of the ubiquity of
the unstressed vowel schwa (33 percent of all vowels according to
Woods, 2005). Schwa is a key perceptual clue to lack of stress in
English, and listeners tend to classify full vowels, even in unstressed
syllables, as stressed (Fear, Cutler, & Butterfield, 1995). Because of
these interactions, word stress and segmentals are inseparable in their
impact on intelligibility (Zielinski, 2008). An unexpected stress pattern
on a word (e.g., FORtune ! forTUNE) also may affect the ways that
vowels and consonants are produced (e.g., [ˈfɔɹtʃən] versus [fɚˈtʰun]).

Discourse-Level Features
Discourse-level features (Figure 2.3) include suprasegmental features
that carry categorical (i.e., phonological) meaning differences. In rela-
tion to how listeners may (mis)understand speakers, these supraseg-
mentals are not likely to cause listeners to misunderstand individual
words (making words unintelligible) but are likely to cause listeners to
process meaning with greater difficulty (making speech more effortful
to understand). Another suprasegmental, word stress, as discussed

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
38 A Framework for the Teaching of Spoken Language

Discourse-level

Rhythm Intonation

Duration Timing Tune Range


Prominence

Figure 2.3 Discourse-based pronunciation features important to intelligibility

above, is included as a word-level feature because it is more likely to


impact intelligibility (though it may also impact comprehensibility, or
ability to process speech, even without a change in vowel quality, as in
Slowiaczek [1990]).
Rhythm, the first discourse-level feature, involves at the very least
the relative durations of syllables and the timing of syllabic beats. The
constructed sentence in (2.1), made up of all single-syllable words,
varies between longer, stressed syllables (in CAPS) and shorter,
unstressed syllables (in lower case). The stressed words have a pro-
nunciation that will be closer to the citation form, while the unstressed
words are prone to simplification (e.g., has is likely to have the [h]
deleted or even to be contracted with John).

(2.1)
JOHN has CLIMBED the TREE to GET the CAT that’s been STUCK for a TIME.

In pronunciation teaching, rhythm has often been described in terms


of stress timing and syllable timing (e.g., Pike, 1945). Stress-timed
languages (like English) are asserted to have large durational differ-
ences between stressed and unstressed syllables, with relatively equal
timing between stresses. Syllable-timed languages are considered to
have quite similar durations between syllables and thus timing that is
at the level of the syllable rather than at the level of stressed syllables.
This well-known formulation is overly simplistic, however, and stress
timing and syllable timing are tendencies rather than absolutes (Dauer,
1983). The rhythmic characteristics of many languages remain of
interest to researchers, and a wide variety of rhythm metrics have been
tested for different L1 speakers (e.g., Low, Grabe, & Nolan, 2000), L2
speakers (Yang & Chu, 2016), and L1–L2 comparisons (White &
Mattys, 2007). However, there is great uncertainty about how well
various rhythm metrics actually capture perceived rhythmic differ-
ences between languages (Arvaniti, 2012).

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Setting Priorities 39

In English, rhythm and word stress are similar. The discourse level
for rhythm in many ways mirrors the word-level rhythm of lexical
stress. A major difference is that word stress typically is limited to
multi-syllabic words, whereas rhythm includes stress for single-
syllable words. In English, for example, content words (e.g., nouns,
verbs, adjectives, adverbs, negatives), including those of one syllable,
are normally stressed in discourse. Single-syllable function words (e.g.,
prepositions, auxiliary verbs, pronouns, determiners) are typically
unstressed in discourse. Many single-syllable function words are also
among the most frequent words in English, helping to contribute to
the perception of stress timing.
Intonation, the second suprasegmental, includes at least three dis-
tinct ways in which meaning is communicated: prominence, tune, and
range. The example in (2.2) illustrates these three. For context,
imagine that the sentence is spoken in the middle of a lecture.

(2.2)

Now, let’s move on to the NEXT topic

The initial extra-high pitch range in the example is meant to signal a


topic shift, or what has been called a paratone (paragraph tone, see
Wichmann, 2000). Pitch range may also signal gradient meanings
related to emotional engagement. NEXT is a prominent syllable with
a jump up in pitch to call attention to the importance of the infor-
mation (in this case, NEXT is likely related to PREVIOUS, the topic(s)
that came before.) The last use of pitch is the drop in pitch from NEXT
to the end of the utterance. This is the tune. Each of these uses is part
of how intonation works in English.
Intonation is the system that uses voice pitch changes to communi-
cate meaning. However, intonation may include more than voice
pitch. Prominent syllables are not only higher or lower in pitch than
the syllables that precede them, they also have greater duration (a
rhythmic feature) and more clearly enunciated segmentals. In addition,
the varied intonational categories are closely related. The final prom-
inent syllable in a phrase (the nucleus) is also the beginning of the tune,
and both may be pronounced with greater or lesser pitch range.
In regard to comprehensibility, the specific contribution of these
suprasegmental features is understudied and needs greater attention.
Isaacs and Trofimovich (2012) found that more native-like vowel
reduction and pitch contours correlated with better comprehensibility

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
40 A Framework for the Teaching of Spoken Language

ratings, while pitch range was not significantly related to comprehen-


sibility. Kang (2010) found that pitch range was instead associated
with accentedness, but not comprehensibility. Tune, on the other
hand, has also been suggested to have an effect on comprehensibility.
Pickering (2001), for example, found that Koreans teaching in English
used a greater number of falling tunes than would be expected, and
that the relative numbers of rising and falling tunes made their speech
more challenging for listeners. (This is an interpretation of Pickering,
given that she worked within a model that considers tunes to commu-
nicate differences in information structure.)

Related Areas
Comprehensibility and intelligibility are not only associated with
pronunciation, but also with other characteristics of spoken language
(Figure 2.4) that have an indirect connection to pronunciation. These
areas include fluency (Derwing, Munro, & Thomson, 2007), speech
rate (Kang, 2010), loudness (not typically addressed for L2 pronunci-
ation research), and voice quality (Esling & Wong, 1983; Ladd, Silver-
man, Tolkmitt, Bergmann, & Scherer, 1985; Munro, Derwing, &
Burgess, 2010). Generally, this book will not address these character-
istics in detail, because other than fluency and speech rate (which is a
component of fluency), these things are more idiosyncratic than the
other features.
In particular, research on fluency and speech rate seems to have
significant effects on judgments of comprehensibility and accented-
ness. Pronunciation research on voice quality and its inclusion in
teaching materials has never been common, despite its seeming prom-
ise in pedagogy (Jones & Evans, 1995; Pennington, 1989). Loudness is
an especially important issue in regard to hearing loss, hearing in
noise, and the intelligibility of speech for those with cochlear implants.
Anecdotally, some L2 learners can become more intelligible simply by
speaking at a volume more appropriate to the context (e.g., in a large
classroom), but this has not been typically considered important for an
L2 pronunciation syllabus.

Related areas

Fluency Rate Loudness Voice quality

Figure 2.4 Spoken language features sometimes associated with pronunciation

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Setting Priorities 41

Fluency is sometimes associated with general proficiency (Fillmore,


1979) or smoothness of speech (Lennon, 1990). I use fluency to refer
to smoothness of speech. However, judgments of fluency are also
closely tied to many discrete elements of speech, including the numbers
of silent pauses and filled pauses, whether junctures in spoken phrases
are grammatically logical, mean length of run, repetitions in speech,
and more (Lennon, 1990), as well as the level of automaticity in
speaking (Segalowitz, 2000), phonological memory (O’Brien, Segalo-
witz, Freed, & Collentine, 2007), and attention control (Segalowitz,
2007). Remedial attention to pronunciation is more likely to be suc-
cessful when learners are relatively comfortable speaking and listening
in the L2 – that is, when they are sufficiently fluent. The development
of fluency is clearly an important part of the big picture of L2 speaking
development and instruction (Firth, 1992), and comfortable fluency
can help give a global structure to other elements of spoken language,
but fluency is not, by itself, part of L2 pronunciation. The fact that it
impacts comprehensibility and is critical in communicative language
teaching (Rossiter, Derwing, Manimtim, & Thomson, 2010) means
that, like pronunciation, it should be prioritized in teaching speaking
and listening.
Speech rate is predictive of fluency judgments (Cucchiarini, Strik, &
Boves, 2000; Kormos & Dénes, 2004). Fluent speakers can speak at
different rates, so a fluent speaker could speak at a slower rate than a
speaker who is judged less fluent. Speech rate is typically measured in
syllables per second or words per minute. This rate may include all
silences and filled pauses, or they may be removed, providing a meas-
ure of articulation rate. L2 speakers tend to speak more slowly than L1
natives, and their comprehensibility may be helped by faster speech.
However, excessively fast or slow speech is more likely to be rated as
less comprehensible and more accented (Derwing & Munro, 2001;
Munro & Derwing, 1998).

Prioritizing: A Summary and Critique of Recommendations


Various writers have made recommendations about priorities for
pronunciation teaching. These recommendations go from very general
(learners should try for “listener-friendly pronunciation,” in the words
of Olle Kjellin) to more detailed descriptions of what might be
included in instruction. Kenworthy’s (1987) early approach to intelli-
gibility was based on describing learner pronunciation issues that may
result in unintelligibility. She lists both segmental and suprasegmental
issues, including substituting one sound for another, deleting or
adding sounds, connecting one word to another, mis-stressing words,

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
42 A Framework for the Teaching of Spoken Language

not using stress-based rhythm, and the misuse of intonation or use of


unfamiliar intonation. Kenworthy does not further prioritize these
potential sources of unintelligibility.
In another early attempt to prioritize, Jenner described a “common
core for pronunciation” that would “guarantee intelligibility and
acceptability anywhere in the world” by specifying “what all native
speakers of all native varieties have in common which enables them to
communicate effectively with native speakers of varieties other than
their own” (1989, p. 2). Rather than focusing on the differences
between varieties, or indeed ruling out certain native varieties a priori
while elevating others as models, Jenner thought it essential to look at
commonalities, recognizing that speakers of native varieties have a
better chance of being intelligible to each other because of what they
share. Jenner suggested that the commonalities included vowel quan-
tity (phonetic length differences), most consonants, syllable structure,
stress-based rhythm, and varied commonalities of intonation,
including tonic syllables and final movements of pitch. Jenner’s rec-
ommendations refer to segmental and suprasegmental features, and
they often distinguish between a category (e.g., consonants) and sec-
ondary features that are not essential in the category (e.g., distinctions
between [l] and [ɫ]).
In another analysis focused on segmental pronunciation, Brown
(1988) proposed functional load as a way to determine pronunciation
priorities. Functional load, a topic that had been put forth long before
Brown’s application to pronunciation teaching, measures the “fre-
quency with which two phonemes contrast in all environments”
(Brown, 1988, p. 591). Functional load is therefore inherently con-
trastive and the use of minimal pairs gives a way to quantify priorities.
Phoneme contrasts that have a higher functional load are more likely
to cause confusion if mispronounced, and they should be given prior-
ity over those with lower functional loads (assuming, of course,
that students have difficulty with the contrast). Brown’s proposal for
measuring functional load takes into account not only the number
of minimal pairs for a contrast, but other issues such as the number of
minimal pairs for the same part of speech, the extent to which a
mispronunciation is stigmatized in native accents, and the acoustic
similarity of the sounds involved in the contrast (not all minimal pairs
are likely to be confused). For example, Brown’s analysis lists con-
trasts such as /p, b/, /p, f/, /l, ɹ/, and /l, n/ as of the highest importance,
and contrasts such as /f, θ/, /ð, d/, and /ʤ, j/ as very low-priority
contrasts. The proposal quantifies importance, but Brown recognizes
that quantification alone cannot completely determine priorities. In
addition, functional load is a measure of segmental importance only

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Setting Priorities 43

and cannot be applied to suprasegmental features, even to relatively


uncomplicated ones like word stress.
Firth prioritizes differently by proposing a “zoom principle” for
teaching in which “a pronunciation syllabus should begin with the
widest possible focus and move gradually in on specific problems”
(1992, p. 173). By this, she means that pronunciation instruction
should start with general speaking abilities and ability to communicate
before moving on to phonetic details. This is put forth as the most
likely way to promote comprehensible speech. Priorities beyond gen-
eral speaking ability (including volume and clarity) include intonation
and stress/rhythm (features that are more related to general speaking
ability) before considering consonants and vowels. Interestingly,
Firth’s recommendations led to some paradoxical suggestions that
seem to have little to do with communicative importance. Despite its
not being seen as critically important for understanding, /θ/ is given
high priority within the consonantal system because students perceive
it to be important and because it is relatively easy to teach effectively,
giving students confidence to try for more difficult sounds. Practically
speaking, this may also be a feature to spend modest time on if it leads
to greater commitment to features more likely to improve intelligibility
(Derwing & Munro, 2015).
Evidence for Firth’s recommendations comes from a recent study by
Isaacs and Trofimovich (2012). With a goal of making explicit the
issues involved in comprehensibility ratings used in a variety of spoken
assessment tools, the researchers examined which factors were salient
in ratings of comprehensibility. Using criteria collected from research
studies, the researchers identified nineteen quantitatively scored speech
measures, including segmentals, suprasegmentals, fluency, vocabulary,
grammar, and discourse. Speech samples of forty French learners of
English were analyzed, and the scores derived from the analysis were
then correlated with naive NS raters’ comprehensibility ratings (based
on Munro & Derwing, 1995). Following this, three experienced ESL
(English as a second language) teachers with minimal training in
pronunciation listened to the speech samples and identified the factors
they noticed when evaluating the speech of the French-speaking
learners. Five factors were identified as important: fluency, breadth
of vocabulary, grammatical control, construction of the discourse,
and word-stress accuracy. Of these five measures, only one involves
pronunciation in an explicit way, while the others are more global
measures. The study concludes that expert raters do not focus overtly
on pronunciation when evaluating speech comprehensibility, but
that instead they take account of features that are not often included
in pronunciation-oriented instruction. Other evidence for Firth’s

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
44 A Framework for the Teaching of Spoken Language

contention can be seen from the work of Derwing, Munro, and Wiebe
(1998), who found that listeners rated L2 learners’ comprehensibility
higher after a pronunciation course focusing on global skills and
suprasegmentals than after a course focusing on segmental improve-
ment. Similar results for other groups of learners can be found in the
work of Gordon and Darcy (2016).
McNerney and Mendelsohn (1992) argue for suprasegmentals to be
given top priority in pronunciation instruction. Suprasegmentals con-
trol how information is related, and they also are said to have a special
role in conveying attitudinal meaning (cf. Pike, 1945). McNerney and
Mendelsohn say that “a short-term pronunciation course should focus
first and foremost on suprasegmentals as they have the greatest impact
on the comprehensibility of learners’ English” and because through
suprasegmentals “greater change can be effected in a short time”
(1992, p. 186), although they provide no empirical evidence for their
confident assertion. Suprasegmental features include:
• word stress and rhythm;
• major sentence stress (or focus);
• intonation;
• linking and pausing;
• palatalization in rhythmically related contexts such as “can’t you”
and “did you.”
In another attempt to specify priorities for a short pronunciation
course, Henderson (2008) identified pacing (stressed words per
minute), speech rate (syllables per second), and word stress as import-
ant in promoting more comprehensible speech in spontaneous and
prepared speaking tasks. The author argued that these three areas
were most amenable to changes in the short term, and that learners
of English in this university-level planned speaking course were most
likely to be successful by learning to vary their pacing and speech rate.
Word stress, which was asserted to be important in promoting under-
standing, was also presented as a feature that may not be easy to
change in the short run. It may be significant that both Isaacs and
Trofimovich (2012) and Henderson (2008) studied French learners of
English, a group for whom word stress may be particularly important.
Gilbert (2001) suggests priorities for beginning learners based on
her experience and repeated attempts to distill that experience into
pronunciation features that are likely to be learnable and to make a
difference in comprehensibility. She lists the following as essential:
sound–symbol correspondences for the spelling of key vowel sounds,
consonant sounds (mostly final) that serve as signals of grammatical
meaning, linking between words, epenthesis and deletion of syllables,

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Setting Priorities 45

word stress, differences between weak and strong syllables (stress


timing), and emphasis (prominence or nuclear stress). This set of
priorities emphasizes suprasegmentals, but also says that certain seg-
mental targets are both important enough to pay attention to and
likely to be learnable even at the beginning levels of proficiency.
Not all writers who agree that a greater emphasis on suprasegmen-
tals is important for pronunciation instruction prioritize what should
be taught. Morley (1991), in her historical review of the evolution of
pronunciation in TESOL (teaching English to speakers of other lan-
guages), described many of the changes from the traditional,
segmental-based approach of the 1950s and 1960s to the more com-
municatively oriented approach becoming evident in the 1980s and
1990s. However, Morley’s recommendations called for “an expanded
concept of what constitutes the domain of pronunciation” (1991,
p. 493) rather than setting clear priorities. This expanded domain
saw pronunciation’s proper sphere of influence as encompassing com-
munication skills, suprasegmentals, segmentals, voice quality, body
language, greater learner and teacher involvement in developing self-
monitoring skills, contextualization, linking listening to speaking,
greater attention to sound–spelling relationships, and attention to
individual differences among ESL learners. These recommendations
cannot be faulted in and of themselves, but taken together their
approach to teaching pronunciation suggests that previous lack of
success came not because of misplaced priorities or goals, but because
the program of instruction was too limited to work. While Gilbert’s
triage metaphor suggests that certain pronunciation needs are medic-
ally critical, identifiable, and should be dealt with immediately, Mor-
ley’s recommendations sound like an extended stay at a luxury
pronunciation spa with personal accent trainers.

Prioritizing for Nonnative Speaker–Nonnative


Speaker Communication
All these approaches assume that native listeners are the appropriate
audience for determining what is intelligible and what is not. This
assumption is called into question by Jenkins (2000) in her proposal
for a prioritized set of features for pronunciation teaching, the Lingua
Franca Core (LFC). Jenkins recognizes that most nonnative speakers
(NNS) of English around the world interact in English not with NSs
but with other NNSs. This difference in audience prompted Jenkins to
consider changes in how priorities are determined, with mutual intel-
ligibility being the standard by which pronunciation features are to be
judged for importance. Jenkins studied the interactive interlanguage

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
46 A Framework for the Teaching of Spoken Language

talk of NNS dyads doing communicative tasks. She analyzed where


communication failed in these communicative tasks and determined
the causes of the forty instances in which communication broke down.
Of these, twenty-seven were related to pronunciation deviations. The
deviations that caused loss of intelligibility were candidates for the
LFC. Those features that did not cause a loss of intelligibility were
typically excluded from the core. The core included most consonant
sounds, some consonant cluster simplifications, vowel length, and
nuclear stress (i.e., prominence). Consonants that were excluded were
the interdental fricatives and [ɫ], the velarized, or dark, allophone of /l/.
Consonant cluster deletions were included in the LFC because loss of
sounds was argued to be more likely to impact intelligibility than was
epenthesis. Perhaps the place in which the LFC departs from the other
recommendations most radically is in its treatment of suprasegmen-
tals. Only one, nuclear stress, is included in the core. Others, such as
stress-based rhythm, intonation, and word stress are all excluded
based on Jenkins’ data, her appeal to teachability/learnability, and
the impact of universals. Some of these decisions have been criticized,
especially regarding word stress (Dauer, 2005; McCrocklin, 2012). In
one description of pronunciation teaching in China that used the LFC
as a rubric, Deterding (2010) showed what most experienced teachers
know very well: Pronunciation difficulties are varied, and they include
errors that reflect both core and non-core features.
In a replication of Jenkins’ (2000) work, Kennedy (2012) found that
the most common sources of unintelligibility were vowel and conson-
ant segments, either individually or in combination. The only supra-
segmental feature implicated in unintelligibility was word stress.
Nuclear stress (prominence) was not a source of unintelligibility.
Kennedy also suggested that learners may not always indicate that
they do not understand a speaker and that researchers and teachers
may not realize that pronunciation is a factor. This problem may be
connected to the types of interactive tasks that are used to collect data,
such that both listeners and speakers have to demonstrate
understanding.
Walker (2010) applies the LFC extensively to pronunciation teach-
ing, giving the original recommendations a classroom teacher’s per-
spective. His defense of ELF (English as a lingua franca) priorities
describes why he thinks the LFC is appropriate, including issues
related to bottom-up rather than top-down processing, mutual intelli-
gibility, speaker identity, and teachability. One of the benefits of the
LFC, Walker argues, is that it recognizes that ELF speakers make
greater use of bottom-up processing in their interactions. This means
that they are much more reliant on the details of the acoustic signal in

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Setting Priorities 47

interactions, and thus more likely to be affected by unexpected


pronunciations of segmentals. NSs, in contrast, make greater use of
top-down processing, in that they can more easily guess at the content
of a spoken message even when the segmentals deviate from what is
expected. The LFC, however, remains controversial in many of its
recommendations. It lacks robust empirical support, assumes that all
NNS contexts are similar, and does not take into account the import-
ance of stigma associated with otherwise intelligible pronunciations
(LeVelle & Levis, 2014). The actual details of what should and should
not be included thus have found uncertain acceptance despite its
appeal, and influential researchers have criticized its consistency,
calling for empirical verification of its recommendations (e.g.,
Szpyra-Kozłowska, 2015, pp. 77–84).
The second area discussed is mutual intelligibility. Intelligibility is
a context-sensitive feature of spoken discourse, a characterization
that will find little disagreement from almost anyone interested in
teaching pronunciation. Walker is more nuanced than Jenkins, rec-
ognizing that the LFC may need to be adjusted in some ways because
interlocutors may also include NSs. For example, the teaching of
weak forms and vowel reduction may be strictly non-core as far as
production, but mutual intelligibility suggests that it is a core feature
for perception. NS interlocutors will reduce vowels, and it is import-
ant for ELF listeners to be able to understand such speech. For
communication to happen, both speakers and listeners must be intel-
ligible to each other, and learning materials typically make use of
native speech.
The third positive aspect of an LFC approach is that it recognizes
the importance of speaker identity. The LFC recognizes that achieving
a native accent is not necessary, and that the influence of the speaker’s
L1 should be accepted, as long as intelligibility is not compromised.
This is clearly not just part of NNS–NNS communication. Most
people in inner-circle communities, especially in larger cities with
significant immigrant communities, also think that an NS accent is
not needed.
Finally, Walker discusses Jenkins’ concept of teachability. Walker
says that “many features that are essential in a traditional EFL syllabus
are largely unteachable. This was the case with tone and stress-timing,
and with the use of weak forms and certain connected speech changes.
In contrast, most of the items in the LFC are teachable, with classroom
teaching leading to learning” (2010, p. 63). While it makes little sense
to teach things that our learners cannot learn, little evidence is pro-
vided regarding teachability. In fact, many of the features that are said
to be unteachable are ones that Judy Gilbert (2001), who is a

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
48 A Framework for the Teaching of Spoken Language

notorious stickler for teaching only those things that can be learned,
provides as priorities for beginning learners: linking, word stress, and
distinguishing strong and weak syllables.
What does it mean for a feature to be teachable? Walker (2010) says
this about nuclear stress placement:
[It] is teachable in the sense that the rules are simple enough for learners to
master in the classroom, although for some learners there may be a noticeable
gap between receptive and productive competence. As a result, our primary
aim in the classroom will be to make learners aware of the existence and
importance of nuclear stress. This should make them more sensitive to its use
by other speakers, and consequently more likely to acquire competence in
its use. (Walker, 2010, p. 64)

Teachability thus seems to mean a topic whose rules can be learned


and applied by learners, leading to acquisition. It does not mean
teachable. Any topic can be taught. What matters is the extent to
which the teaching, the input, becomes learning, or intake. The
principle of teachability/learnability will be discussed in detail in
Principle 6 in Chapter 8.
Walker suggests ways the LFC might apply to speakers of different
languages. Following Jenkins, he includes in the core rhotic /ɹ/ in all
positions (characteristic of most North American English speakers),
the non-flapped /t/ characteristic of British English, and word stress
(an admitted gray area in Jenkins, 2000) because of its impact on the
core feature of nuclear stress, as well as vowel reduction and weak
forms for receptive competence. In addition, certain errors (such as
final glottal stops) that are common among certain users of English
should be addressed because they may cause loss of intelligibility by
masking the character of final stop consonants (see Walker, 2010,
p. 44; cf. Gilbert, 2001).
The LFC’s recommendations have been used to examine features
that promote the mutual intelligibility of emerging South-East Asian
Englishes and the international intelligibility of Hong Kong English.
Deterding and Kirkpatrick (2006) examined conversational inter-
actions in English among speakers from ten South East Asian coun-
tries and identified features of speech that seem to form the basis of a
developing regional variety. Features of this variety shared by speakers
from at least four countries were the use of a stop for the voiceless
dental fricative (dis for this), reduced initial aspiration of voiceless
stops (pill sounds like bill), monophthongal mid-front and back
vowels (take and goat do not have the extra glide typical of inner-
circle varieties, so that they may sound like tech and gut), a lack of
reduced vowels, stressed pronouns, and phrase-final discourse stress

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Setting Priorities 49

(e.g., Give it to HIM). Of these, several are features that Jenkins


suggests should be treated as part of the LFC core (aspiration and
nuclear stress) and others are part of her non-core features (dental
fricatives and reduced vowels). Others are less obvious and may take
finer analysis to determine whether they should be seen as core or
non-core. Monophthongal vowels, however, may violate the quantity
criterion for vowels while keeping the quality intact.
In the work of Kirkpatrick, Deterding, and Wong (2008), the intel-
ligibility of the Hong Kong English of highly educated students was
rated by university students in Singapore (a transitioning outer-circle
country where English has an official role but is not the native lan-
guage of all) and Australia (an inner-circle country in which English is
the native language of most people). The students in Australia were
both NSs and NNSs of English. Recordings of their speech were
played for the subjects, who did a listening comprehension task about
the content of the speech. In addition to this measure of intelligibility,
raters considered the speakers in terms of intelligence and likeability,
two concepts well-attested in other studies to be associated with
speech, and in our terms, with the potential for irritation. In an
interesting finding, the speakers who were the most intelligible were
also seen as less intelligent and less likeable, often based on things they
said, but sometimes on the basis of the speech being too good, sug-
gesting that the raters thought the speaker was showy and proud.
Clearly, intelligibility, a good thing in itself, may sometimes be judged
negatively in some contexts based on unforeseen social values. Over-
all, Hong Kong English was widely intelligible in this area of the
world, where it is likely to be a familiar variety of English. However,
not all speakers were equally intelligible.
These different attempts to specify priorities for intelligibility-based
instruction are interesting both in what they agree on and also in what
they do not agree on. The variety found in the recommendations
comes primarily from a heavy reliance on reasoning and a paucity
of empirical evidence. Table 2.1 provides a summary of the
recommendations.

Critiquing the Recommendations


The different attempts to specify priorities are a mishmash of incom-
plete and contradictory recommendations. Some of the studies offer
recommendations based on experience (Firth, 1992; Gilbert, 2001;
Kenworthy, 1987; McNerney & Mendelsohn, 1992), others provide
priorities based on analysis of similarities and differences between
English varieties (Jenner, 1989) or careful experimental evidence

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Table 2.1 Pronunciation priority recommendations from various authors

50
Study Recommended targets Recommended for exclusion Source of evidence
Studies related to ESL/EFL contexts
Kenworthy Sound substitutions, deletions, and Reasoning based on experience
(1987) additions;
linking; word stress; rhythm; intonation
Jenner (1989) Vowel length; most consonants; syllable Vowel quality; [ɫ] Reasoning based on features shared
structure; stress-based rhythm; by most NS varieties
prominence; movements of pitch
Brown (1991) High functional load contrasts, e.g., /p, b/, Low functional load Functional load calculations based
/p, f/, /l, r/, /l, n/, /æ, ɛ/ contrasts, e.g., /f, θ/, /ð, d/, on minimal pair frequency
/ʤ, j/, /u, ʊ/ modified by other criteria
Firth (1992) In descending order: None specified Based on a “Zoom Principle,”
general speaking abilities; intonation; a pedagogical approach that
stress/rhythm; consonants and vowels prioritizes general speaking
habits over phonetic details
Isaacs and Word stress; lexical richness; grammatical Pitch range Based on correlations between
Trofimovich control; use of discourse features; scalar comprehensibility ratings
(2012) fluency and careful quantitative analysis,
informed by teacher’s verbal
protocols (one L1 only)
McNerney and Word-level stress/unstress; sentence-level None specified Reasoning based on the belief that

terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005


Mendelsohn stress/unstress; major sentence stress (or more change can be achieved by
(1992) focus); intonation; linking and pausing; focusing on suprasegmentals in a
palatalization in rhythmically related short-term course
contexts

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
Gilbert (2001) Key vowel sound/spelling Priorities for beginning learners
correspondences; final consonants based on experience as a teacher
signaling grammatical meaning; linking; and textbook writer
word stress; strong and weak syllables;
emphasis (prominence)
Morley (1991) An expanded domain for pronunciation, None specified Reasoning based upon the asserted
including (in no particular order): need for pronunciation to take on
communication skills; suprasegmental; expanded roles in the language
segmentals; voice quality; body classroom
language; greater learner and teacher
involvement in developing self-
monitoring skills; contextualization;
linking; greater attention to sound–
spelling relationships; attention to
individual differences among
learners
Henderson Pacing of speech; rate of speech; word A review of principles put forth by
(2008) stress other writers. The choice of
features for the short course are
not clearly justified
Studies related to English as an international language (EIL)/ELF contexts
Jenkins (2000) Most consonant sounds; some consonant Interdental fricatives; [ɫ]; Forty errors in NNS–NNS
cluster simplifications involving consonant cluster interaction, twenty-seven of

terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005


deletions; vowel length,; nuclear stress epenthesis; stress-based which were directly related to
(i.e., prominence) rhythm; weak forms; pronunciation. Additional criteria
intonation; lexical stress of teachability and learnability
(continued)

51

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
Table 2.1 (cont.)

52
Study Recommended targets Recommended for exclusion Source of evidence
Walker (2010) Same as Jenkins (2000), including rhotic Same as Jenkins (2000) with Jenkins’ (2000) findings modified by
[ɹ] in all positions; intervocalic [t] rather some modifications for trying to implement the LFC.
than flap in city, beauty; word stress as a specific language groups Other research findings also
basis for nuclear stress; weak forms and consulted
vowel reduction for receptive
competence
Deterding and No priorities given A descriptive study of the features
Kirkpatrick that may be part of an emerging
(2006) South-East Asian variety of
English
Kirkpatrick No priorities given A study of the intelligibility of Hong
et al. (2008) Kong English to listeners in
Singapore and Australia. Hong
Kong English speakers were
generally highly intelligible, but
high intelligibility did not
guarantee perceptions of
likeability or intelligence

terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005


Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
Setting Priorities 53

(Isaacs & Trofimovich, 2012) or other objective analyses based on


models of intelligibility (Brown, 1991; Jenkins, 2000); some study
intelligibility without any intention of recommending pronunciation
priorities (Deterding & Kirkpatrick, 2006; Kirkpatrick et al., 2008),
and others describe not only what should be included but also what
should be excluded (Brown, 1991; Jenkins, 2000; Jenner, 1989).
Features such as word stress are seen to be essential in some research
(Isaacs & Trofimovich, 2012), while they are seen as relatively unim-
portant in other recommendations (Jenkins, 2000) or potentially
important in relation to other features (Walker, 2010). A focus on
suprasegmentals is encouraged by some authors (McNerney & Men-
delsohn, 1992), while it is largely bypassed in favor of segmentals in
other accounts (Jenkins, 2000). Some writers seek to achieve quicker
rates of improvement in intelligibility by focusing first and foremost on
features that are not usually part of pronunciation instruction (Firth,
1992), while other recommendations read like a pronunciation wish-
list with no attempts to prioritize (Morley, 1991). Sounds such as /θ/
are left off many lists, including Jenkins’ influential LFC, but other
writers long to keep /θ/ because its supposed teachability may make it
easier for learners to feel success and thus try harder sounds (Firth,
1992) or because there are situations in which /θ/ can affect intelligi-
bility (Deterding, 2005; Henderson, 2008). Some of these seemingly
contradictory recommendations are likely due to different L1 learner
groups, limited numbers in each study, and the context in which the
study took place.
It is clear that there is a further need to examine principles that help
teachers decide on priorities based on context, allowing a finer-grained
analysis than any single study can provide. All teachers have to priori-
tize, and it is best to have explicit, research-based support for setting
priorities (Derwing & Munro, 2005). Much that has been written
about priorities cannot be called research-based, and the articles that
are based on empirical data should therefore have greater weight.
Jenkins (2000), for example, has been much discussed, much praised,
and much derided, but her recommendations are valuable because
they are based upon evidence. However, the amount of evidence is
small and some of the recommendations have been called into ques-
tion (Dauer, 2005; McCrocklin, 2012). Her twenty-seven pronunci-
ation errors (of forty total errors impacting intelligibility) allow us to
make suggestive recommendations about what should and should not
be included in the core features. /θ/, for example, did not lead to loss of
intelligibility in her data. Clearly, there must have been many other
phonemes that likewise did not lead to loss of intelligibility, but
Jenkins made recommendations against /θ/ as a core item, and for

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
54 A Framework for the Teaching of Spoken Language

other sounds not only based on her evidence but also on her view of
English’s role in the world. This indicates that decisions about prior-
ities must be made not only on explicit evidence, but on how implicit
evidence is interpreted regarding pronunciation’s role in communica-
tive success or failure in particular communicative contexts.

Context and Intelligibility


Finally, intelligibility is sensitive to the context in which communi-
cation takes place. What this means is that the degree of accuracy that
determines intelligibility changes from one situation to the next,
depending on the type of language use required. Intelligibility can be
seen as the lowest possible standard that a speaker has to meet in order
to get by. Hinofotis and Bailey (1981), in a now famous phrase, talk
about “an intelligibility threshold” that speakers must meet in order to
communicate effectively. The intelligibility threshold, rather than
being an objective criterion, is actually a moving target that includes
much more than pronunciation (Tyler, 1992). Much of what causes
the target to move is the context in which speech takes place. The
influence of context is understudied, but it is likely to be an important
determinant of how intelligible a particular speech sample is.
It should be obvious that certain contexts of use have higher stakes
for both the speaker and the listener. For example, if your job involves
staffing a cash register in an area in which ethnic shops are the norm
(e.g., Chinatown), your needs for understandable pronunciation in
English may be relatively low. Most of the customers are likely to
either be from the same speech community or outsiders who have
decided to take the extra step to shop there rather than somewhere
else. The people the clerk interacts with are either sympathetic or are
unlikely to come back regularly (tourists). Contrast this context with
an instructor in a university class. Interactions are regular and required
(students have to come to the lab or breakout session), high-stakes
(performance in the class is highly dependent on being able to under-
stand material through the mediation of the instructor), and subject to
significant cross-cultural conflict (misunderstanding may be likely to
be seen as caused by the inability of the other to play the expected role;
cf. Rubin, 1992). High-stakes contexts include education, health, and
translation, all areas in which speakers and listeners have to negotiate
language and culture barriers and where the cost for failure is
very real.
In North American higher education, many basic classes in the
natural sciences, mathematics, and engineering are taught by NNSs
of English. While some of these teachers are regular faculty, many are

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Setting Priorities 55

graduate teaching assistants who help fund their education through


teaching. The classes they teach include not only majors in the field,
but also students from other fields who are required to take courses
that they may not feel comfortable with or enjoy. Their overall profi-
ciency in English is very high. These NNSs have, after all, been
admitted to demanding graduate degree programs in a foreign-
language setting, and their needs for language support are targeted
to specific areas such as speaking or writing.
Having these students teach sets up a natural context in which
undergraduates’ stress from learning that demands new content (e.g.,
organic chemistry) can interact with stress from the way the content is
presented (which may be due to the graduate students being inexperi-
enced teachers or due to cultural views of appropriate teacher and
student behavior), mixed in with unfamiliar or hard-to-understand
accents. Pedagogical effectiveness and unfamiliar or inadequate pro-
nunciation or language skills may both be implicated in lack of
achievement, but inadequate language skills are most likely to be
blamed for ineffective teaching. Many international teaching assistants
(ITAs) and faculty development programs recognize that working on
teaching and presentation skills can lead to greater success (and pre-
sumably, a better perception of comprehensibility) even without exten-
sive work on language skills.
Other studies also make clear that comprehensibility is not based
only on pronunciation. Tyler (1992) asked raters to listen to two
presentations, one given by an ITA and one by an NTA (native
teaching assistant). Both presentations were then transcribed and read
aloud by an NS of English so that pronunciation would not be a factor
in how the presentations were rated. Raters evaluated the ITA as being
less effective and less easy to follow than the NTA. The researcher
argued that the ITA’s use of unexpected, nonparallel discourse
markers (e.g., “the first one” followed by “and then” and “after
that”), not establishing clear synonyms or clearly linking pronominal
forms to the original noun phrases, and overuse of coordination and
underuse of subordination, caused a loss of understanding.
The use of discourse markers may also be involved in how easy it is
to understand speech. Williams (1992) found that when discourse
moves were explicitly marked, ITA presentations were rated as being
more comprehensible. Tyler and Bro (1992), however, found that
ITAs overused simple additive connectors that were ambiguous in
the connections between ideas in the discourse. Liao (2009) examined
Chinese ITAs’ use of common English spoken discourse markers in
interactions with an interviewer and found that the ITAs overused
some (especially yeah) and underused others (e.g., well, I mean).

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
56 A Framework for the Teaching of Spoken Language

Overall, their markers were more restricted in range than for NTAs
and included innovations that were not likely to be understood easily
by NS interlocutors. At the level of grammatical competence, lexico-
grammatical features may hamper ITAs’ ability to communicate infor-
mation clearly. Tyler, Jefferies, and Davies (1988) found that NTAs
often used strategies to focus listener attention on information to be
foregrounded and backgrounded, but ITAs did not.
In addition, even speech that is completely intelligible may be heard
as heavily accented or take more effort to understand (Munro &
Derwing, 1995). Expectations and implicit stereotypes may affect
how well listeners understand a speaker, despite the speaker being
intelligible. Rubin (1992) played a short lecture given by a female
speaker of General American English to undergraduate students under
two guises: a Caucasian guise and an Asian guise. In the study, some
listeners heard the lecture while looking at a picture of a blond Cauca-
sian woman, while other listeners heard the same lecture (spoken by
the same voice) while looking at a picture of an equally attractive Asian
woman. When asked to demonstrate their understanding of the lecture,
the listeners who heard the lecture in the Asian guise understood
significantly less well than those who heard the lecture in the Caucasian
guise. Comprehension was measured via a cloze of the passage with
every seventh word deleted. In addition, listeners completed a semantic
differential instrument with scales measuring their attitudes, issues
related to background, values, and appearance, as well as items related
to accent, ethnicity, and teaching qualifications. While the Munro and
Derwing (1995) study demonstrated that listeners can decode speech
with 100 percent intelligibility yet find it heavily accented, this study
suggests that lack of understanding can be affected by seemingly unre-
lated nonlanguage factors, in this case the unconscious biases that
listeners bring with them to interactions. More recent research suggests
that this bias may also be connected to congruence between the visual
and the aural. McGowan (2015) used transcription accuracy in noise
to examine whether listeners would be more accurate in transcribing
Chinese-accented speech when presented with a Chinese face, a Cauca-
sian face, or an unspecified silhouette. Listeners transcribed more suc-
cessfully when presented with a congruent face (a Chinese face), a
finding that was consistent despite differences in experience with
listening to Chinese-accented English.

Conclusion
Many recommendations about priorities for pronunciation teaching
are painted in broad brushstrokes that probably mask distinctions

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Setting Priorities 57

between important and unimportant features within the same


category. For example, Isaacs and Trofimovich (2012) found that
pitch movement correlated with comprehensibility ratings at a rela-
tively high rate, whereas pitch range showed no correlation. Yet both
are considered to be part of intonation. Levis (1999a) suggests that
final intonation may be important for certain grammatical structures
(e.g., declaratives), while the same pitch movements are relatively
unimportant for others (e.g., yes–no questions). Syllable structure
modifications are part of Jenner’s (1989) core and Gilbert’s (2001)
recommendations, while Jenkins (2000) distinguishes between initial
deletions (core) and some medial and final deletions and epenthesis
(non-core).
It seems clear from the often conflicting, and sometimes contradict-
ory, recommendations that the criteria we use to set priorities are often
themselves unclear, based on (un)informed intuition, unsupported by
research findings, perhaps because research findings themselves have
many gaps. We are still trying to understand the picture in which too
many elements are missing, like trying to understand the picture on a
jigsaw puzzle with only half the pieces available. A first step toward
understanding the bigger picture is to try to specify and justify guide-
lines that may help us describe what an intelligibility-based approach
might look like in the classroom (as in Chapters 8 and 9). It may also
be that we need a better picture of how pronunciation not only affects
understanding in a vacuum (i.e., via the speech signal alone), but also
how it affects understanding in social and communicative contexts.

Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005
Downloaded from https://www.cambridge.org/core. University of Western Ontario, on 24 Sep 2018 at 02:51:23, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781108241564.005

You might also like