Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu

Recognizing words across regional accents: The role of perceptual assimilation in lexical competition

2013

INTERSPEECH 2013 Recognizing words across regional accents: The role of perceptual assimilation in lexical competition Catherine T. Best1, Jason A. Shaw 1, Elizabeth Clancy1 1 MARCS Institute, University of Western Sydney, Australia c.best@uws.edu.au, j.shaw@uws.edu.au, e.clancy01@gmail.com spoken in L1 regional accents that are unfamiliar and phonetically disparate from listeners’ native accent offer new insights on abstract and episodic effects in lexical access. Abstract Unfamiliar regional accents disrupt spoken word recognition by L2 and L1 learners and L1 adults, and confuse ASR and smart systems. Little is known, however, about which aspects of non-native accents hinder word recognition, or what processes are involved. We assessed how Australian English (AusE) listeners’ recognition of words in unfamiliar accents is affected by two types of cross-accent perceptual assimilation: 1) other-accent phones that constitute ‘deviant’ versions of the matching AusE phonemes (Category Goodness assimilation: CG); 2) phones that cross a native phonological boundary, i.e., assimilate to mismatching AusE phonemes (Category Shift: CS). Eyetracking (“visual world”) revealed the timecourse of lexical competition during online identification of words spoken in Jamaican (JaME: vowel differences from AusE) and Cockney English (CknE: consonant differences), while choosing among four printed choice words: target, onset and offset competitors, unrelated distracter. Recognition was slower, and both competitor types were considered more and longer for JaME and CknE than AusE pronunciations; these effects were stronger for CS than CG differences. We conclude that: 1) perceptual assimilation plays a key role in cross-accent word recognition; 2) lexical competition involves not only onsets but also later aspects of words; 3) vowel and consonant variations affect lexical competition similarly. A likely factor in recognizing words in other L1 accents is the ways in which the unfamiliar vowel and consonant pronunciations are perceptually assimilated to phonemes in the listener’s native accent. Therefore, we extended the principles of the Perceptual Assimilation Model [PAM: 21, 22] to crossaccent word recognition, using critical vowels and consonants that should be assimilated either to the same native-accent phoneme but as a deviant token (Category Goodness type: CG), or to a different, contrasting native phoneme (Category Shifting type: CS, a novel extension of PAM’s Two Category [TC] assimilation type). We designed two experiments to compare how CG and CS accent differences in specific vowels and consonants influence spoken word recognition. 2. Overview of experiments We adapted the visual world paradigm, which provides sensitive indices of the timecourse of lexical competition during spoken word recognition [23, 24], as follows: 1) spoken target words were presented in isolation rather than in carrier phrases, because phonetic and phonological properties of carrier sentences in either the native or unfamiliar accents could confound target word recognition [25]; 2) the onscreen choices participants used to indicate what they had heard were printed words instead of pictures [26, 27]; 3) “not there” appeared centrally to increase task sensitivity [26, 27]; 4) the choice sets included the target word, a phonetically and orthographically unrelated distractor, a target word onset competitor, and an offset competitor, an innovation added to probe how later portions of spoken words affect lexical access. Some word recognition models posit left-to-right lexical access that privileges onsets [13, 28], but others allow effects from later in the word. For example, Shortlist [14, 15] assumes bottom-up phoneme activation while exemplar models [16-18] assume the lexicon is built of stored exemplars, yet in both views any word position can contribute to lexical competition. Index Terms: spoken word recognition, regional accent, phonological categories, perceptual assimilation 1. Introduction Regional accent variation is known to perturb spoken word recognition, especially in second language (L2) learners [1, 2] but does so even in native (L1) adults [3-6] and L1 learners [79]. Indeed, accent differences plague not only humans but also automatic speech recognition systems (ASRs) and “smart” devices [10, 11]. Little is known, however, about what processes underlie cross-accent recognition, or about which types of variations between the native/more familiar accent and the unknown/less familiar accents cause those difficulties. 2.1. Participants Models of native spoken word recognition provide crucial guidance on the likely processes involved, i.e., those identified by prior research that relied on stimuli in the listeners’ native accent. The primary debate in that literature has been over whether lexical access from spoken words is accomplished by abstract processes involving identification of the words’ component phonemes [12-15], or by episodic memories or stored traces of experienced exemplars [16-18]. More recently, proponents of both views have acknowledged that lexical access requires both processes, and have called for development of hybrid models [17, 19, 20]; such models have yet to be fleshed out, however. Investigations with words Copyright © 2013 ISCA Fourteen native Australian English listeners participated in both experiments in a single session, all recruited from the Intro Psychology pool at UWS. Two further participants were tested but removed from the data as English was not their L1. All reported having no exposure to the two accents of this study: Jamaican Mesolect and Cockney English (SE London). 2.2. Key manipulations 2.2.1. Accents 2128 25- 29 August 2013, Lyon, France When gaze duration to the crosshair reached 200ms, it and the rectangle were replaced with the words “Not there” (Figure 1). The target word played over the loudspeaker 100ms later. This allowed participants to preview the choice words prior to hearing the target, and ensured central fixation at its onset. Participants completed 8 practice trials before the test phase. Each experiment used words spoken in Australian English (AusE) versus another regional accent rarely heard in Australia. Because English accents differ mainly in their vowels, Experiment 1 used a non-native regional accent with many vowel differences from AusE: Jamaican Mesolect English (JaME) [29-31]. Consonant differences among English accents are less frequent and more restricted (e.g., to specific words or positions). Yet the impact of consonant differences on word recognition is of interest, given evidence that consonants and vowels play different roles in word structure and processing [32-35]. Thus, Experiment 2 used a non-native accent differing from AusE primarily in certain consonants (CknE: southeast London) [36]. 3. Experiment 1: Jamaican English The first experiment allowed us to test for differential effects of CS and CG vowel differences on word recognition, by using Jamaican Mesolect English (JaME) as our unfamiliar accent. JaME differs from Australian (AusE) primarily in its vowels. Consequently, most accent differences are localized in syllable nuclei rather than in syllable margins. 2.2.2. Assimilation types Target words were selected to have a single target vowel (JaME: Exp. 1) or consonant (CknE: Exp. 2) that differed from the AusE pronunciation. The critical phoneme in half of the words for each unfamiliar accent displayed a CG difference from AusE; those for the remainder showed a CS difference. 3.1. Materials 3.1.1. Audio target words Target words were selected from an existing recorded corpus of multiple tokens of isolated words produced by two female native speakers of JaME (recorded in St Catherine’s parish, Jamaica) and two of AusE (Sydney) chosen to match the JaME speakers’ voice qualities and age. In each word used in the present study the critical vowel in the JaME realization differs from AusE such that our listeners should perceptually assimilate it as either a CG (/ /) or CS (/    ) difference from their native accent. All other phonemes were pronounced similarly to AusE. We used 64 monosyllabic and 64 bisyllabic words (critical vowel in the stressed initial syllable), evenly divided between high and low frequency words (re: British [Celex] and/or AusE [SMH]). We used one token per word per speaker, selected for best match of voice quality and pitch contour among the speakers. We added 35 dB of white noise to all experimental trial targets (but not practice targets), to assure below-ceiling performance. 2.2.3. Printed choice word sets The printed choice words for onset competitors were selected to have the same onset [(C)(C)V] as the predicted AusE assimilation of the target word when it was spoken in JaME (Exp. 1) or CknE (Exp. 2). Offset competitors for monosyllable targets shared the coda [V(C)(C)]; those for bisyllable targets shared the final syllable [(C)(C)V(C)(C)], relative to expected AusE assimilation of the target. Unrelated distractors had no matching letters or phonemes, including expected assimilations, in the same position as in the target. 2.3. Procedure Participants were seated in a quiet room in front of a computer monitor with an eye-tracker below it (Tobii x120). They positioned their chin and forehead on a chin rest located 70cm from the monitor. The audio target words played from a laptop computer through a loudspeaker beside the monitor. Prior to testing, the eye-tracker was calibrated to the participant’s gaze. 3.2. Results Figure 2 shows the proportion of looks (fixation proportion) to each printed word type (Onset competitor, Offset competitor, and Unrelated distracter) as a function of time. The time window shown in Figure 2 spans 500 ms to 1500 ms after the start of a trial. This window was selected for analysis because: 1) at 500 ms the mean fixation proportion to the center (“Not there”) in all conditions had fallen below 0.5 but was still somewhat greater than that to any other choice, and 2) target fixations had reached a plateau by 1500 ms in all conditions. Figure 1: Schematic of an experimental trial. Participants click on the crosshair after reading the words, left panel; fixate on the crosshair until the eye-tracker detects their eyes (red square), middle panel; then “not there” replaces the crosshair, the spoken target word is played, and the participant clicks on the printed word corresponding to the word they heard, final panel. The example word set assesses perceptual assimilation of a Category Shifting (CS) vowel difference: AusE DU(de) [dud] onset distractor and (t)OUR [tuə] offset distractor compete with the target DOOR as spoken in JaME: [du]. The four panels in Figure 2 show the different stimulus conditions. The top two panels show the control condition, AusE-accented words. The bottom two panels show fixation proportions to JaME-accented English. The panels on the left show Category-Shifting (CS) words and those on the right show Category-Goodness (CG) words. Comparison of the top two panels with the bottom two panels reveals that, relative to AusE controls (top panels), looks are more liberally distributed across items for JaME-accented words. First, it is clear that the trajectory of looks toward the target word is steeper for AusEaccented (top) than JaME-accented (bottom) words. Second, the decline in looks to the distracters, particularly onset competitors, is much more gradual for JaME-accented words than for AusE-accented words. This indicates that the competitor words were more distracting when target words The trial procedure was designed to assure the participant was fixating the center of the monitor as each audio target word played. Trials began with a display of the four printed choice words, one per quadrant, with a central crosshair. Participants were asked to silently read the four words, then click on the crosshair and gaze at it until a red rectange outline appeared, triggered by the eye-tracker’s detection of fixation. 2129 were heard in the unfamiliar accent. interactions between accent and assimilation type [F(1,13) = 25.81, p < .001] and between accent and distracter type [F(2,12) = 9.18, p < .01] were also significant, as was the three-way interaction [F(2,12) = 4.25, p < .05]. Separate posthoc ANOVAs were run for each distracter type, with accent and assimilation type as factors. The effect of accent was significant for onset distracters [F(1,13) = 47.83, p < .001] and unrelated distracters [F(1,13) = 19.60, p < .001] and was nearly significant for offset distracters [F(1,13) = 4.52, p = .053]. The CS-CG assimilation type difference was not itself significant for any of the distracter types. However, the accent x assimilation type interaction was significant for both onset [F(1,13) = 12.55, p < .01] and offset distracters [F(1,13) = 19.06, p < .001] but n.s. for unrelated distracters [F(1,13) < 1]. Figure 2 also shows differences between CS and CG words in JaME-accented English. Fixation proportion for onset competitors, but also for offset competitors, shows a more gradual decline for CS words than for CG words. This indicates accent differences that cross a category boundary evoke greater competition between lexical items, even to some degree for the later portion of the word (offset competitors). ASSIMILATION TYPE CG 0.6 AusE 0.4 3.3. Discussion 0.2 ACCENT 0.0 As expected, the unfamiliar accent slowed word recognition. Jamaican-accented English evoked more looks to on-screen competitors than did the same words produced in Australian English, indicating that perceptual assimilation of other-accent vowels to listeners’ native accent systematically affects lexical competition and slows the process of spoken word recognition. Importantly, the design of the study and its target words revealed two additional novel findings: 1) vowels that display category-shifting (CS) differences from the native accent hinder word recognition more than those showing only withincategory goodness differences (CG); 2) nonetheless, CG vowel variants also hinder recognition, and according to the same patterns; 3) while lexical competition is strongest for word onset phonetic similarities, it is non-negligibly affected as well by word offset similarities (rime portion). 0.6 JaME Mean proportion Mean fixation CS Target Onset Offset Unrelated NotThere 0.4 0.2 0.0 500 700 900 1100 1300 1500 500 700 900 1100 1300 1500 Time (ms) Figure 2: Mean fixation proportion to each printed choice item (separate lines) by condition (accent*assimilation type). The top two panels show AusE. The bottom two panels show looks to JaME. Category Goodness [CG] (left) and Category Shifting [CS] (right) words are shown for both accents. Figure 3 shows mean fixation proportion across the 5001500ms window by condition and distracter type for Experiment 1. For onset and offset distracters, assimilation type modulates the effect of accent. Differences in fixation proportion are greater for CS words than for CG words. To evaluate statistical significance of this pattern, we conducted a three-way repeated measures ANOVA on arcsine-transformed fixation proportions (averages across the 500-1500ms window). The factors were accent {AusE, JaME}, assimilation type {CS, CG}, and distracter type {Onset competitor, Offset competitor, Unrelated distracter}. Next, we addressed whether consonant variations also elicit lexical competition, specifically whether or not they elicit the same patterns found with vowels. As noted earlier, converging evidence suggests consonants and vowels play substantively different roles in the phonological organization of words and their recognition. Thus, it was uncertain whether consonant differences would impact word processing similarly to vowels, especially whether the different and more varied positions of the consonants enhances or reduces their impact. 4. Exp 2: Cockney English (SE London) 4.1. Materials 4.1.1. Audio target words Cockney was chosen for this experiment because most of its vowels are very similar to AusE while certain consonants differ, including both CG type assimilations (initial /t/ as [ts]; /r/ as [w]) and CS type assimilations (CknE /θ/ as /f/; initial /h/ as [ ]; medial/final /t/ as []; medial /ð/ as [v]; final /l/ as []). Given the many phonotactic constraints on these consonant realization differences, the critical consonant in CknE vs. AusE target words varied among initial, medial and final position. The JaME words did not display such positional variation, as their critical vowel was always the nucleus of the stressed syllable. We selected 1- and 2-syllable target words following the same principles as Experiment 1. We again used an existing corpus of words produced by two adult female speakers of CknE (recorded in southeast London) and two new female AusE speakers (Sydney), match for voice qualities and age. Again, 35 dB of white noise was added to all tokens. Figure 3: Mean fixation proportion to distracters by accent (JaME vs. AusE) and assimilation type (CG vs. CS) between 500-1500ms. Error bars display standard errors of the mean (s.e.m.). The main effects were all significant: accent [F(1,13) = 73.86, p < .001], distracter type [F(2,12) = 65.70, p < .001], and assimilation type [F(1,13) = 7.41, p < .05]. The 2130 distracter type interaction was significant [F(2,12) = 21.01, p < .001], as was the three way interaction for accent, assimilation and distracter type [F(2,12) = 43.57, p < .001]. 4.2. Results Figure 4 is structurally parallel to Figure 2. It shows proportion of looks (fixation proportion) over time for each distracter type (Onset competitor, Offset competitor, Unrelated distracter) as a function of condition: Accent -- AusE (top), CknE (bottom) x assimilation type -- CG (left), CS (right). The same time window was used for analyses as in Experiment 1, 500ms to 1500ms, for the reasons given earlier. Comparison of the top two panels with the bottom two panels reveals that, relative to AusE controls (top panels), looks are more liberally distributed across items for CknE-accented words. Again, as in Experiment 1, the trajectory of looks toward the target word is steeper for AusE-accented (top) than CknE-accented (bottom) words, and the decline in looks to distracters (particularly onset competitors) is much more gradual for CknE-accented words than AusE-accented words. Thus, competitors were again more distracting for target words in the unfamiliar accent. To pursue the source of the three-way interaction, separate post-hoc ANOVAs were run for each distracter type with accent and assimilation type as factors. These showed that accent had a significant effect on onset competitors [F(1,13) = 45.78, p < .001] and offset competitors [F(1,13) = 24.06, p < .001]. The interaction between accent and assimilation type was also significant for both onset [F(1,13) = 7.90, p < .05] and offset competitors [F(1,13) = 10.85, p < .01]. For unrelated competitors, neither accent [F(1,13) = 3.33, p = .09] nor the interaction between accent and assimilation type [F(1,13) = 4.53, p = .08] reached significance. ASSIMILATION TYPE CG 0.6 AusE 0.4 0.2 ACCENT 0.0 0.6 CknE Mean proportion Mean fixation CS Target Onset Offset Unrelated NotThere 0.4 Figure 5: Mean fixation proportion to distracters by accent (CknE vs. AusE) and assimilation type (CG vs. CS) between 500-1500ms. Errors bars display s.e.m. values. 0.2 0.0 500 700 900 1100 1300 1500 500 700 900 1100 1300 1500 4.3. Discussion Time (ms) Figure 4: Mean fixation proportion to each item on the screen (separate lines) by condition (accent*assimilation type). The top two panels show AusE. The bottom two panels show looks to CknE. Category Shifting [CS] (left) and Category Goodness, [CG] (right) words are shown for both accents. The key results of replicated Experiment 1 findings with a very different unfamiliar accent, CknE, in which the target words had been selected to exploit CG and CS type consonant realization differences, rather than vowel differences, from AusE. Remarkably, these similarities emerged despite the different and more varied word positions of the critical consonants (initial, medial and final word positions), as compared to the constant nuclear position of the critical vowels in the stressed syllable of Experiment 1 target words. Figure 4 also shows differences between CG and CS words produced in Cockney-accented English. Fixation proportion for onset competitors shows a more gradual decline for CS words than for CG words. This indicates that accent differences that cross a category boundary evoke greater competition between lexical items. 5. Conclusions Figure 5 shows mean fixation proportion across the 5001500ms window (Figure 4) by condition and distracter type. The patterns reveal variable effects of assimilation type across accents and distracter types. There are more looks to both onset and offset distracters for CknE-accented English than for AusE-accented English. However, it appears that the effect of accent is modulated by assimilation type in different ways for onset and offset distracters. CS words draw more looks to onset distracters than do CG words. Offset distracters show the opposite pattern: CG words draw more looks than CS words. Across both vowel and consonant differences, spoken word recognition was slower for words spoken in the two unfamiliar regional English accents than in listeners’ native AusE accent. Moreover, while recognition of non-native-accented words was disrupted more by onset than offset competitors, the latter did systematically affect JaME and CknE word recognition. As predicted, effects were larger for CS than CG type accent differences. We conclude that: 1) perceptual assimilation plays a key role in cross-accent recognition; 2) lexical competition occurs not only in onsets but also later in words; 3) vowel and consonant variations affect lexical competition similarly. We again conducted a 3-way repeated measures ANOVA, as in Experiment 2. The main effects of accent [F(1,13) = 45.62, p < .001] and distracter type [F(2,12) = 40.29, p < .001] were significant, but that for assimilation type was not [F(1,13) < 1]. Nor was the interaction between accent and assimilation type significant. However, the assimilation type x 6. Acknowledgements Australian Research Council research grants DP0772441 and DP120104596 contributed support to this research. 2131 7. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] Escudero, P. and Chladkova, K. “Spanish listeners’ perception of American and Southern British English vowels”, J Acou Soc Am, 128: EL254-260, 2010. Mitterer & McQueen, J. “Processing reduced word-forms in speech perception using probabilistic knowledge about speech production”, J Exp Psy: Hum Perc Perf, 35: 244–263, 2009. Adank P., Evans B., Stuart-Smith J., Scott S. “Comprehension of familiar and unfamiliar native accents under adverse listening conditions”, J Exp Psy: Hum Perc Perf: 35, 520–529, 2009. Bradlow, A. and Bent, T. “Perceptual adaptation to non-native speech”, Cognition, 106: 707-729, 2008. Floccia et al., “Does a regional accent perturb speech processing?”, J Exp Psy: Hum Perc Perf, 32: 1276–1293, 2006. Floccia et al., “Regional and foreign accent processing in English: Can listeners adapt?”, J PsychoLing Res, 38: 379–412, 2009. Best, C.T., Tyler, M.D., Gooding, T., Orlando, C. and Quann, C. “Development of phonological constancy: Toddlers’ perception of native- and Jamaican-accented words. Psychological Science, 20: 539-542, 2009. Nathan, L., Wells, B. and Donlan, C. “Children's comprehension of unfamiliar regional accents: a preliminary investigation”, J Child Lang, 25, 343-365, 1998 Schmale, R., Cristia, A., Seidl, A. and Johnson, E.K. “Developmental changes in infants’ ability to cope with dialect variation in word recognition”, Infancy, 15(6), 650–662, 2010. Benzeghiba, M., et al “Automatic speech recognition and speech variability”, Speech Comm, 49: 763-786, 2007. Henton, C. “Bitter pills to swallow: ASR and TTS have drug problems”, Int J Speech Tech, 8: 247-257, 2005. Cutler, A., Eisner, M., McQueen, J. and Norris, D. “How abstract phonemic categories are necessary for coping with speaker-related variation”, in Fougeron C et al [Eds, Laboratory Phonology 10, 91-111, de Gruyter, 2010 Marslen-Wilson, W.D. “Functional parallelism in spoken wordrecognition. Cognition, 25(1–2): 71-102, 1987. Norris, D. “Shortlist: A connectionist model of continuous speech recognition”, Cognition, 52(3): 189–234, 1994. Norris, D. and McQueen, J. “Shortlist B: a Bayesian model of continuous speech recognition”, Psych Rev, 115(2): 357–395, 2008. Goldinger, S. “Words and voices: Episodic traces in spoken word identification and recognition memory”, J Exp Psy: Learn Mem Cog, 22: 1166-1183, 1996. Goldinger, S. “A complementary-systems approach to abstract and episodic speech perception”, Int Congr Phon Sci, 16: 49-54, 2007. Johnson, K. “Speech perception without speaker normalization: An exemplar model”, in K. Johnson and J. Mullennix ([Eds] Talker variability in speech processing, 145-16, Academic, 1997. Cutler, A. “Abstraction-based efficiency in the lexicon”, Laboratory Phonology, 1: 301-318, 2010. Pierrehumbert, J. “The next toolkit”, J Phon, 34: 516-530, 2006. Best, C.T. "A direct realist perspective on cross-language speech perception”, in W Strange [Ed], Speech perception and linguistic experience, 167-200, York Press, 1995. Best, C.T. and Tyler, M.D. Nonnative and second-language speech perception”, in M. Munro and O. Bohn [Eds] Second Language Speech Learning, 13-34, Benjamins , 2007. Tanenhaus, M.K., Spivey-Knowlton, M.J., Eberhard, K.M., and Sedivy, J.C., “Integration of visual and linguistic information during spoken language comprehension”, Science, 268: 1632– 1634, 1995. Weber, A. and Cutler, A., “Lexical competition in non-native spoken-word recognition”, J. Memory & Lang., 50: 1-25, 2004. McQueen, J.M.. and Viebahn, M.C. “Tracking recognition of spoken words by tracking looks to printed words”, Q J Exp Psych, 60(5): 661-671, 2007. Brower, S. “Processing strongly reduced forms in casual speech (PhD dissertation, Radboud University Nijmegen). http:// [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] 2132 View publication stats pubman.mpdl.mpg.de/pubman/item/escidoc:576607:5/componen t/escidoc:1480745/Brouwer(2010)_THESIS_authorVersion.pdf, 2010. Mitterer, H. “The mental lexicon is fully specified: Evidence from eye-tracking”, J Exp Psych: Hum Perc Perf 37(2): 496-513, 2011. Marslen-Wilson, W., and Zwitserlood, P. “Accessing spoken words: The importance of word onsets”, J Exp Psych: Hum Perc Perf, 15(3): 576–585, 1989. Devonish, H. and Harry, O.G. “Jamaican Creole or Jamaican English: Phonology”, in E.W. Schneider, K. Burridge, B. Kortmann, R. Mesthrie, and C. Upton [Eds], A Handbook of Varieties of English: Phonology, 450-480. Mouton de Gruyter. 2004. Patrick, P. L. “Jamaican Creole morphology and syntax”, in E. W. Schneider, K. Burridge, B. Kortmann, R. Mesthrie, & C. Upton [Eds], A Handbook of Varieties of English. Volume 2: Morphology and Syntax, 407-439, Mouton de Gruyter, 1999. Wassink, A.B. Theme and variation in Jamaican vowels. Language Variation and Change, 13: 135-159, 2001. Goldsmith, J. ”An overview of autosegmental phonology”, Linguistic Analysis, 2: 23–68, 1976. Cutler, A., Sebastián-Gallés, N., Soler-Vilageliu, O., and Van Ooijen, B. “Constraints of vowels and consonants on lexical selection: Cross-linguistic comparisons”, Mem & Cogn, 28(5): 746-755, 2000. Van Ooijen, B. “Vowel mutability and lexical selection in English: Evidence from a word reconstruction task. Memory & Cognition, 24(5): 573-583, 1996. Bonatti, L.L., Peña, M., Nespor, M. and Mehler, J. “Linguistic constraints on statistical computations: The role of consonants and vowels in continuous speech processing”, Psychological Science, 16: 451-459. 2005. Foulkes, P. and Docherty, G. “Urban Voices: Accent Studies in the British Isles”, Arnold, 1999.