ABSTRACT Virtual humans have become part of our everyday life (movies, internet, and computer gam... more ABSTRACT Virtual humans have become part of our everyday life (movies, internet, and computer games). Even though they are more and more realistic, their speech capabilities are, most of the time, limited and not coherent and/or not synchronous with the corresponding acoustic signal. We describe a method to convert a virtual human avatar (animated through key frames and interpolation) into a more naturalistic talking head. Speech-capabilities were added to the avatar using real speech production data. Electromagnetic articulography (EMA) data provided lip, jaw and tongue trajectories of a speaker involved in face to face communication. An articulatory model driving jaw, lip and tongue movements was built. Constraining the key frame values, a corresponding high definition tongue articulatory model was developed. The resulting avatar was able to produce visible and partly occluded facial speech movements coherent and synchronous with the acoustic signal.
The Journal of the Acoustical Society of America, Oct 1, 2016
Substantial research has established that place of articulation of stop consonants (labial, alveo... more Substantial research has established that place of articulation of stop consonants (labial, alveolar, velar) are reliably differentiated using a number of acoustic measures such as closure duration, voice onset time (VOT), and spectral measures such as centre of gravity and the relative energy distribution in the mid-to-high spectral range of the burst. It is unclear, however, whether such measurable acoustic differences are present in multiple place of articulation contrasts among coronal stops. This article presents evidence from the highly endangered indigenous Australian language Wubuy, which maintains a 4-way coronal stop place contrast series in all word positions. The authors examine the temporal and burst characteristics of / t̪ t ʈ/ in three prosodic positions (utterance-initial, word-initial but phrase medial, and word-medial). The results indicate that VOT, closure duration, and the spectral quality of the burst may indeed differentiate multiple coronal place contrasts, i...
Proceedings of the 13th Australasian International Conference on Speech Science and Technology 14 16 December 2010 Melbourne Australia, 2015
Page 1. The Big Australian Speech Corpus (The Big ASC) Michael Wagner 1 , Dat Tran 1 , Roberto To... more Page 1. The Big Australian Speech Corpus (The Big ASC) Michael Wagner 1 , Dat Tran 1 , Roberto Togneri 2 , Phil Rose 3 , David Powers 4 , Mark Onslow 5 , Debbie Loakes 6 , TrentLewis 4 , Takaaki Kuratate 7 , Yuko Kinoshita 1 , Nenagh Kemp 8 , ...
Selected Proceedings of the 2008 Hcsnet Workshop on Designing the Australian National Corpus Mustering Languages University of New South Wales 4 5 December 2008, 2015
What is the role of temporal integration in the development of speech perception? The answer depe... more What is the role of temporal integration in the development of speech perception? The answer depends on how one construes ''temporal integration.'' Over which time scale does the integration take placethe microscopic scale of stimulus time, the more extended scale of ...
ABSTRACT Virtual humans have become part of our everyday life (movies, internet, and computer gam... more ABSTRACT Virtual humans have become part of our everyday life (movies, internet, and computer games). Even though they are more and more realistic, their speech capabilities are, most of the time, limited and not coherent and/or not synchronous with the corresponding acoustic signal. We describe a method to convert a virtual human avatar (animated through key frames and interpolation) into a more naturalistic talking head. Speech-capabilities were added to the avatar using real speech production data. Electromagnetic articulography (EMA) data provided lip, jaw and tongue trajectories of a speaker involved in face to face communication. An articulatory model driving jaw, lip and tongue movements was built. Constraining the key frame values, a corresponding high definition tongue articulatory model was developed. The resulting avatar was able to produce visible and partly occluded facial speech movements coherent and synchronous with the acoustic signal.
The Journal of the Acoustical Society of America, Oct 1, 2016
Substantial research has established that place of articulation of stop consonants (labial, alveo... more Substantial research has established that place of articulation of stop consonants (labial, alveolar, velar) are reliably differentiated using a number of acoustic measures such as closure duration, voice onset time (VOT), and spectral measures such as centre of gravity and the relative energy distribution in the mid-to-high spectral range of the burst. It is unclear, however, whether such measurable acoustic differences are present in multiple place of articulation contrasts among coronal stops. This article presents evidence from the highly endangered indigenous Australian language Wubuy, which maintains a 4-way coronal stop place contrast series in all word positions. The authors examine the temporal and burst characteristics of / t̪ t ʈ/ in three prosodic positions (utterance-initial, word-initial but phrase medial, and word-medial). The results indicate that VOT, closure duration, and the spectral quality of the burst may indeed differentiate multiple coronal place contrasts, i...
Proceedings of the 13th Australasian International Conference on Speech Science and Technology 14 16 December 2010 Melbourne Australia, 2015
Page 1. The Big Australian Speech Corpus (The Big ASC) Michael Wagner 1 , Dat Tran 1 , Roberto To... more Page 1. The Big Australian Speech Corpus (The Big ASC) Michael Wagner 1 , Dat Tran 1 , Roberto Togneri 2 , Phil Rose 3 , David Powers 4 , Mark Onslow 5 , Debbie Loakes 6 , TrentLewis 4 , Takaaki Kuratate 7 , Yuko Kinoshita 1 , Nenagh Kemp 8 , ...
Selected Proceedings of the 2008 Hcsnet Workshop on Designing the Australian National Corpus Mustering Languages University of New South Wales 4 5 December 2008, 2015
What is the role of temporal integration in the development of speech perception? The answer depe... more What is the role of temporal integration in the development of speech perception? The answer depends on how one construes ''temporal integration.'' Over which time scale does the integration take placethe microscopic scale of stimulus time, the more extended scale of ...
Italian roots in Australian soil: coronal obstruents in native dialect speech of Italian-Australi... more Italian roots in Australian soil: coronal obstruents in native dialect speech of Italian-Australians from two areas of Veneto.
We will discuss the maintenance of the heritage dialect coronal fricatives in the speech of Italian-Australian trilinguals (dialect/Italian/English) originating from North Veneto, Italy, as compared to the variability found in the productions of comparable Italian-Australian trilinguals originating from Central Veneto. Results on coronal fricatives' distribution based on narrow phonetic trancriptions and on their acoustic characteristics based on spectral moments analysis show that the immigrants have generally maintained the fine-grained features of their dialect. After more than five decades of residence in Australia, traces of interlinguistic influences exerted by L3-English are evident in one speaker only. We consider both internal (linguistic) and external (sociolinguistic) factors for this difference in maintenance of first language speech features between immigrants from two geographical areas of the same province of origin in Italy (Veneto).
Uploads
Papers by Catherine Best
We will discuss the maintenance of the heritage dialect coronal fricatives in the speech of Italian-Australian trilinguals (dialect/Italian/English) originating from North Veneto, Italy, as compared to the variability found in the productions of comparable Italian-Australian trilinguals originating from Central Veneto. Results on coronal fricatives' distribution based on narrow phonetic trancriptions and on their acoustic characteristics based on spectral moments analysis show that the immigrants have generally maintained the fine-grained features of their dialect. After more than five decades of residence in Australia, traces of interlinguistic influences exerted by L3-English are evident in one speaker only. We consider both internal (linguistic) and external (sociolinguistic) factors for this difference in maintenance of first language speech features between immigrants from two geographical areas of the same province of origin in Italy (Veneto).