chapter

Expressive Speech Synthesis Using Emotion-Specific Speech Inventories

Authors:

Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction: COST Action 2102 International Conference, Patras, Greece, October 29-31, 2007. Revised Papers

December 2008

Pages 225 - 234

https://doi.org/10.1007/978-3-540-70872-8_17

Published: 17 December 2008 Publication History

Abstract

In this paper we explore the use of emotion-specific speech inventories for expressive speech synthesis. We recorded a semantically neutral sentence and 26 logatoms containing all the diphones and CVC triphones necessary to synthesize the same sentence. The speech material was produced by a professional actress expressing all logatoms and the sentence with the six basic emotions and in neutral tone. 7 emotion-dependent inventories were constructed from the logatoms. The 7 inventories paired with the prosody extracted from the 7 natural sentences were used to synthesize 49 sentences. 194 listeners evaluated the emotions expressed in the logatoms and in the natural and synthetic sentences. The intended emotion was recognized above chance level for 99% of the logatoms and for all natural sentences. Recognition rates significantly above chance level were obtained for each emotion. The recognition rate for some synthetic sentences exceeded that of natural ones.

References

[1]

Ladd, D.R., Silverman, K., Tolkmitt, F., Bergmann, G., Scherer, K.R.: Evidence for the independent function of intonation contour type, voice quality, and f0 range in signalling speaker affect. Journal of the Acoustic Society of America 78(2), 435-444 (1985).

Crossref

Google Scholar

[2]

Inanoglu, Z., Young, S.: A system for Transforming the Emotion in Speech: Combining Data-Driven Conversion Techniques for Prosody and Voice Quality. In: Interspeech (2007).

Google Scholar

[3]

Montero, J.M., Arriola, G.J., Colas, J., Enriquez, E., Pardo, J.M.: Analysis and Modeling of Emotional Speech in Spanish. In: Proc. of ICPhS, pp. 957-960 (1999).

Google Scholar

[4]

Bulut, M., Narayanan, S.S., Syrdal, A.K.: Expressive Speech Synthesis Using a Concatenative Synthesizer. In: ICSLP-2002, pp. 1265-1268 (2002).

Google Scholar

[5]

Schröder, M., Grice, M.: Expressing Vocal Effort in Concatenative Synthesis. In: Proc. of ICPhS, Barcelona, Spain, pp. 2589-2592 (2003).

Google Scholar

[6]

Boersma, P.: Praat, a system for doing phonetics by computer. Glot International 5(9/10), 341-345 (2001).

Google Scholar

Cited By

View all

Přibilová APřibil J(2009)Harmonic model for female voice emotional synthesisProceedings of the 2009 joint COST 2101 and 2102 international conference on Biometric ID management and multimodal communication10.5555/1812740.1812748(41-48)Online publication date: 16-Sep-2009
https://dl.acm.org/doi/10.5555/1812740.1812748

Expressive Speech Synthesis Using Emotion-Specific Speech Inventories
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Expressive gibberish speech synthesis for affective human-computer interaction
TSD'10: Proceedings of the 13th international conference on Text, speech and dialogue

In this paper we present our study on expressive gibberish speech synthesis as a means for affective communication between computing devices, such as a robot or an avatar, and their users. Gibberish speech consists of vocalizations of meaningless ...
The IBM expressive text-to-speech synthesis system for American English

Expressive text-to-speech (TTS) synthesis should contribute to the pleasantness, intelligibility, and speed of speech-based human-machine interactions which use TTS. We describe a TTS engine which can be directed, via text markup, to use a variety of ...
Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis

This paper presents an expressive voice conversion model (DeBi-HMM) as the post processing of a text-to-speech (TTS) system for expressive speech synthesis. DeBi-HMM is named for its duration-embedded characteristic of the two HMMs for modeling the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction: COST Action 2102 International Conference, Patras, Greece, October 29-31, 2007. Revised Papers

December 2008

279 pages

ISBN:9783540708711

Editors:
Anna Esposito
Department of Psychology, Second University of Naples, and IIASS, Vietri sul Mare (SA), Italy 84019
,
Nikolaos G. Bourbakis
ATRC Center, Wright State University, Dayton, USA
,
Nikolaos Avouris
Human Computer Interaction Group, University of Patras, Rio Patras, Greece
,
Ioannis Hatzilygeroudis
Department of Computer Engineering, University of Patras, Patras, Greece

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 17 December 2008

Author Tags

Qualifiers

Chapter

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 11 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Přibilová APřibil J(2009)Harmonic model for female voice emotional synthesisProceedings of the 2009 joint COST 2101 and 2102 international conference on Biometric ID management and multimodal communication10.5555/1812740.1812748(41-48)Online publication date: 16-Sep-2009
https://dl.acm.org/doi/10.5555/1812740.1812748

Abstract

References

Cited By

Recommendations

Expressive gibberish speech synthesis for affective human-computer interaction

The IBM expressive text-to-speech synthesis system for American English

Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations