Abstract
In this paper we present the AmuS database of about three hours worth of data related to amused speech recorded from two males and one female subjects and contains data in two languages French and English. We review previous work on smiled speech and speech-laughs. We describe acoustic analysis on part of our database, and a perception test comparing speech-laughs with smiled and neutral speech. We show the efficiency of the data in AmuS for synthesis of amused speech by training HMM-based models for neutral and smiled speech for each voice and comparing them using an on-line CMOS test.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
AmuS is available at: http://tcts.fpms.ac.be/~elhaddad/AmuS/.
References
Ressel, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39, 1161 (1980)
Barthel, H., Quené, H.: Acoustic-phonetic properties of smiling revised-measurements on a natural video corpus. In: Proceedings of the 18th International Congress of Phonetic Sciences (2015)
Bonin, F., Campbell, N., Vogel, C.: Time for laughter. Knowl.-Based Syst. 71, 15–24 (2014)
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B.: A database of German emotional speech. In: Interspeech, vol. 5, pp. 1517–1520 (2005)
Busso, C., Bulut, M., Lee, C.C., Kazemzadeh, A., Mower, E., Kim, S., Chang, J., Lee, S., Narayanan, S.S.: IEMOCAP: interactive emotional dyadic motion capture database. J. Lang. Res. Eval. 42(4), 335–359 (2008)
Chovil, N.: Discourse oriented facial displays in conversation. Res. Lang. Soc. Interact. 25(1–4), 163–194 (1991)
Digalakis, V.V., Rtischev, D., Neumeyer, L.G.: Speaker adaptation using constrained estimation of Gaussian mixtures. IEEE Trans. Speech Audio Process. 3(5), 357–366 (1995)
Drahota, A., Costall, A., Reddy, V.: The vocal communication of different kinds of smile. Speech Commun. 50(4), 278–287 (2008)
Dumpala, S., Sridaran, K., Gangashetty, S., Yegnanarayana, B.: Analysis of laughter and speech-laugh signals using excitation source information. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 975–979, May 2014
Dupont, S., et al.: Laughter research: a review of the ILHAIRE project. In: Esposito, A., Jain, L.C. (eds.) Toward Robotic Socially Believable Behaving Systems - Volume I. ISRL, vol. 105, pp. 147–181. Springer, Cham (2016). doi:10.1007/978-3-319-31056-5_9
El Haddad, K., Cakmak, H., Dupont, S., Dutoit, T.: An HMM approach for synthesizing amused speech with a controllable intensity of smile. In: IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), Abu Dhabi, UAE, 7–10 December 2015
El Haddad, K., Dupont, S., d’Alessandro, N., Dutoit, T.: An HMM-based speech-smile synthesis system: an approach for amusement synthesis. In: International Workshop on Emotion Representation, Analysis and Synthesis in Continuous Time and Space (EmoSPACE), Ljubljana, Slovenia, 4–8 May 2015
El Haddad, K., Dupont, S., Urbain, J., Dutoit, T.: Speech-laughs: an HMM-based approach for amused speech synthesis. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, 19–24 April 2015
Émond, C., Ménard, L., Laforest, M.: Perceived prosodic correlates of smiled speech in spontaneous data. In: Bimbot, F., Cerisara, C., Fougeron, C., Gravier, G., Lamel, L., Pellegrino, F., Perrier, P. (eds.) INTERSPEECH, pp. 1380–1383. ISCA (2013)
Eyben, F., Scherer, K., Schuller, B., Sundberg, J., André, E., Busso, C., Devillers, L., Epps, J., Laukka, P., Narayanan, S., Truong, K.: The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing. IEEE Trans. Affect. Comput. 7(2), 190–202 (2015). Open access
Fagel, S.: Effects of smiling on articulation: lips, larynx and acoustics. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds.) Development of Multimodal Interfaces: Active Listening and Synchrony. LNCS, vol. 5967, pp. 294–303. Springer, Heidelberg (2010). doi:10.1007/978-3-642-12397-9_25
Fayek, H.M., Lech, M., Cavedon, L.: Evaluating deep learning architectures for speech emotion recognition. Neural Netw. 92, 60–68 (2017)
Garcia-Ceja, E., Osmani, V., Mayora, O.: Automatic stress detection in working environments from smartphones’ accelerometer data: a first step. IEEE J. Biomed. Health Inform. 20(4), 1053–1060 (2016)
Glenn, P.: Laughter in Interaction, vol. 18. Cambridge University Press, Cambridge (2003)
Haakana, M.: Laughter and smiling: notes on co-occurrences. J. Pragmat. 42(6), 1499–1512 (2010)
Haddad, K.E., Çakmak, H., Dupont, S., Dutoit, T.: Amused speech components analysis and classification: towards an amusement arousal level assessment system. Comput. Electr. Eng. (2017). http://www.sciencedirect.com/science/article/pii/S0045790617317135
Hoque, M., Morency, L.-P., Picard, R.W.: Are you friendly or just polite? – analysis of smiles in spontaneous face-to-face interactions. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.-C. (eds.) ACII 2011. LNCS, vol. 6974, pp. 135–144. Springer, Heidelberg (2011). doi:10.1007/978-3-642-24600-5_17
Ito, A., Wang, X., Suzuki, M., Makino, S.: Smile and laughter recognition using speech processing and face recognition from conversation video. In: 2005 International Conference on Cyberworlds (CW 2005), pp. 437–444, November 2005
Kim, Y., Provost, E.M.: Emotion spotting: discovering regions of evidence in audio-visual emotion expressions. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, ICMI 2016, New York, NY, USA, pp. 92–99. ACM (2016)
Kohler, K.J.: “Speech-smile”,“speech-laugh”,“laughter” and their sequencing in dialogic interaction. Phonetica 65(1–2), 1–18 (2008)
Kominek, J., Black, A.W.: The CMU arctic speech databases. In: Fifth ISCA Workshop on Speech Synthesis (2004)
Koolagudi, S.G., Rao, K.S.: Emotion recognition from speech: a review. Int. J. Speech Technol. 15(2), 99–117 (2012)
Kraut, R.E., Johnston, R.E.: Social and emotional messages of smiling: an ethological approach. J. Pers. Soc. Psychol. 37(9), 1539 (1979)
Lasarcyk, E., Trouvain, J.: Spread lips+ raised larynx+ higher f0= Smiled Speech?-an articulatory synthesis approach. In: Proceedings of ISSP (2008)
Laskowski, K., Burger, S.: Analysis of the occurrence of laughter in meetings. In: Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp, Belgium, pp. 1258–1261, 27–31 August 2007
Bradley, M.M., Greenwald, M.K., Petry, M.C., Lang, P.J.: Remembering pictures: pleasure and arousal in memory. J. Exp. Psychol. Learn. Mem. Cogn. 18, 379 (1992)
McKeown, G., Curran, W.: The relationship between laughter intensity and perceived humour. In: Proceedings of the 4th Interdisciplinary Workshop on Laughter and Other Non-verbal Vocalisations in Speech, pp. 27–29 (2015)
Ming, H., Huang, D., Xie, L., Wu, J., Dong, M., Li, H.: Deep bidirectional LSTM modeling of timbre and prosody for emotional voice conversion. In: 17th Annual Conference of the International Speech Communication Association, Interspeech 2016, 8–12 September 2016, San Francisco, CA, USA, pp. 2453–2457 (2016)
Nwokah, E.E., Hsu, H.C., Davies, P., Fogel, A.: The integration of laughter and speech in vocal communicationa dynamic systems perspective. J. Speech Lang. Hear. Res. 42(4), 880–894 (1999)
Oh, J., Wang, G.: Laughter modulation: from speech to speech-laugh. In: INTERSPEECH, pp. 754–755 (2013)
Pickering, L., Corduas, M., Eisterhold, J., Seifried, B., Eggleston, A., Attardo, S.: Prosodic markers of saliency in humorous narratives. Discourse process. 46(6), 517–540 (2009)
Provine, R.R.: Laughter punctuates speech: linguistic, social and gender contexts of laughter. Ethology 95(4), 291–298 (1993)
Robson, J., Janet, B.: Hearing smiles-perceptual, acoustic and production aspects of labial spreading. In: XIVth Proceedings of the XIVth International Congress of Phonetic Sciences, vol. 1, pp. 219–222. International Congress of Phonetic Sciences (1999)
Sjölander, K.: The Snack Sound Toolkit [computer program webpage] (consulted on September, 2014). http://www.speech.kth.se/snack/
Tartter, V.: Happy talk: perceptual and acoustic effects of smiling on speech. Percept. Psychophys. 27(1), 24–27 (1980)
Tartter, V.C., Braun, D.: Hearing smiles and frowns in normal and whisper registers. J. Acoust. Soc. Am. 96(4), 2101–2107 (1994)
Torre, I.: Production and perception of smiling voice. In: Proceedings of the First Postgraduate and Academic Researchers in Linguistics at York (PARLAY 2013), pp. 100–117 (2014)
Trouvain, J.: Phonetic aspects of “speech laughs”. In: Oralité et Gestualité: Actes du colloque ORAGE, Aix-en-Provence. L’Harmattan, Paris, pp. 634–639 (2001)
Young, S.J., Young, S.: The HTK hidden Markov model toolkit: design and philosophy. In: Entropic Cambridge Research Laboratory, Ltd. (1994)
Zen, H., Nose, T., Yamagishi, J., Sako, S., Masuko, T., Black, A., Tokuda, K.: The HMM-based speech synthesis system (HTS) version 2.0. In: Proceeding 6th ISCA Workshop on Speech Synthesis (SSW-6), August 2007
Zen, H., Tokuda, K., Black, A.W.: Statistical parametric speech synthesis. Speech Commun. 51(11), 1039–1064 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
El Haddad, K. et al. (2017). Introducing AmuS: The Amused Speech Database. In: Camelin, N., Estève, Y., Martín-Vide, C. (eds) Statistical Language and Speech Processing. SLSP 2017. Lecture Notes in Computer Science(), vol 10583. Springer, Cham. https://doi.org/10.1007/978-3-319-68456-7_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-68456-7_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68455-0
Online ISBN: 978-3-319-68456-7
eBook Packages: Computer ScienceComputer Science (R0)