Nothing Special   »   [go: up one dir, main page]

Skip to main content

Toward More Expressive Speech Communication in Human-Robot Interaction

  • Conference paper
  • First Online:
Interactive Collaborative Robotics (ICR 2018)

Abstract

It is well known that speech communication is a very important segment of human-robot interaction. The paper presents our experience from the project “Design of Robots as Assistive Technology for the Treatment of Children with Developmental Disorders”, with focus on the development of more expressive dialogue systems based on automatic speech recognition (ASR) and text-to-speech synthesis (TTS) in South Slavic languages. The paper presents the most recent results of our research related to the development of expressive conversational human-robot interaction, specifically in the field of conversion of voice and style of synthesized speech based on a new generation of deep neural network (DNN) based speech synthesis algorithms, as well as the field of emotional speech recognition. The development of dialogue strategies is described in more details in the second part of the paper, as well as the experience in their clinical applications for treatment of children with cerebral palsy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Hamacher, A., Bianchi-Berthouze, N., Pipe, A.G., Eder, K.: Believing in BERT: using expressive communication to enhance trust and counteract operational error in physical Human-Robot Interaction. In: 25th IEEE International Symposium on Robot and Human Interactive Communication, 26–31 August 2016, 8 pages (2016). https://doi.org/10.1109/roman.2016.7745163

  2. Berns, K., Zafar, Z.: Emotion based human-robot interaction. In: Ronzhin, A., Shishlakov, V. (eds.) 13th International Scientific-Technical Conference on Electromechanics and Robotics “Zavalishin’s Readings”, St. Petersburg, Russia, 18–21 April 2018, MATEC Web of Conferences, vol. 161, Article 01001, 7 pages (2018). https://doi.org/10.1051/matecconf/201816101001

  3. Popović, B., et al.: A novel split-and-merge algorithm for hierarchical clustering of Gaussian mixture models. Appl. Intell. 37(3), 377–389 (2012). https://doi.org/10.1007/s10489-011-0333-9

    Article  Google Scholar 

  4. Popović, B., Ostrogonac, S., Pakoci, E., Jakovljević, N., Delić, V.: Deep Neural Network based continuous speech recognition for Serbian Using the Kaldi Toolkit. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS (LNAI), vol. 9319, pp. 186–192. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23132-7_23

    Chapter  Google Scholar 

  5. Pakoci, E., Popović, B., Pekar, D.: Language model optimization for a deep neural network based speech recognition system for Serbian. In: Karpov, A., Potapova, R., Mporas, I. (eds.) SPECOM 2017. LNCS (LNAI), vol. 10458, pp. 483–492. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66429-3_48

    Chapter  Google Scholar 

  6. Sečujski, M., Pekar, D., Knežević, D., Svrkota V.: Prosody prediction in speech synthesis based on regression trees. In: Halupka-Rešetar, S., et al. (eds.) The 3rd International Conference of Syntax, Phonology and Language Analysis, pp. 224–236. Cambridge Scholar Publishing (2012)

    Google Scholar 

  7. Nwe, T., Foo, S., De Silva, L.: Speech emotion recognition using hidden Markov models. Speech. 41, 603–623 (2003)

    Article  Google Scholar 

  8. Schüller, B., Batliner, A., Steidl, S., Seppi, D.: Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Commun. 53, 1062–1087 (2011)

    Article  Google Scholar 

  9. Delić, V., Bojanić, M., Gnjatović, M., Sečujski, M., Jovičić, S.: Discrimination capability of prosodic and spectral features for emotional speech recognition. Elektronika ir Elektrotechnika 18(9), 51–54 (2012). https://doi.org/10.5755/j01.eee.18.9.2806

    Article  Google Scholar 

  10. Suzić, S., Delić, T., Jovanović, V., Sečujski, M., Pekar D., Delić, V.: A comparison of multi-style DNN-based TTS approaches using small datasets. In: 13th International Scientific-Technical Conference on Electromechanics and Robotics “Zavalishin’s Readings”, St. Petersburg, Russia, April 2018, MATEC Web Conference, vol. 161, 6 pages (2018). https://doi.org/10.1051/matecconf/201816103005

  11. Fan, Y., Qian, Y., Soong, F. K., He, L.: Multi-speaker modeling and speaker adaptation for DNN-based TTS synthesis. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, April 2015. https://doi.org/10.1109/icassp.2015.7178817

  12. Hojo, N., Ijima, Y., Mizuno, H.: An investigation of DNN-based speech synthesis using speaker codes. In: Interspeech, San Francisco, USA. https://doi.org/10.21437/interspeech.2016-589

  13. Gnjatović, M.: Therapist-centered design of a robot’s dialogue behavior. Cogn. Comput. 6(4), 775–788 (2014)

    Article  Google Scholar 

  14. Gnjatović, M., Delić, V.: Cognitively-inspired representational approach to meaning in machine dialogue. Knowl. Based Syst. 71, 25–33 (2014)

    Article  Google Scholar 

  15. Gnjatović, M., Janev, M., Delić, V.: Focus tree: modeling attentional information in task-oriented human-machine interaction. Appl. Intell. 37(3), 305–320 (2012)

    Article  Google Scholar 

  16. Mišković, D., Gnjatović, M., Štrbac, P., Trenkić, B., Jakovljević, N., Delić, V.: Hybrid methodological approach to context-dependent speech recognition. Int. J. Adv. Robot. Syst. 14(1), 12 (2017)

    Article  Google Scholar 

  17. Gnjatović, M., et al.: Pilot corpus of child-robot interaction in therapeutic settings. In: Proceedings of the 8th IEEE International Conference on Cognitive Infocom. (CogInfoCom), Debrecen, Hungary, pp. 253–257 (2017)

    Google Scholar 

  18. Tasevski, J., Gnjatović, M., Borovac, B.: Assessing the Children’s Receptivity to the Robot MARKO. Acta Polytechnica Hungarica, Special Issue on Cognitive Infocommunications (in press)

    Google Scholar 

  19. Zwecker, M., Zeilig, G., Ohry, A.: Professor Heinrich Sebastian Frenkel: a forgotten founder of rehabilitation medicine. Spinal Cord 42, 55–56 (2004)

    Article  Google Scholar 

Download references

Acknowledgments

Research was supported in part by the Ministry of Education, Science and Technological Development of Serbia (grants TR32035 and III44008).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vlado Delić .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Delić, V. et al. (2018). Toward More Expressive Speech Communication in Human-Robot Interaction. In: Ronzhin, A., Rigoll, G., Meshcheryakov, R. (eds) Interactive Collaborative Robotics. ICR 2018. Lecture Notes in Computer Science(), vol 11097. Springer, Cham. https://doi.org/10.1007/978-3-319-99582-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99582-3_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99581-6

  • Online ISBN: 978-3-319-99582-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics