Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Learning to control listening-oriented dialogue using partially observable markov decision processes

Published: 03 January 2014 Publication History

Abstract

Our aim is to build listening agents that attentively listen to their users and satisfy their desire to speak and have themselves heard. This article investigates how to automatically create a dialogue control component of such a listening agent. We collected a large number of listening-oriented dialogues with their user satisfaction ratings and used them to create a dialogue control component that satisfies users by means of Partially Observable Markov Decision Processes (POMDPs). Using a hybrid dialog controller where high-level dialog acts are chosen with a statistical policy and low-level slot values are populated by a wizard, we evaluated our dialogue control method in a Wizard-of-Oz experiment. The experimental results show that our POMDP-based method achieves significantly higher user satisfaction than other stochastic models, confirming the validity of our approach. This article is the first to verify, by using human users, the usefulness of POMDP-based dialogue control for improving user satisfaction in nontask-oriented dialogue systems.

References

[1]
Bickmore, T. and Cassell, J. 2001. Relational agents: A model and implementation of building user trust. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI'01). 396--403.
[2]
Boularias, A., Chinaei, H. R., and Chaib-Draa, B. 2010. Learning the reward model of dialogue POMDPS from data. In Proceedings of the NIPS Workshop of Machine Learning for Assistive Techniques. 1--9.
[3]
Cohen, J. 1960. A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 43, 37--46.
[4]
Cohen, W. W., Carvalho, V. R., and Mitchell, T. M. 2004. Learning to classify email into “speech acts”. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'04). 309--316.
[5]
Core, M. G. and Allen, J. F. 1997. Coding dialogs with the damsl annotation scheme. In Proceedings of the Working Notes of the AAAI Fall Symposium on Communicative Action in Humans and Machines.
[6]
Ferguson, G., Allen, J. F., and Miller, B. 1996. TRAINS-95: Towards a mixed-initiative planning assistant. In Proceedings of the 3rd Conference on AI Planning Systems. 70--77.
[7]
Gašić, M., Jurcicek, F., Thomson, B., Yu, K., and Young, S. 2011. On-line policy optimisation of spoken dialogue systems via live interaction with human subjects. In Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU'11). 312--317.
[8]
Higashinaka, R., Dohsaka, K., and Isozaki, H. 2008. Effects of self-disclosure and empathy in human computer dialogue. In Proceedings of the Spoken Language Technology Workshop (SLT'08). 108--112.
[9]
Higashinaka, R., Minami, Y., Dohsaka, K., and Meguro, T. 2010. Issues in predicting user satisfaction transitions in dialogues: Individual differences, evaluation criteria, and prediction models. In Proceedings of the 2nd International Conference on Spoken Dialogue Systems for Ambient Environments. Springer, 48--60.
[10]
Higuchi, S., Rzepka, R., and Araki, K. 2008. A casual conversation system using modality and word associations retrieved from the web. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'08). 382--390.
[11]
Hirasawa, J., Miyazaki, N., Nakano, M., and Kawabata, T. 1998. Implementation of coordinative nodding behavior on spoken dialogue systems. In Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP'98). 2347--2350.
[12]
Hirshman, L. 1989. Overview of the DARPA speech and natural language workshop. In Proceedings of the DARPA Speech and Natural Language Workshop. 1--2.
[13]
Isomura, N., Toriumi, F., and Ishii, K. 2009. Evaluation method of non-task-oriented dialogue system using HMM. IEICE Trans. Inf. Syst. J92-D, 4, 542--551.
[14]
Ivey, A., Ivey, M., and Zalaquett, C. 2009. Intentional Interviewing and Counseling: Facilitating Client Development in a Multicultural Society. Brooks/Cole Publishing.
[15]
Ivey, A. E. and Ivey, M. B. 2002. Intentional Interviewing and Counseling: Facilitating Client Development in a Multicultural Society. Brooks/Cole Publishing.
[16]
Jurafsky, D., Shriberg, L., and Biasca, D. 1997. Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual. Tech. rep. 97-01, University of Colorado, Institute of Cognitive Science.
[17]
Kang, S., Gratch, J., Sidner, C. L., and Artstein, R. 2012. Towards building a virtual counselor: Modeling nonverbal behavior during intimate self-disclosure. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems. 63--70.
[18]
Kobayashi, Y., Yamamoto, D., Koga, T., Yokoyama, S., and Doi, M. 2010. Design targeting voice interface robot capable of active listening. In Proceedings of the 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI'10). 161--162.
[19]
Lee, A., Shikano, K., and Kawahara, T. 2004. Real-time word confidence scoring using local posterior probabilities on tree trellis search. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 1, 793--796.
[20]
Maatman, R. M., Gratch, J., and Marsella, S. 2005. Natural behavior of a listening agent. In Proceedings of the 5th International Working Conference on Intelligent Virtual Agents (IVA'05). Lecture Notes in Computer Science, vol. 3661, Springer, 25--36.
[21]
Mauldin, M. L. 1994. Chatterbots, tinymuds, and the turing test: Entering the Loebner prize competition. In Proceedings of the 12th National Conference on Artificial Intelligence (AAAI'94). 16--21.
[22]
Meguro, T., Higashinaka, R., Dohsaka, K., Minami, Y., and Isozaki, H. 2009. Analysis of listening oriented dialogue for building listening agents. In Proceedings of the 10th Annual SIGDIAL Meeting on Discourse and Dialogue (SIGDIAL'09). 124--127.
[23]
Meguro, T., Higashinaka, R., Minami, Y., and Dohsaka, K. 2010. Controlling listening-oriented dialogue using partially observable Markov decision processes. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING'10). 761--769.
[24]
Meguro, T., Higashinaka, R., Minami, Y., and Dohsaka, K. 2011. Evaluation of listening-oriented dialogue control rules based on the analysis of HMMs. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech'11). 809--812.
[25]
Minami, Y., Mori, A., Meguro, T., Higashinaka, R., Dohsaka, K., and Maeda, E. 2009. Dialogue control algorithm for ambient intelligence based on partially observable markov decision processes. In Proceedings of the 1st International Workshop on Spoken Dialogue Systems Technology (IWSDS'09). 254--263.
[26]
Minami, Y., Sawaki, M., Dohsaka, K., Higashinaka, R., Ishizuka, K., Isozaki, H., Matsubayashi, T., Miyoshi, M., Nakamura, A., Oba, T., Sawada, H., Yamada, T., and Maeda, E. 2007. The world of mushrooms: Human-computer interaction prototype systems for ambient intelligence. In Proceedings of the 9th International Conference on Multimodal Interfaces (ICMI'07). 366--373.
[27]
Nakano, M., Miyazaki, N., Hirasawa, J., Dohsaka, K., and Kawabata, T. 1999. Understanding unsegmented user utterances in real-time spoken dialogue systems. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics (ACL). 200--207.
[28]
Ng, A. Y. and Russell, S. J. 2000. Algorithms for inverse reinforcement learning. In Proceedings of the 7th International Conference on Machine Learning (ICML'00). 663--670.
[29]
Pineau, J., Gordon, G., and Thrun, S. 2003. Point-based value iteration: An anytime algorithm for POMDPS. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJSAI'03). 1025--1032.
[30]
Schmitt, A., Schatz, B., and Minker, W. 2011. Modeling and predicting quality in spoken human computer interaction. In Proceedings of the 12th Annual SIGDIAL Meeting on Discourse and Dialogue (SIGDIAL'11). 173--184.
[31]
Shirai, K. 1996. Modeling of spoken dialogue with and without visual information. In Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP'96). Vol. 1, 188--191.
[32]
Shitaoka, K., Tokuhisa, R., Yoshimura, T., Hoshino, H., and Watanabe, N. 2010. Active listening system for dialogue robot. Tech. rep. SIG-SLUD, vol. 58, 61--66, Japanese Society for Artificial Intelligence (in Japanese).
[33]
Stolcke, A., Coccaro, N., Bates, R., Taylor, P., Van Ess-Dykema, C., Ries, K., Shriberg, E., Jurafsky, D., Martin, R., and Meteer, M. 2000. Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput. Linguist. 26, 3, 339--373.
[34]
Sugiyama, H., Meguro, T., and Minami, Y. 2012. Preference-learning based inverse reinforcement learning for dialog control. In Proceedings of the 13th Annual Conference of the International Speech Communication Association.
[35]
Surendran, D. and Levow, G. 2006. Dialog act tagging with support vector machines and hidden Markov models. In Proceedings of the 9th International Conference on Spoken Language Processing (ICSLP'06). 1950--1953.
[36]
Walker, M., Rudnicky, A., Aberdeen, J., Bratt, E. O., Prasad, R., Roukos, S. S. G., and Stallard, S. D. 2002. DARPA Communicator evaluation: Progress from 2000 to 2001. In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP'02). 273--276.
[37]
Wallace, R. S. 2004. The Anatomy of A.L.I.C.E. Artificial Intelligence Foundation, Inc. http://www.alicebot. org/anatomy.html.
[38]
Wilks, Y. 2005. Artificial companions. Interdisciplinary Sci. Rev. 30, 145--152.
[39]
Williams, J. and Young, S. 2007. Partially observable Markov decision processes for spoken dialog systems. Comput. Speech Lang. 2, 393--422.
[40]
Yokoyama, S., Yamamoto, D., Kobayashi, Y., and Doi, M. 2010. Development of dialogue interface for elderly people—Switching the topic presenting mode and the attentive listening mode to keep chatting. Tech. rep., IPSJ SIG vol. 2010-SLP-80, 1--6 (in Japanese).

Cited By

View all
  • (2024)Implementation and Evaluation of Interviewer’s Response Generation Model Considering Semantic Content意味内容に基づくインタビュアー応答生成モデルの作成と評価Transactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.39-3_IDS6-A39:3(IDS6-A_1-15)Online publication date: 1-May-2024
  • (2023)Recording and Analyzing the Process of Building Common Ground in Dialogues in a Collaborative Task共同作業を行う対話における共通基盤構築過程の記録と分析Journal of Natural Language Processing10.5715/jnlp.30.90730:3(907-934)Online publication date: 2023
  • (2020)Collection and Analysis of Perceived Information in Chat-oriented Dialogue雑談対話における言外の情報の収集と類型化Transactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.DSI-E35:1(DSI-E_1-10)Online publication date: 1-Jan-2020
  • Show More Cited By

Index Terms

  1. Learning to control listening-oriented dialogue using partially observable markov decision processes

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Speech and Language Processing
    ACM Transactions on Speech and Language Processing   Volume 10, Issue 4
    December 2013
    206 pages
    ISSN:1550-4875
    EISSN:1550-4883
    DOI:10.1145/2560566
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 January 2014
    Accepted: 01 March 2013
    Revised: 01 November 2012
    Received: 01 August 2012
    Published in TSLP Volume 10, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Dialogue systems
    2. dialogue control
    3. listening-oriented dialogue
    4. partially observable Markov decision processes

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 22 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Implementation and Evaluation of Interviewer’s Response Generation Model Considering Semantic Content意味内容に基づくインタビュアー応答生成モデルの作成と評価Transactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.39-3_IDS6-A39:3(IDS6-A_1-15)Online publication date: 1-May-2024
    • (2023)Recording and Analyzing the Process of Building Common Ground in Dialogues in a Collaborative Task共同作業を行う対話における共通基盤構築過程の記録と分析Journal of Natural Language Processing10.5715/jnlp.30.90730:3(907-934)Online publication date: 2023
    • (2020)Collection and Analysis of Perceived Information in Chat-oriented Dialogue雑談対話における言外の情報の収集と類型化Transactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.DSI-E35:1(DSI-E_1-10)Online publication date: 1-Jan-2020
    • (2020)An Evaluation of a Chat-oriented Dialogue System that Remembers and Uses User Information over Multiple Daysユーザ情報を記憶する雑談対話システムの構築とその複数日にまたがる評価Transactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.DSI-B35:1(DSI-B_1-10)Online publication date: 1-Jan-2020
    • (2019)Positive Emotion Elicitation in Chat-Based Dialogue SystemsIEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)10.1109/TASLP.2019.290091027:4(866-877)Online publication date: 17-May-2019
    • (2017)Processing negative emotions through social communication: Multimodal database construction and analysis2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII)10.1109/ACII.2017.8273582(79-85)Online publication date: Oct-2017
    • (2016)User Information Extraction for Personalized Dialogue SystemsTransactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.DSF-51231:1(DSF-B_1-10)Online publication date: 2016
    • (2015)Micro-Counseling Dialog System Based on Semantic ContentNatural Language Dialog Systems and Intelligent Assistants10.1007/978-3-319-19291-8_6(63-72)Online publication date: 2015

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media