research-article

Crowdsourcing a self-evolving dialog graph

Authors:

Fethiye Irmak Doğan,

Gabriel SkantzeAuthors Info & Claims

CUI '19: Proceedings of the 1st International Conference on Conversational User Interfaces

Article No.: 14, Pages 1 - 8

https://doi.org/10.1145/3342775.3342790

Published: 22 August 2019 Publication History

Abstract

In this paper we present a crowdsourcing-based approach for collecting dialog data for a social chat dialog system, which gradually builds a dialog graph from actual user responses and crowd-sourced system answers, conditioned by a given persona and other instructions. This approach was tested during the second instalment of the Amazon Alexa Prize 2018 (AP2018), both for the data collection and to feed a simple dialog system which would use the graph to provide answers. As users interacted with the system, a graph which maintained the structure of the dialogs was built, identifying parts where more coverage was needed. In an offline evaluation, we have compared the corpus collected during the competition with other potential corpora for training chatbots, including movie subtitles, online chat forums and conversational data. The results show that the proposed methodology creates data that is more representative of actual user utterances, and leads to more coherent and engaging answers from the agent. An implementation of the proposed method is available as open-source code.

References

[1]

Douglas Bates, Martin Mächler, Ben Bolker, and Steve Walker. 2015. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software 67, 1 (2015), 1--48.

[2]

Cynthia Breazeal, Nick DePalma, Jeff Orkin, Sonia Chernova, and Malte Jung. 2013. Crowdsourcing human-robot interaction: New methods and system evaluation in a public environment. Journal of Human-Robot Interaction 2, 1 (2013), 82--111.

Digital Library

[3]

Cleverbot. 2018. https://www.cleverbot.com. Last accessed 2018-08-14.

[4]

Microsoft Corporation. 2018. Luis. https://www.luis.ai. Last accessed 2018-08-14.

[5]

Elena Filatova. 2012. Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012).

[6]

Wikimedia Foundation. 2019. Wikidata. https://www.wikidata.org. Last accessed 2019-04-12.

[7]

J. J. Godfrey, E. C. Holliman, and J. McDaniel. 1992. SWITCHBOARD: telephone speech corpus for research and development. In {Proceedings} ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1. 517--520 vol.1.

Digital Library

[8]

Ting-Hao 'Kenneth' Huang, Joseph Chee Chang, and Jeffrey P. Bigham. 2018. Evorus: A Crowd-powered Conversational Assistant Built to Automate Itself Over Time. (2018). arXiv:1801.02668

[9]

Amazon.com Inc. 2018. The Amazon Alexa Prize. https://developer.amazon.com/alexaprize. Last accessed 2018-10-24.

[10]

Amazon.com Inc. 2018. Lex. https://aws.amazon.com/lex. Last accessed 2018-08-14.

[11]

Wit.AI Inc. 2018. Wit. https://wit.ai. Last accessed 2018-08-14.

[12]

Sina Jafarpour, Christopher JC Burges, and Alan Ritter. 2010. Filter, rank, and transfer the knowledge: Learning to chat. Advances in Ranking 10 (2010), 2329--9290.

[13]

Patrik Jonell, Mattias Bystedt, Fethiye Irmak Doğan, Per Fallgren, Jonas Ivarsson, Marketa Slukova, Ulme Wennberg, José Lopes, Johan Boye, and Gabriel Skantze. 2018. Fantom: A Crowdsourced Social Chatbot using an Evolving Dialog Graph. Alexa Prize Proceedings (2018).

[14]

John F Kelley. 1983. An empirical methodology for writing user-friendly natural language computer applications. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. ACM, 193--196.

Digital Library

[15]

Chandra Khatri, Behnam Hedayatnia, Anu Venkatesh, Jeff Nunn, Yi Pan, Qing Liu, Han Song, Anna Gottardi, Sanjeev Kwatra, Sanju Pancholi, Ming Cheng, Qinglang Chen, Lauren Stubel, Karthik Gopalakrishnan, Kate Bland, Raefer Gabriel, Arindam Mandal, Dilek Hakkani-Tür, Gene Hwang, Nate Michel, Eric King, and Rohit Prasad. 2018. Advancing the State of the Art in Open Domain Dialog Systems through the Alexa Prize. CoRR abs/1812.10757 (2018). arXiv:1812.10757 http://arxiv.org/abs/1812.10757

[16]

Ben Krause, Marco Damonte, Mihai Dobre, Daniel Duma, Joachim Fainberg, Federico Fancellu, Emmanuel Kahembwe, Jianpeng Cheng, and Bonnie L. Webber. 2017. Edina: Building an Open Domain Socialbot with Self-dialogues. CoRR abs/1709.09816 (2017). arXiv:1709.09816 http://arxiv.org/abs/1709.09816

[17]

Alexandra Kuznetsova, Per B. Brockhof, and Rune H. B. Christensen. 2017. ImerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software 82, 13 (2017), 1--26.

[18]

Iolanda Leite, André Pereira, Allison Funkhouser, Boyang Li, and Jill Fain Lehman. 2016. Semi-situated Learning of Verbal and Nonverbal Content for Repeated Human-robot Interaction. In Proceedings of the 18th ACM International Conference on Multimodal Interaction (ICMI 2016). ACM, New York, NY, USA, 13--20.

Digital Library

[19]

Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016. A Persona-Based Neural Conversation Model. arXiv (2016), 10. arXiv:1603.06155 http://arxiv.org/abs/1603.06155

[20]

Pierre Lison and Jörg Tiedemann. 2016. OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (23--28), Nicoletta Calzolari (Conference Chair), Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, and Stelios Piperidis (Eds.). European Language Resources Association (ELRA), Paris, France.

[21]

Google LLC. 2018. Dialogflow. https://dialogflow.com. Last accessed 2018-08-14.

[22]

Xiaofei Lu. 2009. Automatic measurement of syntactic complexity in child language acquisition. International Journal of Corpus Linguistics 14, 1 (2009), 3--28.

[23]

Jekaterina Novikova, Ondřej Dušek, and Verena Rieser. 2017. The E2E dataset: New challenges for end-to-end generation. arXiv preprint arXiv:1706.09254 (2017).

[24]

OpenSubtitles. 2018. https://www.opensubtitles.org. Last accessed 2018-08-14.

[25]

Jeff Orkin and Deb Roy. 2009. Automatic learning and generation of social behavior from collective human gameplay. In Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems-Volume 1. International Foundation for Autonomous Agents and Multiagent Systems, 385--392.

Digital Library

Cited By

Starke ALee M(2022)Unifying Recommender Systems and Conversational User InterfacesProceedings of the 4th Conference on Conversational User Interfaces10.1145/3543829.3544524(1-7)Online publication date: 26-Jul-2022
https://dl.acm.org/doi/10.1145/3543829.3544524
Falduti MTessaris S(2022)On the Use of Chatbots to Report Non-consensual Intimate Images Abuses: the Legal Expert PerspectiveProceedings of the 2022 ACM Conference on Information Technology for Social Good10.1145/3524458.3547247(96-102)Online publication date: 7-Sep-2022
https://dl.acm.org/doi/10.1145/3524458.3547247
Engwall OLopes JCumbal R(2022)Is a Wizard-of-Oz Required for Robot-Led Conversation Practice in a Second Language?International Journal of Social Robotics10.1007/s12369-021-00849-814:4(1067-1085)Online publication date: 5-Jan-2022
https://doi.org/10.1007/s12369-021-00849-8
Show More Cited By

Index Terms

Crowdsourcing a self-evolving dialog graph
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Discourse, dialogue and pragmatics

Recommendations

Towards Conversationally Intelligent Dialog Systems
CHI EA '22: Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems

Spoken dialog systems, lacking the means to address the complex phenomena of spontaneous speech and conversational dynamics, force users into a constrained mode of dialog that resembles text-based interaction more closely than spoken conversation. Turn-...
Integrating a dialog component into a framework for spoken language understanding
RAISE '18: Proceedings of the 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering

Spoken language interfaces are the latest trend in human computer interaction. Users enjoy the newly found freedom but developers face an unfamiliar and daunting task. Creating reactive spoken language interfaces requires skills in natural language ...
Towards a Context-Based Dialog Management Layer for Expert Systems
EKNOW '09: Proceedings of the 2009 International Conference on Information, Process, and Knowledge Management

Speech-based conversation agents describe those computer-based entities that interact with humans to help accomplish a certain task via spoken word input. This paper proposes a method of managing spoken dialog interactions in response to recognizing the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

CUI '19: Proceedings of the 1st International Conference on Conversational User Interfaces

August 2019

131 pages

ISBN:9781450371872

DOI:10.1145/3342775

General Chairs:
Benjamin R Cowan
University College Dublin, Dublin, Ireland
,
Leigh Clark
University College Dublin, Dublin, Ireland

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

CogSIS Project: CogSIS Project
ADAPT: ADAPT Centre
Irish Research Council: Irish Research Council

In-Cooperation

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 August 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Swedish Foundation for Strategic Research
Swedish Research Council

Conference

CUI 2019

Sponsor:

CogSIS Project
ADAPT
Irish Research Council

CUI 2019: 1st International Conference on Conversational User Interfaces

August 22 - 23, 2019

Dublin, Ireland

Acceptance Rates

CUI '19 Paper Acceptance Rate 9 of 28 submissions, 32%;

Overall Acceptance Rate 34 of 100 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
324
Total Downloads

Downloads (Last 12 months)24
Downloads (Last 6 weeks)1

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Starke ALee M(2022)Unifying Recommender Systems and Conversational User InterfacesProceedings of the 4th Conference on Conversational User Interfaces10.1145/3543829.3544524(1-7)Online publication date: 26-Jul-2022
https://dl.acm.org/doi/10.1145/3543829.3544524
Falduti MTessaris S(2022)On the Use of Chatbots to Report Non-consensual Intimate Images Abuses: the Legal Expert PerspectiveProceedings of the 2022 ACM Conference on Information Technology for Social Good10.1145/3524458.3547247(96-102)Online publication date: 7-Sep-2022
https://dl.acm.org/doi/10.1145/3524458.3547247
Engwall OLopes JCumbal R(2022)Is a Wizard-of-Oz Required for Robot-Led Conversation Practice in a Second Language?International Journal of Social Robotics10.1007/s12369-021-00849-814:4(1067-1085)Online publication date: 5-Jan-2022
https://doi.org/10.1007/s12369-021-00849-8
Maenhout LPeuters CCardon GCompernolle SCrombez GDeSmet A(2021)Participatory Development and Pilot Testing of an Adolescent Health Promotion ChatbotFrontiers in Public Health10.3389/fpubh.2021.7247799Online publication date: 11-Nov-2021
https://doi.org/10.3389/fpubh.2021.724779
Frommherz YZarcone A(2021)Crowdsourcing Ecologically-Valid Dialogue Data for GermanFrontiers in Computer Science10.3389/fcomp.2021.6860503Online publication date: 21-Jun-2021
https://doi.org/10.3389/fcomp.2021.686050
Choi YMonserrat TPark JShin HLee NKim J(2021)ProtoChatProceedings of the ACM on Human-Computer Interaction10.1145/34329244:CSCW3(1-27)Online publication date: 5-Jan-2021
https://dl.acm.org/doi/10.1145/3432924
Castle-Green TReeves SFischer JKoleva BTorres MSchlögl SClark LPorcheron M(2020)Decision Trees as Sociotechnical Objects in Chatbot DesignProceedings of the 2nd Conference on Conversational User Interfaces10.1145/3405755.3406133(1-3)Online publication date: 22-Jul-2020
https://dl.acm.org/doi/10.1145/3405755.3406133
Pérez-Soler SGuerra Ede Lara J(2020)Model-Driven Chatbot DevelopmentConceptual Modeling10.1007/978-3-030-62522-1_15(207-222)Online publication date: 29-Oct-2020
https://doi.org/10.1007/978-3-030-62522-1_15

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents