research-article

Open access

Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data

Authors:

Young-Ho KimAuthors Info & Claims

Proceedings of the ACM on Human-Computer Interaction, Volume 8, Issue CSCW1

Article No.: 87, Pages 1 - 35

https://doi.org/10.1145/3637364

Published: 26 April 2024 Publication History

Abstract

Large language models (LLMs) provide a new way to build chatbots by accepting natural language prompts. Yet, it is unclear how to design prompts to power chatbots to carry on naturalistic conversations while pursuing a given goal such as collecting self-report data from users. We explore what design factors of prompts can help steer chatbots to talk naturally and collect data reliably. To this aim, we formulated four prompt designs with different structures and personas. Through an online study (N = 48) where participants conversed with chatbots driven by different designs of prompts, we assessed how prompt designs and conversation topics affected the conversation flows and users' perceptions of chatbots. Our chatbots covered 79% of the desired information slots during conversations, and the designs of prompts and topics significantly influenced the conversation flows and the data collection performance. We discuss the opportunities and challenges of building chatbots with LLMs.

References

[1]

2021. Auto-GPT. https://github.com/Significant-Gravitas/Auto-GPT. GitHub repository.

[2]

2022. Chatgpt is a tipping point for AI. https://hbr.org/2022/12/chatgpt-is-a-tipping-point-for-ai

[3]

2023. ChatGPT passes 1B page views. https://aibusiness.com/nlp/chatgpt-passes-1b-page-views.

[4]

Daniel Adiwardana, Minh-Thang Luong, David R So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, et al . 2020. Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977 (2020).

[5]

Sam Altman. 2022. CHATGPT launched on Wednesday. Today it crossed 1 million users! https://twitter.com/sama/status/1599668808285028353

[6]

Amazon. 2022. Amazon Alexa Voice AI. Retrieved Dec 04, 2022 from https://developer.amazon.com/en-US/alexa

[7]

Amazon.com, Inc. 2022. Amazon Alexa Voice AI. Retrieved Dec 04, 2022 from https://developer.amazon.com/en-US/alexa

[8]

Apple Inc. 2022. SIRI shortcuts boost health and fitness routines - Apple News Room. Retrieved Dec 04, 2022 from https://www.apple.com/newsroom/2019/03/siri-shortcuts-boost-health-and-fitness-routines/

[9]

Zahra Ashktorab, Mohit Jain, Q Vera Liao, and Justin D Weisz. 2019. Resilient chatbots: Repair strategy preferences for conversational breakdowns. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1--12.

Digital Library

[10]

Jacob Austin. 2022. We found that code models get better when you prompt them with "I'm an expert python programmer". the new anthropic paper did something similar, prefixing the model's response with "I've tested this function myself so I know that it's correct:". https://twitter.com/jacobaustin132/status/1515063524258627586

[11]

Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, et al. 2021. Program synthesis with large language models. arXiv preprint arXiv:2108.07732 (2021).

[12]

Sanghwan Bae, Donghyun Kwak, Sungdong Kim, Donghoon Ham, Soyoung Kang, Sang-Woo Lee, and Woomyoung Park. 2022. Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models. arXiv preprint arXiv:2205.00176 (2022).

[13]

Florian Bemmann, Ramona Schödel, Niels Van Berkel, and Daniel Buschek. 2021. Chatbots for Experience Sampling-Initial Opportunities and Challenges. In IUI Workshops.

[14]

Erin Beneteau, Olivia K Richards, Mingrui Zhang, Julie A Kientz, Jason Yip, and Alexis Hiniker. 2019. Communication breakdowns between families and Alexa. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1--13.

Digital Library

[15]

Michael Braun, Anja Mainz, Ronee Chadowitz, Bastian Pfleging, and Florian Alt. 2019. At your service: Designing voice assistant personalities to improve automotive user interfaces. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1--11.

Digital Library

[16]

Susan E Brennan. 1990. Conversation as direct manipulation: An iconoclastic view. The art of human-computer interface design (1990), 393--404.

[17]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al . 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877--1901.

[18]

Irene Celino and Gloria Re Calegari. 2020. Submitting surveys via a conversational interface: an evaluation of user acceptance and approach effectiveness. International Journal of Human-Computer Studies 139 (2020), 102410.

[19]

Harrison Chase. 2023. Langchain. https://github.com/hwchase17/langchain.

[20]

Janghee Cho and Emilee Rader. 2020. The role of conversational grounding in supporting symbiosis between people and digital assistants. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1 (2020), 1--28.

Digital Library

[21]

Eun Kyoung Choe, Saeed Abdullah, Mashfiqui Rabbi, Edison Thomaz, Daniel A Epstein, Felicia Cordeiro, Matthew Kay, Gregory D Abowd, Tanzeem Choudhury, James Fogarty, Bongshin Lee, Mark Matthews, and Julie A Kientz. 2017. Semi-Automated Tracking: A Balanced Approach for Self-Monitoring Applications. IEEE Pervasive Computing 16, 1 (Jan. 2017), 74--84. https://doi.org/10.1109/MPRV.2017.18

Digital Library

[22]

Eun Kyoung Choe, Nicole B Lee, Bongshin Lee, Wanda Pratt, and Julie A Kientz. 2014. Understanding quantified-selfers' practices in collecting and exploring personal data. In Proceedings of the SIGCHI conference on human factors in computing systems. 1143--1152.

Digital Library

[23]

Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, and Noah Fiedel. 2022. PaLM: Scaling Language Modeling with Pathways. https://doi.org/10.48550/ARXIV.2204.02311

[24]

John Joon Young Chung, Wooseok Kim, Kang Min Yoo, Hwaran Lee, Eytan Adar, and Minsuk Chang. 2022. TaleBrush: Sketching Stories with Generative Pretrained Language Models. In CHI Conference on Human Factors in Computing Systems. 1--19.

[25]

Felicia Cordeiro, Daniel A Epstein, Edison Thomaz, Elizabeth Bales, Arvind K Jagannathan, Gregory D Abowd, and James Fogarty. 2015. Barriers and negative nudges: Exploring challenges in food journaling. In Proceedings of the 33rd annual ACM conference on human factors in computing systems. 1159--1162.

Digital Library

[26]

Lei Cui, Shaohan Huang, Furu Wei, Chuanqi Tan, Chaoqun Duan, and Ming Zhou. 2017. Superagent: A customer service chatbot for e-commerce websites. In Proceedings of ACL 2017, system demonstrations. 97--102.

[27]

Tilman Dingler, Dominika Kwasnicka, Jing Wei, Enying Gong, and Brian Oldenburg. 2021. The use and promise of conversational agents in digital health. Yearbook of Medical Informatics 30, 01 (2021), 191--199.

[28]

Daniel A Epstein, Daniel Avrahami, and Jacob T Biehl. 2016. Taking 5: Work-Breaks, Productivity, and Opportunities for Personal Informatics for Knowledge Workers. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM Press, New York, NY, USA, 673--684. https://doi.org/10.1145/2858036.2858066

Digital Library

[29]

Kathleen Kara Fitzpatrick, Alison Darcy, and Molly Vierhile. 2017. Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot): a randomized controlled trial. JMIR mental health 4, 2 (2017), e7785.

[30]

Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, and Noah A Smith. 2020. Realtoxicityprompts: Evaluating neural toxic degeneration in language models. arXiv preprint arXiv:2009.11462 (2020).

[31]

Ulrich Gnewuch, Stefan Morana, Marc Adam, and Alexander Maedche. 2018. Faster is not always better: understanding the effect of dynamic response delays in human-chatbot interaction. (2018).

[32]

Google. 2022. Build Chatbots with Dialogflow. Retrieved Dec 04, 2022 from https://developers.google.com/learn/pathways/chatbots-dialogflow

[33]

Google. 2022. DialogFlow | Google Cloud. Retrieved Dec 04, 2022 from https://cloud.google.com/dialogflow/docs/

[34]

Google. 2022. Google Assistant, Your Own Personal Google. Retrieved Dec 04, 2022 from https://assistant.google.com

[35]

Sarah Homewood, Amanda Karlsson, and Anna Vallgårda. 2020. Removal as a method: A fourth wave HCI approach to understanding the experience of self-tracking. In Proceedings of the 2020 ACM Designing Interactive Systems Conference. 1779--1791.

Digital Library

[36]

IBM. 2022. IBM Watson. Retrieved Dec 04, 2022 from https://www.ibm.com/watson

[37]

Mohit Jain, Pratyush Kumar, Ramachandra Kota, and Shwetak N Patel. 2018. Evaluating and informing the design of chatbots. In Proceedings of the 2018 designing interactive systems conference. 895--906.

Digital Library

[38]

Jae Ho Jeon. 2016. OmniTrack: Designing Flexible and Highly Customizable Quantified-Self Tool. MS thesis. Seoul National University, Seoul, Korea. http://www.riss.kr/link?id=T14226449

[39]

Eunkyung Jo, Daniel A Epstein, Hyunhoon Jung, and Young-ho Kim. 2023. Understanding the Benefits and Challenges of Deploying Conversational AI Leveraging Large Language Models for Public Health Intervention. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Vol. 1. ACM, New York, NY, USA, 1--16. https://doi.org/10.1145/3544548.3581503

Digital Library

[40]

Juju, inc. 2022. Cognitive AI Chatbot. Retrieved Dec 04, 2022 from https://juji.io/

[41]

Boseop Kim, HyoungSeok Kim, Sang-Woo Lee, Gichang Lee, Donghyun Kwak, Jeon Dong Hyeon, Sunghyun Park, Sungju Kim, Seonhoon Kim, Dongpil Seo, Heungsub Lee, Minyoung Jeong, Sungjae Lee, Minsub Kim, Suk Hyun Ko, Seokhun Kim, Taeyong Park, Jinuk Kim, Soyoung Kang, Na-Hyeon Ryu, Kang Min Yoo, Minsuk Chang, Soobin Suh, Sookyo In, Jinseong Park, Kyungduk Kim, Hiun Kim, Jisu Jeong, Yong Goo Yeo, Donghoon Ham, Dongju Park, Min Young Lee, Jaewook Kang, Inho Kang, Jung-Woo Ha, Woomyoung Park, and Nako Sung. 2021. What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 3405--3424. https://doi.org/10.18653/v1/2021.emnlp-main.274

[42]

Soomin Kim, Joonhwan Lee, and Gahgene Gweon. 2019. Comparing data from chatbot and web surveys: Effects of platform and conversational style on survey response quality. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1--12.

Digital Library

[43]

Young-Ho Kim, Eun Kyoung Choe, Bongshin Lee, and Jinwook Seo. 2019. Understanding Personal Productivity: How Knowledge Workers Define, Evaluate, and Reflect on Their Productivity. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI '19). ACM, New York, NY, USA, Article 615, 12 pages. https://doi.org/10.1145/3290605.3300845

Digital Library

[44]

Young-Ho Kim, Diana Chou, Bongshin Lee, Margaret Danilovich, Amanda Lazar, David E. Conroy, Hernisa Kacorri, and Eun Kyoung Choe. 2022. MyMove: Facilitating Older Adults to Collect In-Situ Activity Labels on a Smartwatch with Speech. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI '22). ACM, New York, NY, USA. https://doi.org/10.1145/3491102.3517457

Digital Library

[45]

Young-Ho Kim, Jae Ho Jeon, Eun Kyoung Choe, Bongshin Lee, KwonHyun Kim, and Jinwook Seo. 2016. TimeAware: Leveraging Framing Effects to Enhance Personal Productivity. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (Santa Clara, California, USA) (CHI '16). ACM, New York, NY, USA, 272--283. https://doi.org/10.1145/2858036.2858428

Digital Library

[46]

Young-Ho Kim, Jae Ho Jeon, Bongshin Lee, Eun Kyoung Choe, and Jinwook Seo. 2017. OmniTrack: A Flexible Self-Tracking Approach Leveraging Semi-Automated Tracking. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 3, Article 67 (Sept. 2017), 28 pages. https://doi.org/10.1145/3130930

Digital Library

[47]

Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. 2022. Large Language Models are Zero-Shot Reasoners. In ICML 2022 Workshop on Knowledge Retrieval and Language Models. https://openreview.net/forum?id=6p3AuaHAFiN

[48]

Diane M Korngiebel and Sean D Mooney. 2021. Considering the possibilities and pitfalls of Generative Pre-trained Transformer 3 (GPT-3) in healthcare delivery. NPJ Digital Medicine 4, 1 (2021), 1--3.

[49]

Brigitte Krenn, Birgit Endrass, Felix Kistler, and Elisabeth André. 2014. Effects of language variety on personality perception in embodied conversational agents. In International Conference on Human-Computer Interaction. Springer, 429--439.

[50]

Charlotte Laborde, Erta Cenko, Mamoun Mardini, Sanjay Ranka, Parisa Rashidi, and Todd Manini. 2020. Older Adults' Satisfaction and Compliance of Smartwatches Providing Ecological Momentary. Innovation in Aging 4, Suppl 1 (2020), 799.

[51]

Reed Larson and Mihaly Csikszentmihalyi. 2014. The Experience Sampling Method. In Flow and the foundations of positive psychology. Springer, Berlin/Heidelberg, Germany, 21--34.

[52]

Chia-Hsuan Lee, Hao Cheng, and Mari Ostendorf. 2021. Dialogue State Tracking with a Language Model using Schema-Driven Prompting. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 4937--4949. https://doi.org/10.18653/v1/2021.emnlp-main.404

[53]

Mina Lee, Percy Liang, and Qian Yang. 2022. CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities. In CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI '22). Association for Computing Machinery, New York, NY, USA, Article 388, 19 pages. https://doi.org/10.1145/3491102.3502030

Digital Library

[54]

Min Kyung Lee, Sara Kiesler, Jodi Forlizzi, Siddhartha Srinivasa, and Paul Rybski. 2010. Gracefully mitigating breakdowns in robotic services. In 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 203--210.

[55]

Peter Lee, Sebastien Bubeck, and Joseph Petro. 2023. Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine. New England Journal of Medicine 388, 13 (2023), 1233--1239.

[56]

Russell V. Lenth, Paul Buerkner, Maxime Herve, Jonathon Love, Hannes Riebl, and Henrik Singmann. 2021. emmeans: Estimated Marginal Means, aka Least-Squares Means. CRAN. https://CRAN.R-project.org/package=emmeans

[57]

Chi-Hsun Li, Su-Fang Yeh, Tang-Jie Chang, Meng-Hsuan Tsai, Ken Chen, and Yung-Ju Chang. 2020. A conversation analysis of non-progress and coping strategies with a banking task-oriented chatbot. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--12.

Digital Library

[58]

Ian Li, Anind Dey, and Jodi Forlizzi. 2010. A Stage-based Model of Personal Informatics Systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI '10). ACM, New York, NY, USA, 557--566. https://doi.org/10.1145/1753326.1753409

Digital Library

[59]

Ian Li, Anind K Dey, and Jodi Forlizzi. 2011. Understanding my data, myself: supporting self-reflection with ubicomp technologies. In Proceedings of the 13th international conference on Ubiquitous computing. 405--414.

Digital Library

[60]

Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, and Yuta Koreeda. 2022. Holistic Evaluation of Language Models. https://doi.org/10.48550/ARXIV.2211.09110

[61]

Q Vera Liao, Matthew Davis, Werner Geyer, Michael Muller, and N Sadat Shami. 2016. What can you do? Studying social-agent orientation and agent proactive interactions with an agent for employees. In Proceedings of the 2016 acm conference on designing interactive systems. 264--275.

Digital Library

[62]

Q Vera Liao, Muhammed Mas-ud Hussain, Praveen Chandar, Matthew Davis, Yasaman Khazaeni, Marco Patricio Crasso, Dakuo Wang, Michael Muller, N Sadat Shami, and Werner Geyer. 2018. All work and no play? Conversations with a Question-and-Answer Chatbot in the Wild. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1--13.

[63]

Zhaojiang Lin, Bing Liu, Andrea Madotto, Seungwhan Moon, Zhenpeng Zhou, Paul A Crook, Zhiguang Wang, Zhou Yu, Eunjoon Cho, Rajen Subba, et al. 2021. Zero-Shot Dialogue State Tracking via Cross-Task Transfer. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP). 7890--7900.

[64]

Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. 2021. What Makes Good In-Context Examples for GPT-3? https://doi.org/10.48550/ARXIV.2101.06804

[65]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. https://doi.org/10.48550/ARXIV.2107.13586

[66]

Vivian Liu and Lydia B Chilton. 2022. Design Guidelines for Prompt Engineering Text-to-Image Generative Models. In CHI Conference on Human Factors in Computing Systems. 1--23.

[67]

Ewa Luger and Abigail Sellen. 2016. ?Like Having a Really Bad PA" The Gulf between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI conference on human factors in computing systems. 5286--5297.

Digital Library

[68]

Yuhan Luo, Young-Ho Kim, Bongshin Lee, Naeemul Hassan, and Eun Kyoung Choe. 2021. FoodScrap: Promoting Rich Data Capture and Reflective Food Journaling Through Speech Input. In Designing Interactive Systems Conference 2021. 606--618.

[69]

Yuhan Luo, Bongshin Lee, and Eun Kyoung Choe. 2020. TandemTrack: shaping consistent exercise experience by complementing a mobile app with a smart speaker. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--13.

Digital Library

[70]

Lucas M. Silva and Daniel A. Epstein. 2021. Investigating Preferred Food Description Practices in Digital Food Journaling. In Designing Interactive Systems Conference 2021 (Virtual Event, USA) (DIS '21). Association for Computing Machinery, New York, NY, USA, 589--605. https://doi.org/10.1145/3461778.3462145

Digital Library

[71]

Michael McTear. 2018. Conversational modelling for chatbots: current approaches and future directions. Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung (2018), 175--185.

[72]

Shikib Mehri, Yasemin Altun, and Maxine Eskenazi. 2022. LAD: Language Models as Data for Zero-Shot Dialog. arXiv preprint arXiv:2207.14393 (2022).

[73]

Meta Platforms, Inc. 2022. React -- A JavaScript library for building user interfaces. Retrieved Dec 04, 2022 from https://reactjs.org/

[74]

Microsoft. 2022. TypeScript. Retrieved Dec 04, 2022 from https://www.typescriptlang.org

[75]

Jun-Ki Min, Afsaneh Doryab, Jason Wiese, Shahriyar Amini, John Zimmerman, and Jason I Hong. 2014. Toss'n'turn: smartphone as sleep and sleep quality detector. In Proceedings of the SIGCHI conference on human factors in computing systems. 477--486.

Digital Library

[76]

Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, and Luke Zettlemoyer. 2022. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? arXiv preprint arXiv:2202.12837 (2022).

[77]

Elliot Mitchell, Noemie Elhadad, and Lena Mamykina. 2022. Examining AI Methods for Micro-Coaching Dialogs. In CHI Conference on Human Factors in Computing Systems. 1--24.

[78]

Ryan Morrison. 2022. GPT-3 developer OpenAI releases new Davinci generative text model. Retrieved Dec 04, 2022 from https://techmonitor.ai/technology/ai-and-automation/gpt-3-openai-davinci-generative-text

[79]

Chelsea Myers, Anushay Furqan, Jessica Nebolsky, Karina Caro, and Jichen Zhu. 2018. Patterns for how users overcome obstacles in voice user interfaces. In Proceedings of the 2018 CHI conference on human factors in computing systems. 1--7.

Digital Library

[80]

OpenAI. 2022. OpenAI API. Retrieved Dec 04, 2022 from https://openai.com/api/

[81]

OpenAI, Inc. 2023. GPT models - OpenAI. Retrieved Dec 04, 2022 from https://platform.openai.com/docs/guides/gpt

[82]

Joon Sung Park, Lindsay Popowski, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. 2022. Social Simulacra: Creating Populated Prototypes for Social Computing Systems. In In the 35th Annual ACM Symposium on User Interface Software and Technology (UIST '22) (Bend, OR, USA) (UIST '22). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3526113.3545616

Digital Library

[83]

José Pinheiro and Douglas Bates. 2000. Mixed-Effects Models in S and S-PLUS (1 ed.). Springer-Verlag, New York. 528 pages. https://doi.org/10.1007/b98882

[84]

Hannah Rashkin, Eric Michael Smith, Margaret Li, and Y-Lan Boureau. 2018. Towards empathetic open-domain conversation models: A new benchmark and dataset. arXiv preprint arXiv:1811.00207 (2018).

[85]

Jungwook Rhim, Minji Kwak, Yeaeun Gong, and Gahgene Gweon. 2022. Application of humanization to survey chatbots: Change in chatbot perception, interaction experience, and survey data quality. Computers in Human Behavior 126 (2022), 107034.

Digital Library

[86]

Steven I Ross, Fernando Martinez, Stephanie Houde, Michael Muller, and Justin D Weisz. 2023. The programmer's assistant: Conversational interaction with a large language model for software development. In Proceedings of the 28th International Conference on Intelligent User Interfaces. 491--514.

Digital Library

[87]

Kathryn Roulston and Myungweon Choi. 2018. Qualitative interviews. The SAGE handbook of qualitative data collection (2018), 233--249.

[88]

Ryan M Schuetzler, G Mark Grimes, and Justin Scott Giboney. 2018. An investigation of conversational agent relevance, presence, and engagement. (2018).

[89]

Abigail See, Stephen Roller, Douwe Kiela, and Jason Weston. 2019. What makes a good conversation? how controllable attributes affect human judgments. arXiv preprint arXiv:1902.08654 (2019).

[90]

Woosuk Seo, Chanmo Yang, and Young-Ho Kim. 2023. ChaCha: Leveraging Large Language Models to Prompt Children to Share Their Emotions about Personal Events. arXiv:2309.12244 [cs.HC]

[91]

Alexander Serenko. 2008. A model of user adoption of interface agents for email notification. Interacting with Computers 20, 4--5 (2008), 461--472.

Digital Library

[92]

Weiyan Shi, Xuewei Wang, Yoo Jung Oh, Jingwen Zhang, Saurav Sahay, and Zhou Yu. 2020. Effects of persuasive dialogues: testing bot identities and inquiry strategies. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--13.

Digital Library

[93]

Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, and Jason Weston. 2021. Retrieval augmentation reduces hallucination in conversation. arXiv preprint arXiv:2104.07567 (2021).

[94]

Nina Svenningsson and Montathar Faraon. 2019. Artificial intelligence in conversational agents: A study of factors related to perceived humanness in chatbots. In Proceedings of the 2019 2nd Artificial Intelligence and Cloud Computing Conference. 151--161.

Digital Library

[95]

Vivian Ta, Caroline Griffith, Carolynn Boatfield, Xinyu Wang, Maria Civitello, Haley Bader, Esther DeCero, Alexia Loggarakis, et al . 2020. User experiences of social support from companion chatbots in everyday contexts: thematic analysis. Journal of medical Internet research 22, 3 (2020), e16235.

[96]

Anaïs Tack and Chris Piech. 2022. The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues. arXiv preprint arXiv:2205.07540 (2022).

[97]

The OpenJS Foundation. 2022. Node.js. Retrieved Dec 04, 2022 from https://nodejs.org

[98]

Sandeep A Thorat and Vishakha Jadhav. 2020. A review on implementation issues of rule-based chatbot systems. In Proceedings of the International Conference on Innovative Computing & Communications (ICICC).

[99]

Xiaoyi Tian, Zak Risha, Ishrat Ahmed, Arun Balajiee Lekshmi Narayanan, and Jacob Biehl. 2021. Let's talk it out: A chatbot for effective study habit behavioral change. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 1--32.

Digital Library

[100]

David R Traum. 2000. 20 questions on dialogue act taxonomies. Journal of semantics 17, 1 (2000), 7--30.

[101]

Saeid Ashraf Vaghefi, Qian Wang, Veruska Muccione, Jingwei Ni, Mathias Kraus, Julia Bingler, Tobias Schimanski, Chiara Colesanti-Senni, Dominik Stammbach, Nicolas Webersinke, et al. 2023. Chatclimate: Grounding conversational AI in climate science. (2023).

[102]

Sarah Theres Völkel, Ramona Schödel, Daniel Buschek, Clemens Stachl, Verena Winterhalter, Markus Bühner, and Heinrich Hussmann. 2020. Developing a personality model for speech-based conversational agents using the psycholexical approach. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--14.

Digital Library

[103]

Ryan Volum, Sudha Rao, Michael Xu, Gabriel DesGarennes, Chris Brockett, Benjamin Van Durme, Olivia Deng, Akanksha Malhotra, and William B Dolan. 2022. Craft an Iron Sword: Dynamically Generating Interactive Game Characters by Prompting Large Language Models Tuned on Code. In Proceedings of the 3rd Wordplay: When Language Meets Games Workshop (Wordplay 2022). 25--43.

[104]

Bryan Wang, Gang Li, and Yang Li. 2023. Enabling conversational interaction with mobile ui using large language models. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1--17.

Digital Library

[105]

Lu Wang, Munif Ishad Mujib, Jake Williams, George Demiris, and Jina Huh-Yoo. 2021. An Evaluation of Generative Pre-Training Model-based Therapy Chatbot for Caregivers. https://doi.org/10.48550/ARXIV.2107.13115

[106]

Jing Wei, Tilman Dingler, and Vassilis Kostakos. 2021. Understanding User Perceptions of Proactive Smart Speakers. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 4 (2021), 1--28.

Digital Library

[107]

Jing Wei, Weiwei Jiang, Chaofan Wang, Difeng Yu, Jorge Goncalves, Tilman Dingler, and Vassilis Kostakos. 2022. Understanding How to Administer Voice Surveys through Smart Speakers. Proceedings of the ACM on Human-Computer Interaction 6, CSCW2 (2022), 1--32.

Digital Library

[108]

Jing Wei, Benjamin Tag, Johanne R Trippas, Tilman Dingler, and Vassilis Kostakos. 2022. What Could Possibly Go Wrong When Interacting with Proactive Smart Speakers? A Case Study Using an ESM Application. In CHI Conference on Human Factors in Computing Systems. 1--15.

Digital Library

[109]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, and Denny Zhou. 2022. Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903 (2022).

[110]

Anuradha Welivita and Pearl Pu. 2020. A taxonomy of empathetic response intents in human social conversations. arXiv preprint arXiv:2012.04080 (2020).

[111]

Sean Welleck, Ilia Kulikov, Stephen Roller, Emily Dinan, Kyunghyun Cho, and Jason Weston. 2019. Neural Text Generation with Unlikelihood Training. https://doi.org/10.48550/ARXIV.1908.04319

[112]

Patti Williams, Lauren G Block, and Gavan J Fitzsimons. 2006. Simply asking questions about health behaviors increases both healthy and unhealthy behaviors. Social Influence 1, 2 (2006), 117--127.

[113]

Cornelia Wrzus and Matthias R Mehl. 2015. Lab and/or field? Measuring personality processes and their social consequences. European Journal of Personality 29, 2 (2015), 250--271.

[114]

Tongshuang Wu, Michael Terry, and Carrie Jun Cai. 2022. Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts. In CHI Conference on Human Factors in Computing Systems. 1--22.

Digital Library

[115]

Ziang Xiao, Michelle X Zhou, Wenxi Chen, Huahai Yang, and Changyan Chi. 2020. If i hear you correctly: Building and evaluating interview chatbots with active listening skills. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--14.

Digital Library

[116]

Ziang Xiao, Michelle X Zhou, Q Vera Liao, Gloria Mark, Changyan Chi, Wenxi Chen, and Huahai Yang. 2020. Tell me about yourself: Using an AI-powered chatbot to conduct conversational surveys with open-ended questions. ACM Transactions on Computer-Human Interaction (TOCHI) 27, 3 (2020), 1--37.

Digital Library

[117]

JD Zamfirescu-Pereira, Richmond Y Wong, Bjoern Hartmann, and Qian Yang. 2023. Why Johnny can't prompt: how non-AI experts try (and fail) to design LLM prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1--21.

Digital Library

[118]

Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, and Luke Zettlemoyer. 2022. OPT: Open Pre-trained Transformer Language Models. https://doi.org/10.48550/ARXIV.2205.01068

[119]

Li Zhou, Jianfeng Gao, Di Li, and Heung-Yeung Shum. 2020. The design and implementation of xiaoice, an empathetic social chatbot. Computational Linguistics 46, 1 (2020), 53--93.

Digital Library

Cited By

Ogunleye BZakariyyah KAjao OOlayinka OSharma H(2024)A Systematic Review of Generative AI for Teaching and Learning PracticeEducation Sciences10.3390/educsci1406063614:6(636)Online publication date: 13-Jun-2024
https://doi.org/10.3390/educsci14060636
Yang ZXu XYao BRogers EZhang SIntille SShara NGao GWang D(2024)Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older AdultsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596258:2(1-35)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.1145/3659625
Kim TBae SKim HLee SHong HYang CKim Y(2024)MindfulDiary: Harnessing Large Language Model to Support Psychiatric Patients' JournalingProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642937(1-20)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642937
Show More Cited By

Index Terms

Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Empirical studies in HCI

Recommendations

User Engagement with Chatbots: A Discursive Psychology Approach
CUI '20: Proceedings of the 2nd Conference on Conversational User Interfaces

Conversational agents have transcended into multiple industries with increased ability for user engagement in intelligent conversation. Conversations with chatbots are different from interpersonal communication in terms of turn-taking, intentions, and ...
How Does Conversation Length Impact User’s Satisfaction? A Case Study of Length-Controlled Conversations with LLM-Powered Chatbots
CHI EA '24: Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems
Users can discuss a wide range of topics with large language models (LLMs), but they do not always prefer solving problems or getting information through lengthy conversations. This raises an intriguing HCI question: How does instructing LLMs to engage in ...
Exploring how politeness impacts the user experience of chatbots for mental health support
Abstract
Politeness is important in human–human interaction when asking people to engage in sensitive conversations. If politeness manifests similarly in human–chatbot interaction, it may play an important role in the design of sensitive chatbot ...
Highlights
- Politeness can both positively and negatively impact the chatbot user experience.
- The Personal politeness chatbot was experienced as caring and encouraging.
- The Passive politeness chatbot was experienced as too apologetic and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Human-Computer Interaction

Proceedings of the ACM on Human-Computer Interaction Volume 8, Issue CSCW1

CSCW

April 2024

6294 pages

EISSN:2573-0142

DOI:10.1145/3661497

Editor:
Jeff Nichols
Apple Inc., United States

Issue’s Table of Contents

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 April 2024

Published in PACMHCI Volume 8, Issue CSCW1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
830
Total Downloads

Downloads (Last 12 months)830
Downloads (Last 6 weeks)225

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ogunleye BZakariyyah KAjao OOlayinka OSharma H(2024)A Systematic Review of Generative AI for Teaching and Learning PracticeEducation Sciences10.3390/educsci1406063614:6(636)Online publication date: 13-Jun-2024
https://doi.org/10.3390/educsci14060636
Yang ZXu XYao BRogers EZhang SIntille SShara NGao GWang D(2024)Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older AdultsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596258:2(1-35)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.1145/3659625
Kim TBae SKim HLee SHong HYang CKim Y(2024)MindfulDiary: Harnessing Large Language Model to Support Psychiatric Patients' JournalingProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642937(1-20)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642937
Wu RYu CPan XLiu YZhang NFu YWang YZheng ZChen LJiang QXu XShi Y(2024)MindShift: Leveraging Large Language Models for Mental-States-Based Problematic Smartphone Use InterventionProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642790(1-24)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642790
Jo EJeong YPark SEpstein DKim Y(2024)Understanding the Impact of Long-Term Memory on Self-Disclosure with Large Language Model-Driven Chatbots for Public Health InterventionProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642420(1-21)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642420
Seo WYang CKim Y(2024)ChaCha: Leveraging Large Language Models to Prompt Children to Share Their Emotions about Personal EventsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642152(1-20)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642152
Yao XXi Y(2024)Pathways linking expectations for AI chatbots to loyalty: A moderated mediation analysisTechnology in Society10.1016/j.techsoc.2024.10262578(102625)Online publication date: Sep-2024
https://doi.org/10.1016/j.techsoc.2024.102625
Chen ZWang QSun YCai HLu X(2024)Chat-ePRO: Development and pilot study of an electronic patient-reported outcomes system based on ChatGPTJournal of Biomedical Informatics10.1016/j.jbi.2024.104651154(104651)Online publication date: Jun-2024
https://doi.org/10.1016/j.jbi.2024.104651
Guo ZLai ADeng ZLi K(2024)Evaluating the Feasibility and Acceptability of a GPT-Based Chatbot for Depression Screening: A Mixed-Methods StudyArtificial Intelligence in Healthcare10.1007/978-3-031-67278-1_20(249-263)Online publication date: 14-Aug-2024
https://doi.org/10.1007/978-3-031-67278-1_20

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents