Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3604237.3626838acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaifConference Proceedingsconference-collections
research-article

Fine-Tuning Pretrained Language Models to Enhance Dialogue Summarization in Customer Service Centers

Published: 25 November 2023 Publication History

Abstract

The application of pretrained language models in real-world business domains has gained significant attention. However, research on the practical use of generative artificial intelligence (AI) to address real-world downstream tasks is limited. This study aims to enhance the routine tasks of customer service (CS) representatives, particularly in the finance domain, by applying a fine-tuning method to dialogue summarization in CS centers. KakaoBank handles an average of 15,000 CS calls daily. By employing a fine-tuning method using real-world CS dialogue data, we can reduce the time required to summarize CS dialogues and standardize summarization skills. To ensure effective dialogue summarization in the finance domain, pretrained language models should acquire additional knowledge and skills, such as specific knowledge of financial products, problem-solving abilities, and the capacity to handle emotionally charged customers. In this study, we developed a reference fine-tuned model using Polyglot-Ko (5.8B) as the baseline PLM and a dataset containing a wide range of zero-shot instructions and partially containing summarization instructions. We compared this reference model with another model fine-tuned using KakaoBank’s CS dialogues and summarization data as the instruct dataset. The results demonstrated that the fine-tuned model based on KakaoBank’s internal datasets outperformed the reference model, showing a 199% and 12% improvement in ROUGE-L and RDASS, respectively. This study emphasizes the significance of task-specific fine-tuning using appropriate instruct datasets for effective performance in specific downstream tasks. Considering its practical use, we suggest that fine-tuning using real-world instruct datasets is a powerful and cost-effective technique for developing generative AI in the business domain.

References

[1]
Dogu Araci. 2019. FinBERT: Financial Sentiment Analysis with Pre-trained Language Models. arxiv:1908.10063
[2]
Jacqui Ayling and Adriane Chapman. 2022. Putting AI ethics to work: are the tools fit for purpose?AI and Ethics 2, 3 (aug 2022), 405–429.
[3]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 1877–1901.
[4]
Erik Brynjolfsson, Danielle Li, and Lindsey Raymond. 2023. Generative AI at Work. arxiv:2304.11771
[5]
Yi-Syuan Chen and Hong-Han Shuai. 2021. Meta-Transfer Learning for Low-Resource Abstractive Summarization. Proceedings of the AAAI Conference on Artificial Intelligence 35, 14, 12692–12700. https://doi.org/10.1609/aaai.v35i14.17503
[6]
Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, and Noah Fiedel. 2022. PaLM: Scaling Language Modeling with Pathways. arxiv:2204.02311
[7]
Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang, Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, and Maosong Sun. 2023. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature Machine Intelligence 5, 3 (Jun 2023), 220–235.
[8]
Tyna Eloundou, Sam Manning, Pamela Mishkin, and Daniel Rock. 2023. GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models. arxiv:2303.10130
[9]
Kavita Ganesan. 2018. ROUGE 2.0: Updated and Improved Measures for Evaluation of Summarization Tasks. arxiv:1803.01937
[10]
Bogdan Gliwa, Iwona Mochol, Maciej Biesek, and Aleksander Wawer. 2019. SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization. Association for Computational Linguistics, Hong Kong, China, 70–79.
[11]
Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao Liu, Pieter Abbeel, Sergey Levine, and Dawn Song. 2023. The False Promise of Imitating Proprietary LLMs. arxiv:2305.15717
[12]
Yichong Huang, Xiachong Feng, Xiaocheng Feng, and Bing Qin. 2023. The Factual Inconsistency Problem in Abstractive Text Summarization: A Survey. arxiv:2104.14839
[13]
Hyunwoong Ko, Kichang Yang, Minho Ryu, Taekyoon Choi, Seungmu Yang, jiwung Hyun, and Sungho Park. 2022. Polyglot-Ko: Open-Source Korean Autoregressive Language Model.
[14]
Hyunwoong Ko, Kichang Yang, Minho Ryu, Taekyoon Choi, Seungmu Yang, Jiwung Hyun, Sungho Park, and Kyubyong Park. 2023. A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models. arxiv:2306.02254
[15]
Faisal Ladhak, Esin Durmus, He He, Claire Cardie, and Kathleen McKeown. 2022. Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Dublin, Ireland, 1410–1421.
[16]
Dongyub Lee, Myeong Cheol Shin, Taesun Whang, Seungwoo Cho, Byeongil Ko, Daniel Lee, EungGyun Kim, and Jaechoon Jo. 2020. Reference and Document Aware Semantic Evaluation Methods for Korean Language Summarization. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 5604–5616.
[17]
Huije Lee, Wonsuk Yang, Chaehun Park, Hoyun Song, Eugene Jang, and Jong C. Park. 2021. Optimizing Domain Specificity of Transformer-based Language Models for Extractive Summarization of Financial News Articles in Korean. In Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation. Association for Computational Lingustics, Shanghai, China, 611–621.
[18]
Yunxiang Li, Zihan Li, Kai Zhang, Ruilong Dan, and You Zhang. 2023. ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge. arxiv:2303.14070
[19]
Vladislav Lialin, Vijeta Deshpande, and Anna Rumshisky. 2023. Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning. arxiv:2303.15647
[20]
Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74–81.
[21]
Tiedong Liu and Bryan Kian Hsiang Low. 2023. Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks. arxiv:2305.14201
[22]
Joshua Maynez, Shashi Narayan, Bernd Bohnet, and Ryan McDonald. 2020. On Faithfulness and Factuality in Abstractive Summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 1906–1919.
[23]
Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human feedback. arxiv:2203.02155
[24]
Seongmin Park, Dongchan Shin, and Jihwa Lee. 2022. Leveraging Non-dialogue Summaries for Dialogue Summarization. In Proceedings of the First Workshop On Transcript Understanding. International Conference on Computational Linguistics, Gyeongju, South Korea, 1–7.
[25]
Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, Weizhu Chen, and Jianfeng Gao. 2023. Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback. arxiv:2302.12813
[26]
Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, and Jianfeng Gao. 2023. Instruction Tuning with GPT-4. arxiv:2304.03277
[27]
Michael Polanyi. 1966. The Tacit Dimension. University of Chicago Press, Chicago, IL.
[28]
George Prodan and Elena Pelican. 2022. Prompt scoring system for dialogue summarization using GPT-3. TechRxiv (5 2022).
[29]
Chad Syverson. 2011. What Determines Productivity?Journal of Economic Literature 49, 2 (Jun 2011), 326–65.
[30]
Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, and Tatsunori B. Hashimoto. 2023. Stanford Alpaca: An Instruction-following LLaMA model. https://github.com/tatsu-lab/stanford_alpaca.
[31]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9 (2008), 2579–2605.
[32]
Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, and Hannaneh Hajishirzi. 2023. Self-Instruct: Aligning Language Models with Self-Generated Instructions. arxiv:2212.10560
[33]
Jason Wei, Maarten Bosma, Vincent Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V Le. 2022. Finetuned Language Models are Zero-Shot Learners. In International Conference on Learning Representations. https://openreview.net/forum?id=gEZrGCozdqR
[34]
Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry Gilbert, Ashraf Elnashar, Jesse Spencer-Smith, and Douglas C. Schmidt. 2023. A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. arxiv:2302.11382
[35]
Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, and Luke Zettlemoyer. 2022. OPT: Open Pre-trained Transformer Language Models. arxiv:2205.01068
[36]
Ming Zhong, Da Yin, Tao Yu, Ahmad Zaidi, Mutethia Mutuma, Rahul Jha, Ahmed Hassan Awadallah, Asli Celikyilmaz, Yang Liu, Xipeng Qiu, and Dragomir Radev. 2021. QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization. arxiv:2104.05938

Cited By

View all
  • (2024)Embedding Large Language Models into Extended Reality: Opportunities and Challenges for Inclusion, Engagement, and PrivacyProceedings of the 6th ACM Conference on Conversational User Interfaces10.1145/3640794.3665563(1-7)Online publication date: 8-Jul-2024

Index Terms

  1. Fine-Tuning Pretrained Language Models to Enhance Dialogue Summarization in Customer Service Centers

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        ICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance
        November 2023
        697 pages
        ISBN:9798400702402
        DOI:10.1145/3604237
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 25 November 2023

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Korean language model
        2. dialogue summarization
        3. fine-tuning
        4. instruct tuning

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        ICAIF '23

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)182
        • Downloads (Last 6 weeks)5
        Reflects downloads up to 14 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Embedding Large Language Models into Extended Reality: Opportunities and Challenges for Inclusion, Engagement, and PrivacyProceedings of the 6th ACM Conference on Conversational User Interfaces10.1145/3640794.3665563(1-7)Online publication date: 8-Jul-2024

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media