Nothing Special   »   [go: up one dir, main page]

Skip to main content

Token Trails: Navigating Contextual Depths in Conversational AI with ChatLLM

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2024)

Abstract

Conversational modeling using Large Language Models (LLMs) requires a nuanced understanding of context to generate coherent and contextually relevant responses. In this paper, we present Token Trails, a novel approach that leverages Token-Type Embeddings to navigate the intricate contextual nuances within conversations. Our framework utilizes Token-Type Embeddings to distinguish between user utterances and bot responses, facilitating the generation of context-aware replies. Through comprehensive experimentation and evaluation, we demonstrate the effectiveness of Token Trails in improving conversational understanding and response generation, achieving state-of-the-art performance. Our results highlight the significance of contextual modeling in conversational AI and underscore the promising potential of Token Trails to advance the field, paving the way for more sophisticated and contextually aware chatbot interactions. Model and source code available at: https://huggingface.co/Kowsher/TokenTrails.

P. Bhat—This work does not relate to Prakash’s position at Amazon.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Achiam, J., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023)

  2. Adiwardana, D., et al.: Chatgpt: integrating language and vision into a single model for task-oriented dialogue. arXiv preprint arXiv:2109.00510 (2021)

  3. Allouch, M., Azaria, A., Azoulay, R.: Conversational agents: goals, technologies, vision and challenges. Sensors 21(24), 8448 (2021)

    Article  Google Scholar 

  4. Almazrouei, E., et al.: The falcon series of open language models. arXiv preprint arXiv:2311.16867 (2023)

  5. Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817–1853 (2005)

    MathSciNet  Google Scholar 

  6. Brown, T.B., et al.: Gpt-2: Language models are unsupervised multitask learners. Technical report. OpenAI (2019)

    Google Scholar 

  7. Brown, T.B., et al.: Towards a human-like open-domain chatbot. In: Proceedings of the 2020 Conference on Neural Information Processing Systems (NeurIPS) (2020)

    Google Scholar 

  8. Chae, H., et al.: Dialogue chain-of-thought distillation for commonsense-aware conversational agents. arXiv preprint arXiv:2310.09343 (2023)

  9. Chen, W., et al.: Dialogved: a pre-trained latent variable encoder-decoder model for dialog response generation. In: Muresan, S., Nakov, P., Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, 22–27 May 2022, pp. 4852–4864. Association for Computational Linguistics (2022)

    Google Scholar 

  10. Deriu, J., Rodrigo, A., Otegi, A., Echegoyen, G., Rosset, S., Agirre, E., Cieliebak, M.: Survey on evaluation methods for dialogue systems. Artif. Intell. Rev. 54, 755–810 (2021)

    Article  Google Scholar 

  11. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) (2019)

    Google Scholar 

  12. Feng, Y., Wang, Y., Li, H.: A sequence-to-sequence approach to dialogue state tracking. arXiv preprint arXiv:2011.09553 (2020)

  13. Ghosal, D., Majumder, N., Gelbukh, A., Mihalcea, R., Poria, S.: COSMIC: COmmonSense knowledge for eMotion identification in conversations. In: Cohn, T., He, Y., Liu, Y. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 2470–2481. Association for Computational Linguistics (2020)

    Google Scholar 

  14. Hu, E.J., et al.: Lora: low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021)

  15. Huang, M., Zhu, X., Gao, J.: Challenges in building intelligent open-domain dialog systems. ACM Trans. Inf. Syst. (TOIS) 38(3), 1–32 (2020)

    Google Scholar 

  16. Huang, Z., Gutierrez, S., Kamana, H., MacNeil, S.: Memory sandbox: transparent and interactive memory management for conversational agents. In: Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pp. 1–3 (2023)

    Google Scholar 

  17. Khennouche, F., Elmir, Y., Djebari, N., Himeur, Y., Amira, A.: Revolutionizing customer interactions: insights and challenges in deploying chatgpt and generative chatbots for faqs. arXiv preprint arXiv:2311.09976 (2023)

  18. Kwak, J.M., Kim, M., Hwang, S.J.: Context-dependent instruction tuning for dialogue response generation. arXiv preprint arXiv:2311.07006 (2023)

  19. Li, Y., Su, H., Shen, X., Li, W., Cao, Z., Niu, S.: Dailydialog: a manually labelled multi-turn dialogue dataset. arXiv preprint arXiv:1710.03957 (2017)

  20. McTear, M., Ashurkina, M.: Transforming Conversational AI: Exploring the Power of Large Language Models in Interactive Conversational Agents. Apress, Berkeley (2024)

    Book  Google Scholar 

  21. Poria, S., Hazarika, D., Majumder, N., Naik, G., Cambria, E., Mihalcea, R.: MELD: a multimodal multi-party dataset for emotion recognition in conversations. CoRR arxiv:1810.02508 (2018)

  22. Shen, W., Wu, S., Yang, Y., Quan, X.: Directed acyclic graph network for conversational emotion recognition. In: Annual Meeting of the Association for Computational Linguistics (2021)

    Google Scholar 

  23. Skantze, G.: Turn-taking in conversational systems and human-robot interaction: a review. Comput. Speech Lang. 67, 101178 (2021)

    Article  Google Scholar 

  24. Skantze, G., Doğruöz, A.S.: The open-domain paradox for chatbots: common ground as the basis for human-like dialogue. In: Stoyanchev, S., Joty, S., Schlangen, D., Dusek, O., Kennington, C., Alikhani, M. (eds.) Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 605–614. Association for Computational Linguistics, Prague (2023)

    Google Scholar 

  25. Sun, K., Yu, D., Chen, J., Yu, D., Choi, Y., Cardie, C.: DREAM: a challenge dataset and models for dialogue-based reading comprehension. CoRR arxiv:1902.00164 (2019)

  26. Team, G., et al.: Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023)

  27. Weizenbaum, J.: Eliza-a computer program for the study of natural language communication between man and machine. Commun. ACM 9(1), 36–45 (1966)

    Article  MathSciNet  Google Scholar 

  28. Xi, Z., et al.: The rise and potential of large language model based agents: a survey. arXiv preprint arXiv:2309.07864 (2023)

  29. Zahiri, S.M., Choi, J.D.: Emotion detection on TV show transcripts with sequence-based convolutional neural networks. CoRR arxiv:1708.04299 (2017)

  30. Zhang, S., Dinan, E., Urbanek, J., Szlam, A., Kiela, D., Weston, J.: Personalizing dialogue agents: i have a dog, do you have pets too? In: Gurevych, I., Miyao, Y. (eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers, pp. 2204–2213. Association for Computational Linguistics, Melbourne (2018)

    Google Scholar 

  31. Zhang, Y., et al.: Dialogpt: large-scale generative pre-training for conversational response generation. In: Proceedings of the 2020 Conference on Neural Information Processing Systems (NeurIPS) (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prakash Bhat .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kowsher, M., Panditi, R., Prottasha, N.J., Bhat, P., Bairagi, A.K., Arefin, M.S. (2024). Token Trails: Navigating Contextual Depths in Conversational AI with ChatLLM. In: Rapp, A., Di Caro, L., Meziane, F., Sugumaran, V. (eds) Natural Language Processing and Information Systems. NLDB 2024. Lecture Notes in Computer Science, vol 14763. Springer, Cham. https://doi.org/10.1007/978-3-031-70242-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70242-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70241-9

  • Online ISBN: 978-3-031-70242-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics