Token Trails: Navigating Contextual Depths in Conversational AI with ChatLLM

Md. Kowsher¹¹,
Ritesh Panditi¹¹,
Nusrat Jahan Prottasha¹¹,
Prakash Bhat¹²,
Anupam Kumar Bairagi¹³ &
…
Mohammad Shamsul Arefin¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14763))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

422 Accesses

Abstract

Conversational modeling using Large Language Models (LLMs) requires a nuanced understanding of context to generate coherent and contextually relevant responses. In this paper, we present Token Trails, a novel approach that leverages Token-Type Embeddings to navigate the intricate contextual nuances within conversations. Our framework utilizes Token-Type Embeddings to distinguish between user utterances and bot responses, facilitating the generation of context-aware replies. Through comprehensive experimentation and evaluation, we demonstrate the effectiveness of Token Trails in improving conversational understanding and response generation, achieving state-of-the-art performance. Our results highlight the significance of contextual modeling in conversational AI and underscore the promising potential of Token Trails to advance the field, paving the way for more sophisticated and contextually aware chatbot interactions. Model and source code available at: https://huggingface.co/Kowsher/TokenTrails.

P. Bhat—This work does not relate to Prakash’s position at Amazon.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Achiam, J., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023)
Adiwardana, D., et al.: Chatgpt: integrating language and vision into a single model for task-oriented dialogue. arXiv preprint arXiv:2109.00510 (2021)
Allouch, M., Azaria, A., Azoulay, R.: Conversational agents: goals, technologies, vision and challenges. Sensors 21(24), 8448 (2021)
Article MATH Google Scholar
Almazrouei, E., et al.: The falcon series of open language models. arXiv preprint arXiv:2311.16867 (2023)
Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817–1853 (2005)
MathSciNet MATH Google Scholar
Brown, T.B., et al.: Gpt-2: Language models are unsupervised multitask learners. Technical report. OpenAI (2019)
Google Scholar
Brown, T.B., et al.: Towards a human-like open-domain chatbot. In: Proceedings of the 2020 Conference on Neural Information Processing Systems (NeurIPS) (2020)
Google Scholar
Chae, H., et al.: Dialogue chain-of-thought distillation for commonsense-aware conversational agents. arXiv preprint arXiv:2310.09343 (2023)
Chen, W., et al.: Dialogved: a pre-trained latent variable encoder-decoder model for dialog response generation. In: Muresan, S., Nakov, P., Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, 22–27 May 2022, pp. 4852–4864. Association for Computational Linguistics (2022)
Google Scholar
Deriu, J., Rodrigo, A., Otegi, A., Echegoyen, G., Rosset, S., Agirre, E., Cieliebak, M.: Survey on evaluation methods for dialogue systems. Artif. Intell. Rev. 54, 755–810 (2021)
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) (2019)
Google Scholar
Feng, Y., Wang, Y., Li, H.: A sequence-to-sequence approach to dialogue state tracking. arXiv preprint arXiv:2011.09553 (2020)
Ghosal, D., Majumder, N., Gelbukh, A., Mihalcea, R., Poria, S.: COSMIC: COmmonSense knowledge for eMotion identification in conversations. In: Cohn, T., He, Y., Liu, Y. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 2470–2481. Association for Computational Linguistics (2020)
Google Scholar
Hu, E.J., et al.: Lora: low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021)
Huang, M., Zhu, X., Gao, J.: Challenges in building intelligent open-domain dialog systems. ACM Trans. Inf. Syst. (TOIS) 38(3), 1–32 (2020)
MATH Google Scholar
Huang, Z., Gutierrez, S., Kamana, H., MacNeil, S.: Memory sandbox: transparent and interactive memory management for conversational agents. In: Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pp. 1–3 (2023)
Google Scholar
Khennouche, F., Elmir, Y., Djebari, N., Himeur, Y., Amira, A.: Revolutionizing customer interactions: insights and challenges in deploying chatgpt and generative chatbots for faqs. arXiv preprint arXiv:2311.09976 (2023)
Kwak, J.M., Kim, M., Hwang, S.J.: Context-dependent instruction tuning for dialogue response generation. arXiv preprint arXiv:2311.07006 (2023)
Li, Y., Su, H., Shen, X., Li, W., Cao, Z., Niu, S.: Dailydialog: a manually labelled multi-turn dialogue dataset. arXiv preprint arXiv:1710.03957 (2017)
McTear, M., Ashurkina, M.: Transforming Conversational AI: Exploring the Power of Large Language Models in Interactive Conversational Agents. Apress, Berkeley (2024)
Book MATH Google Scholar
Poria, S., Hazarika, D., Majumder, N., Naik, G., Cambria, E., Mihalcea, R.: MELD: a multimodal multi-party dataset for emotion recognition in conversations. CoRR arxiv:1810.02508 (2018)
Shen, W., Wu, S., Yang, Y., Quan, X.: Directed acyclic graph network for conversational emotion recognition. In: Annual Meeting of the Association for Computational Linguistics (2021)
Google Scholar
Skantze, G.: Turn-taking in conversational systems and human-robot interaction: a review. Comput. Speech Lang. 67, 101178 (2021)
Article MATH Google Scholar
Skantze, G., Doğruöz, A.S.: The open-domain paradox for chatbots: common ground as the basis for human-like dialogue. In: Stoyanchev, S., Joty, S., Schlangen, D., Dusek, O., Kennington, C., Alikhani, M. (eds.) Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 605–614. Association for Computational Linguistics, Prague (2023)
Google Scholar
Sun, K., Yu, D., Chen, J., Yu, D., Choi, Y., Cardie, C.: DREAM: a challenge dataset and models for dialogue-based reading comprehension. CoRR arxiv:1902.00164 (2019)
Team, G., et al.: Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023)
Weizenbaum, J.: Eliza-a computer program for the study of natural language communication between man and machine. Commun. ACM 9(1), 36–45 (1966)
Article MathSciNet MATH Google Scholar
Xi, Z., et al.: The rise and potential of large language model based agents: a survey. arXiv preprint arXiv:2309.07864 (2023)
Zahiri, S.M., Choi, J.D.: Emotion detection on TV show transcripts with sequence-based convolutional neural networks. CoRR arxiv:1708.04299 (2017)
Zhang, S., Dinan, E., Urbanek, J., Szlam, A., Kiela, D., Weston, J.: Personalizing dialogue agents: i have a dog, do you have pets too? In: Gurevych, I., Miyao, Y. (eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers, pp. 2204–2213. Association for Computational Linguistics, Melbourne (2018)
Google Scholar
Zhang, Y., et al.: Dialogpt: large-scale generative pre-training for conversational response generation. In: Proceedings of the 2020 Conference on Neural Information Processing Systems (NeurIPS) (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Stevens Institute of Technology, Hoboken, NJ, USA
Md. Kowsher, Ritesh Panditi & Nusrat Jahan Prottasha
Amazon, Seattle, USA
Prakash Bhat
Khulna University, Khulna, Bangladesh
Anupam Kumar Bairagi
Chittagong University of Engineering and Technology, Chittagong, Bangladesh
Mohammad Shamsul Arefin

Authors

Md. Kowsher
View author publications
You can also search for this author in PubMed Google Scholar
Ritesh Panditi
View author publications
You can also search for this author in PubMed Google Scholar
Nusrat Jahan Prottasha
View author publications
You can also search for this author in PubMed Google Scholar
Prakash Bhat
View author publications
You can also search for this author in PubMed Google Scholar
Anupam Kumar Bairagi
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Shamsul Arefin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prakash Bhat .

Editor information

Editors and Affiliations

University of Turin, Turin, Italy
Amon Rapp
University of Turin, Turin, Italy
Luigi Di Caro
University of Derby, Derby, UK
Farid Meziane
Oakland University, Rochester, MI, USA
Vijayan Sugumaran

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kowsher, M., Panditi, R., Prottasha, N.J., Bhat, P., Bairagi, A.K., Arefin, M.S. (2024). Token Trails: Navigating Contextual Depths in Conversational AI with ChatLLM. In: Rapp, A., Di Caro, L., Meziane, F., Sugumaran, V. (eds) Natural Language Processing and Information Systems. NLDB 2024. Lecture Notes in Computer Science, vol 14763. Springer, Cham. https://doi.org/10.1007/978-3-031-70242-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-70242-6_6
Published: 20 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70241-9
Online ISBN: 978-3-031-70242-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics