Nothing Special   »   [go: up one dir, main page]

Skip to main content

Topic Shift Detection in Chinese Dialogues: Corpus and Benchmark

  • Conference paper
  • First Online:
Document Analysis and Recognition - ICDAR 2023 (ICDAR 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14189))

Included in the following conference series:

Abstract

Dialogue topic shift detection is to detect whether an ongoing topic has shifted or should shift in a dialogue, which can be divided into two categories, i.e., response-known task and response-unknown task. Currently, only a few investigated the latter, because it is still a challenge to predict the topic shift without the response information. In this paper, we first annotate a Chinese Natural Topic Dialogue (CNTD) corpus consisting of 1308 dialogues to fill the gap in the Chinese natural conversation topic corpus. And then we focus on the response-unknown task and propose a teacher-student framework based on hierarchical contrastive learning to predict the topic shift without the response. Specifically, the response at high-level teacher-student is introduced to build the contrastive learning between the response and the context, while the label contrastive learning is constructed at low-level student. The experimental results on our Chinese CNTD and English TIAGE show the effectiveness of our proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Dai, S., Wang, G., Park, S., Lee, S.: Dialogue response generation via contrastive latent representation learning. In: Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI, pp. 189–197 (2021)

    Google Scholar 

  2. Li, J., et al.: Dadgraph: a discourse-aware dialogue graph neural network for multiparty dialogue machine reading comprehension. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)

    Google Scholar 

  3. Li, Y., Zhao, H.: Self-and pseudo-self-supervised prediction of speaker and key-utterance for multi-party dialogue reading comprehension. Find. Assoc. Comput. Linguist. EMNLP 2021, 2053–2063 (2021)

    Google Scholar 

  4. Ghandeharioun, A., et al.: Approximating interactive human evaluation with self-play for open-domain dialog systems. Adv. Neural Inf. Process. Syst. 32, 13658–13669 (2019)

    Google Scholar 

  5. Einolghozati, A., Gupta, S., Mohit, M., Shah, R.: Improving robustness of task oriented dialog systems. arXiv preprint arXiv:1911.05153 (2019)

  6. Liu, B., Tur, G., Hakkani-Tur, D., Shah, P., Heck, L.: Dialogue learning with human teaching and feedback in end-to-end trainable task-oriented dialogue systems. In: Proceedings of NAACL-HLT, pp. 2060–2069 (2018)

    Google Scholar 

  7. Xie, H., Liu, Z., Xiong, C., Liu, Z., Copestake, A.: Tiage: a benchmark for topic-shift aware dialog modeling. In: Findings of the Association for Computational Linguistics: EMNLP, vol. 2021, pp. 1684–1690 (2021)

    Google Scholar 

  8. Yi, X., Zhao, H., Zhang, Z.: Topic-aware multi-turn dialogue modeling. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14176–14184 (2021)

    Google Scholar 

  9. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020)

    MathSciNet  MATH  Google Scholar 

  10. Li, Z., et al.: Hint-based training for non-autoregressive machine translation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5708–5713 (2019)

    Google Scholar 

  11. Zhang, S., Dinan, E., Urbanek, J., Szlam, A., Kiela, D., Weston, J.: Personalizing dialogue agents: I have a dog, do you have pets too? In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2204–2213 (2018)

    Google Scholar 

  12. Budzianowski, P., et al: Multiwoz-a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 5016–5026 (2018)

    Google Scholar 

  13. Eric, M., Krishnan, L., Charette, F., Manning, C.D.: Key-value retrieval networks for task-oriented dialogue. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, pp. 37–49 (2017)

    Google Scholar 

  14. Eisenstein, J., Barzilay, R.: Bayesian unsupervised topic segmentation. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 334–343 (2008)

    Google Scholar 

  15. Du, L., Buntine, W., Johnson, M.: Topic segmentation with a structured topic model. In: Proceedings of the 2013 conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies, pp. 190–200 (2013)

    Google Scholar 

  16. Koshorek, O., Cohen, A., Mor, N., Rotman, M., Berant, J.: Text segmentation as a supervised learning task. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp. 469–473 (2018)

    Google Scholar 

  17. Badjatiya, Pinkesh, Kurisinkel, Litton J.., Gupta, Manish, Varma, Vasudeva: Attention-based neural text segmentation. In: Pasi, Gabriella, Piwowarski, Benjamin, Azzopardi, Leif, Hanbury, Allan (eds.) ECIR 2018. LNCS, vol. 10772, pp. 180–193. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76941-7_14

    Chapter  Google Scholar 

  18. Arnold, S., Schneider R., Cudré-Mauroux, P., Gers, F.A.,Alexander Löser. Sector: A neural model for coherent topic segmentation and classification. Trans. Assoc. Comput. Linguist 7, 169–184, 2019

    Google Scholar 

  19. Yingcheng Sun and Kenneth Loparo. Topic shift detection in online discussions using structural context. In 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), volume 1, pages 948–949. IEEE, 2019

    Google Scholar 

  20. Wang, X., Li, C., Zhao, J., Dong, Yu.: Naturalconv: A chinese dialogue dataset towards multi-turn topic-driven conversation. In Proceedings of the AAAI Conference on Artificial Intelligence 35, 14006–14014 (2021)

    Article  Google Scholar 

  21. Wenquan Wu, Zhen Guo, Xiangyang Zhou, Hua Wu, Xiyuan Zhang, Rongzhong Lian, and Haifeng Wang. Proactive human-machine conversation with explicit conversation goal. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3794–3804, 2019

    Google Scholar 

  22. Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018

  23. Shaoxiong Feng, Xuancheng Ren, Hongshen Chen, Bin Sun, Kan Li, and Xu Sun. Regularizing dialogue generation by imitating implicit scenarios. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6592–6604, 2020

    Google Scholar 

  24. Li, S., Yan, H., Qiu, X.: Contrast and generation make bart a good dialogue emotion recognizer. In Proceedings of the AAAI Conference on Artificial Intelligence 36, 11002–11010 (2022)

    Article  Google Scholar 

  25. Beliz Gunel, Jingfei Du, Alexis Conneau, and Veselin Stoyanov. Supervised contrastive learning for pre-trained language model fine-tuning. In International Conference on Learning Representations

    Google Scholar 

  26. Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. Dailydialog: A manually labelled multi-turn dialogue dataset. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 986–995, 2017

    Google Scholar 

  27. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186, 2019

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the three anonymous reviewers for their comments on this paper. This research was supported by the National Natural Science Foundation of China (Nos. 62276177, 61836007 and 62006167), and Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peifeng Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lin, J., Fan, Y., Jiang, F., Chu, X., Li, P. (2023). Topic Shift Detection in Chinese Dialogues: Corpus and Benchmark. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14189. Springer, Cham. https://doi.org/10.1007/978-3-031-41682-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41682-8_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41681-1

  • Online ISBN: 978-3-031-41682-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics