Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

MatchACNN: A Multi-Granularity Deep Matching Model

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

This paper discusses a deep learning approach to ranking relevance in information retrieval (IR). In recent years, deep neural networks have led to exciting breakthroughs in speech recognition, computer vision, and natural language processing (NLP) tasks. However, the multi-granularity deep matching model has yielded few positive results. Existing deep IR models use the granularity of words to match the query and document. According to the human inquiry process, matching should be done at multiple granularities of words, phrases, and even sentences. MatchACNN, a new deep learning architecture for simulating the aforementioned human assessment process, is presented in this study. To solve the aforementioned problems, our model treats text matching as image recognition, extracts features from different dimensions, and employs a two-dimensional convolution neural network and an attention mechanism in image recognition. Experiments on Wiki QA Corpus, NFCorpus, and TREC QA show that MatchACNN can significantly outperform existing deep learning methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Hofmann K, Whiteson S, de Rijke M (2013) Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval. Inf Retrieval 16:63–90. https://doi.org/10.1007/s10791-012-9197-9

    Article  Google Scholar 

  2. Johnson M, Schuster M, Le QV, et al Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation. 14

  3. Tay Y, Tuan LA, Hui SC (2018) Multi-Cast Attention Networks for Retrieval-based Question Answering and Response Prediction

  4. Yu Z, Amin SU, Alhussein M, Lv Z (2021) Research on disease prediction based on improved DeepFM and IoMT. IEEE Access 9:39043–39054. https://doi.org/10.1109/ACCESS.2021.3062687

    Article  Google Scholar 

  5. Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18:613–620. https://doi.org/10.1145/361219.361220

    Article  MATH  Google Scholar 

  6. Gadge J, Bhirud S (2021) Contextual weighting approach to compute term weight in layered vector space model. J Inf Sci 47:29–40. https://doi.org/10.1177/0165551519860043

    Article  Google Scholar 

  7. Luyxi H (2014) Learning to rank for information retrieval and natural language processing second edition. Synthesis Lect Human Language Technol 7:1–121

  8. Hinton G, Deng L, Yu D et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of Four research groups. IEEE Signal Process Mag 29:82–97. https://doi.org/10.1109/MSP.2012.2205597

    Article  Google Scholar 

  9. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60:84–90. https://doi.org/10.1145/3065386

    Article  Google Scholar 

  10. Gong Y, Luo H, Zhang J (2022) Natural language inference over interaction space

  11. Guo J, Fan Y, Ai Q, Croft WB (2016) A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. Association for Computing Machinery, New York, NY, USA, pp 55–64

  12. Guo J, Fan Y, Pang L et al (2020) A deep look into neural ranking models for information retrieval. Inf Process Manage 57:102067. https://doi.org/10.1016/j.ipm.2019.102067

    Article  Google Scholar 

  13. Huang P-S, He X, Gao J, et al (2013) Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM international conference on Information & Knowledge Management. Association for Computing Machinery, New York, NY, USA, pp 2333–2338

  14. Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural language sentences. advances in neural information processing systems. Curran Associates Inc

    Google Scholar 

  15. Shen Y, He X, Gao J, et al (2014) A Latent semantic model with convolutional-pooling structure for information retrieval. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. Association for Computing Machinery, New York, NY, USA, pp 101–110

  16. Qiu X, Huang X (2015) Convolutional neural tensor network architecture for community-based question answering. In: Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, Buenos Aires, Argentina, pp 1305–1311

  17. Socher R, Huang E, Pennin J et al (2011) Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. advances in neural information processing systems. Curran Associates Inc

    Google Scholar 

  18. Yin W, Schütze H (2015) Convolutional neural network for paraphrase identification. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Denver, Colorado, pp 901–911

  19. Yin W, Schütze H (2015) MultiGranCNN: An architecture for general matching of text chunks on multiple levels of granularity. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Beijing, China, pp 63–73

  20. Wan S, Lan Y, Guo J, et al (2016) A deep architecture for semantic matching with multiple positional sentence representations. Proceedings of the AAAI Conference on Artificial Intelligence 30:. doi: https://doi.org/10.1609/aaai.v30i1.10342

  21. Xiong C, Dai Z, Callan J, et al (2017) End-to-end neural ad-hoc ranking with kernel pooling. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, New York, NY, USA, pp 55–64

  22. Pang L, Lan Y, Guo J, et al (2016) Text matching as image recognition. Proceedings of the AAAI Conference on Artificial Intelligence 30:. doi: https://doi.org/10.1609/aaai.v30i1.10341

  23. Chen Q, Zhu X, Ling Z, et al (2017) Enhanced LSTM for natural language inference. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp 1657–1668

  24. Wan S, Lan Y, Xu J, et al (2019) Match-SRNN: Modeling the recursive matching structure with spatial RNN

  25. Devlin J, Chang M-W, Lee K, Toutanova KN (2018) BERT: Pre-training of deep bidirectional transformers for language understanding

  26. Rao J, Liu L, Tay Y, et al (2019) Bridging the gap between relevance matching and semantic matching for short text similarity modeling. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp 5370–5381

  27. Humeau S, Shuster K, Lachaux M-A, Weston J (2022) Poly-encoders: architectures and pre-training strategies for fast and accurate multi-sentence scoring

  28. Khattab O, Zaharia M (2020) ColBERT: Efficient and effective passage search via contextualized late interaction over BERT. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, New York, NY, USA, pp 39–48

  29. Karpukhin V, Oguz B, Min S, et al (2020) dense passage retrieval for open-domain question answering. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, pp 6769–6781

  30. Yu J, Rui Y, Chen B (2014) Exploiting click constraints and multi-view features for image re-ranking. IEEE Trans Multim 16:159–168. https://doi.org/10.1109/TMM.2013.2284755

    Article  Google Scholar 

  31. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. pp 770–778

  32. Simonyan K, Zisserman A (2019) Very deep convolutional networks for large-scale image recognition. 3rd International Conference on Learning Representations (ICLR 2015)

  33. Wei S, Liao L, Li J et al (2019) Saliency inside: learning attentive cnns for content-based image retrieval. IEEE Trans Image Process 28:4580–4593. https://doi.org/10.1109/TIP.2019.2913513

    Article  MathSciNet  MATH  Google Scholar 

  34. Zhang J, Cao Y, Wu Q (2021) Vector of locally and adaptively aggregated descriptors for image feature representation. Pattern Recogn 116:107952. https://doi.org/10.1016/j.patcog.2021.107952

    Article  Google Scholar 

  35. Komodakis N, Zagoruyko S (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer

  36. Yang Y, Yih W, Meek C (2015) WikiQA: A challenge dataset for open-domain question answering. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, pp 2013–2018

  37. Boteva V, Gholipour D, Sokolov A, Riezler S (2016) A full-text learning to rank dataset for medical information retrieval. In: Ferro N, Crestani F, Moens M-F et al (eds) Advances in information retrieval. Springer International Publishing

    Google Scholar 

  38. Wang M, Smith NA, Mitamura T (2007) What is the jeopardy model? a quasi-synchronous grammar for QA. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). Association for Computational Linguistics, Prague, Czech Republic, pp 22–32

Download references

Acknowledgements

This work is supported by the Technology Innovation and Application Development Projects of Chongqing (Grant No. cstc2021jscx-gksbX0032, cstc2021jscx-gksbX0029); the Chongqing Research Program of Basic Research and Frontier Technology (Grant No. cstc2021jcy-jmsxmX0530); the Key R & D plan of Hainan Province (Grant No. ZDYF2021GXJS006); the National Natural Science Foundation of China (Grant No. 62106030); the Natural Science Foundation of Chongqing (Grant No. cstc2018jcyjAX0314).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weihan Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chang, G., Wang, W. & Hu, S. MatchACNN: A Multi-Granularity Deep Matching Model. Neural Process Lett 55, 4419–4438 (2023). https://doi.org/10.1007/s11063-022-11047-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-022-11047-6

Keywords

Navigation