Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3494885.3494902acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsseConference Proceedingsconference-collections
research-article

Chinese Text Similarity Calculation Model Based on Multi-Attention Siamese Bi-LSTM

Published: 20 December 2021 Publication History

Abstract

Measuring text similarity is a key research area in natural language processing technology. In this research, we proposed a multi-attention Siamese bi-directional long short-term memory (MAS-Bi-LSTM) to calculate the semantic similarity between two Chinese texts. The novel model used Bi-LSTM as the basic framework of the Siamese network, introduced a multi-head attention mechanism to capture the key features of the text, and used the Manhattan distance to calculate the similarity. Experiments were conducted on the large-scale Chinese question matching corpus dataset. Results showed that our model can achieve higher accuracy compared with other comparable models. The F1 value of our model reached 0.8070. The contribution of this research is to use the multi-head attention mechanism to re-weight the semantic features, and explore the influence of different pre-training corpus, distance formulas and heads of multi-attention on the model.

References

[1]
A. Ben Abacha and P. Zweigenbaum, "MEANS: A medical question-answering system combining NLP techniques and semantic Web technologies," Inf. Process. Manag., vol. 51, no. 5, pp. 570-594, Sep 2015.
[2]
T. Nakazawa, K. Yu, D. Kawahara, and S. Kurohashi, "Example-based machine translation based on deeper NLP," in International Workshop on Spoken Language Translation (IWSLT) 2006, 2006.
[3]
A. F. Smeaton, "Using NLP or NLP Resources for Information Retrieval Tasks," in Natural Language Information Retrieval, T. Strzalkowski Ed. Dordrecht: Springer Netherlands, 1999, pp. 99-111.
[4]
P.-S. Huang, X. He, J. Gao, L. Deng, A. Acero, and L. Heck, "Learning deep structured semantic models for web search using clickthrough data," presented at the Proceedings of the 22nd ACM international conference on Information & Knowledge Management, San Francisco, California, USA, 2013. [Online]. Available: https://doi.org/10.1145/2505515.2505665.
[5]
Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil, "A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval," presented at the Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, Shanghai, China, 2014. [Online]. Available: https://doi.org/10.1145/2661829.2661935.
[6]
H. Palangi, "Semantic modelling with long-short-term memory for information retrieval," arXiv preprint arXiv:1412.6629, 2014.
[7]
W. Bao, W. Bao, J. Du, Y. Yang, and X. Zhao, "Attentive Siamese LSTM Network for Semantic Textual Similarity Measure," in 2018 International Conference on Asian Language Processing (IALP), Nov 2018, pp. 312-317.
[8]
Z. Lin, "A structured self-attentive sentence embedding," arXiv preprint arXiv:1703.03130, 2017.
[9]
T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013.
[10]
J. Pennington, R. Socher, and C. D. Manning, "Glove: Global vectors for word representation," in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532-1543.
[11]
A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, "Bag of tricks for efficient text classification," arXiv preprint arXiv:1607.01759, 2016.
[12]
D. Guthrie, B. Allison, W. Liu, L. Guthrie, and Y. Wilks, "A closer look at skip-gram modelling," in LREC, 2006, vol. 6: Citeseer, pp. 1222-1225.
[13]
T. Kenter, A. Borisov, and M. De Rijke, "Siamese cbow: Optimizing word embeddings for sentence representations," arXiv preprint arXiv:1606.04640, 2016.
[14]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, vol. 25, pp. 1097-1105, 2012.
[15]
F. Gers, "Long short-term memory in recurrent neural networks," Verlag nicht ermittelbar, 2001.
[16]
P. Zhou, "Attention-based bidirectional long short-term memory networks for relation classification," in Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: Short papers), 2016, pp. 207-212.
[17]
J. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Bengio, "Attention-based models for speech recognition," arXiv preprint arXiv:1506.07503, 2015.
[18]
J. Li, "Research on text semantic similarity combined with neural network," Master's thesis, Shandong University, 2019.
[19]
C. Zhao, J. Guo, Z. Yu, Y. Huang, Q. Liu, and R. Song, "Correlation Analysis of News and Cases Based on Unbalanced Siamese Network," Journal of Chinese Information Processing, vol. 34, no. 3, pp. 99-106, 2020.

Cited By

View all
  • (2024)Semantic analysis and construction of English discourse based on neural networkApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-25469:1Online publication date: 3-Sep-2024
  • (2024)Text Similarity Calculation Model Based on Semantic Information and Syntactic Structure Fusion Weighting2024 6th International Conference on Communications, Information System and Computer Engineering (CISCE)10.1109/CISCE62493.2024.10653095(531-540)Online publication date: 10-May-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
CSSE '21: Proceedings of the 4th International Conference on Computer Science and Software Engineering
October 2021
366 pages
ISBN:9781450390675
DOI:10.1145/3494885
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Bi-LSTM
  2. Multi-Attention
  3. Siamese Network
  4. Text Similarity

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Project of University Excellent Talents of Education Bureau of Anhui Province, China
  • Major Project of Natural Science Research Foundation of Education Bureau of Anhui Province, China

Conference

CSSE 2021

Acceptance Rates

Overall Acceptance Rate 33 of 74 submissions, 45%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Semantic analysis and construction of English discourse based on neural networkApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-25469:1Online publication date: 3-Sep-2024
  • (2024)Text Similarity Calculation Model Based on Semantic Information and Syntactic Structure Fusion Weighting2024 6th International Conference on Communications, Information System and Computer Engineering (CISCE)10.1109/CISCE62493.2024.10653095(531-540)Online publication date: 10-May-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media