research-article

STMAP: : A novel semantic text matching model augmented with embedding perturbations

Authors:

Huaping ZhangAuthors Info & Claims

Volume 61, Issue 1

https://doi.org/10.1016/j.ipm.2023.103576

Published: 01 January 2024 Publication History

Abstract

Semantic text matching models have achieved outstanding performance, but traditional methods may not solve Few-shot learning problems and data augmentation techniques could suffer from semantic deviation. To solve this problem, we propose STMAP, which is implemented from the perspective of data augmentation based on Gaussian noise and Noise Mask signal. We also employ an adaptive optimization network to dynamically optimize the several training targets generated by data augmentation. We evaluated our model on four English datasets: MRPC, SciTail, SICK, and RTE, with achieved scores of 90.3%, 94.2%, 88.9%, and 68.8%, respectively. Our model obtained state-of-the-art (SOTA) results on three of the English datasets. Furthermore, we assessed our approach on three Chinese datasets, and achieved an average improvement of 1.3% over the baseline model. Additionally, in the Few-shot learning experiment, our model outperformed the baseline performance by 5%, especially when the data volume was reduced by around 0.4. Our ablation experiments further validated the effectiveness of STMAP. We have released our source code. https://github.com/wangyanhao0517/STMAP.

Highlights

•

Proposing Semantic Text Matching Model Augmented with Perturbations (STMAP).

•

Introducing a noise perturbation augmentation scheme based on Gaussian noise.

•

Proposing a multi-task adaptive optimization scheme.

•

STMAP demonstrates strong generalization capabilities.

•

STMAP model achieves state-of-the-art result and performs well under Few-shot scenes.

References

[1]

Bai J., Wang Y., Chen Y., Yang Y., Bai J., Yu J., et al., Improving pre-trained transformers with syntax trees, in: Proceedings of the 16th conference of the european chapter of the association for computational linguistics, Kiev, Ukraine, 2021, pp. 21–23.

[2]

Bollacker K., Evans C., Paritosh P., Sturge T., Taylor J., Freebase: a collaboratively created graph database for structuring human knowledge, in: Proceedings of the 2008 ACM SIGMOD international conference on management of data, 2008, pp. 1247–1250.

Digital Library

[3]

Brown T., Mann C., et al., Language models are few-shot learners, Advances in Neural Information Processing Systems 33 (2020) 1877–1901.

[4]

Chen L., Zhao Y., Lyu B., Jin L., Chen Z., Zhu S., et al., Neural graph matching networks for Chinese short text matching, in: Proceedings of the 58th annual meeting of the association for computational linguistics, 2020, pp. 6152–6158.

[5]

Chen Q., Zhu X., Ling Z.-H., Wei S., Jiang H., Inkpen D., Enhanced LSTM for natural language inference, in: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long papers), 2017, pp. 1657–1668.

[6]

Coulombe C., Text data augmentation made simple by leveraging NLP cloud APIs, 2018, CoRR abs/1812.04718.

[7]

Ding X., Chen B., Du L., Qin B., Liu T., CogBERT: Cognition-guided pre-trained language models, in: Proceedings of the 29th international conference on computational linguistics, 2022, pp. 3210–3225.

[8]

Fadaee M., Bisazza A., Monz C., Data augmentation for low-resource neural machine translation, in: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 2: Short papers), 2017, pp. 567–573.

[9]

Fukushima K., Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position, Biological Cybernetics 36 (4) (1980) 193–202.

[10]

Gao T., Yao X., Chen D., SimCSE: Simple contrastive learning of sentence embeddings, in: 2021 conference on empirical methods in natural language processing, EMNLP 2021, Association for Computational Linguistics (ACL), 2021, pp. 6894–6910.

[11]

Goodfellow I.J., Shlens J., Szegedy C., Explaining and harnessing adversarial examples, stat 1050 (2015) 20.

[12]

Guo J., Fan Y., Ji X., Cheng X., Matchzoo: A learning, practicing, and developing system for neural text matching, in: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, 2019, pp. 1297–1300.

[13]

He P., Liu X., Gao J., Chen W., Deberta: Decoding-enhanced BERT with disentangled attention, in: International conference on learning representations, 2021, pp. 1210–1233.

[14]

Hochreiter S., Schmidhuber J., Long short-term memory, Neural Computation 9 (8) (1997) 1735–1780.

Digital Library

[15]

Huang P.-S., He X., Gao J., Deng L., Acero A., Heck L., Learning deep structured semantic models for web search using clickthrough data, in: Proceedings of the 22nd ACM international conference on information & knowledge management, 2013, pp. 2333–2338.

Digital Library

[16]

Humeau S., Shuster K., Lachaux M.-A., Weston J., Poly-encoders: Architectures and pre-training strategies for fast and accurate multi-sentence scoring, in: International conference on learning representations, 2020, pp. 2563–2578.

[17]

Jia Q., Zhang D., Yang S., Xia C., Shi Y., Tao H., et al., Traditional Chinese medicine symptom normalization approach leveraging hierarchical semantic information and text matching with attention mechanism, Journal of Biomedical Informatics 116 (2021).

[18]

Karimi A., Rossi L., Prati A., AEDA: An easier data augmentation technique for text classification, in: Findings of the association for computational linguistics: EMNLP 2021, 2021, pp. 2748–2754.

[19]

Karras T., Laine S., Aila T., A style-based generator architecture for generative adversarial networks, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4401–4410.

[20]

Kenton J.D.M.-W.C., Toutanova L.K., BERT: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of NAACL-HLT, 2019, pp. 4171–4186.

[21]

Khot T., Sabharwal A., Clark P., SCITAIL: a textual entailment dataset from science question answering, in: Proceedings of the thirty-second AAAI conference on artificial intelligence and thirtieth innovative applications of artificial intelligence conference and eighth AAAI symposium on educational advances in artificial intelligence, 2018, pp. 5189–5197.

[22]

Kingma D.P., Ba J., Adam: A method for stochastic optimization, in: Bengio Y., LeCun Y. (Eds.), 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference track proceedings, 2015.

[23]

Le Scao T., Rush A.M., How many data points is a prompt worth?, in: Proceedings of the 2021 conference of the north american chapter of the association for computational linguistics: Human language technologies, 2021, pp. 2627–2636.

[24]

Lee S., Kang M., Lee J., Hwang S.J., Learning to perturb word embeddings for out-of-distribution QA, in: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long papers), 2021, pp. 5583–5595.

[25]

Li Z., Li X., Xie H., Wang F.L., Leng M., Li Q., et al., A novel dropout mechanism with label extension schema toward text emotion classification, Information Processing & Management 60 (2) (2023),. URL: https://www.sciencedirect.com/science/article/pii/S0306457322002746.

Digital Library

[26]

Li D., Yang Y., Tang H., Liu J., Wang Q., Wang J., et al., VIRT: Improving representation-based text matching via virtual interaction, in: Proceedings of the 2022 conference on empirical methods in natural language processing, 2022, pp. 914–925.

[27]

Li Y., Zhou K., Zhao W.X., Wen J.-R., Diffusion models for non-autoregressive text generation: A survey, 2023, arXiv preprint arXiv:2303.06574.

[28]

Lin D., Tang J., Li X., Pang K., Li S., Wang T., BERT-SMAP: Paying attention to Essential Terms in passage ranking beyond BERT, Information Processing & Management 59 (2) (2022).

Digital Library

[29]

Liu Y., Ott M., Goyal N., Du J., Joshi M., Chen D., et al., RoBERTa: A robustly optimized BERT pretraining approach, 2019, CoRR abs/1907.11692.

[30]

Madry A., Makelov A., Schmidt L., Tsipras D., Vladu A., Towards deep learning models resistant to adversarial attacks, stat 1050 (2019) 4.

[31]

Marelli M., Menini S., Baroni M., Bentivogli L., Bernardi R., Zamparelli R., A SICK cure for the evaluation of compositional distributional semantic models, in: Proceedings of the ninth international conference on language resources and evaluation (LREC’14), 2014, pp. 216–223.

[32]

Miller G.A., WordNet: a lexical database for English, Communications of the ACM 38 (11) (1995) 39–41.

Digital Library

[33]

Min J., McCoy R.T., Das D., Pitler E., Linzen T., Syntactic data augmentation increases robustness to inference heuristics, in: Proceedings of the 58th annual meeting of the association for computational linguistics, 2020, pp. 2339–2352.

[34]

Niu Z., Zhong G., Yu H., A review on the attention mechanism of deep learning, Neurocomputing 452 (2021) 48–62.

[35]

Pu X., Yuan L., Leng J., Wu T., Gao X., Lexical knowledge enhanced text matching via distilled word sense disambiguation, Knowledge-Based Systems (2023).

[36]

Raffel C., Shazeer N., Roberts A., Lee K., Narang S., Matena M., et al., Exploring the limits of transfer learning with a unified text-to-text transformer, Journal of Machine Learning Research 21 (1) (2020) 5485–5551.

[37]

Santhanam K., Khattab O., Saad-Falcon J., Potts C., Zaharia M., ColBERTv2: Effective and efficient retrieval via lightweight late interaction, in: Proceedings of the 2022 conference of the north american chapter of the association for computational linguistics: Human language technologies, 2022, pp. 3715–3734.

[38]

Shen J., Pan T., Xu M., Gan D., An B., A novel DL-based algorithm integrating medical knowledge graph and doctor modeling for Q&A pair matching in OHP, Information Processing & Management 60 (3) (2023),. URL: https://www.sciencedirect.com/science/article/pii/S0306457323000596.

Digital Library

[39]

Shorten C., Khoshgoftaar T.M., Furht B., Text data augmentation for deep learning, Journal of Big Data 8 (2021) 1–34.

[40]

Tay Y., Luu A.T., Hui S.C., Hermitian co-attention networks for text matching in asymmetrical domains, in: IJCAI, 2018, pp. 4425–4431.

[41]

Wang Z., Hamza W., Florian R., Bilateral multi-perspective matching for natural language sentences, in: Proceedings of the 26th international joint conference on artificial intelligence, 2017, pp. 4144–4150.

[42]

Wang S., Liang D., Song J., Li Y., Wu W., DABERT: Dual attention enhanced BERT for semantic matching, in: Proceedings of the 29th international conference on computational linguistics, 2022, pp. 1645–1654.

[43]

Wang J., Pan M., He T., Huang X., Wang X., Tu X., A pseudo-relevance feedback framework combining relevance matching and semantic matching for information retrieval, Information Processing & Management 57 (6) (2020).

[44]

Wei J., Zou K., EDA: Easy data augmentation techniques for boosting performance on text classification tasks, in: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), 2019, pp. 6382–6388.

[45]

Wu L.-T., Lin J.-R., Leng S., Li J.-L., Hu Z.-Z., Rule-based information extraction for mechanical-electrical-plumbing-specific semantic web, Automation in Construction 135 (2022).

[46]

Wu X., Lv S., Zang L., Han J., Hu S., Conditional bert contextual augmentation, in: Computational science–ICCS 2019: 19th international conference, Faro, Portugal, June 12–14, 2019, Proceedings, Part IV 19, Springer, 2019, pp. 84–95.

[47]

Xia T., Wang Y., Tian Y., Chang Y., Using prior knowledge to guide bert’s attention in semantic textual matching tasks, in: Proceedings of the web conference 2021, 2021, pp. 2466–2475.

[48]

Xiang C., Zhang J., Li F., Fei H., Ji D., A semantic and syntactic enhanced neural model for financial sentiment analysis, Information Processing & Management 59 (4) (2022).

Digital Library

[49]

Xu B., Xu Y., Liang J., Xie C., Liang B., Cui W., et al., CN-DBpedia: A never-ending Chinese knowledge extraction system, in: Advances in artificial intelligence: From theory to practice: 30th international conference on industrial engineering and other applications of applied intelligent systems, IEA/AIE 2017, Arras, France, June 27-30, 2017, Proceedings, Part II, Springer, 2017, pp. 428–438.

[50]

Yang Z., Dai Z., Yang Y., Carbonell J., Salakhutdinov R.R., Le Q.V., Xlnet: Generalized autoregressive pretraining for language understanding, Advances in Neural Information Processing Systems 32 (2019).

[51]

Yang Y., Miao R., Wang Y., Wang X., Contrastive Graph Convolutional Networks with adaptive augmentation for text classification, Information Processing & Management 59 (4) (2022).

[52]

Yu C., Xue H., Jiang Y., An L., Li G., A simple and efficient text matching model based on deep interaction, Information Processing & Management 58 (6) (2021).

Digital Library

[53]

Zhang Q., Chen S., Fang M., Chen X., Joint reasoning with knowledge subgraphs for Multiple Choice Question Answering, Information Processing & Management 60 (3) (2023),. URL: https://www.sciencedirect.com/science/article/pii/S0306457323000341.

Digital Library

[54]

Zhang Z., Han X., Liu Z., Jiang X., Sun M., Liu Q., ERNIE: Enhanced language representation with informative entities, in: Proceedings of the 57th annual meeting of the association for computational linguistics, Association for Computational Linguistics, Florence, Italy, 2019, pp. 1441–1451,. URL: https://aclanthology.org/P19-1139.

[55]

Zhang Z., Wu Y., Zhao H., Li Z., Zhang S., Zhou X., et al., Semantics-aware BERT for language understanding, 2019, CoRR abs/1909.02209.

[56]

Zou Y., Liu H., Gui T., Wang J., Zhang Q., Tang M., et al., Divide and conquer: Text semantic matching with disentangled keywords and intents, in: Findings of the association for computational linguistics: ACL 2022, 2022, pp. 3622–3632.

[57]

Zuo Y., Lu W., Peng X., Wang S., Zhang W., Qiao X., DuCL: Dual-stage contrastive learning framework for Chinese semantic textual matching, Computers & Electrical Engineering 106 (2023).

Cited By

Liu JTian XTong HXie CRuan TCong LWu BWang H(2024)Enhancing Chinese abbreviation prediction with LLM generation and contrastive evaluationInformation Processing and Management: an International Journal10.1016/j.ipm.2024.10376861:4Online publication date: 18-Jul-2024
https://dl.acm.org/doi/10.1016/j.ipm.2024.103768

Index Terms

STMAP: A novel semantic text matching model augmented with embedding perturbations

Index terms have been assigned to the content through auto-classification.

Recommendations

Contextual Information Augmented Few-Shot Relation Extraction
Knowledge Science, Engineering and Management
Abstract
Few-Shot Relation Extraction is a challenging task that involves extracting relations from a limited number of annotated data. While some researchers have proposed using sentence-level information to improve performance on this task with Prototype ...
Improving Object Detection Robustness against Natural Perturbations through Synthetic Data Augmentation
CVIPPR '23: Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition

Robustness against real-world distribution shifts is crucial for the successful deployment of object detection models in practical applications. In this paper, we address the problem of assessing and enhancing the robustness of object detection models ...
Boosting Medical Image Segmentation with Partial Class Supervision
Biometric Recognition
Abstract
Medical image data are often limited due to expensive acquisition and annotation processes. Directly using such limited annotated samples can easily lead to the deep learning models overfitting on the training dataset. An alternative way is to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Processing and Management: an International Journal

Information Processing and Management: an International Journal Volume 61, Issue 1

Jan 2024

823 pages

Issue’s Table of Contents

The Author(s).

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 January 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu JTian XTong HXie CRuan TCong LWu BWang H(2024)Enhancing Chinese abbreviation prediction with LLM generation and contrastive evaluationInformation Processing and Management: an International Journal10.1016/j.ipm.2024.10376861:4Online publication date: 18-Jul-2024
https://dl.acm.org/doi/10.1016/j.ipm.2024.103768

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents