Abstract
Semantic sentence matching is a fundamental technology in natural language processing. In the previous work, neural networks with attention mechanism have been successfully extended to semantic matching. However, existing deep models often simply use some operations such as summation and max-pooling to represent the whole sentence to a single distributed representation. We present a deep architecture to match two Chinese sentences, which only relies on alignment instead of recurrent neural network after attention mechanism used to get interaction information between sentence-pairs, it becomes more lightweight and simple. In order to capture original features enough, we employ a pooling operation named attention-pooling to convergence information from the whole sentence. We also explore several excellent performance English models on Chinese data. The experimental results show that our method can achieve better results than other models on Chinese dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Berger, A., Caruana, R., Cohn, D., Freitag, D., Mittal, V.: Bridging the lexical chasm: statistical approaches to answer-finding. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 192–199 (2000)
Lu, Z., Li, H.: A deep architecture for matching short texts. Adv. Neural Inf. Process. Syst. (NIPS), 1367–1375 (2013)
Aliguliyev, R.M.: A new sentence similarity measure and sentence based extractive technique for automatic text summarization. Expert Syst. Appl. (2009)
Huang, P.-S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.: Learning deep structured semantic models for web search using click through data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management (CIKM), pp. 2333–2338 (2013)
Palangi, H., Deng, L., Shen, Y., Gao, J., He, X., Chen, J., Song, X., Ward, R.K.: Deep sentence embedding using the long short term memory network: analysis and application to information retrieval. CoRR abs arXiv:1502.06922 (2015)
Csernai, K.: Quora question pair dataset (2017)
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (2015)
Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. arXiv:1704.05426 (2017)
Ant Financial. Ant Financial Artificial Competition. https://dc.cloud.alipay.com/index#/-topic/data?id=3
Junyi, S.: jieba. https://github.com/fxsjy/jieba
Mikolov, T., et al.: Efficient estimation of word representations in vector space. https://arxiv.org/abs/1301.3781
Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv:1505.00387 (2015)
Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. In: Wu, D., Carpuat, M., Carreras, X., Vecchi, E.M. (eds) Proceedings of SSST@EMNLP 2014 (2014)
Seo, M.J., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. arXiv:1611.01603 (2016)
Chen, Q., Zhu, X.: Enhanced LSTM for natural language inference. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 1657–1668
Parikh, A.P., Täckström, O., Das, D., Uszkoreit, J.: A decomposable attention model for natural language inference. https://arxiv.org/pdf/1606.01933
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of AISTATS (2011)
Lu, H., Li, Y., Chen, M., Kim, H., Serikawa, S.: Brain intelligence: go beyond artificial intelligence. Mob. Netw. Appl. 1–8 (2017)
Natural Language Computing Group, Microsoft Research Asia. R-NET: Machine Reading Comprehension With Self-matching Networks. https://www.microsoft.com/en-us/research/publication/mrc/
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR (2014)
Kingma, D.P., Adam, J.B.: A method for stochastic optimization. https://arxiv.org/abs/1412.6980
Xu, X., He, L., Lu, H., Gao, L., Ji, Y.: Deep adversarial metric learning for cross-modal retrieval. World Wide Web J. https://doi.org/10.1007/s11280-018-0541-x (2018)
Lu, H., Li, Y., Mu, S., Wang, D., Kim, H., Serikawa, S.: Motor anomaly detection for unmanned aerial vehicles using reinforcement learning. IEEE Internet Things J. https://doi.org/10.1109/jiot.2017.2737479 (2017)
Deshpande, A.: Diving into natural language processing. https://dzone.com/articles/-natural-language-processing-adit-deshpande-cs-unde
Serikawa, S., Huimin, L.: Underwater image dehazing using joint trilateral filter. Comput. Electr. Eng. 40(1), 41–50 (2014)
Lu, H., Li, Y., Uemura, T.: Low illumination underwater light field images reconstruction using deep convolutional neural networks. Future Gener. Comput. Syst. https://doi.org/10.1016/j.future.2018.01.001 (2018)
Lu, H., et al.: Low illumination underwater light field images reconstruction using deep convolutional neural networks. Future Gener. Comput. Syst. https://doi.org/10.1016/j.future.2018.01.001 (2018)
Choi, J., Yoo, K.M., Lee, S.: Learning to compose task-specific tree structures. AAAI (2017)
Nie, Y., Bansal, M.: Shortcut-stacked sentence encoders for multi-domain inference. arXiv:1708.02312 (2017)
Acknowledgements
We are especially grateful to Ant Financial for allowing us to use the dataset from Ant Financial Artificial Competition for experiments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Lai, H., Tao, Y., Wang, C., Xu, L., Tang, D., Li, G. (2020). A Deep Architecture for Chinese Semantic Matching with Pairwise Comparisons and Attention-Pooling. In: Lu, H. (eds) Cognitive Internet of Things: Frameworks, Tools and Applications. ISAIR 2018. Studies in Computational Intelligence, vol 810. Springer, Cham. https://doi.org/10.1007/978-3-030-04946-1_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-04946-1_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04945-4
Online ISBN: 978-3-030-04946-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)