Generating Adversarial Texts for Recurrent Neural Networks

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12396))

Included in the following conference series:

International Conference on Artificial Neural Networks

3496 Accesses
1 Citations

Abstract

Adversarial examples have received increasing attention recently due to their significant values in evaluating and improving the robustness of deep neural networks. Existing adversarial attack algorithms have achieved good result for most images. However, those algorithms cannot be directly applied to texts as the text data is discrete in nature. In this paper, we extend two state-of-the-art attack algorithms, PGD and C&W, to craft adversarial text examples for RNN-based models. For Extend-PGD attack, it identifies the words that are important for classification by computing the Jacobian matrix of the classifier, to effectively generate adversarial text examples. For Extend-C&W attack, it utilizes $\mathcal {L}_{1}$ regularization to minimize the alteration of the original input text. We conduct comparison experiments on two recurrent neural networks trained for classifying texts in two real-world datasets. Experimental results show that our Extend-PGD and Extend-C&W attack algorithms have advantages of attack success rate and semantics-preserving ability, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

UCTT: universal and low-cost adversarial example generation for tendency classification

Article 30 April 2024

Inflect-text: a novel mechanism to evade neural text classifiers by leveraging word inflectional perturbations

Article 26 January 2025

Evaluating Defensive Distillation for Defending Text Processing Neural Networks Against Adversarial Examples

References

Carlini, N., Wagner, D.A.: Towards evaluating the robustness of neural networks. In: IEEE Symposium on Security and Privacy, pp. 39–57 (2017)
Google Scholar
Gao, J., Lanchantin, J., Soffa, M.L., Qi, Y.: Black-box generation of adversarial text sequences to evade deep learning classifiers. In: IEEE Symposium on Security and Privacy Workshops, pp. 50–56 (2018)
Google Scholar
Gong, Z., Wang, W., Li, B., Song, D., Ku, W.S.: Adversarial texts with gradient methods. arXiv:1801.07175 (2018)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (2015)
Google Scholar
Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: International Conference on Machine Learning, pp. 1764–1772 (2014)
Google Scholar
Hochreiter, S., Schmolze, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)
Google Scholar
Li, J., Ji, S., Du, T., Li, B., Wang, T.: TextBugger: generating adversarial text against real-world applications. In: Network and Distributed System Security Symposium (2018)
Google Scholar
Liang, B., Li, H., Su, M., Bian, P., Li, X., Shi, W.: Deep text classification can be fooled. In: IJCAI, pp. 4208–4215 (2018)
Google Scholar
Liu, S., Yang, N., Li, M., Zhou, M.: A recursive recurrent neural network for statistical machine translation. In: ACL, pp. 1491–1500 (2014)
Google Scholar
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: ACL, pp. 142–150 (2011)
Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR (2018)
Google Scholar
Mesnil, G., Mikolov, T., Ranzato, M.A., Bengio, Y.: Ensemble of generative and discriminative techniques for sentiment analysis of movie reviews. In: ICLR (2015)
Google Scholar
Metsis, V., Androutsopoulos, I., Paliouras, G.: Spam fitering with naive bayes-which naive bayes? CEAS 17, 28–69 (2006)
Google Scholar
Moosavi-Dezfooli, S., Fawzi, A., Frossard, P.: DeepFool: a simple and accurate method to fool deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2574–2582 (2016)
Google Scholar
Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram features. In: NAACL: Human Language Technologies, pp. 528–540 (2018)
Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Empirical Methods in Natural Language Processing (2014)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Cogn. Model. 14(2) (1988)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv:1312.6199 (2014)

Download references

Author information

Authors and Affiliations

Software Engineering Institute, East China Normal University, Shanghai, China
Chang Liu & Zhengfeng Yang
School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou, China
Wang Lin

Authors

Chang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wang Lin
View author publications
You can also search for this author in PubMed Google Scholar
Zhengfeng Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chang Liu .

Editor information

Editors and Affiliations

Department of Applied Informatics, Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kgs. Lyngby, Denmark
Paolo Masulli
Department of Informatics, University of Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, C., Lin, W., Yang, Z. (2020). Generating Adversarial Texts for Recurrent Neural Networks. In: Farkaš, I., Masulli, P., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2020. ICANN 2020. Lecture Notes in Computer Science(), vol 12396. Springer, Cham. https://doi.org/10.1007/978-3-030-61609-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-61609-0_4
Published: 14 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61608-3
Online ISBN: 978-3-030-61609-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics