research-article

Improving machine translation systems via isotopic replacement

Authors:

Mike Papadakis,

Lu ZhangAuthors Info & Claims

ICSE '22: Proceedings of the 44th International Conference on Software Engineering

Pages 1181 - 1192

https://doi.org/10.1145/3510003.3510206

Published: 05 July 2022 Publication History

Abstract

Machine translation plays an essential role in people's daily international communication. However, machine translation systems are far from perfect. To tackle this problem, researchers have proposed several approaches to testing machine translation. A promising trend among these approaches is to use word replacement, where only one word in the original sentence is replaced with another word to form a sentence pair. However, precise control of the impact of word replacement remains an outstanding issue in these approaches.

To address this issue, we propose CAT, a novel word-replacement-based approach, whose basic idea is to identify word replacement with controlled impact (referred to as isotopic replacement). To achieve this purpose, we use a neural-based language model to encode the sentence context, and design a neural-network-based algorithm to evaluate context-aware semantic similarity between two words. Furthermore, similar to TransRepair, a state-of-the-art word-replacement-based approach, CAT also provides automatic fixing of revealed bugs without model retraining.

Our evaluation on Google Translate and Transformer indicates that CAT achieves significant improvements over TransRepair. In particular, 1) CAT detects seven more types of bugs than TransRepair; 2) CAT detects 129% more translation bugs than TransRepair; 3) CAT repairs twice more bugs than TransRepair, many of which may bring serious consequences if left unfixed; and 4) CAT has better efficiency than TransRepair in input generation (0.01s v.s. 0.41s) and comparable efficiency with TransRepair in bug repair (1.92s v.s. 1.34s).

References

[1]

[n.d.]. The worst translation mistake in history. https://pangeanic.co.uk/knowledge/the-worst-translation-mistake-in-history/

[2]

Yonatan Belinkov and Yonatan Bisk. 2018. Synthetic and natural noise both break neural machine translation. In Proc. ICLR.

[3]

Jialun Cao, Meiziniu Li, Yeting Li, Ming Wen, and Shing-Chi Cheung. 2020. SemMT: A Semantic-based Testing Approach for Machine Translation Systems. CoRR abs/2012.01815 (2020). arXiv:2012.01815 https://arxiv.org/abs/2012.01815

[4]

Yong Cheng, Lu Jiang, and Wolfgang Macherey. 2019. Robust Neural Machine Translation with Doubly Adversarial Inputs. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. 4324--4333. https://www.aclweb.org/anthology/P19-1425/

[5]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186.

[6]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. CoRR abs/2010.11929 (2020). arXiv:2010.11929 https://arxiv.org/abs/2010.11929

[7]

Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2018. HotFlip: White-Box Adversarial Examples for Text Classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 31--36.

[8]

Silvia P Gennari, Maryellen C MacDonald, Bradley R Postle, and Mark S Seidenberg. 2007. Context-dependent interpretation of words: Evidence for interactive neural processes. Neuroimage 35, 3 (2007), 1278--1286.

[9]

Carlo Giglio and Richard Caulk. 1965. Article 17 of the Treaty of Uccialli. Journal of African History (1965), 221--231.

[10]

Google. 2021. Google Translate. http://translate.google.com.

[11]

Chuan Guo, Alexandre Sablayrolles, Hervé Jégou, and Douwe Kiela. 2021. Gradient-based Adversarial Attacks against Text Transformers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7--11 November, 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, 5747--5757. https://aclanthology.org/2021.emnlp-main.464

[12]

Shashij Gupta, Pinjia He, Clara Meister, and Zhendong Su. 2020. Machine translation testing via pathological invariance. In ESEC/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Virtual Event, USA, November 8--13, 2020, Prem Devanbu, Myra B. Cohen, and Thomas Zimmermann (Eds.). ACM, 863--875.

Digital Library

[13]

Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, et al. 2018. Achieving human parity on automatic chinese to english news translation. arXiv preprint arXiv:1803.05567 (2018).

[14]

Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, James Law, Kevin Lee, Jason Lu, Pieter Noordhuis, Misha Smelyanskiy, Liang Xiong, and Xiaodong Wang. 2018. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. In 24th International Symposium on High-Performance Computer Architecture (HPCA 2018), February 24--28, Vienna, Austria.

[15]

Pinjia He, Clara Meister, and Zhendong Su. 2020. Structure-invariant testing for machine translation. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 961--973.

Digital Library

[16]

Pinjia He, Clara Meister, and Zhendong Su. 2021. Testing Machine Translation via Referential Transparency. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 410--422.

Digital Library

[17]

Georg Heigold, Stalin Varanasi, Günter Neumann, and Josef van Genabith. 2018. How Robust Are Character-Based Word Embeddings in Tagging and MT Against Wrod Scramlbing or Randdm Nouse?. In Proceedings of the 13th Conference of the Association for Machine Translation in the Americas, AMTA 2018, Boston, MA, USA, March 17--21, 2018 - Volume 1: Research Papers. 68--80. https://aclanthology.info/papers/W18-1807/w18-1807

[18]

Yue Jia and Mark Harman. 2011. An Analysis and Survey of the Development of Mutation Testing. IEEE Transactions on Software Engineering 37, 5 (September--October 2011), 649 -- 678.

Digital Library

[19]

Di Jin, Zhijing Jin, Joey Tianyi Zhou, and Peter Szolovits. 2020. Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI2020, New York, NY, USA, February 7--12, 2020. AAAI Press, 8018--8025. https://aaai.org/ojs/index.php/AAAI/article/view/6311

[20]

Linyang Li, Ruotian Ma, Qipeng Guo, Xiangyang Xue, and Xipeng Qiu. 2020. BERT-ATTACK: Adversarial Attack Against BERT Using BERT. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16--20, 2020, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 6193--6202.

[21]

Xin Li, Lidong Bing, Wenxuan Zhang, and Wai Lam. 2019. Exploiting BERT for End-to-End Aspect-based Sentiment Analysis. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). 34--41.

[22]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).

[23]

Yang Liu and Maosong Sun. 2015. Contrastive unsupervised word alignment with non-local features. In Twenty-Ninth AAAI Conference on Artificial Intelligence.

Digital Library

[24]

John X. Morris, Eli Lifland, Jin Yong Yoo, Jake Grigsby, Di Jin, and Yanjun Qi. 2020. TextAttack: A Framework for Adversarial Attacks, Data Augmentation, and Adversarial Training in NLP. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020 - Demos, Online, November 16--20, 2020, Qun Liu and David Schlangen (Eds.). Association for Computational Linguistics, 119--126.

[25]

Mike Papadakis, Marinos Kintis, Jie Zhang, Yue Jia, Yves Le Traon, and Mark Harman. 2019. Mutation testing advances: an analysis and survey. In Advances in Computers. Vol. 112. Elsevier, 275--378.

[26]

Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.

[27]

Daniel Pesu, Zhi Quan Zhou, Jingfeng Zhen, and Dave Towey. 2018. A Monte Carlo Method for Metamorphic Testing of Machine Translation Services. In 3rd IEEE/ACM International Workshop on Metamorphic Testing, MET 2018, Gothenburg, Sweden, May 27, 2018. ACM, 38--45. http://ieeexplore.ieee.org/document/8457612

Digital Library

[28]

Fanchao Qi, Yuan Yao, Sophia Xu, Zhiyuan Liu, and Maosong Sun. 2021. Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: LongPapers), Virtual Event, August1--6, 2021, Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Association for Computational Linguistics, 4873--4883.

[29]

SpaCy. 2019. SpaCy. https://spacy.io/.

[30]

Matthias Sperber, Jan Niehues, and Alex Waibel. 2017. Toward robust neural machine translation for noisy input sequences. In International Workshop on Spoken Language Translation (IWSLT).

[31]

Liqun Sun and Zhi Quan Zhou. 2018. Metamorphic testing for machine translations: MT4MT. In 2018 25th Australasian Software Engineering Conference (ASWEC). IEEE, 96--100.

[32]

Zeyu Sun, Jie M. Zhang, Mark Harman, Mike Papadakis, and Lu Zhang. 2020. Automatic testing and improvement of machine translation. In ICSE. 974--985.

Digital Library

[33]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., 6000--6010.

[34]

Wikipedia. 2014. Wikipedia. https://dumps.wikimedia.org/.

[35]

WMT.2018. News-Commentary. http://data.statmt.org/wmt18/translation-task/.

[36]

Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. 2020. Transformers: State-of-the-Art Natural Language Processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 38--45. https://www.aclweb.org/anthology/2020.emnlp-demos.6

[37]

Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu, and Maosong Sun. 2020. Word-level Textual Adversarial Attacking as Combinatorial Optimization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5--10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 6066--6080.

[38]

Jie M Zhang, Mark Harman, Lei Ma, and Yang Liu. 2019. Machine Learning Testing: Survey, Landscapes and Horizons. arXiv preprint arXiv:1906.10742 (2019).

[39]

Zhengli Zhao, Dheeru Dua, and Sameer Singh. 2017. Generating Natural Adversarial Examples. CoRR abs/1710.11342 (2017). arXiv:1710.11342 http://arxiv.org/abs/1710.11342

[40]

Wangchunshu Zhou, Tao Ge, Ke Xu, Furu Wei, and Ming Zhou. 2019. BERT-based lexical substitution. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 3368--3373.

Cited By

Xie XJin SChen SCheung S(2024)Word Closure-Based Metamorphic Testing for Machine TranslationACM Transactions on Software Engineering and Methodology10.1145/3675396Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3675396
Sun ZChen ZZhang JHao D(2024)Fairness Testing of Machine Translation SystemsACM Transactions on Software Engineering and Methodology10.1145/366460833:6(1-27)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3664608
Chen ZZhang JHort MHarman MSarro F(2024)Fairness Testing: A Comprehensive Survey and Analysis of TrendsACM Transactions on Software Engineering and Methodology10.1145/365215533:5(1-59)Online publication date: 4-Jun-2024
https://dl.acm.org/doi/10.1145/3652155
Show More Cited By

Recommendations

Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

Word reordering is a difficult task for translation between languages with widely different word orders, such as Japanese and English. A previously proposed post-ordering method for Japanese-to-English translation first translates a Japanese sentence ...
Neural Machine Translation of Indian Languages
Compute '17: Proceedings of the 10th Annual ACM India Compute Conference

Neural Machine Translation (NMT) is a new technique for machine translation that has led to remarkable improvements compared to rule-based and statistical machine translation (SMT) techniques, by overcoming many of the weaknesses in the conventional ...
Language Modeling for Syntax-Based Machine Translation Using Tree Substitution Grammars: A Case Study on Chinese-English Translation

The poor grammatical output of Machine Translation (MT) systems appeals syntax-based approaches within language modeling. However, previous studies showed that syntax-based language modeling using (Context-Free) Treebank Grammars was not very helpful in ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '22: Proceedings of the 44th International Conference on Software Engineering

May 2022

2508 pages

ISBN:9781450392211

DOI:10.1145/3510003

General Chair:
Matthew B Dwyer
University of Virginia
,
Program Chairs:
Daniela Damian
University of Victoria, Canada
,
Andreas Zeller
CISPA, Germany

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Funding Sources

the Innovation and Technology Commission of HKSAR
the National Key Research and Development Program of China
the Luxembourg National Research Fund (FNR)
the ERC advanced
National Natural Science Foundation of China

Conference

ICSE '22

Sponsor:

SIGSOFT

ICSE '22: 44th International Conference on Software Engineering

May 21 - 29, 2022

Pennsylvania, Pittsburgh

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

25
Total Citations
View Citations
205
Total Downloads

Downloads (Last 12 months)89
Downloads (Last 6 weeks)12

Reflects downloads up to 28 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xie XJin SChen SCheung S(2024)Word Closure-Based Metamorphic Testing for Machine TranslationACM Transactions on Software Engineering and Methodology10.1145/3675396Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3675396
Sun ZChen ZZhang JHao D(2024)Fairness Testing of Machine Translation SystemsACM Transactions on Software Engineering and Methodology10.1145/366460833:6(1-27)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3664608
Chen ZZhang JHort MHarman MSarro F(2024)Fairness Testing: A Comprehensive Survey and Analysis of TrendsACM Transactions on Software Engineering and Methodology10.1145/365215533:5(1-59)Online publication date: 4-Jun-2024
https://dl.acm.org/doi/10.1145/3652155
Yang JJiang JSun ZChen JChristakis MPradel M(2024)A Large-Scale Empirical Study on Improving the Fairness of Image Classification ModelsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652122(210-222)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652122
Wang GLi YLiu YDeng GLi TXu GLiu YWang HWang KLo DPenta MXia XHu X(2024)MeTMaP: Metamorphic Testing for Detecting False Vector Matching Problems in LLM Augmented GenerationProceedings of the 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering10.1145/3650105.3652297(12-23)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3650105.3652297
Ma WYang CKästner CBosch JLewis GCleland-Huang JMuccini H(2024)(Why) Is My Prompt Getting Worse? Rethinking Regression Testing for Evolving LLM APIsProceedings of the IEEE/ACM 3rd International Conference on AI Engineering - Software Engineering for AI10.1145/3644815.3644950(166-171)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3644815.3644950
Jiang WZhai JMa SZhang XShen C(2024)COSTELLO: Contrastive Testing for Embedding-Based Large Language Model as a Service EmbeddingsProceedings of the ACM on Software Engineering10.1145/36437671:FSE(906-928)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643767
Zhang QZhai JFang CLiu JSun WHu HWang Q(2024)Machine Translation Testing via Syntactic Tree PruningACM Transactions on Software Engineering and Methodology10.1145/364032933:5(1-39)Online publication date: 4-Jun-2024
https://dl.acm.org/doi/10.1145/3640329
Wang JLi YChen ZChen LZhang XZhou YRoychoudhury APaiva AAbreu RStorey M(2024)Knowledge Graph Driven Inference Testing for Question Answering SoftwareProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639109(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639109
Meng LLi YChen LMa MZhou YXu B(2024)Hybrid mutation driven testing for natural language inferenceJournal of Software: Evolution and Process10.1002/smr.2694Online publication date: 17-Jun-2024
https://doi.org/10.1002/smr.2694
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents