Nothing Special   »   [go: up one dir, main page]

Skip to main content

Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT

  • Conference paper
  • First Online:
Machine Translation (CWMT 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 787))

Included in the following conference series:

Abstract

Neural machine translation (NMT) becomes a new approach to machine translation and is proved to outperform conventional statistical machine translation (SMT) across a variety of language pairs. Most existing NMT systems operate with a fixed vocabulary, but translation is an open-vocabulary problem. Hence, previous works mainly handle rare and unknown words by using different translation granularities, such as character, subword, and hybrid word-character. While translation involving Chinese has been proved to be one of the most difficult tasks, there is no study to demonstrate which translation granularity is the most suitable for Chinese in NMT. In this paper, we conduct an extensive comparison using Chinese-English NMT as a case study. Furthermore, we discuss the advantages and disadvantages of various translation granularities in detail. Our experiments show that subword model performs best for Chinese-to-English translation while hybrid word-character model is most suitable for English-to-Chinese translation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The corpora include LDC2000T50, LDC2002T01, LDC2002E18, LDC2003E07, LDC2003E14, LDC2003T17 and LDC2004T07.

  2. 2.

    https://github.com/isi-nlp/Zoph_RNN.

  3. 3.

    https://github.com/rsennrich/subword-nmt.

  4. 4.

    https://github.com/google/sentencepiece.

References

  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Proceedings of ICLR 2015 (2015)

    Google Scholar 

  2. Cheng, Y., Liu, Y., Yang, Q., Sun, M., Xu, W.: Joint training for pivot-based neural machine translation. arXiv preprint arXiv:1611.04928v2 (2017)

  3. Chiang, D.: A hierarchical phrase-based model for statistical machine translation. In: Proceedings of ACL 2005 (2005)

    Google Scholar 

  4. Chung, J., Cho, K., Bengio, Y.: A character-level decoder without explicit segmentation for neural machine translation (2016)

    Google Scholar 

  5. Gage, P.: A New Algorithm for Data Compression. R & D Publications, Inc., Lawrence (1994)

    Google Scholar 

  6. He, W., He, Z., Wu, H., Wang, H.: Improved neural machine translation with SMT features. In: Proceedings of AAAI 2016 (2016)

    Google Scholar 

  7. Jean, S., Cho, K., Memisevic, R., Bengio, Y.: On using very large target vocabulary for neural machine translation. Computer Science (2014)

    Google Scholar 

  8. Junczys-Dowmunt, M., Dwojak, T., Hoang, H.: Is neural machine translation ready for deployment? A case study on 30 translation directions. In: Proceedings of IWSLT 2016 (2016)

    Google Scholar 

  9. Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of EMNLP 2013 (2013)

    Google Scholar 

  10. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. Computer Science (2014)

    Google Scholar 

  11. Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of ACL-NAACL 2013 (2003)

    Google Scholar 

  12. Li, X., Zhang, J., Zong, C.: Towards zero unknown word in neural machine translation. In: Proceedings of IJCAI 2016 (2016)

    Google Scholar 

  13. Luong, M.T., Manning, C.D.: Achieving open vocabulary neural machine translation with hybrid word-character models (2016)

    Google Scholar 

  14. Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of EMNLP 2015 (2015)

    Google Scholar 

  15. Luong, M.T., Sutskever, I., Le, Q.V., Vinyals, O., Zaremba, W.: Addressing the rare word problem in neural machine translation. In: Proceedings of ACL 2015 (2015)

    Google Scholar 

  16. Meng, F., Lu, Z., Li, H., Liu, Q.: Interactive attention for neural machine translation. In: Proceedings of COLING 2016 (2016)

    Google Scholar 

  17. Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In Proceedings of ACL 2002 (2002)

    Google Scholar 

  18. Schuster, M., Nakajima, K.: Japanese and Korean voice search, vol. 22, no. 10, pp. 5149–5152 (2012)

    Google Scholar 

  19. Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of ACL 2016 (2016)

    Google Scholar 

  20. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of NIPS 2014 (2014)

    Google Scholar 

  21. Wang, X., Lu, Z., Tu, Z., Li, H., Xiong, D., Zhang, M.: Neural machine translation advised by statistical machine translation. In: Proceedings of AAAI 2017 (2017)

    Google Scholar 

  22. Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Mohammad Norouzi, et al.: Googles neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)

  23. Zhai, F., Zhang, J., Zhou, Y., Zong, C., et al.: Tree-based translation without using parse trees. In: Proceedings of COLING 2012 (2012)

    Google Scholar 

  24. Zhang, J., Zong, C.: Bridging neural machine translation and bilingual dictionaries. arXiv preprint arXiv:1610.07272 (2016)

  25. Zhou, L., Hu, W., Zhang, J., Zong, C.: Neural system combination for machine translation. arXiv preprint arXiv:1704.06393 (2017)

Download references

Acknowledgments

The research work has been funded by the Natural Science Foundation of China under Grant Nos. 61333018 and 61402478, and it is also supported by the Strategic Priority Research Program of the CAS under Grant No. XDB02070007.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiajun Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Y., Zhou, L., Zhang, J., Zong, C. (2017). Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT. In: Wong, D., Xiong, D. (eds) Machine Translation. CWMT 2017. Communications in Computer and Information Science, vol 787. Springer, Singapore. https://doi.org/10.1007/978-981-10-7134-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-7134-8_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-7133-1

  • Online ISBN: 978-981-10-7134-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics