Abstract
Sentiment classification system relies on high-quality emotional resources. However, these resources are imbalanced in different languages. The way of how to leverage rich labeled data of one language (source language) for the sentiment classification of resource-poor language (target language), namely cross-lingual sentiment classification (CLSC), becomes a focus topic. This paper utilizes rich English resources for Chinese sentiment classification. To eliminate the language gap between English and Chinese, this paper proposes a combination CLSC approach based on denoising autoencoder. First, two classifiers based on denoising autoencoder are learned respectively in English and Chinese views by using English corpus and English-to-Chinese corpus. Second, we classify Chinese test data and Chinese-to-English test data with the two classifiers trained in the two views. Last, the final sentiment classification results are obtained by the combination of the two results in two views. Experiments are carried out on NLP&CC 2013 CLSC dataset including book, DVD and music categories. The results show that our approach achieves the accuracy of 80.02%, which outperforms the current state-of-the-art systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Turney, P.D.: Thumbs up or thumbs down? Semantic Orientation Applied to Unsupervised Classification of Reviews. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistic, pp. 417–424. Association for Computational Linguistics (2002)
Wei, B., Pal, C.: Cross lingual adaptation: an experiment on sentiment classifications. In: Proceedings of the ACL 2010 Conference Short Papers, pp. 258–262. Association for Computational Linguistics (2010)
Zhao, Y.Y., Qin, B., Liu, T.: Sentiment Analysis (in Chinese). Journal of Software 21(8), 1834–1848 (2010)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification using Machine Learning Techniques. In: Proceedings of the ACL-2002 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)
Kennedy, A., Inkpen, D.: Sentiment classification of movie reviews using contextual valence shifters. Computational Intelligence 22(2), 110–125 (2006)
Li, S., Xia, R., Zong, C.Q., Huang, C.R.: A framework of feature selection methods for text categorization. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th IJFNLP of the AFNLP, pp. 692–700. ACL and AFNLP (2009)
Wan, X.J.: Co-training for cross-lingual sentiment classification. In: Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, vol. 1, pp. 235–243. Association for Computational Linguistics (2009)
Gui, L., Xu, R., Xu, J., Yuan, L., Yao, Y., Zhou, J., Qiu, Q., Wang, S., Wong, K.-F., Cheung, R.: A Mixed Model for Cross Lingual Opinion Analysis. In: Zhou, G., Li, J., Zhao, D., Feng, Y. (eds.) NLPCC 2013. CCIS, vol. 400, pp. 93–104. Springer, Heidelberg (2013)
Li, S., Wang, R., Liu, H., Huang, C.-R.: Active Learning for Cross-Lingual Sentiment Classification. In: Zhou, G., Li, J., Zhao, D., Feng, Y. (eds.) NLPCC 2013. CCIS, vol. 400, pp. 236–246. Springer, Heidelberg (2013)
Chen, Q., He, Y.X., Liu, X.L., Sun, S.T., Peng, M., Li, F.: Cross-Language Sentiment Analysis Based on Parser. Acta Scientiarum Naturalium Universitatis Pekinensis 50(1), 55–60 (2014) (in Chinese)
Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine Learning 2, 1–127 (2009)
Bengio, Y., Delalleau, O.: On the expressive power of deep architectures. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. LNCS, vol. 6925, pp. 18–36. Springer, Heidelberg (2011)
Bengio, Y., LeCun, Y.: Scaling learning algorithms towards AI. Large-Scale Kernel Machines 34, 1–41 (2007)
Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Advances in Neural Information Processing Systems 19, 153 (2007)
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM, New York (2008)
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(8), 1915–1929 (2013)
Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing 20(1), 30–42 (2012)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. The Journal of Machine Learning Research 12, 2493–2537 (2011)
Tang, D., Qin, B., Liu, T., Li, Z.: Learning Sentence Representation for Emotion Classification on Microblogs. In: Zhou, G., Li, J., Zhao, D., Feng, Y. (eds.) NLPCC 2013. CCIS, vol. 400, pp. 212–223. Springer, Heidelberg (2013)
Zhou, S.S., Chen, Q.C., Wang, X.L.: Active deep networks for semi-supervised sentiment classification. In: COLING 2010 Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1515–1523. Association for Computational Linguistics (2010)
Galavotti, L., Sebastiani, F., Simi, M.: Feature selection and negative evidence in automated text categorization. In: Proceedings of KDD (2000)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information processing & management 24(5), 513–523 (1988)
Bergstra, J., Breuleux, O., Bastien, F., et al.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference, SciPy (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhou, H., Chen, L., Huang, D. (2014). Cross-Lingual Sentiment Classification Based on Denoising Autoencoder. In: Zong, C., Nie, JY., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2014. Communications in Computer and Information Science, vol 496. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45924-9_17
Download citation
DOI: https://doi.org/10.1007/978-3-662-45924-9_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45923-2
Online ISBN: 978-3-662-45924-9
eBook Packages: Computer ScienceComputer Science (R0)