Abstract
Feature summarization is an important problem in opinion mining of product review. Current methods mostly cluster feature expressions with unsupervised learning methods based on lexical similarity or context information similarity. Although several semi-supervised methods have been proposed to addressing the problem, their labeled set needs manual definition and requires a professional knowledge to categorize features into same aspect or different aspects. In this paper, we proposed a semi-supervised method that incorporates domain information in web sites to generate labeled set and constructs a novel context information modeling process with EM method to solve the problem. Experimental results show that the proposed method achieves better performance than existing approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Liu, B., Hu, M., Cheng, J.: Opinion Observer: Analyzing and Comparing Opinions on the Web. In: WWW (2005)
Hu, M., Liu, B.: Mining Opinion Features in Customer Reviews. American Association for Artificial Intelligence (2004)
Hu, M., Liu, B.: Mining and Summarizing Customer Reviews. In: KDD (2004)
Harris, Z.S.: Mathematical structures of language. Interscience Tracts in Pure and Applied Mathematics (1968)
Zhai, Z., Liu, B., Xu, H., Jia, P.: Clstering Product Features for Opinion Mining. In: WSDM (2011)
Nigam, K., Mccallum, A.K., Thrun, S., Mitchell, T.: Text Classification from Labeled and Unlabeled Documents using EM. Machine Learning (2009)
Peter, F., Brown, V.J., Della Pietra, V.J., Della Pietra, J.C., Lai, R.L.: Mercer: Class-based n-gram models of natural language. Association for Computational Linguistics (1992)
Lin, D., Wu, X.: Phrase Clustering for Discriminative Learning. In: ACL (2009)
Matsuo, Y., Sakaki, T., Uchiyama, K., Ishizuka, M.: Graph-based Word Clustering using a Web Search Engine. EMNLP (2006)
Lewis, D.D.: Naive (Bayes) at forty: The independence assumption in information retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 4–15. Springer, Heidelberg (1998)
McCallum, A., Nigam, K.: A comparison of event models for naive Bayes text classification. AAAI (1998)
McCallum, A.K., Nigam, K.: Employing EM and Pool-Based Active Learning for Text Classification. In: ICML (1998)
Liu, B.: Web data mining: Exploring hyperlinks, contents, and usage data. Springer (2006)
Carenini, G., Ng, R.T., Zwart, E.: Extracting Knowledge from Evaluative Text. In: K-CAP (2005)
Lee, L.: Measures of Distributional Similarity ACL (1999)
Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research (2003)
Guo, H., Zhu, H., Guo, Z., Zhang, X., Su, Z.: Product Feature Categorization with Multilevel Latent Semantic Association. In: CIKM (2009)
Patrick, P., Eric, C., Arkady, B., Ana-Maria, P., Vishnu, V.: Web-scale distributional similarity and entity set expansion. In: ACL (2009)
Popescu, A.-M., Etzioni, O.: Extracting Product Features and Opinions from Reviews. In: HLT-EMNLP (2005)
Wang, H., Lu, Y., Zhai, C.: Latent aspect rating analysis on review text data: a rating regression approach. In: SIGKDD (2010)
Thad, H., Daniel, R.: Lexical semantic relatedness with random graph walks. EMNLP (2007)
Mukherjee, A., Liu, B.: Aspect extraction through Semi-Supervised modeling. In: ACL (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, T., Cai, Y., Zhang, G., Liu, Y., Chen, J., Min, H. (2013). Product Feature Summarization by Incorporating Domain Information. In: Hong, B., Meng, X., Chen, L., Winiwarter, W., Song, W. (eds) Database Systems for Advanced Applications. DASFAA 2013. Lecture Notes in Computer Science, vol 7827. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40270-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-40270-8_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40269-2
Online ISBN: 978-3-642-40270-8
eBook Packages: Computer ScienceComputer Science (R0)