Nothing Special   »   [go: up one dir, main page]

Skip to main content

CCRM: An Effective Algorithm for Mining Commodity Information from Threaded Chinese Customer Reviews

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4426))

Included in the following conference series:

Abstract

This paper is concerned with the problem of mining commodity information from threaded Chinese customer reviews. Chinese online commodity forums, which are developing rapidly, provide a good environment for customers to share reviews. However, due to noises and navigational limitations, it is hard to have a clear view of a commodity from thousands of related reviews. Further more, due to different characters between Chinese and English, Researching approaches may vary a lot. This paper aims to automatically mine out key information from commodity reviews. An effective algorithm, i.e. Chinese Commodity Review Miner (CCRM) is proposed. The algorithm can be divided into two parts. First, we propose an efficient rule based algorithm for commodity feature extraction as well as a probabilistic model for feature ranking. Second, we propose a top-to-down algorithm to reorganize the extracted features into hierarchical structure. A prototype system based on CCRM is also implemented. Using CCRM, users can easily acquire the outline of a commodity, and navigate freely in it.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bai, X., Padman, R., Airoldi, E.: On Learning Parsimonious Models for Extracting Consumer Opinions. In: Proc. of HICSS-05, p. 75b (2005)

    Google Scholar 

  2. Baron, F., Hirst, G.: Collocations as Cues to Semantic Orientation. In: Proc. of the AAAI Spring Symposium on Exploring Attitude and Affect in Text: Theories and Applications, AAAI Press, Menlo Park (2004)

    Google Scholar 

  3. Bourigault, D.: Lexter: A terminology extraction software for knowledge acquisition from texts. In: Proc. of KAW-95 (1995)

    Google Scholar 

  4. Clemencon, S., Lugosi, G., Vayatis-Manuscript, N.: Ranking and scoring using empirical risk minimization. In: Proc. of the 18th Annual Conference on Learning Theory (2005)

    Google Scholar 

  5. Cohen, W.W., Schapire, R.E., Singer, Y.: Learning to order things. Journal of Artificial Intelligence Research 10, 243–270 (1999)

    MATH  MathSciNet  Google Scholar 

  6. Daille, B.: Study and Implementation of Combined Techniques for Automatic Extraction of Terminology. In: The Balancing Act: Combining Symbolic and Statistical Approaches to Language, MIT Press, Cambridge (1996)

    Google Scholar 

  7. Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: Proc. of WWW-03, pp. 519–528 (2003)

    Google Scholar 

  8. Gamon, M., et al.: Pulse: Mining Customer Opinions from Free Text. In: Proc. of IDA-05, pp. 121-132 (2005)

    Google Scholar 

  9. Haveliwala, T.H.: Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search. IEEE Transactions on Knowledge and Data Engineering (2003)

    Google Scholar 

  10. Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proc. of KDD-04 (2004)

    Google Scholar 

  11. Jacquemin, C., Bourigault, D.: Term extraction and automatic indexing. In: Mitkov, R. (ed.) Handbook of Computational Linguistics, Oxford University Press, Oxford (2001)

    Google Scholar 

  12. Justeson, J., Katz, S.: Technical Terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1(1), 9–27 (1995)

    Article  Google Scholar 

  13. Lei, M., et al.: Improved relevance ranking in WebGather. Journal of Computer Science and Technology, 410–417 (September 2001)

    Google Scholar 

  14. Liu, B., Hu, M., Cheng, J.: Opinion Observer: Analyzing and Comparing Opinions on the Web. In: Proc. of WWW-05 (2005)

    Google Scholar 

  15. Morinaga, S., et al.: Mining Product Reputations on the Web. In: Proc. of KDD-02 (2002)

    Google Scholar 

  16. Zeng, H., et al.: Learning to cluster web search results. In: Proc. of ACM SIGIR-04, pp. 210–217 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Zhi-Hua Zhou Hang Li Qiang Yang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Duan, H., Bao, S., Yu, Y. (2007). CCRM: An Effective Algorithm for Mining Commodity Information from Threaded Chinese Customer Reviews. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71701-0_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71700-3

  • Online ISBN: 978-3-540-71701-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics