Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1341531.1341560acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Opinion spam and analysis

Published: 11 February 2008 Publication History

Abstract

Evaluative texts on the Web have become a valuable source of opinions on products, services, events, individuals, etc. Recently, many researchers have studied such opinion sources as product reviews, forum posts, and blogs. However, existing research has been focused on classification and summarization of opinions using natural language processing and data mining techniques. An important issue that has been neglected so far is opinion spam or trustworthiness of online opinions. In this paper, we study this issue in the context of product reviews, which are opinion rich and are widely used by consumers and product manufacturers. In the past two years, several startup companies also appeared which aggregate opinions from product reviews. It is thus high time to study spam in reviews. To the best of our knowledge, there is still no published study on this topic, although Web spam and email spam have been investigated extensively. We will see that opinion spam is quite different from Web spam and email spam, and thus requires different detection techniques. Based on the analysis of 5.8 million reviews and 2.14 million reviewers from amazon.com, we show that opinion spam in reviews is widespread. This paper analyzes such spam activities and presents some novel techniques to detect them

References

[1]
E. Amitay, D. Carmel, A. Darlow, R. Lempel & A. Soffer. The connectivity sonar: detecting site functionality by structural patterns. Hypertext'03, 2003.
[2]
M. Andreolini, A. Bulgarelli, M. Colajanni & F. Mazzoni. Honeyspam: Honeypots fighting spam at the source. In Proc. USENIX SRUTI 2005, Cambridge, MA, July 2005.
[3]
R. Baeza-Yates, C. Castillo & V. Lopez. PageRank increase under different collusion topologies. AIRWeb'05, 2005.
[4]
A. Z. Broder. On the resemblance and containment of documents. In Proceedings of Compression and Complexity of Sequences 1997, IEEE Computer Society, 1997.
[5]
C. Castillo, D. Donato, L. Becchetti, P. Boldi, S. Leonardi, M. Santini, S. Vigna. A reference collection for web spam, SIGIR Forum'06, 2006.
[6]
S. Chakrabarti. Mining the Web: discovering knowledge from hypertext data. Morgan Kaufmann, 2003.
[7]
K. Dave, S. Lawrence & D. Pennock. Mining the peanut gallery: opinion extraction and semantic classification of product reviews. WWW'2003.
[8]
I. Fette, N. Sadeh-Koniecpol, A. Tomasic. Learning to Detect Phishing Emails. WWW2007.
[9]
D. Fetterly, M. Manasse & M. Najork. Detecting phrase-level duplication on the World Wide Web. SIGIR'2005.
[10]
Z. Gyongyi & H. Garcia-Molina. Web Spam Taxonomy. Technical Report, Stanford University, 2004.
[11]
M. R. Henzinger: Finding near-duplicate web pages: a large-scale evaluation of algorithms. SIGIR'06, 2006.
[12]
M. Hu & B. Liu. Mining and summarizing customer reviews. KDD'2004.
[13]
N. Jindal and B. Liu. Product Review Analysis. Technical Report, UIC, 2007.
[14]
N. Jindal and B. Liu. Analyzing and Detecting Review Spam. ICDM2007.
[15]
W. Li, N. Zhong, C. Liu. Combining Multiple Email Filters Based on Multivariate Statistical Analysis. ISMIS 2006.
[16]
B. Liu. Web Data Mining: Exploring hyperlinks, contents and usage data. Springer, 2007.
[17]
A. Metwally, D. Agrawal, A. Abbadi. DETECTIVES: DETEcting Coalition hiT Inflation attacks in adVertising nEtworks Streams. WWW2007.
[18]
B. Mobasher, R. Burke & J. J Sandvig. Model-based collaborative filtering as a defense against profile injection attacks. AAAI'2006.
[19]
A. Ntoulas, M. Najork, M. Manasse & D. Fetterly. Detecting Spam Web Pages through Content Analysis. WWW'2006.
[20]
B. Pang, L. Lee & S. Vaithyanathan. Thumbs up? Sentiment classification using machine learning techniques. EMNLP'2002.
[21]
A-M. Popescu and O. Etzioni. Extracting Product Features and Opinions from Reviews. EMNLP'2005.
[22]
M. Sahami and S. Dumais and D. Heckerman and E. Horvitz. A Bayesian Approach to Filtering Junk {E}-Mail. AAAI Technical Report WS-98-05, 1998.
[23]
P. Turney. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. ACL'2002.
[24]
Y. Wang, M. Ma, Y. Niu, H. Chen. Spam Double-Funnel: Connecting Web Spammers with Advertisers. WWW2007.
[25]
B. Wu and B. D. Davison. Identifying link farm spam pages. WWW'06, 2006.
[26]
B. Wu, V. Goel & B. D. Davison. Topical TrustRank: using topicality to combat Web spam. WWW'2006.
[27]
S. Ye, R. Song, J.-R. Wen, W.-Y. Ma. A Query-dependent duplicate detection approach for large scale search engines. APWeb'04, 2004.
[28]
Z. Zhang & B. Varadarajan, Utility scoring of product reviews, CIKM'2006.

Cited By

View all
  • (2024)Spam Review Detection using Machine LearningInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-17556(361-366)Online publication date: 22-Apr-2024
  • (2024)Sentiment Analysis of Electronic Word of Mouth (E-WoM) on E-LearningEncyclopedia of Information Science and Technology, Sixth Edition10.4018/978-1-6684-7366-5.ch057(1-23)Online publication date: 1-Jul-2024
  • (2024)Leveraging Stacking Framework for Fake Review Detection in the Hospitality SectorJournal of Theoretical and Applied Electronic Commerce Research10.3390/jtaer1902007519:2(1517-1558)Online publication date: 15-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '08: Proceedings of the 2008 International Conference on Web Search and Data Mining
February 2008
270 pages
ISBN:9781595939272
DOI:10.1145/1341531
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 February 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. fake reviews
  2. opinion spam
  3. review analysis
  4. review spam

Qualifiers

  • Research-article

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)263
  • Downloads (Last 6 weeks)33
Reflects downloads up to 30 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Spam Review Detection using Machine LearningInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-17556(361-366)Online publication date: 22-Apr-2024
  • (2024)Sentiment Analysis of Electronic Word of Mouth (E-WoM) on E-LearningEncyclopedia of Information Science and Technology, Sixth Edition10.4018/978-1-6684-7366-5.ch057(1-23)Online publication date: 1-Jul-2024
  • (2024)Leveraging Stacking Framework for Fake Review Detection in the Hospitality SectorJournal of Theoretical and Applied Electronic Commerce Research10.3390/jtaer1902007519:2(1517-1558)Online publication date: 15-Jun-2024
  • (2024)Trustworthiness of Review Opinions on the Internet for 3C CommoditiesElectronics10.3390/electronics1307134613:7(1346)Online publication date: 3-Apr-2024
  • (2024)AI-Generated Spam Review Detection Framework with Deep Learning Algorithms and Natural Language ProcessingComputers10.3390/computers1310026413:10(264)Online publication date: 12-Oct-2024
  • (2024)Efficient Detection of Irrelevant User Reviews Using Machine LearningApplied Sciences10.3390/app1416690014:16(6900)Online publication date: 7-Aug-2024
  • (2024)Exploring Transformer Models and Domain Adaptation for Detecting Opinion Spam in Reviews2024 36th Conference of Open Innovations Association (FRUCT)10.23919/FRUCT64283.2024.10749897(249-255)Online publication date: 30-Oct-2024
  • (2024)Sentiment Analysis and Fake Amazon Reviews Classification Using SVM Supervised Machine Learning ModelJournal of Advances in Information Technology10.12720/jait.15.1.49-5815:1(49-58)Online publication date: 2024
  • (2024)Interactive Machine Teaching by Labeling Rules and InstancesTransactions of the Association for Computational Linguistics10.1162/tacl_a_0070712(1441-1459)Online publication date: 18-Nov-2024
  • (2024)Metadata Integration for Spam Reviews Detection on Vietnamese E-commerce WebsitesInternational Journal of Asian Language Processing10.1142/S2717554524500024Online publication date: 29-Jul-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media