Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2024288.2024306acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesi-knowConference Proceedingsconference-collections
research-article

Privacy-aware spam detection in social bookmarking systems

Published: 07 September 2011 Publication History

Abstract

With the increased popularity of Web 2.0 services in the last years data privacy has become a major concern for users. The more personal data users reveal, the more difficult it becomes to control its disclosure in the web. However, for Web 2.0 service providers, the data provided by users is a valuable source for offering effective, personalised data mining services. One major application is the detection of spam in social bookmarking systems: in order to prevent a decrease of content quality, providers need to distinguish spammers and exclude them from the system. They thereby experience a conflict of interests: on the one hand, they need to identify spammers based on the information they collect about users, on the other hand, they need to respect privacy concerns and process as few personal data as possible. It would therefore be of tremendous help for system developers and users to know which personal data are needed for spam detection and which can be ignored. In this paper we address these questions by presenting a data privacy aware feature engineering approach. It consists of the design of features for spam classification which are evaluated according to both, performance and privacy conditions. Experiments using data from the social bookmarking system BibSonomy show that both conditions must not exclude each other.

References

[1]
K. Barker, M. Askari, M. Banerjee, K. Ghazinour, B. Mackas, M. Majedi, S. Pun, and A. Williams. A data privacy taxonomy. In Proc, of the 26th British National Conference on Databases: Dataspace: The Final Frontier, BNCOD 26, pages 42--54, Berlin, Heidelberg, 2009. Springer-Verlag.
[2]
S. Bhagat, G. Cormode, B. Krishnamurthy, and D. Srivastava. Class-based graph anonymization for social network data. Proc. VLDB Endow., 2:766--777, August 2009.
[3]
C. Cattuto, C. Schmitz, A. Baldassarri, V. D. P. Servedio, V. Loreto, A. Hotho, M. Grahl, and G. Stumme. Network properties of folksonomies. Al Communications Journal, 20(4):245--262, 2007.
[4]
F. Chen, P.-N. Tan, and A. K. Jain. A co-classification framework for detecting web spam and spammers in social media web sites. In D. W.-L. Cheung, I.-Y. Song, W. W. Chu, X. Hu, and J. J. Lin, editors, CIKM, pages 1807--1810. ACM, 2009.
[5]
K. Cornelius and S. Tschoepe. Strafrechtliche Grenzen der zentralen E-Mail-Filterung und -Blockade. Kommunikation und Recht, pages 269--271, 2006.
[6]
Council of Europe. Convention for the protection of individuals with regard to automatic processing of personal data, January 1981.
[7]
G. Danezis. Inferring privacy policies for social networking services. In Proc, of the 2nd ACM workshop on Security and artificial intelligence, AlSec '09, pages 5--10, New York, NY, USA, 2009. ACM.
[8]
P. V. Eecke and M. Truyens. Privacy and social networks. Computer Law & Security Review, 26(5):535--546, 2010.
[9]
L. Fang and K. LeFevre. Privacy wizards for social networking sites. In Proc, of the 19th international conference on World wide web, WWW '10, pages 351--360, New York, NY, USA, 2010. ACM.
[10]
T. Fawcett. An introduction to roc analysis. Pattern Recogn. Lett, 27(8):861--874, 2006.
[11]
S. Golder and B. A. Huberman. The structure of collaborative tagging systems. Journal of Information Sciences, 32(2):198--208, April 2006.
[12]
P. Heymann, G. Koutrika, and H. Garcia-Molina. Fighting spam on social web sites: A survey of approaches and future challenges. IEEE Internet Computing, 11:36--45, November 2007.
[13]
T. Hoeren. Intemetrecht, 2010. P. 419 et seq. Available at: http://www.uni-muenster.de/Jura.itm/hoeren/materialien/Skript/Skript\_Internetrecht\_September\y.202010.pdf.
[14]
A. Hotho, D. Benz, R. Jäschke, and B. Krause, editors. EC ML PKDD Discovery Challenge 2008 (RSDC'08). Workshop at 18th Europ. Conf. on Machine Learning (ECML'08)/11th Europ. Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD'08), 2008.
[15]
B. Krause, H. Lerch, A. Hotho, A. Roßnagel, and G. Stumme. Datenschutz im Web 2.0 am Beispiel des sozialen Tagging-Systems BibSonomy. Informatik-Spektrum, pages 1--12, 2010.
[16]
B. Krause, C. Schmitz, A. Hotho, and G. Stumme. The anti-social tagger: detecting spam in social bookmarking systems. In Proc, of the 4th international workshop on Adversarial information retrieval on the web, pages 61--68, New York, NY, USA, 2008. ACM.
[17]
B. Krishnamurthy and C. E. Wills. Characterizing privacy in online social networks. In Proc, of the first workshop on Online social networks, WOSP '08, pages 37--42, New York, NY, USA, 2008. ACM.
[18]
S. Leible. Spam oder Nicht-Spam, das ist hier die Frage. Kommunikation und Recht, 11:485--489, 2006.
[19]
H. Lerch, B. Krause, A. Hotho, A. Rofinagel, and G. Stumme. Social Bookmarking-Systeme --- die unerkannten Datensammler - Ungewollte personenbezogene Datenverabeitung? MultiMedia und Recht, 7:454--458, 2010.
[20]
B. Markines, C. Cattuto, and F. Menczer. Social spam detection. In D. Fetterly and Z. Gyöngyi, editors, AIRWeb, ACM International Conference Proceeding Series, pages 41--48, 2009.
[21]
OLG Frankfurt a.M. Judgement from 16 June 2010, June 2010. 13 U 105/07.
[22]
C. Prasse. Spam-E-Mails in der neueren Rechtsprechung. Monatsschrift fuer deutsches Recht, 7:361--365, 2006.
[23]
J. Schrammel, C. Köffel, and M. Tscheligi. How much do you tell?: information disclosure behaviour indifferent types of online communities. In Proc, of the 4th international conference on Communities and technologies, pages 275--284, New York, NY, USA, 2009. ACM.
[24]
J. Schrammel, C. Köffel, and M. Tscheligi. Personality traits, usage patterns and information disclosure in online communities. In Proc, of the 23rd British HCI Group Annual Conference on People and Computers: Celebrating People and Technology, pages 169--174, Swinton, UK, 2009. British Computer Society.
[25]
G. Spindler and S. Ernst. Vertragsgestaltung für den Einsatz von E-Mail-Filtern. Computer Und Recht: Forum für die Praxis des Rechts der Datenverarbeitung, Information und Automation, 20(6):437--444, 2004.
[26]
T. Stadler. Schutz vor Spam durch Greylisting - Eine rechtsadaequate Handlungsoption? Datenschutz und Datensicherheit, 6:433--438, 2005.
[27]
The Madrid Resolution. International standards on the protection of personal data and privacy. In 31st International Conference of Data Protection and Privacy Commissioners, volume 2, November 2009.
[28]
UN General Assembly. Guidelines for the regulation of computerized personal data files. Available at:http://www.unhcr.org/refworld/docid/3ddcafaac.html, December 1990.
[29]
B. Zhou, J. Pei, and W. Luk. A brief survey on anonymization techniques for privacy preserving publishing of social network data. SIGKDD Explor. Newsl., 10:12--22, December 2008.

Cited By

View all
  • (2012)Mining social media: key players, sentiments, and communitiesWiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery10.1002/widm.10692:5(411-419)Online publication date: 1-Sep-2012

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
i-KNOW '11: Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies
September 2011
306 pages
ISBN:9781450307321
DOI:10.1145/2024288
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2011

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. privacy social-bookmarking spam-detection

Qualifiers

  • Research-article

Conference

i-KNOW '11

Acceptance Rates

Overall Acceptance Rate 77 of 238 submissions, 32%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2012)Mining social media: key players, sentiments, and communitiesWiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery10.1002/widm.10692:5(411-419)Online publication date: 1-Sep-2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media