Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2671225.2671229guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

XRay: enhancing the web's transparency with differential correlation

Published: 20 August 2014 Publication History

Abstract

Today's Web services - such as Google, Amazon, and Facebook - leverage user data for varied purposes, including personalizing recommendations, targeting advertisements, and adjusting prices. At present, users have little insight into how their data is being used. Hence, they cannot make informed choices about the services they choose.
To increase transparency, we developed XRay, the first fine-grained, robust, and scalable personal data tracking system for the Web. XRay predicts which data in an arbitrary Web account (such as emails, searches, or viewed products) is being used to target which outputs (such as ads, recommended products, or prices). XRay's core functions are service agnostic and easy to instantiate for new services, and they can track data within and across services. To make predictions independent of the audited service, XRay relies on the following insight: by comparing outputs from different accounts with similar, but not identical, subsets of data, one can pinpoint targeting through correlation. We show both theoretically, and through experiments on Gmail, Amazon, and YouTube, that XRay achieves high precision and recall by correlating data from a surprisingly small number of extra accounts.

References

[1]
Adblock plus. https://adblockplus.org.
[2]
Amazon. Product categories. http://services.amazo n.com/services/soa-approval-category.htm.
[3]
P. Barford, I. Canadi, D. Krushevskaja, Q. Ma, and S. Muthukrishnan. Adscape: Harvesting and Analyzing Online Display Ads. In Proc. of the 23nd International Conference on WWW, 2014.
[4]
B. Beizer. Black-Box Testing. Techniques for Functional Testing of Software and Systems. John Wiley & Sons, May 1995.
[5]
D. Boneh, G. Crescenzo, R. Ostrovsky, and G. Persiano. Public Key Encryption with Keyword Search. In Proc. of the ACM European Conference on Computer Systems (EuroSys), pages 506-522. Springer Berlin Heidelberg, Berlin, Heidelberg, 2004.
[6]
C. Castelluccia, M. A. Kaafar, and M. Tran. Betrayed by your ads! PETS'12: Proceedings of the 12th International Conference on Privacy Enhancing Technologies, 2012.
[7]
W. Cheng, Q. Zhao, B. Yu, and S. Hiroshige. Tainttrace: Efficient flow tracing with dynamic binary rewriting. In Proc. of the 11th IEEE Symposium on Computers and Communications, 2006.
[8]
Chrome web store - collusion for chrome. https://chro me.google.com/webstore/detail/collusion-for-chrome/ganlifbpkcplnldliibcbegplfmcfigp.
[9]
V. Dave, S. Guha, and Y. Zhang. Measuring and fingerprinting click-spam in ad networks. In SIGCOMM '12: Proceedings of the ACM SIGCOMM 2012 Conference on Applications, Technologies, Crchitectures, and Protocols for Computer Communication. ACM Request Permissions, Aug. 2012.
[10]
N. Diakopoulos. Algorithmic accountability reporting: On the investigation of black boxes. Tow Center for Digital Journalism, Columbia University. February, 2014.
[11]
R. Dingledine, N. Mathewson, and P. Syverson. Tor: The second-generation onion router. Technical Report, 2004.
[12]
W. Enck, P. Gilbert, B. gon Chun, L. P. Cox, J. Jung, P. McDaniel, and A. N. Sheth. TaintDroid: An information-flow tracking system for realtime privacy monitoring on smartphones. In Proc. of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2010.
[13]
M. Fredrikson and B. Livshits. RePriv: Re-imagining Content Personalization and In-browser Privacy. 2011 IEEE Symposium on Security and Privacy, pages 131-146, 2011.
[14]
R. Geambasu, T. Kohno, A. Levy, and H. M. Levy. Vanish: Increasing data privacy with self-destructing data. In Proc. of USENIX Security, 2009.
[15]
C. Gentry. Fhe using ideal lattices. In Proc. of the ACM Symposium on Theory of Computing (STOC), 2009.
[16]
D. B. Giffin, A. Levy, D. Stefan, D. Terei, D. Mazières, J. C. Mitchell, and A. Russo. Hails: Protecting data privacy in untrusted web applications. In In Proc. of the 10th USENIX Conference on Operating Systems Design and Implementation, 2012.
[17]
Google. Adsense categories. https://support.googl e.com/adsense/answer/3016459.
[18]
J. Gould. SafeGov.org - Google admits data mining student emails in its free education apps, 2014.
[19]
V. Goyal, O. Pandey, A. Sahai, and B. Waters. Attribute-based encryption for fine-grained access control of encrypted data. In Proc. of the ACM Conference on Computer and Communications Security (CCS), 2006.
[20]
S. Guha, B. Cheng, and P. Francis. Challenges in measuring online advertising systems. In IMC '10: Proceedings of the 10th Annual Conference on Internet Measurement, 2010.
[21]
A. Hannak, P. Sapiezynski, A. M. Kakhki, B. Krishnamurthy, D. Lazer, A. Mislove, and C. Wilson. Measuring personalization of web search. In WWW '13: Proceedings of the 22nd International Conference on World Wide Web, 2013.
[22]
A. L. Hughes and L. Palen. Twitter adoption and use in mass convergence and emergency events. International Journal of Emergency Management, 2009.
[23]
A. Korolova. Privacy Violations Using Microtargeted Ads: A Case Study. In ICDM Workshops, 2010.
[24]
A. Korolova. Privacy violations using microtargeted ads: A case study. Data Mining Workshops (ICDMW), 2010 IEEE International Conference on, pages 474-482, 2010.
[25]
B. Krishnamurthy and C. E. Wills. On the leakage of personally identifiable information via online social networks. In Proc. of the 2nd ACM Workshop on Online Social Networks, 2009.
[26]
M. Lecuyer, G. Ducoffe, F. Lan, A. Papancea, T. Petsios, R. Spahn, A. Chaintreau, and R. Geambasu. XRay: Enhancing the Web's Transparency with Differential Correlation. Technical report, CS Department, Columbia University, 2014.
[27]
Lightbeam. http://www.mozilla.org/lightbeam/.
[28]
B. Liu, A. Sheth, U. Weinsberg, J. Chandrashekar, and R. Govindan. AdReveal: improving transparency into online targeted advertising. In Proc. of the Twelfth ACM Workshop on Hot Topics in Networks, 2013.
[29]
D. Mattioli. WSJ.com - On Orbitz, Mac Users Steered to Pricier Hotels, 2012.
[30]
V. McKalin. Techtimes.com - google: We promise not to spy on student email accounts to deliver ads, 2014.
[31]
J. Mikians, L. Gyarmati, V. Erramilli, and N. Laoutaris. Detecting price and search discrimination on the internet. In Proceedings of the 11th ACM Workshop on Hot Topics in Networks, pages 79-84, 2012.
[32]
L. Olejnik, T. Minh-Dung, C. Castelluccia, et al. Selling off privacy at auction. In In Proceedings of the Network and Distributed System Security Symposium (NDSS), 2013.
[33]
R. A. Popa, C. M. S. Redfield, N. Zeldovich, and H. Balakrishnan. Cryptdb: Protecting confidentiality with encrypted query processing. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP '11, pages 85-100, 2011.
[34]
F. Roesner. sharemenot.cs.washington.edu.
[35]
F. Roesner, T. Kohno, and D. Wetherall. Detecting and defending against third-party tracking on the web. In NSDI'12: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. USENIX Association, Apr. 2012.
[36]
A. Sadilek and H. Kautz. Modeling the impact of lifestyle on health at scale. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, 2013.
[37]
SafeGov.org. Declaration of Kyle C. Wong in Support of Google Inc.'s Opposition to Plaintiffs' Motion for Class Certification, 2013.
[38]
Snapchat. http://blog.snapchat.com/.
[39]
Snapchat blog - how snaps are stored and deleted.
[40]
L. Sweeney. Discrimination in online ad delivery. Communications of the ACM, 56(5), May 2013.
[41]
The Guardian. Snapchat's expired snaps are not deleted, just hidden, 2014.
[42]
V. Toubiana, A. Narayanan, and D. Boneh. Adnostic: Privacy preserving targeted advertising. Proc. NDSS, 2010.
[43]
J. Valentino-Devries, J. Singer-Vine, and A. Soltani. WSJ.com - Websites Vary Prices, Deals Based on Users' Information, 2012.
[44]
X. Wang, M. Gerber, and D. Brown. Automatic crime prediction using events extracted from twitter posts. In S. Yang, A. Greenberg, and M. Endsley, editors, Social Computing, Behavioral - Cultural Modeling and Prediction, volume 7227 of Lecture Notes in Computer Science, pages 231-238. 2012.
[45]
A. Whitten and J. D. Tygar. Why Johnny can't encrypt: A usability evaluation of PGP 5.0. In Proc. of USENIX Security, 1999.
[46]
C. E. Wills and C. Tatar. Understanding What They Do with What They Know. WPES '12: Proceedings of the 12th Annual ACM Workshop on Privacy in the Electronic Society, 2012.
[47]
X. Xing, W. Meng, D. Doozan, N. Feamster, W. Lee, and A. C. Snoeren. Exposing Inconsistent Web Search Results with Bobble. Passive and Active Measurements Conference, 2014.
[48]
Y. Zhu, J. Jung, D. Song, T. Kohno, and D. Wetherall. Privacy scope: A precise information flow tracking system for finding application leaks. Technical Report UCB/EECS-2009-145, 2009.
[49]
P. R. Zimmermann. The official PGP user's guide. 1995.

Cited By

View all
  • (2019)FAIRYProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3290990(240-248)Online publication date: 30-Jan-2019
  • (2018)TreadsProceedings of the 17th ACM Workshop on Hot Topics in Networks10.1145/3286062.3286089(169-175)Online publication date: 15-Nov-2018
  • (2018)The Accuracy of the Demographic Inferences Shown on Google's Ad SettingsProceedings of the 2018 Workshop on Privacy in the Electronic Society10.1145/3267323.3268962(33-41)Online publication date: 15-Oct-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
SEC'14: Proceedings of the 23rd USENIX conference on Security Symposium
August 2014
1067 pages
ISBN:9781931971157
  • Program Chair:
  • Kevin Fu

Sponsors

  • Akamai: Akamai
  • Google Inc.
  • IBMR: IBM Research
  • NSF
  • Microsoft Reasearch: Microsoft Reasearch
  • USENIX Assoc: USENIX Assoc

Publisher

USENIX Association

United States

Publication History

Published: 20 August 2014

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2019)FAIRYProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3290990(240-248)Online publication date: 30-Jan-2019
  • (2018)TreadsProceedings of the 17th ACM Workshop on Hot Topics in Networks10.1145/3286062.3286089(169-175)Online publication date: 15-Nov-2018
  • (2018)The Accuracy of the Demographic Inferences Shown on Google's Ad SettingsProceedings of the 2018 Workshop on Privacy in the Electronic Society10.1145/3267323.3268962(33-41)Online publication date: 15-Oct-2018
  • (2018)Investigating the Impact of Gender on Rank in Resume Search EnginesProceedings of the 2018 CHI Conference on Human Factors in Computing Systems10.1145/3173574.3174225(1-14)Online publication date: 21-Apr-2018
  • (2018)Unpacking Perceptions of Data-Driven Inferences Underlying Online Targeting and PersonalizationProceedings of the 2018 CHI Conference on Human Factors in Computing Systems10.1145/3173574.3174067(1-12)Online publication date: 21-Apr-2018
  • (2017)Enhancing Privacy Using Crowdsourcing MechanismsProceedings of the 21st Pan-Hellenic Conference on Informatics10.1145/3139367.3139405(1-5)Online publication date: 28-Sep-2017
  • (2017)Bias in Online Freelance MarketplacesProceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing10.1145/2998181.2998327(1914-1933)Online publication date: 25-Feb-2017
  • (2017)Uncovering Influence CookbooksProceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing10.1145/2998181.2998257(1413-1418)Online publication date: 25-Feb-2017
  • (2016)Online TrackingProceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security10.1145/2976749.2978313(1388-1401)Online publication date: 24-Oct-2016
  • (2016)Don’t Let Google Know I’m LonelyACM Transactions on Privacy and Security10.1145/293775419:1(1-25)Online publication date: 5-Aug-2016
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media