Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1401890.1401982acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Anonymizing transaction databases for publication

Published: 24 August 2008 Publication History

Abstract

This paper considers the problem of publishing "transaction data" for research purposes. Each transaction is an arbitrary set of items chosen from a large universe. Detailed transaction data provides an electronic image of one's life. This has two implications. One, transaction data are excellent candidates for data mining research. Two, use of transaction data would raise serious concerns over individual privacy. Therefore, before transaction data is released for data mining, it must be made anonymous so that data subjects cannot be re-identified. The challenge is that transaction data has no structure and can be extremely high dimensional. Traditional anonymization methods lose too much information on such data. To date, there has been no satisfactory privacy notion and solution proposed for anonymizing transaction data. This paper proposes one way to address this issue.

References

[1]
M. Barbaro, T. Zeller and S. Hansell. A Face Is Exposed for AOL Searcher No. 4417749. New York Times, Aug 9, 2006.
[2]
E. Adar. User 4XXXXX9: Anonymizing Query Logs. Query Log Analysis Workshop, WWW 2007.
[3]
R. Kumar, J. Novak, B. Pang, and A. Tomkins. On Anonymizing Query Logs via Token-based Hashing. WWW 2007.
[4]
L. Sweeney. Achieving k-Anonymity Privacy Protection Using Generalization and Suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5), 2002.
[5]
V. S. Verykios, A. K. Elmagarmid, E. Bertino, Y. Saygin, and E. Dasseni. Association Rule Hiding. TKDE, 16(4):434--447, 2004.
[6]
Y. Saygin, V. S. Verykios, C. Clifton. Using Unknowns to Prevent Discovery of Association Rules, Conference on Research Issues in Data Engineering, 2002.
[7]
F. Bonchi, F. Giannotti and D. Pedreschi. Blocking Anonymity Threats Raised by Frequent Itemset Mining. ICDM 2005.
[8]
R. Agrawal, T. Imielinski, and A. N. Swami. Mining Association Rules between Sets of Items in Large Databases. SIGMOD 1993.
[9]
A. Evfimievski, R. Srikant, R. Agrawal and J. Gehrke. Privacy Preserving Association Rule Mining. SIGKDD 2002.
[10]
B. Fung, K. Wang and P. Yu. Top-Down Specialization for Information and Privacy Preservation. ICDE 2005.
[11]
A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. l-Diversity: Privacy beyond k-Anonymity. ICDE 2006.
[12]
Y. Wang, X. Wu. Approximate Inverse Frequent Itemset Mining: Privacy, Complexity, and Approximation. ICDM 2005.
[13]
S. Brin, R. Motwani, and C. Silverstein. Beyond Market Basket: Generalizing Association Rules to Correlations. SIGMOD 1997.
[14]
E. Adar, D. S. Weld, B. N. Bershad, S. D. Gribble. Why We Search: Visualizing and Predicting User Behavior. WWW 2007.
[15]
K. Hafner. Researchers Yearn to Use AOL Logs, but They Hesitate. New York Times, August 23, 2006.
[16]
L. Backstrom, C. Dwork and J. Kleinberg. Wherefore Art Thou R3579x?: Anonymized Social Networks, Hidden Patterns, and Structural Steganography. WWW 2007.
[17]
A. Narayanan and V. Shmatikov. How to Break Anonymity of the Netflix Prize Dataset. ArXiv Computer Science e-prints, October 2006.
[18]
H. Cui, J. Wen, J. Nie, and W. Ma. Probabilistic Query Expansion Using Query Logs. WWW 2002.
[19]
Z. Dou, R. Song, and J. Wen. A Large-scale Evaluation and Analysis of Personalized Search Strategies. WWW 2007.
[20]
B. Liu, W. Hsu, and Y. Ma. Integrating Classification and Association Rule Mining. KDD 1998.
[21]
C. Aggarwal. On k-Anonymity and the Curse of Dimensionality. VLDB 2005

Cited By

View all
  • (2024)A divide-and-conquer approach to privacy-preserving high-dimensional big data releaseJournal of Information Security and Applications10.1016/j.jisa.2024.10375683(103756)Online publication date: Jun-2024
  • (2023)An Improved Partitioning Method via Disassociation towards Environmental SustainabilitySustainability10.3390/su1509744715:9(7447)Online publication date: 30-Apr-2023
  • (2023)A New Approach for Anonymizing Transaction Data with Set ValuesElectronics10.3390/electronics1214304712:14(3047)Online publication date: 12-Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2008
1116 pages
ISBN:9781605581934
DOI:10.1145/1401890
  • General Chair:
  • Ying Li,
  • Program Chairs:
  • Bing Liu,
  • Sunita Sarawagi
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. anonymity
  2. data publishing
  3. privacy
  4. transaction database

Qualifiers

  • Research-article

Conference

KDD08

Acceptance Rates

KDD '08 Paper Acceptance Rate 118 of 593 submissions, 20%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)2
Reflects downloads up to 26 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A divide-and-conquer approach to privacy-preserving high-dimensional big data releaseJournal of Information Security and Applications10.1016/j.jisa.2024.10375683(103756)Online publication date: Jun-2024
  • (2023)An Improved Partitioning Method via Disassociation towards Environmental SustainabilitySustainability10.3390/su1509744715:9(7447)Online publication date: 30-Apr-2023
  • (2023)A New Approach for Anonymizing Transaction Data with Set ValuesElectronics10.3390/electronics1214304712:14(3047)Online publication date: 12-Jul-2023
  • (2023)A Privacy-Preserved and User Self-Governance Blockchain-Based Framework to Combat COVID-19 Depression in Social MediaIEEE Access10.1109/ACCESS.2023.326459811(35255-35280)Online publication date: 2023
  • (2022)Transactional Data Anonymization for Privacy and Information Preservation via Disassociation and Local SuppressionSymmetry10.3390/sym1403047214:3(472)Online publication date: 25-Feb-2022
  • (2022)Addition-Based Algorithm to Overcome Cover Problem During Anonymization of Transactional DataIntelligent Computing10.1007/978-3-031-10461-9_62(896-914)Online publication date: 7-Jul-2022
  • (2022) ( k , m , t )‐anonymity: Enhanced privacy for transactional data Concurrency and Computation: Practice and Experience10.1002/cpe.702034:18Online publication date: 10-Apr-2022
  • (2021)Set-valued data collection with local differential privacy based on category hierarchyMathematical Biosciences and Engineering10.3934/mbe.202113918:3(2733-2763)Online publication date: 2021
  • (2021)A Comprehensive Study on Enhanced Clustering Technique of Association Rules over Transactional Datasets2021 Fifth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)10.1109/I-SMAC52330.2021.9640681(1-5)Online publication date: 11-Nov-2021
  • (2021)A survey of privacy-preserving mechanisms for heterogeneous data typesComputer Science Review10.1016/j.cosrev.2021.10040341:COnline publication date: 1-Aug-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media