Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation

Josep Domingo-Ferrer¹ &
Vicenç Torra²

2013 Accesses
9 Altmetric
Explore all metrics

Abstract

k-Anonymity is a useful concept to solve the tension between data utility and respondent privacy in individual data (microdata) protection. However, the generalization and suppression approach proposed in the literature to achieve k-anonymity is not equally suited for all types of attributes: (i) generalization/suppression is one of the few possibilities for nominal categorical attributes; (ii) it is just one possibility for ordinal categorical attributes which does not always preserve ordinality; (iii) and it is completely unsuitable for continuous attributes, as it causes them to lose their numerical meaning. Since attributes leading to disclosure (and thus needing k-anonymization) may be nominal, ordinal and also continuous, it is important to devise k-anonymization procedures which preserve the semantics of each attribute type as much as possible. We propose in this paper to use categorical microaggregation as an alternative to generalization/suppression for nominal and ordinal k-anonymization; we also propose continuous microaggregation as the method for continuous k-anonymization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalization-Based k-Anonymization

Attribute disclosure risk for k-anonymity: the case of numerical data

Article Open access 25 July 2023

Hybrid microaggregation for privacy preserving data mining

Article 26 November 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., and Zhu, A. 2004. k-Anonymity: Algorithms and hardness. Technical report, Stanford University.
Dalenius, T. 1986. Finding a needle in a haystack - or identifying anonymous census records. Journal of Official Statistics, 2(3):329–336.
Google Scholar
Defays, D. and Nanopoulos, P. 1993. Panels of enterprises and confidentiality: the small aggregates method. In Proc. of 92 Symposium on Design and Analysis of Longitudinal Surveys. Ottawa, Statistics Canada, pp.195–204.
Domingo-Ferrer, J. and Mateo-Sanz, J.M. 2002. Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering, 14(1):189–201.
Article Google Scholar
Domingo-Ferrer, J., Mateo-Sanz, J.M., and Torra, V. 2001. Comparing sdc methods for microdata on the basis of information loss and disclosure risk. In Pre-proceedings of ETK-NTTS'2001 (vol. 2). Luxemburg. Eurostat, pp. 807–826.
Domingo-Ferrer, J. and Torra, V. 2001a. Disclosure protection methods and information loss for microdata. In P. Doyle, J.I. Lane, J.J.M. Theeuwes, and L. Zayatz (Eds.), Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, Amsterdam. North-Holland. http://vneumann.etse.urv.es/publications/bcpi pp. 91–110.
Domingo-Ferrer, J. and Torra, V. 2001b. A quantitative comparison of disclosure control methods for microdata. In P. Doyle, J.I. Lane, J.J.M. Theeuwes, and L. Zayatz (Eds.), Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, Amsterdam. North-Holland. http://vneumann.etse.urv.es/publications/bcpi, pp. 111–134.
Domingo-Ferrer, J. and Torra, V. 2005. Privacy in statistical databases: Methods and performance metrics for microdata protection. manuscript.
Duncan, G.T., Fienberg, S.E., Krishnan, R., Padman, R., and Roehrig, S.F. 2001a. Disclosure limitation methods and information loss for tabular data. In P. Doyle, J.I. Lane, J.J. Theeuwes and L.V. Zayatz (Eds.), Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies. Amsterdam. North-Holland, pp. 135–166.
Duncan, G.T., Keller-McNulty, S.A., and Stokes, S.L. 2001b. Disclosure risk vs. data utility: The r-u confidentiality map.
Hundepool, A., de Wetering, A.V., Ramaswamy, R., Franconi, L., Capobianchi, A., DeWolf, P.-P., Domingo-Ferrer, J., Torra, V., Brand, R., and Giessing, S. 2003. μ-ARGUS version 3.2 Software and User's Manual. Statistics Netherlands, Voorburg NL. http://neon.vb.cbs.nl/casc://neon.vb.cbs.nl/casc.
Mateo-Sanz, J.M., Domingo-Ferrer, J., and Sebé, F. 2005. Probabilistic information loss measures in confidentiality protection of continuous microdata. Data Mining and Knowledge Discovery, this issue.
Meyerson, A. and Williams, R. 2004. On the complexity of optimal k-Anonymity. In Proc. of the ACM Symposium on Principles of Database Systems-PODS'2004. Paris, France. ACM, pp. 223–228.
Oganian, A. and Domingo-Ferrer, J. 2001. On the complexity of optimal microaggregation for statistical disclosure control. Statistical Journal of the United Nations Economic Comission for Europe, 18(4):345–354.
Google Scholar
Reiter, J.P. 2004. Releasing multiply-imputed, synthetic public use microdata: An illustration and empirical study. Journal of the Royal Statistical Society, Series A, page forthcoming.
Samarati, P. 2001. Protecting respondents' identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010–1027.
Article Google Scholar
Samarati, P. and Sweeney, L. 1998. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, SRI International.
Sebé, F., Domingo-Ferrer, J., Mateo-Sanz, J.M., and Torra, V. 2002. Post-masking optimization of the tradeoff between information loss and disclosure risk in masked microdata sets. In J. Domingo-Ferrer (ed.), Inference Control in Statistical Databases, volume 2316 of LNCS, Berlin Heidelberg, Springer, pp. 163–171.
Google Scholar
Sweeney, L. 2002a. Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, 10(5):571–588.
Article MATH MathSciNet Google Scholar
Sweeney, L. 2002b. k-anonimity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, 10(5):557–570.
Article MATH MathSciNet Google Scholar
Torra, V. 2004. Microaggregation for categorical variables: A median based approach. In J. Domingo-Ferrer and V. Torra (Eds.), Privacy in Statistical Databases, volume 3050 of LNCS, Berlin Heidelberg. Springer, pp. 162–174.
Google Scholar
Willenborg, L. and DeWaal, T. 2001. Elements of Statistical Disclosure Control. Springer-Verlag, New York.
MATH Google Scholar
Winkler, W. E. 2004. Re-identification methods for masked microdata. In J. Domingo-Ferrer and V. Torra (Eds.), Privacy in Statistical Databases, volume 3050 of LNCS, Berlin Heidelberg, Springer, pp. 216–230.
Google Scholar
Yancey, W.E., Winkler, W.E., and Creecy, R.H. 2002. Disclosure risk assessment in perturbative microdata protection. In J. Domingo-Ferrer (Eds.), Inference Control in Statistical Databases, volume 2316 of LNCS, Berlin Heidelberg. Springer, pp. 135–152.
Google Scholar

Download references

Acknowledgments

Francesc Sebé's help in obtaining the results reported for continuous data is gratefully acknowledged. Comments by William Winkler were also particularly useful to improve this paper. This work was partly funded by the Spanish Ministry of Science and Technology and the European FEDER Fund under project TIC2001-0633-C03-01/03 “STREAMOBILE” and also by the Spanish Ministry of Education and Science under project SEG2004-04352-C04-01/02 “PROPRIETAS”.

Author information

Authors and Affiliations

Department of Computer Engineering and Maths, Rovira i Virgili University of Tarragona, Av. Països Catalans 26, E-43007, Tarragona, Catalonia, Spain
Josep Domingo-Ferrer
Institut d'Investigació en Intel·ligència Artificial-CSIC, Campus UAB, E-08193, Bellaterra, Catalonia, Spain
Vicenç Torra

Authors

Josep Domingo-Ferrer
View author publications
You can also search for this author in PubMed Google Scholar
Vicenç Torra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Josep Domingo-Ferrer.

Additional information

Editor:

Geoff Webb

Rights and permissions

Reprints and permissions

About this article

Cite this article

Domingo-Ferrer, J., Torra, V. Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation. Data Min Knowl Disc 11, 195–212 (2005). https://doi.org/10.1007/s10618-005-0007-5

Download citation

Received: 27 October 2004
Accepted: 14 April 2005
Published: 23 August 2005
Issue Date: September 2005
DOI: https://doi.org/10.1007/s10618-005-0007-5

Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Generalization-Based k-Anonymization

Attribute disclosure risk for k-anonymity: the case of numerical data

Hybrid microaggregation for privacy preserving data mining

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Editor:

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Generalization-Based k-Anonymization

Attribute disclosure risk for k-anonymity: the case of numerical data

Hybrid microaggregation for privacy preserving data mining

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Editor:

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now