Mixing Genetic Algorithms and V-MDAV to Protect Microdata

Agusti Solanas⁴,
Úrsula González-Nicolás⁴ &
Antoni Martínez-Ballesté⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 394))

811 Accesses
1 Citations

Abstract

Protecting the privacy of individuals, whose data are released to untrusted parties, is a problem that has captured the attention of the scientific community for years. Several techniques have been proposed to cope with this problem. Amongst these techniques, microaggregation is able to provide a good trade-off between information loss and disclosure risk. Thus, many efforts have been devoted to its study. Microaggregation is a statistical disclosure control (SDC) technique that aims at protecting the privacy of individual respondents by aggregating the information of similar respondents, so as to make them undistinguishable. Although microaggregation is a very interesting approach, to microaggregate multivariate data sets optimally is known to be an NP-hard problem. Consequently, the use of heuristics has been suggested as a possible strategy to solve the problem in a reasonable time. Specifically, genetic algorithms (GA) have been shown to be able to find good solutions to the microaggregation problem for small, multivariate data sets. However, due to the very nature of the problem, GA can hardly cope with large, multivariate data sets. With the aim to apply them to large data sets, those have to be previously partitioned into smaller disjoint subsets that the GA can handle separately. In this chapter, we summarise several proposals for partitioning data sets, in order to apply GA to microaggregate them. In addition, we elaborate on the study of a partitioning strategy based on the variable-MDAV algorithm, we study the effect of several parameters, namely the dimension, the aggregation parameter (k), the size of the data sets, etc. Also, we compare it with the most relevant previous proposals.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Novel Iterative Min-Max Clustering to Minimize Information Loss in Statistical Disclosure Control

Differentially Private Data Sets Based on Microaggregation and Record Perturbation

Beyond Multivariate Microaggregation for Large Record Anonymization

References

Brand, R., Domingo-Ferrer, J., Mateo-Sanz, J.M.: Reference data sets to test and compare sdc methods for protection of numerical microdata. European Project IST-2000-25069 CASC (2002), http://neon.vb.cbs.nl/casc
Canadian Privacy: Canadian privacy regulations (2005), http://www.media-awareness.ca/english/issues/privacy/canadian_legislation_privacy.cfm
Defays, D., Anwar, N.: Micro-aggregation: a generic method. In: Proceedings of the 2nd International Symposium on Statistical Confidentiality, Eurostat, Luxemburg, pp. 69–78 (1995)
Google Scholar
Domingo-Ferrer, J., Martínez-Ballesté, A., Mateo-Sanz, J.M., Sebé, F.: Efficient multivariate data-oriented microaggregation. The VLDB Journal 15(4), 355–369 (2006)
Article Google Scholar
Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering 14(1), 189–201 (2002)
Article Google Scholar
Domingo-Ferrer, J., Sebé, F., Solanas, A.: A polynomial-time approximation to optimal multivariate microaggregation. Computers & Mathematics with Applications 55(4), 714–732 (2008)
Article MATH MathSciNet Google Scholar
Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Min. Knowl. Discov. 11(2), 195–212 (2005)
Article MathSciNet Google Scholar
Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogenerous k-anonymity through microaggregation. Data Mining and Knowledge Discovery 11(2), 195–212 (2005)
Article MathSciNet Google Scholar
Edwards, A.W.F., Cavalli-Sforza, L.L.: A method for cluster analysis. Biometrics 21, 362–375 (1965)
Article Google Scholar
European Parliament: DIRECTIVE 2002/58/EC of the European Parliament and Council of concerning the processing of personal data and the protection of privacy in the electronic communications sector (Directive on privacy and electronic communications) (July 12, 2002), http://europa.eu.int/eur-lex/pri/en/oj/dat/2002/l_201/l_20120020731en00370047.pdf
Fayyoumi, E., Oommen, B.J.: A Fixed Structure Learning Automaton Micro-Aggregation Technique for Secure Statistical Databases. In: Domingo-Ferrer, J., Franconi, L. (eds.) PSD 2006. LNCS, vol. 4302, pp. 114–128. Springer, Heidelberg (2006)
Chapter Google Scholar
Hansen, S.L., Mukherjee, S.: A polynomial algorithm for optimal univariate microaggregation. IEEE Transactions on Knowledge and Data Engineering 15(4), 1043–1044 (2003)
Article Google Scholar
Holland, J.: Adaptation in Natural and Artificial Systems. University of Michigan Press (1975)
Google Scholar
Hundepool, A., de Wetering, A.V., Ramaswamy, R., Franconi, L., Capobianchi, A., DeWolf, P.P., Domingo-Ferrer, J., Torra, V., Brand, R., Giessing, S.: μ-ARGUS version 4.0 Software and User’s Manual. Statistics Netherlands, Voorburg NL (2005), http://neon.vb.cbs.nl/casc
Hutter, M.: Fitness uniform selection to preserve genetic diversity. Tech. Rep. IDSIA-01-01, IDSIA, Manno-Lugano, Switzerland (2001)
Google Scholar
Laszlo, M., Mukherjee, S.: Minimum spanning tree partitioning algorithm for microaggregation. IEEE Transactions on Knowledge and Data Engineering 17(7), 902–911 (2005)
Article Google Scholar
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Google Scholar
Martínez-Ballesté, A., Solanas, A., Domingo-Ferrer, J., Mateo-Sanz, J.M.: A genetic approach to multivariate microaggregation for database privacy. In: ICDE Workshops, pp. 180–185. IEEE Computer Society Press (2007), http://dx.doi.org/10.1109/ICDEW.2007.4400989
Oganian, A., Domingo-Ferrer, J.: On the complexity of optimal microaggregation for statistical disclosure control. Statistical Journal of the United Nations Economic Comission for Europe 18(4), 345–354 (2001)
Google Scholar
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Article Google Scholar
Sande, G.: Exact and approximate methods for data directed microaggregation in one or more dimensions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10(5), 459–476 (2002)
Article MATH MathSciNet Google Scholar
Solanas, A.: Privacy Protection with Genetic Algorithms. In: Success in Evolutionary Computation. SCI, pp. 215–237. Springer, Heidelberg (2008)
Chapter Google Scholar
Solanas, A., Gonzalez-Nicolaas, U., Martinez-Balleste, A.: A variable-mdav-based partitioning strategy to continuous multivariate microaggregation with genetic algorithms. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2010), doi:10.1109/IJCNN.2010.5596660
Google Scholar
Solanas, A., Martínez-Ballesté, A.: V-MDAV: Variable group size multivariate microaggregation. In: COMPSTAT 2006, Rome, pp. 917–925 (2006)
Google Scholar
Solanas, A., Martínez-Ballesté, A., Mateo-Sanz, J.M., Domingo-Ferrer, J.: Multivariate microaggregation based on genetic algorithms. In: 3rd IEEE Conference On Intelligent Systems, pp. 65–70. IEEE Computer Society Press, Westminster (2006)
Chapter Google Scholar
Torra, V.: Microaggregation for categorical variables: A median based approach. In: Privacy in Statistical Databases, pp. 162–174 (2004)
Google Scholar
US Privacy: regulations (2005), http://www.media-awareness.ca/english/issues/privacy/us_legislation_privacy.cfm
Ward, J.H.: Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58, 236–244 (1963)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

UNESCO Chair in Data Privacy in the Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, Av. Paisos Catalans 26, 43007, Tarragona, Catalonia, Spain
Agusti Solanas, Úrsula González-Nicolás & Antoni Martínez-Ballesté

Authors

Agusti Solanas
View author publications
You can also search for this author in PubMed Google Scholar
Úrsula González-Nicolás
View author publications
You can also search for this author in PubMed Google Scholar
Antoni Martínez-Ballesté
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Agusti Solanas .

Editor information

Editors and Affiliations

, The Gateway, De Montfort University, Leicester, LE1 9BH, United Kingdom
David A. Elizondo
, Department of Computer Engineering and, Universitat Rovira i Virgili, Av. Paisos Catalans 26, Tarragona, 43007, Spain
Agusti Solanas
, Department of Computer Engineering and, Universitat Rovira i Virgili, Av. Paisos Catalans 26, Tarragona, 43007, Spain
Antoni Martinez-Balleste

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Solanas, A., González-Nicolás, Ú., Martínez-Ballesté, A. (2012). Mixing Genetic Algorithms and V-MDAV to Protect Microdata. In: Elizondo, D., Solanas, A., Martinez-Balleste, A. (eds) Computational Intelligence for Privacy and Security. Studies in Computational Intelligence, vol 394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25237-2_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-25237-2_8
Published: 10 January 2012
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25236-5
Online ISBN: 978-3-642-25237-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Mixing Genetic Algorithms and V-MDAV to Protect Microdata

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Novel Iterative Min-Max Clustering to Minimize Information Loss in Statistical Disclosure Control

Differentially Private Data Sets Based on Microaggregation and Record Perturbation

Beyond Multivariate Microaggregation for Large Record Anonymization

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Mixing Genetic Algorithms and V-MDAV to Protect Microdata

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Novel Iterative Min-Max Clustering to Minimize Information Loss in Statistical Disclosure Control

Differentially Private Data Sets Based on Microaggregation and Record Perturbation

Beyond Multivariate Microaggregation for Large Record Anonymization

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation