Dataset Anonyization on Cloud: Open Problems and Perspectives

Matteo Cristani¹¹ &
Claudio Tomazzoli¹¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11609))

Included in the following conference series:

International Conference on Web Engineering

275 Accesses

Abstract

Data anonymization is the process of making information contained in a group of data such that it is not possible to identify unique references to single elements in the group after the process. This action, when conducted onto datasets used to make statistical inference is bound to have ananlogous behaviours on certain indices before and after the process itself. In this paper we study the pipeline of anonymization process for datasets, when this pipeline is managed on cloud technology, where cryptography is not applicable at all, for datasets being available in an open setting. We examine the open problems, and devise a method to address these problems in a logical framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Federated Clouds: A New Metric for Measuring the Quality of Data Anonymization

Privacy Preservation over Big Data in Cloud Systems

Clouding Big Data: Information Privacy Considerations

Notes

1.
We introduce here the notion of analytical properties in terms of statistical measures, the most common properties desired in dataset anonymization.

References

Simmhan, Y., Plale, B., Gannon, D.: A survey of data provenance in e-science. SIGMOD Rec. 34(3), 31–36 (2005)
Article Google Scholar
Buneman, P., Khanna, S., Wang-Chiew, T.: Why and where: a characterization of data provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 316–330. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44503-X_20
Chapter Google Scholar
Buneman, P., Khanna, S., Tan, W.-C.: Data provenance: some basic issues. In: Kapoor, S., Prasad, S. (eds.) FSTTCS 2000. LNCS, vol. 1974, pp. 87–93. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44450-5_6
Chapter Google Scholar
Lima, R., Espinasse, B., Freitas, F.: A logic-based relational learning approach to relation extraction: the ontoilper system. Eng. Appl. Artif. Intell. 78, 142–157 (2019)
Article Google Scholar
Kazmi, M., Schueller, P., Saygin, Y.: Improving scalability of inductive logic programming via pruning and best-effort optimisation. Expert Syst. Appl. 87, 291–303 (2017)
Article Google Scholar
Lisi, F., Malerba, D.: Inducing multi-level association rules from multiple relations. Mach. Learn. 55(2), 175–210 (2004)
Article MATH Google Scholar
Lisi, F.: Building rules on top of ontologies for the semantic web with inductive logic programming. Theory Pract. Log. Program. 8(3), 271–300 (2008)
Article MathSciNet MATH Google Scholar
Lisi, F.: Inductive logic programming in databases: From datalog to dl+log. Theory Pract. Log. Program. 10(3), 331–359 (2010)
Article MathSciNet MATH Google Scholar
Ray, O.: Nonmonotonic abductive inductive learning. J. Appl. Log. 7(3), 329–340 (2009)
Article MathSciNet MATH Google Scholar
Sakama, C.: Induction from answer sets in nonmonotonic logic programs. ACM Trans. Comput. Log. 6(2), 203–231 (2005)
Article MathSciNet MATH Google Scholar
Sakama, C., Inoue, K.: Brave induction: a logical framework for learning from incomplete information. Mach. Learn. 76(1), 3–35 (2009)
Article Google Scholar
Sakama, C.: Nonmonotomic inductive logic programming. In: Eiter, T., Faber, W., Truszczyński, M. (eds.) LPNMR 2001. LNCS (LNAI), vol. 2173, pp. 62–80. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45402-0_5
Chapter MATH Google Scholar
Zorzi, M., Combi, C., Lora, R., Pagliarini, M., Moretti, U.: Automagically encoding adverse drug reactions in MedDRA, pp. 90–99 (2015). [26]
Google Scholar
Zorzi, M., Combi, C., Pozzani, G., Arzenton, E., Moretti, U.: A co-occurrence based MedDRA terminology generation: some preliminary results. In: ten Teije, A., Popow, C., Holmes, J.H., Sacchi, L. (eds.) AIME 2017. LNCS (LNAI), vol. 10259, pp. 215–220. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59758-4_24
Chapter Google Scholar
Zorzi, M., Combi, C., Pozzani, G., Moretti, U.: Mapping free text into MedDRA by natural language processing: a modular approach in designing and evaluating software extensions, pp. 27–35 (2017). [28]
Google Scholar
Tomazzoli, C., Cristani, M., Karafili, E., Olivieri, F.: Non-monotonic reasoning rules for energy efficiency. J. Ambient Intell. Smart Environ. 9(3), 345–360 (2017)
Article Google Scholar
Governatori, G., Olivieri, F., Rotolo, A., Scannapieco, S., Cristani, M.: Picking up the best goal an analytical study in defeasible logic. In: Morgenstern, L., Stefaneas, P., Lévy, F., Wyner, A., Paschke, A. (eds.) RuleML 2013. LNCS, vol. 8035, pp. 99–113. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39617-5_12
Chapter MATH Google Scholar
Governatori, G., Olivieri, F., Scannapieco, S., Cristani, M.: Superiority based revision of defeasible theories. In: Dean, M., Hall, J., Rotolo, A., Tabet, S. (eds.) RuleML 2010. LNCS, vol. 6403, pp. 104–118. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16289-3_10
Chapter Google Scholar
Cristani, M., Tomazzoli, C., Karafili, E., Olivieri, F.: Defeasible reasoning about electric consumptions, pp. 885–892 (May 2016)
Google Scholar
Burato, E., Cristani, M.: The process of reaching agreement in meaning negotiation. In: Nguyen, N.T. (ed.) Transactions on Computational Collective Intelligence VII. LNCS, vol. 7270, pp. 1–42. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32066-8_1
Chapter Google Scholar
Burato, E., Cristani, M., Viganò, L.: A deduction system for meaning negotiation. In: Omicini, A., Sardina, S., Vasconcelos, W. (eds.) DALT 2010. LNCS (LNAI), vol. 6619, pp. 78–95. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20715-0_5
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Verona, Verona, Italy
Matteo Cristani & Claudio Tomazzoli

Authors

Matteo Cristani
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Tomazzoli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matteo Cristani .

Editor information

Editors and Affiliations

Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy
Marco Brambilla
Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy
Cinzia Cappiello
Department of Software Engineering, University of Malaya, Kuala Lumpur, Malaysia
Siew Hock Ow

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cristani, M., Tomazzoli, C. (2020). Dataset Anonyization on Cloud: Open Problems and Perspectives. In: Brambilla, M., Cappiello, C., Ow, S. (eds) Current Trends in Web Engineering. ICWE 2019. Lecture Notes in Computer Science(), vol 11609. Springer, Cham. https://doi.org/10.1007/978-3-030-51253-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-51253-8_9
Published: 30 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-51252-1
Online ISBN: 978-3-030-51253-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Dataset Anonyization on Cloud: Open Problems and Perspectives

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Federated Clouds: A New Metric for Measuring the Quality of Data Anonymization

Privacy Preservation over Big Data in Cloud Systems

Clouding Big Data: Information Privacy Considerations

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Dataset Anonyization on Cloud: Open Problems and Perspectives

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Federated Clouds: A New Metric for Measuring the Quality of Data Anonymization

Privacy Preservation over Big Data in Cloud Systems

Clouding Big Data: Information Privacy Considerations

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation