Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-030-31578-8_27guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A Framework with Randomized Encoding for a Fast Privacy Preserving Calculation of Non-linear Kernels for Machine Learning Applications in Precision Medicine

Published: 25 October 2019 Publication History

Abstract

For many diseases it is necessary to gather large cohorts of patients with the disease in order to have enough power to discover the important factors. In this setting, it is very important to preserve the privacy of each patient and ideally remove the necessity to gather all data in one place. Examples include genomic research of cancer, infectious diseases or Alzheimer’s. This problem leads us to develop privacy preserving machine learning algorithms. So far in the literature there are studies addressing the calculation of a specific function privately with lack of generality or utilizing computationally expensive encryption to preserve the privacy, which slows down the computation significantly. In this study, we propose a framework utilizing randomized encoding in which four basic arithmetic operations (addition, subtraction, multiplication and division) can be performed, in order to allow the calculation of machine learning algorithms involving one type of these operations privately. Among the suitable machine learning algorithms, we apply the oligo kernel and the radial basis function kernel to the coreceptor usage prediction problem of HIV by employing the framework to calculate the kernel functions. The results show that we do not sacrifice the performance of the algorithms for privacy in terms of F1-score and AUROC. Furthermore, the execution time of the framework in the experiments of the oligo kernel is comparable with the non-private version of the computation. Our framework in the experiments of radial basis function kernel is also way faster than the existing approaches utilizing integer vector homomorphic encryption and consequently homomorphic encryption based solutions, which indicates that our approach has a potential for application to many other diseases and data types.

References

[1]
Applebaum B Garbled circuits as randomized encodings of functions: a primer Tutorials on the Foundations of Cryptography 2017 Cham Springer 1-44
[2]
Applebaum B, Ishai Y, and Kushilevitz E Computationally private randomizing polynomials and their applications Comput. Complex. 2006 15 2 115-162
[3]
Applebaum B, Ishai Y, and Kushilevitz E Cryptography in SIAM J. Comput. 2006 36 4 845-888
[4]
Applebaum B, Ishai Y, and Kushilevitz E How to garble arithmetic circuits SIAM J. Comput. 2014 43 2 905-929
[5]
Ayday E, De Cristofaro E, Hubaux JP, and Tsudik G Whole genome sequencing: revolutionary medicine or privacy nightmare? Computer 2015 48 2 58-66
[6]
Cortes C and Vapnik V Support-vector networks Mach. Learn. 1995 20 3 273-297
[7]
Döring Matthias, Büch Joachim, Friedrich Georg, Pironti Alejandro, Kalaghatgi Prabhav, Knops Elena, Heger Eva, Obermeier Martin, Däumer Martin, Thielen Alexander, Kaiser Rolf, Lengauer Thomas, and Pfeifer Nico geno2pheno[ngs-freq]: a genotypic interpretation system for identifying viral drug resistance using next-generation sequencing data Nucleic Acids Research 2018 46 W1 W271-W277
[8]
Halevi S and Shoup V Garay JA and Gennaro R Algorithms in HElib Advances in Cryptology – CRYPTO 2014 2014 Heidelberg Springer 554-571
[9]
Halevi, S., Shoup, V.: HElib-an implementation of homomorphic encryption. Cryptology ePrint Archive, Report 2014/039 (2014)
[10]
Igel C, Glasmachers T, Mersch B, Pfeifer N, and Meinicke P Gradient-based optimization of kernel-target alignment for sequence kernels applied to bacterial gene start detection IEEE/ACM Trans. Comput. Biol. Bioinform. 2007 4 2 216-226
[11]
Kale G, Ayday E, and Tastan O A utility maximizing and privacy preserving approach for protecting kinship in genomic databases Bioinformatics 2017 34 2 181-189
[12]
Kauppi JP et al. Towards brain-activity-controlled information retrieval: decoding image relevance from MEG signals NeuroImage 2015 112 288-298
[13]
Lengauer T, Pfeifer N, and Kaiser R Personalized HIV therapy to control drug resistance Drug Discovery Today: Technol. 2014 11 57-64
[14]
Lengauer T, Sander O, Sierra S, Thielen A, and Kaiser R Bioinformatics prediction of HIV coreceptor usage Nat. Biotechnol. 2007 25 12 1407-1410
[15]
Liu, F., Ng, W.K., Zhang, W.: Encrypted SVM for outsourced data mining. In: 2015 IEEE 8th International Conference on Cloud Computing (CLOUD), pp. 1085–1092. IEEE (2015)
[16]
Lunshof JE, Chadwick R, Vorhaus DB, and Church GM From genetic privacy to open consent Nat. Rev. Genet. 2008 9 5 406
[17]
Marouli E et al. Rare and low-frequency coding variants alter human adult height Nature 2017 542 7640 186
[18]
Meinicke P, Tech M, Morgenstern B, and Merkl R Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites BMC Bioinform. 2004 5 1 169
[19]
Mersch B, Gepperth A, Suhai S, and Hotz-Wagenblatt A Automatic detection of exonic splicing enhancers (ESEs) using SVMs BMC Bioinform. 2008 9 1 369
[20]
Michailidou K et al. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer Nat. Genet. 2015 47 4 373
[21]
Ming Jing, Verner Eric, Sarwate Anand, Kelly Ross, Reed Cory, Kahleck Torran, Silva Rogers, Panta Sandeep, Turner Jessica, Plis Sergey, and Calhoun Vince COINSTAC: Decentralizing the future of brain imaging analysis F1000Research 2017 6 1512
[22]
Pfeifer N and Kohlbacher O Crandall KA and Lagergren J Multiple instance learning allows MHC class II epitope predictions across alleles Algorithms in Bioinformatics 2008 Heidelberg Springer 210-221
[23]
Reis-Filho JS Next-generation sequencing Breast Cancer Res. 2009 11 3 S12
[24]
Schölkopf B, Smola AJ, et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond 2002 Cambridge MIT Press
[25]
Vaidya J, Yu H, and Jiang X Privacy-preserving SVM classification Knowl. Inf. Syst. 2008 14 2 161-178
[26]
Yu, A., Lai, W.L., Payor, J.: Efficient integer vector homomorphic encryption (2015)
[27]
Zhang, J., Ma, K.K., Er, M.H., Chong, V.: Tumor segmentation from magnetic resonance imaging by learning via one-class support vector machine. In: International Workshop on Advanced Image Technology (IWAIT 2004), pp. 207–211 (2004)
[28]
Zhang, J., Wang, X., Yiu, S.M., Jiang, Z.L., Li, J.: Secure dot product of outsourced encrypted vectors and its application to SVM. In: Proceedings of the Fifth ACM International Workshop on Security in Cloud Computing, pp. 75–82. ACM (2017)
[29]
Zhou, H., Wornell, G.: Efficient homomorphic encryption on integer vectors and its applications. In: 2014 Information Theory and Applications Workshop (ITA), pp. 1–9. IEEE (2014)

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Cryptology and Network Security: 18th International Conference, CANS 2019, Fuzhou, China, October 25–27, 2019, Proceedings
Oct 2019
534 pages
ISBN:978-3-030-31577-1
DOI:10.1007/978-3-030-31578-8
  • Editors:
  • Yi Mu,
  • Robert H. Deng,
  • Xinyi Huang

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 25 October 2019

Author Tags

  1. Privacy preserving machine learning
  2. Randomized encoding
  3. String kernel
  4. RBF kernel
  5. Precision medicine

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media