Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3127479.3129256acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access

Secure data types: a simple abstraction for confidentiality-preserving data analytics

Published: 24 September 2017 Publication History

Abstract

Cloud computing offers a cost-efficient data analytics platform. However, due to the sensitive nature of data, many organizations are reluctant to analyze their data in public clouds. Both software-based and hardware-based solutions have been proposed to address the stalemate, yet all have substantial limitations. We observe that a main issue cutting across all solutions is that they attempt to support confidentiality in data queries in a way transparent to queries. We propose the novel abstraction of secure data types with corresponding annotations for programmers to conveniently denote constraints relevant to security. These abstractions are leveraged by novel compilation techniques in our system Cuttlefish to compute data analytics queries in public cloud infrastructures while keeping sensitive data confidential. Cuttlefish encrypts all sensitive data residing in the cloud and employs partially homomorphic encryption schemes to perform operations securely, resorting however to client-side completion, re-encryption, or secure hardware-based re-encryption based on Intel's SGX when available based on a novel planner engine. Our evaluation shows that our prototype can execute all queries in standard benchmarks such as TPC-H and TPC-DS with an average overhead of 2.34× and 1.69× respectively compared to a plaintext execution that reveals all data.

References

[1]
Apache Pig. http://pig.apache.org.
[2]
BigDigits multiple-precision arithmetic library. http://www.di-mgt.com.au/bigdigits.html.
[3]
Intel SGX. https://software.intel.com/en-us/isa-extensions/intel-sgx.
[4]
SGX Virtualization. https://01.org/intel-software-guard-extensions/sgx-virtualization.
[5]
2014. Intel Software Guard Extensions Programming Reference. (2014). https://software.intel.com/sites/default/files/managed/48/88/329298-002.pdf.
[6]
2014. Synopsys, Inc., Open Source Report 2014. (2014). http://go.coverity.com/rs/157-LQW-289/images/2014-Coverity-Scan-Report.pdf.
[7]
George J Annas. 2003. HIPAA regulations-a new era of medical-record privacy? New England Journal of Medicine 348, 15 (2003), 1486--1490.
[8]
Arvind Arasu, Spyros Blanas, Ken Eguro, Raghav Kaushik, Donald Kossmann, Ravishankar Ramamurthy, and Ramarathnam Venkatesan. 2013. Orthogonal Security with Cipherbase. In Biennial Conf. on Innovative DataSystems Research (CIDR).
[9]
Alessandro Armando, Roberto Carbone, Luca Compagna, Jorge Cuellar, and Llanos Tobarra. 2008. Formal Analysis of SAML 2.0 Web Browser Single Sign-on: Breaking the SAML-based Single Sign-on for Google Apps. In W. on Formal Methods in Security Engineering (FMSE). 1--10.
[10]
Sergei Arnautov, Bohdan Trach, Franz Gregor, Thomas Knauth, Andre Martin, Christian Priebe, Joshua Lind, Divya Muthukumaran, Dan O'Keefe, Mark L Stillwell, David Goltzsche, Dave Eyers, Rüdiger Kapitza, Peter Pietzuch, and Christof Fetzer. 2016. SCONE: Secure Linux Containers with Intel SGX. In Symp. on Op. Sys. Design and Implementation (OSDI). 689--703.
[11]
Sumeet Bajaj and Radu Sion. 2014. TrustedDB: A Trusted Hardware-Based Database with Privacy and Data Confidentiality. IEEE Trans. Knowl. Data Eng. 26, 3 (2014), 752--765.
[12]
Andrew Baumann, Marcus Peinado, and Galen C. Hunt. 2014. Shielding Applications from an Untrusted Cloud with Haven. In Symp. on Op. Sys. Design and Implementation (OSDI). 267--283.
[13]
Dan Boneh Ben A. Fisch, Dhinakaran Vinayagamurthy and Sergey Gorbunov. 2016. Iron: Functional Encryption using Intel SGX. Cryptology ePrint Archive, Report 2016/1071. (2016). http://eprint.iacr.org/2016/1071.
[14]
Alexandra Boldyreva, Nathan Chenette, Younho Lee, and Adam O'Neill. 2009. Order-Preserving Symmetric Encryption. In Int. Conf. on The Theory and Applications of Cryptographic Techniques (EUROCRYPT). 224--241.
[15]
Alexandra Boldyreva, Nathan Chenette, and Adam O'Neill. 2011. Order-preserving Encryption Revisited: Improved Security Analysis and Alternative Solutions. In Annual Int. Cryptology Conf. (CRYPTO). Springer-Verlag, 578--595.
[16]
Dan Boneh, Kevin Lewi, Mariana Raykova, Amit Sahai, Mark Zhandry, and Joe Zimmerman. 2015. Semantically Secure Order-Revealing Encryption: Multi-input Functional Encryption Without Obfuscation. In Int. Conf. on The Theory and Applications of Cryptographic Techniques (EUROCRYPT). 563--594.
[17]
Stefan Brenner, Colin Wulf, Matthias Lorenz, Nico Weichbrodt, David Goltzsche, Christof Fetzer, Peter Pietzuch, and Rüdiger Kapitza. 2016. SecureKeeper: Confidential ZooKeeper using Intel SGX. In Int. Conf. on Middleware (MIDDLEWARE). 14:1--14:13.
[18]
Florian Cajori. 1911. Horner's method of approximation anticipated by Ruffini. Bull. Amer. Math. Soc. 17, 8 (05 1911), 409--414.
[19]
Nathan Chenette, Kevin Lewi, Stephen A. Weis, and David J. Wu. 2016. Practical Order-Revealing Encryption with Limited Leakage. In Int. Conf. on Fast Software Encryption (FSE) (FSE 2016). 474--493.
[20]
Cloudera. A TPC-DS like benchmark for Cloudera Impala. https://github.com/cloudera/impala-tpcds-kit.
[21]
Sashank Dara and Scott R. Fluhrer. 2014. FNR: Arbitrary Length Small Domain Block Cipher Proposal. In Int. Conf. on Security, Privacy, and Applied Cryptography Engineering (SPACE). 146--154.
[22]
Rowan Davies. 2005. Practical Refinement-type Checking. Ph.D. Dissertation. AAI3168521.
[23]
Dorothy E. Denning. 1976. A Lattice Model of Secure Information Flow. Commun. ACM (1976), 236--243.
[24]
Dorothy E. Denning and Peter J. Denning. 1977. Certification of Programs for Secure Information Flow. Commun. ACM (1977), 504--513.
[25]
T. ElGamal. 1985. A Public-Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms. Trans. on Information Theory 31, 4 (1985), 469--472.
[26]
Sky Faber, Stanislaw Jarecki, Hugo Krawczyk, Quan Nguyen, Marcel-Catalin Rosu, and Michael Steiner. 2015. Rich Queries on Encrypted Data: Beyond Exact Matches. In Symp. on Research in Computer Security (ESORICS). 123--145.
[27]
Tim Freeman and Frank Pfenning. 1991. Refinement Types for ML. In Conf. on Prog. Lang. Design and Implementation (PLDI). ACM, 268--277.
[28]
Craig Gentry. 2009. A Fully Homomorphic Encryption Scheme. Ph.D. Dissertation. Advisor(s) Boneh, Dan. AAI3382729.
[29]
Craig Gentry, Shai Halevi, and Nigel P. Smart. 2012. Homomorphic Evaluation of the AES Circuit. IACR Cryptology ePrint Archive (2012). Informal publication.
[30]
Paul Grubbs, Kevin Sekniqi, Vincent Bindschaedler, Muhammad Naveed, and Thomas Ristenpart. 2017. Leakage-Abuse Attacks against Order-Revealing Encryption. In Symp. on Security and Privacy (S&P). 655--672.
[31]
Hakan Hacigümüs, Balakrishna R. Iyer, Chen Li, and Sharad Mehrotra. 2002. Executing SQL over encrypted data in the database-service-provider model. In Int. Conf. on the Mgt. of Data (SIGMOD). 216--227.
[32]
Alon Y. Halevy. 2001. Answering Queries Using Views: A Survey. The VLDB Journal 270--294.
[33]
A. Hosangadi, F. Fallah, and R. Kastner. 2006. Optimizing Polynomial Expressions by Algebraic Factorization and Common Subexpression Elimination. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2006), 2012--2022.
[34]
Florian Kerschbaum. 2015. Frequency-Hiding Order-Preserving Encryption. In Int. Conf. on on Computer and Communications Security (CCS). ACM, 656--667.
[35]
Chang Lan, Justine Sherry, Raluca Ada Popa, Sylvia Ratnasamy, and Zhi Liu. 2016. Embark: Securely Outsourcing Middleboxes to the Cloud. In Networked Sys. Design and Implem. (NSDI). 255--273.
[36]
Kevin Lewi and David J. Wu. 2016. Order-Revealing Encryption: New Constructions, Applications, and Lower Bounds. In Int. Conf. on on Computer and Communications Security (CCS). 1167--1178.
[37]
Frank McKeen, Ilya Alexandrovich, Alex Berenzon, Carlos V. Rozas, Hisham Shafi, Vedvyas Shanbhogue, and Uday R. Savagaonkar. 2013. Innovative instructions and software model for isolated execution. In W. on Hardware and Architectural Support for Security and Privacy (HASP). 10.
[38]
Muhammad Naveed, Seny Kamara, and Charles V. Wright. 2015. Inference Attacks on Property-Preserving Encrypted Databases. In Int. Conf. on on Computer and Communications Security (CCS). ACM, 644--655.
[39]
Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew Tomkins. 2008. Pig Latin: A Not-so-foreign Language for Data Processing. In Int. Conf. on the Mgt. of Data (SIGMOD). 1099--1110.
[40]
Kay Ousterhout, Ryan Rasti, Sylvia Ratnasamy, Scott Shenker, and Byung-Gon Chun. 2015. Making Sense of Performance in Data Analytics Frameworks. In Networked Sys. Design and Implem. (NSDI). 293--307.
[41]
Pascal Paillier. 1999. Public-key Cryptosystems Based on Composite Degree Residuosity Classes. In Int. Conf. on The Theory and Applications of Cryptographic Techniques (EUROCRYPT). 223--238.
[42]
Antonis Papadimitriou, Ranjita Bhagwan, Nishanth Chandran, Ramachandran Ramjee, Andreas Haeberlen, Harmeet Singh, Abhishek Modi, and Saikrishna Badrinarayanan. 2016. Big Data Analytics over Encrypted Datasets with Seabed. In Symp. on Op. Sys. Design and Implementation (OSDI).
[43]
Raluca Ada Popa, Frank H. Li, and Nickolai Zeldovich. 2013. An Ideal-Security Protocol for Order-Preserving Encoding. In Symp. on Security and Privacy (S&P). 463--477.
[44]
Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, and Hari Balakrishnan. 2011. Cryptdb: protecting confidentiality with encrypted query processing. In Symp. on Op. Sys. Principles (SOSP). 85--100.
[45]
Thomas Ristenpart and Eran Tromer. 2009. Hey, you, get off of my cloud: exploring information leakage in third-party compute clouds. In Int. Conf. on on Computer and Communications Security (CCS). 199--212.
[46]
Hossein Shafagh, Anwar Hithnawi, Andreas Droscher, Simon Duquennoy, and Wen Hu. 2015. Talos: Encrypted Query Processing for the Internet of Things. In Conf. on Embedded Networked Sensor Sys. (SenSys).
[47]
Justine Sherry, Chang Lan, Raluca Ada Popa, and Sylvia Ratnasamy. 2015. BlindBox: Deep Packet Inspection over Encrypted Traffic. In Int. Conf. on Data Communication (SIGCOMM). 213--226.
[48]
Dawn Xiaodong Song, David Wagner, and Adrian Perrig. 2000. Practical Techniques for Searches on Encrypted Data. In Symp. on Security and Privacy (S&P). 44--55.
[49]
Julian James Stephen, Savvas Savvides, Russell Seidel, and Patrick Eugster. 2014. Practical Confidentiality Preserving Big Data Analysis. In W. on Hot Topics in Cloud Computing (HotCloud).
[50]
Julian James Stephen, Savvas Savvides, Russell Seidel, and Patrick Th. Eugster. 2014. Program analysis for secure big data processing. In Int. Conf. on Automated Software Engineering (ASE). 277--288.
[51]
Julian James Stephen, Savvas Savvides, Vinaitheerthan Sundaram, Masoud Saeida Ardekani, and Patrick Eugster. 2016. STYX: Stream Processing with Trustworthy Cloud-based Execution. In Symp. on Cloud Computing (SoCC). 348--360.
[52]
Sai Deep Tetali, Mohsen Lesani, Rupak Majumdar, and Todd D. Millstein. 2013. Mr-Crypt: Static analysis for secure cloud computations. In Conf. on Object-Oriented Prog. Sys., Lang. and Applications (OOPSLA). 271--286.
[53]
Stephen Tu, M. Frans Kaashoek, Samuel Madden, and Nickolai Zeldovich. 2013. Processing Analytical Queries over Encrypted Data. Proc. VLDB Endow. 6, 5 (2013), 289--300.
[54]
Abhishek Verma, Ludmila Cherkasova, and Roy H. Campbell. 2011. ARIA: Automatic Resource Inference and Allocation for Mapreduce Environments. In Int. Conf. on Autonomic Computing (ICAC). 235--244.
[55]
Dennis M. Volpano, Cynthia E. Irvine, and Geoffrey Smith. 1996. A Sound Type System for Secure Flow Analysis. Journal of Computer Security 167--188.
[56]
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauly, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In Networked Sys. Design and Implem. (NSDI). 15--28.
[57]
Wenting Zheng, Ankur Dave, Jethro G. Beekman, Raluca Ada Popa, Joseph E. Gonzalez, and Ion Stoica. 2017. Opaque: An Oblivious and Encrypted Distributed Analytics Platform. In Networked Sys. Design and Implem. (NSDI). 283--298.

Cited By

View all
  • (2023)Generalized Policy-Based Noninterference for Efficient Confidentiality-PreservationProceedings of the ACM on Programming Languages10.1145/35912317:PLDI(267-291)Online publication date: 6-Jun-2023
  • (2023)Structured encryption for triangle counting on graph dataFuture Generation Computer Systems10.1016/j.future.2023.03.030145(200-210)Online publication date: Aug-2023
  • (2020)Efficient confidentiality-preserving data analytics over symmetrically encrypted datasetsProceedings of the VLDB Endowment10.14778/3389133.338914413:8(1290-1303)Online publication date: 3-May-2020
  • Show More Cited By

Index Terms

  1. Secure data types: a simple abstraction for confidentiality-preserving data analytics

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SoCC '17: Proceedings of the 2017 Symposium on Cloud Computing
      September 2017
      672 pages
      ISBN:9781450350280
      DOI:10.1145/3127479
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 September 2017

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. cloud computing
      2. data confidentiality
      3. homomorphic encryption

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      SoCC '17
      Sponsor:
      SoCC '17: ACM Symposium on Cloud Computing
      September 24 - 27, 2017
      California, Santa Clara

      Acceptance Rates

      Overall Acceptance Rate 169 of 722 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)118
      • Downloads (Last 6 weeks)13
      Reflects downloads up to 26 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Generalized Policy-Based Noninterference for Efficient Confidentiality-PreservationProceedings of the ACM on Programming Languages10.1145/35912317:PLDI(267-291)Online publication date: 6-Jun-2023
      • (2023)Structured encryption for triangle counting on graph dataFuture Generation Computer Systems10.1016/j.future.2023.03.030145(200-210)Online publication date: Aug-2023
      • (2020)Efficient confidentiality-preserving data analytics over symmetrically encrypted datasetsProceedings of the VLDB Endowment10.14778/3389133.338914413:8(1290-1303)Online publication date: 3-May-2020
      • (2019)Ensuring Confidentiality in the Cloud of ThingsIEEE Pervasive Computing10.1109/MPRV.2018.287728618:1(10-18)Online publication date: 2-May-2019
      • (2018)Picking a PartnerProceedings of the 2018 Applied Networking Research Workshop10.1145/3232755.3232785(33-39)Online publication date: 16-Jul-2018
      • (2018)Secure Data Communication in Autonomous V2X Systems2018 IEEE International Congress on Internet of Things (ICIOT)10.1109/ICIOT.2018.00029(156-163)Online publication date: Jul-2018

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media