Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3332186.3332250acmotherconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
research-article
Open access

Making it More Secure: The Technical and Social Challenges of Expanding the Functionality of an Existing HPC Cluster to Meet University and Federal Data Security Requirements

Published: 28 July 2019 Publication History

Abstract

As High Performance Computing and Big Data analytics become more commonplace, we see researchers applying these tools in new areas. Indeed, in the past few years, we've seen the use of HPC in diverse areas such as archeology, public policy, and digital humanities. So it comes as no surprise that many life science researchers are now approaching us to use large scale computation and data analytics on their sensitive data sets, such as de-identified patient or genomics data, for the purposes of scientific inquiry. At UC Berkeley, this has become a pressing issue, as existing faculty need a place to do research on sensitive data. And we knew of at least one instance where it affected the campus' ability to recruit a new faculty member. We had a clear imperative for action!
An informal survey informed us that most other institutions built a new dedicated system to support their sensitive data research, including identified and HIPAA data. This paper is a case study of how we met this need by using a methodology to apply our campus cybersecurity framework, with the help of our institution's cybersecurity team, to convert our traditional production HPC cluster, with over 2000 users across 100 research groups, and our virtual machine service offering, to also support this type of research. Our efforts show that it is not only possible, but also that it is also a practical alternative to take this approach instead of building a new environment.
The field of information security focuses on defense-in-depth and as yet and offers no turnkey solutions that would prevent security incidents and breaches of data. As a result, the focus of most university and research lab information security groups is on preparing for detection of a breach after the fact and limiting its scope of impact. These realities combine such that to do secure computing in a high performance computing (HPC) cluster or on virtual machines in the Cloud, one must implement technical security controls, and write a host of process and audit documentation, which is both labor-intensive and on-going.
The paper describes our work at UC Berkeley to take an existing HPC cluster, with a base level of data security controls and procedures in place, and reconfigure it to meet more secure university and federal requirements, while maintaining the same computing experience and functionality on the system for users who are not computing over sensitive data. In other words, this is a study in configuring a hybrid HPC system for computing over non-sensitive and sensitive data alike, and our work to develop the policy and procedures to meet our information security requirements. It describes in detail the technical as well the educational and partner-building work we did at our institution to make this work a success.

References

[1]
NIH Security Best Practices for Controlled-Access Data Subject to the NIH Genomic Data Sharing Policy. March 9, 2015. https://osp.nih.gov/wp-content/uploads/NIH_Best Practices for Controlled-Access to the Data Subject to the NIH GDS Policy.pdf.
[2]
UC Berkeley / Information Security and Policy -- A-Z Policy Catalog: Laws and Regulations. See: https://security.berkeley.edu/policy/laws-regulations.
[3]
UC Berkeley / Information Security and Policy -- Minimum Security Standard for Electronic Information. July 2013. See: https://security.berkeley.edu/minimum-security-standard-electronic-information-effective-july-2013
[4]
Protecting Controlled Unclassified Information in Nonfederal Systems and Organizations. June 6, 2018. See: https://csrc.nist.gov/publications/detail/sp/800-171/rev-1/final
[5]
NIH Security Best Practices for Controlled-Access Data Subject to the NIH Genomic Data Sharing (GDS) Policy. See: https://osp.od.nih.gov/wp-content/uploads/NIH Best Practices for Controlled-Access Data Subject to the NIH GDS Policy.pdf

Cited By

View all
  • (2023)Secure HPCFuture Generation Computer Systems10.1016/j.future.2022.12.019141:C(677-691)Online publication date: 15-Feb-2023
  • (2022)Cybersecurity and Research are not a DichotomyPractice and Experience in Advanced Research Computing 2022: Revolutionary: Computing, Connections, You10.1145/3491418.3535180(1-4)Online publication date: 8-Jul-2022
  • (2022)Corralling sensitive data in the Wild West: supporting research with highly sensitive dataPractice and Experience in Advanced Research Computing 2022: Revolutionary: Computing, Connections, You10.1145/3491418.3535155(1-5)Online publication date: 8-Jul-2022
  • Show More Cited By

Index Terms

  1. Making it More Secure: The Technical and Social Challenges of Expanding the Functionality of an Existing HPC Cluster to Meet University and Federal Data Security Requirements

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      PEARC '19: Practice and Experience in Advanced Research Computing 2019: Rise of the Machines (learning)
      July 2019
      775 pages
      ISBN:9781450372275
      DOI:10.1145/3332186
      • General Chair:
      • Tom Furlani
      Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 28 July 2019

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. High Performance Computing
      2. information security
      3. research computing

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      PEARC '19

      Acceptance Rates

      Overall Acceptance Rate 133 of 202 submissions, 66%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)62
      • Downloads (Last 6 weeks)12
      Reflects downloads up to 01 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Secure HPCFuture Generation Computer Systems10.1016/j.future.2022.12.019141:C(677-691)Online publication date: 15-Feb-2023
      • (2022)Cybersecurity and Research are not a DichotomyPractice and Experience in Advanced Research Computing 2022: Revolutionary: Computing, Connections, You10.1145/3491418.3535180(1-4)Online publication date: 8-Jul-2022
      • (2022)Corralling sensitive data in the Wild West: supporting research with highly sensitive dataPractice and Experience in Advanced Research Computing 2022: Revolutionary: Computing, Connections, You10.1145/3491418.3535155(1-5)Online publication date: 8-Jul-2022
      • (2022)A Secure Workflow for Shared HPC Systems2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00118(965-974)Online publication date: May-2022

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media