Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3311790.3396625acmconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
research-article
Open access

Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scientific Computing: Producing a fp32 ExaFLOP hour worth of IceCube simulation data in a single workday

Published: 26 July 2020 Publication History

Abstract

Scientific computing needs are growing dramatically with time and are expanding in science domains that were previously not compute intensive. When compute workflows spike well in excess of the capacity of their local compute resource, capacity should be temporarily provisioned from somewhere else to both meet deadlines and to increase scientific output. Public Clouds have become an attractive option due to their ability to be provisioned with minimal advance notice. The available capacity of cost-effective instances is not well understood. This paper presents expanding the IceCube's production HTCondor pool using cost-effective GPU instances in preemptible mode gathered from the three major Cloud providers, namely Amazon Web Services, Microsoft Azure and the Google Cloud Platform. Using this setup, we sustained for a whole workday about 15k GPUs, corresponding to around 170 PFLOP32s, integrating over one EFLOP32 hour worth of science output for a price tag of about $60k. In this paper, we provide the reasoning behind Cloud instance selection, a description of the setup and an analysis of the provisioned resources, as well as a short description of the actual science output of the exercise.

Supplemental Material

MP4 File
Presentation video

References

[1]
Igor Sfiligoi, 2020. Running a Pre-Exascale, Geographically Distributed, Multi-Cloud Scientific Simulation. To be published in Proc. of 35th Int. Conf. on High Perf. Comp., ISC High Performance 2020.
[2]
Dmitry Chirki, 2013. Photon tracking with GPUs in IceCube. Nucl. Inst. and Meth. in Phys. Res. Sec. A, Vol. 725, 141-143.
[3]
Computing with HTCondor. 2020. https://research.cs.wisc.edu/htcondor/ (accessed 2020)
[4]
Ruth Pordes, 2007. The open science grid. J. Phys.: Conf. Ser. 78 012057.
[5]
Larry Smarr, 2018. The Pacific Research Platform: Making High-Speed Networking a Reality for the Scientist. Proc. of PEARC. 1-8.
[6]
Parag Mhashilkar, 2014. Cloud Bursting with GlideinWMS: Means to satisfy ever increasing computing needs for Scientific Workflows. J. Phys.: Conf. Ser. 513 032069
[7]
Ricardo G. Diaz, 2011. Belle-DIRAC Setup for Using Amazon Elastic Compute Cloud. J Grid Computing 9, 65–79.
[8]
Brandon Posey, 2019. On-Demand Urgent High Performance Computing Utilizing the Google Cloud Platform. 2019 IEEE/ACM HPC for Urgent Decision Making (UrgentHPC), Denver, CO, USA, 13-23.
[9]
Brian Bockelman, 2014. OASIS: a data and software distribution service for Open Science Grid. J. Phys.: Conf. Ser. 513 032013.
[10]
Mark G. Aartsen, 2017. The IceCube Neutrino Observatory: instrumentation and online systems. J. Instrum. 12, P03012-P03012.

Cited By

View all
  • (2024)NSDF-Services: Integrating Networking, Storage, and Computing Services into a Testbed for Democratization of Data DeliveryProceedings of the IEEE/ACM 16th International Conference on Utility and Cloud Computing10.1145/3603166.3632136(1-10)Online publication date: 4-Apr-2024
  • (2024)Evaluation of ARM CPUs for IceCube available through Google Kubernetes EngineEPJ Web of Conferences10.1051/epjconf/202429511009295(11009)Online publication date: 6-May-2024
  • (2023)Ultra high energy cosmic rays The intersection of the Cosmic and Energy FrontiersAstroparticle Physics10.1016/j.astropartphys.2023.102819149(102819)Online publication date: Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PEARC '20: Practice and Experience in Advanced Research Computing 2020: Catch the Wave
July 2020
556 pages
ISBN:9781450366892
DOI:10.1145/3311790
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cloud
  2. GPU
  3. HTCondor
  4. Hybrid-Cloud
  5. IceCube
  6. Multi-Cloud
  7. astrophysics
  8. cost analysis

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

PEARC '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 133 of 202 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)7
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)NSDF-Services: Integrating Networking, Storage, and Computing Services into a Testbed for Democratization of Data DeliveryProceedings of the IEEE/ACM 16th International Conference on Utility and Cloud Computing10.1145/3603166.3632136(1-10)Online publication date: 4-Apr-2024
  • (2024)Evaluation of ARM CPUs for IceCube available through Google Kubernetes EngineEPJ Web of Conferences10.1051/epjconf/202429511009295(11009)Online publication date: 6-May-2024
  • (2023)Ultra high energy cosmic rays The intersection of the Cosmic and Energy FrontiersAstroparticle Physics10.1016/j.astropartphys.2023.102819149(102819)Online publication date: Jul-2023
  • (2023)Publisher's Note:Astroparticle Physics10.1016/j.astropartphys.2022.102794147(102794)Online publication date: May-2023
  • (2023)Efficient GPU Cloud architectures for outsourcing high-performance processing to the CloudThe International Journal of Advanced Manufacturing Technology10.1007/s00170-023-11252-0133:1-2(949-958)Online publication date: 26-May-2023
  • (2022)Data-Intensive Physics Analysis in Azure CloudComputer Networks, Big Data and IoT10.1007/978-981-19-0898-9_20(257-268)Online publication date: 22-May-2022
  • (2021)Managing Cloud networking costs for data-intensive applications by provisioning dedicated network linksPractice and Experience in Advanced Research Computing 2021: Evolution Across All Dimensions10.1145/3437359.3465563(1-8)Online publication date: 17-Jul-2021
  • (2021)Expanding IceCube GPU computing into the Clouds2021 IEEE 17th International Conference on eScience (eScience)10.1109/eScience51609.2021.00034(227-228)Online publication date: Sep-2021
  • (2021)Pushing the Cloud Limits in Support of IceCube ScienceIEEE Internet Computing10.1109/MIC.2020.304520925:1(71-75)Online publication date: 1-Jan-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media