Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3603166.3632142acmconferencesArticle/Chapter ViewAbstractPublication PagesuccConference Proceedingsconference-collections
research-article

End-to-end Integration of Scientific Workflows on Distributed Cyberinfrastructures: Challenges and Lessons Learned with an Earth Science Application

Published: 04 April 2024 Publication History

Abstract

Distributed cyberinfrastructures (CI) pose opportunities and challenges for the execution of scientific workflows, especially in the context of Earth science applications. They provide heterogeneous resources that can meet the needs of the applications that are part of the scientific workflows and provide the necessary performance and scalability to achieve scientific goals. However, the challenge with distributed CI is that it is difficult to find the right resources for the applications and to orchestrate the workflow execution from resource provisioning to job execution to delivering the final results. In some cases, poor choice of resources may result in slow execution or outright failure. In this paper, we present Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) Pegasus, a CI solution built as part of the U.S. National Science Foundation ACCESS program that provides automated execution of scientific applications. We demonstrate Pegasus's capabilities with SOil MOisture SPatial Inference Engine (SOMOSPIE), an earth science multi-component application for fine-grained soil moisture predictions. We identify a roadmap to migrate applications such as SOMOSPIE on ACCESS resources with the support of ACCESS Pegasus, outlining both strengths and weaknesses of this approach.

References

[1]
ACCESS Pegasus Website [n. d.]. ACCESS Pegasus: Automate your Workflow. Available at https://support.access-ci.org/pegasus, [Online; accessed 10-30-2023].
[2]
Jim Basney et al. 2019. CILogon: Enabling Federated Identity and Access Management for Scientific Collaborations. Proceedings of Science (PoS) 351 (2019).
[3]
Alan Blatecky, Damian Clarke, Joel E. Cutcher Gershenfeld, Deborah Dent, Rebecca Hipp, Ana Hunsinger, Al Kuslikas, and Lauren Michael. 2021. The Missing Millions: Democratizing Computation and Data to Bridge Digital Divides and Increase Access to Science for Underrepresented Communities. Technical Report. National Science Foundation.
[4]
A. Bárdossy and W. Lehmann. 1998. Spatial distribution of soil moisture in a small catchment. Part 1: geostatistical analysis. Journal of Hydrology 206, 1 (1998), 1--15.
[5]
Hui Chen et al. 2017. Comparison of spatial interpolation methods for soil moisture and its application for monitoring drought. Environmental Monitoring and Assessment 189, 10 (Sept. 2017), 525.
[6]
Ewa Deelman et al. 2015. Pegasus: a Workflow Management System for Science Automation. Future Generation Computer Systems 46 (2015), 17--35.
[7]
Yuanyuan Ding et al. 2011. Research on the spatial interpolation methods of soil moisture based on GIS. In Proc. of International Conference on Information Science and Technology. 709--711.
[8]
Exosphere [n. d.]. Exosphere: the User-Friendliest Interface for Non-proprietary Cloud Infrastructure. Available at https://gitlab.com/exosphere/exosphere, [Online; accessed 10-30-2023].
[9]
Mario Guevara et al. 2021. Gap-free global annual soil moisture: 15 km grids for 1991--2018. Earth System Science Data 13, 4 (2021), 1711--1735.
[10]
David Y. Hancock et al. 2021. Jetstream2: Accelerating Cloud Computing via Jetstream. In Proc. of Practice and Experience in Advanced Research Computing (PEARC '21). ACM, Article 11, 8 pages.
[11]
Dave Hudak et al. 2018. Open OnDemand: A web-based client portal for HPC centers. Journal of Open Source Software 3, 25 (2018), 622.
[12]
Ricardo M. Llamas et al. 2020. Spatial Gap-Filling of ESA CCI Satellite-Derived Soil Moisture Based on Geostatistical Techniques and Multiple Regression. Remote. Sens. 12, 4 (2020), 665.
[13]
Ricardo M. Llamas et al. 2022. Downscaling Satellite Soil Moisture Using a Modular Spatial Inference Framework. Remote Sensing 14, 13 (2022).
[14]
Ryan McKinney et al. 2015. From HPC performance to climate modeling: Transforming methods for HPC predictions into models of extreme climate conditions. In Proc. of 2015 IEEE 11th International Conference on e-Science. IEEE Computer Society, 108--117.
[15]
Paula Olaya et al. 2023. Building Trust in Earth Science Findings through Data Traceability and Results Explainability. IEEE Transactions on Parallel and Distributed Systems (TPDS) 34, 2 (2023), 704--717.
[16]
Paula Olaya et al. 2023. Enabling Scalability in the Cloud for Scientific Workflows: An Earth Science Use Case. In Proc. of 2023 IEEE 16th International Conference on Cloud Computing (CLOUD). IEEE Computer Society, 383--393.
[17]
Wolfgang Preimesberger et al. [n. d.]. ESA Soil Moisture Climate Change Initiative: ACTIVE product, Version 08.1. NERC EDS Centre for Environmental Data. Available at https://www.esa-soilmoisture-cci.org, [Online; accessed 02-25-2023].
[18]
Camila Roa et al. 2023. GEOtiled: A Scalable Workflow for Generating Large Datasets of High-Resolution Terrain Parameters. In Proc. of the 32nd International Symposium on High-Performance Parallel and Distributed Computing (HPDC '23). ACM, 311--312.
[19]
Danny Rorabaugh et al. 2019. SOMOSPIE: A Modular SOil MOisture SPatial Inference Engine Based on Data-Driven Decisions. In Proc. of the 15th International Conference on eScience (eScience). IEEE, 1--10.
[20]
X. Carol Song et al. 2022. Anvil - System Architecture and Experiences from Deployment and Early User Operations. In Proc. of Practice and Experience in Advanced Research Computing (PEARC '22). ACM, Article 23, 9 pages.
[21]
Douglas Thain et al. 2005. Distributed computing in practice: the Condor experience. Concurr. Comput.: Pract. Exper. 17, 2--4 (2005), 323--356.
[22]
The Open Storage Network [n. d.]. OpenStorage Network. Available at https://www.openstoragenetwork.org, [Online; accessed 10-30-2023].
[23]
USGS. [n. d.]. 3DEP: 3D Elevation Program. Available at https://apps.nationalmap.gov/downloader/#/elevation, [Online; accessed 10-30-2023].

Index Terms

  1. End-to-end Integration of Scientific Workflows on Distributed Cyberinfrastructures: Challenges and Lessons Learned with an Earth Science Application

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    UCC '23: Proceedings of the IEEE/ACM 16th International Conference on Utility and Cloud Computing
    December 2023
    502 pages
    ISBN:9798400702341
    DOI:10.1145/3603166
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 April 2024

    Check for updates

    Author Tags

    1. workflows
    2. machine learning
    3. soil moisture
    4. high throughput computing
    5. containers

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    UCC '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 38 of 125 submissions, 30%

    Upcoming Conference

    UCC '24
    2024 IEEE/ACM 17th International Conference on Utility and Cloud Computing
    December 16 - 19, 2024
    Sharjah , United Arab Emirates

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 58
      Total Downloads
    • Downloads (Last 12 months)58
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 01 Nov 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media