Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3642978.3652833acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article
Open access

Workflows' applications in computational environmental science: a survey

Published: 22 April 2024 Publication History

Abstract

This survey paper explores the different applications of workflows in computational environmental science. Workflows are crucial in streamlining complex computational processes, enabling researchers to manage and analyze large-scale environmental data effectively. The paper reviews existing literature, methodologies, and tools associated with workflow applications in environmental science, highlighting their impact on research efficiency, reproducibility, and collaboration. By examining case studies and emerging trends, this survey aims to provide insights into the current landscape of workflow applications within the computational environmental science domain.

References

[1]
Isabella Ascione, Giulio Giunta, Patrizio Mariani, Raffaele Montella, and Angelo Riccio. 2006. A grid computing based virtual laboratory for environmental simulations. In Euro-Par 2006 Parallel Processing: 12th International Euro-Par Conference, Dresden, Germany, August 28-September 1, 2006. Proceedings 12. Springer, 1085--1094.
[2]
Adam Barker and Jano Van Hemert. 2007. Scientific workflow: a survey and research directions. In International Conference on Parallel Processing and Applied Mathematics. Springer, 746--753.
[3]
Derik Barseghian, Ilkay Altintas, Matthew B Jones, Daniel Crawl, Nathan Potter, James Gallagher, Peter Cornillon, Mark Schildhauer, Elizabeth T Borer, Eric W Seabloom, et al. 2010. Workflows and extensions to the Kepler scientific workflow system to support environmental sensor data access and analysis. Ecological Informatics 5, 1 (2010), 42--50.
[4]
Michael R Berthold, Nicolas Cebron, Fabian Dill, Thomas R Gabriel, Tobias Kötter, Thorsten Meinl, Peter Ohl, Kilian Thiel, and Bernd Wiswedel. 2009. KNIME-the Konstanz information miner: version 2.0 and beyond. AcM SIGKDD explorations Newsletter 11, 1 (2009), 26--31.
[5]
Shishir Bharathi, Ann Chervenak, Ewa Deelman, Gaurang Mehta, Mei-Hui Su, and Karan Vahi. 2008. Characterization of scientific workflows. In 2008 third workshop on workflows in support of large-scale science. IEEE, 1--10.
[6]
Steven P Callahan, Juliana Freire, Emanuele Santos, Carlos Eduardo Scheidegger, Cláudio T Silva, and Huy T Vo. 2006. Managing the evolution of dataflows with vistrails. In 22nd International Conference on Data Engineering Workshops (ICDEW'06). IEEE, 71--71.
[7]
Peter JA Cock, Björn A Grüing, Konrad Paszkiewicz, and Leighton Pritchard. 2013. Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology. PeerJ 1 (2013), e167.
[8]
Sarah Cohen-Boulakia, Khalid Belhajjame, Olivier Collin, Jérôme Chopard, Christine Froidevaux, Alban Gaignard, Konrad Hinsen, Pierre Larmande, Yvan Le Bras, Frédéric Lemoine, et al. 2017. Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities. Future Generation Computer Systems 75 (2017), 284--298.
[9]
Iacopo Colonnelli et al. 2022. Workflow models for heterogeneous distributed systems. (2022).
[10]
Iacopo Colonnelli, Barbara Cantalupo, Ivan Merelli, and Marco Aldinucci. 2020. StreamFlow: cross-breeding cloud with HPC. IEEE Transactions on Emerging Topics in Computing 9, 4 (2020), 1723--1737.
[11]
Rafael Ferreira Da Silva, Ewa Deelman, Rosa Filgueira, Karan Vahi, Mats Rynge, Rajiv Mayani, and Benjamin Mayer. 2016. Automating environmental computing applications with scientific workflows. In 2016 IEEE 12th International Conference on e-Science (e-Science). IEEE, 400--406.
[12]
Ewa Deelman, Karan Vahi, Gideon Juve, Mats Rynge, Scott Callaghan, Philip J Maechling, Rajiv Mayani, Weiwei Chen, Rafael Ferreira Da Silva, Miron Livny, et al. 2015. Pegasus, a workflow management system for science automation. Future Generation Computer Systems 46 (2015), 17--35.
[13]
Roberto Di Lauro, Francesca Lucarelli, and Raffaele Montella. 2012. SIaaS-sensing instrument as a service using cloud computing to turn physical instrument into ubiquitous service. In 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications. IEEE, 861--862.
[14]
Alexander Fillbrunn, Christian Dietz, Julianus Pfeuffer, René Rahn, Gregory A Landrum, and Michael R Berthold. 2017. KNIME for reproducible cross-domain analysis of life science data. Journal of biotechnology 261 (2017), 149--156.
[15]
Giulio Giunta, Patrizio Mariani, Raffaele Montella, and Angelo Riccio. 2007. pPOM: A nested, scalable, parallel and Fortran 90 implementation of the Princeton Ocean Model. Environmental Modelling & Software 22, 1 (2007), 117--122.
[16]
G Giunta, R Montella, Patrizio Mariani, and A Riccio. 2005. Modeling and computational issues for air/water quality problems: A grid computing approach. Il nuovo cimento C 28, 2 (2005), 215--224.
[17]
Jeremy Goecks, Anton Nekrutenko, James Taylor, and Galaxy Team team@ galaxyproject. org. 2010. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome biology 11 (2010), 1--13.
[18]
Carlos Granell, SvenSchade, and Nicole Ostländer. 2013. Seeing the forest through the trees: A review of integrated environmental modelling tools. Computers, Environment and Urban Systems 41 (2013), 136--150.
[19]
Bas P Harenslak and Julian de Ruiter. 2021. Data Pipelines with Apache Airflow. Simon and Schuster.
[20]
Ferenc Horváth, Péter Ittzés, Dóra Ittzés, Zoltán Barcza, Laura Dobor, Dóra Hidy, Attila Marosi, and Alex Hardisty. 2014. Supporting environmental modelling with Taverna workflows, web services and desktop grid technology. In Ames, DP, Quinn, NWT, Rizzoli, AE (Eds.), Proceedings of the 7th International Congress on Environmental Modelling and Software, June 15--19, San Diego, California, USA. International Environmental Modelling & Software Society.
[21]
Gideon Juve, Ann Chervenak, Ewa Deelman, Shishir Bharathi, Gaurang Mehta, and Karan Vahi. 2013. Characterizing and profiling scientific workflows. Future generation computer systems 29, 3 (2013), 682--692.
[22]
Fabian Lehmann, David Frantz, Sören Becker, Ulf Leser, and Patrick Hostert. 2021. FORCE on Nextflow: Scalable Analysis of Earth Observation Data on Commodity Clusters. In CIKM Workshops.
[23]
Chee Sun Liew, Malcolm P Atkinson, Michelle Galea, Tan Fong Ang, Paul Martin, and Jano I Van Hemert. 2016. Scientific workflows: moving across paradigms. ACM Computing Surveys (CSUR) 49, 4 (2016), 1--39.
[24]
Cui Lin and Shiyong Lu. 2011. Scheduling scientific workflows elastically for cloud computing. In 2011 IEEE 4th international conference on cloud computing. IEEE, 746--747.
[25]
Bertram Ludäscher, Ilkay Altintas, Chad Berkley, Dan Higgins, Efrat Jaeger, Matthew Jones, Edward A Lee, JingTao, and Yang Zhao. 2006. Scientific workflow management and the Kepler system. Concurrency and computation: Practice and experience 18, 10 (2006), 1039--1065.
[26]
Anusha Srirenganathan Malarvizhi, Qian Liu, Dexuan Sha, Hai Lan, and Chaowei Yang. 2022. An open-source workflow for spatiotemporal studies with COVID-19 as an example. ISPRS International Journal of Geo-Information 11, 1 (2022), 13.
[27]
Ryan Mitchell, Loic Pottier, Steve Jacobs, Rafael Ferreira da Silva, Mats Rynge, Karan Vahi, and Ewa Deelman. 2019. Exploration of workflow management systems emerging features from users perspectives. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 4537--4544.
[28]
Raffaele Montella, Alison Brizius, Diana Di Luccio, Cheryl Porter, Joshua Elliot, Ravi Madduri, David Kelly, Angelo Riccio, and Ian Foster. 2016. Applications of the FACE-IT portal and workflow engine for operational food quality prediction and assessment: Mussel farm monitoring in the Bay of Napoli, Italy. In CEUR Workshop Proceedings, Vol. 1800. 64--68.
[29]
Raffaele Montella, Alison Brizius, Diana Di Luccio, Cheryl Porter, Joshua Elliot, Ravi Madduri, David Kelly, Angelo Riccio, and Ian Foster. 2020. Using the face-it portal and workflow engine for operational food quality prediction and assessment: An application to mussel farms monitoring in the bay ofnapoli, italy. Future Generation Computer Systems 110 (2020), 453--467.
[30]
Raffaele Montella, Diana Di Luccio, Angelo Ciaramella, and Ian Foster. 2019. StormSeeker: A machine-learning-based mediterranean storm tracer. In International Conference on Internet and Distributed Computing Systems. Springer, 444--456.
[31]
Raffaele Montella, Diana Di Luccio, and Sokol Kosta. 2018. Dagon: Executing direct acyclic graphs as parallel jobs on anything. In 2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS). IEEE, 64--73.
[32]
Raffaele Montella, Diana Di Luccio, Livia Marcellino, Ardelio Galletti, Sokol Kosta, Giulio Giunta, and Ian Foster. 2019. Workflow-based automatic processing for internet of floating things crowdsourced data. Future generation computer systems 94 (2019), 103--119.
[33]
Raffaele Montella, Giulio Giunta, and Angelo Riccio. 2007. Using grid computing based components in on demand environmental data delivery. In Proceedings of the second workshop on Use of P2P, GRID and agents for the development of content networks. 81--86.
[34]
Raffaele Montella, Sokol Kosta, and Ian Foster. 2018. DYNAMO: Distributed leisure yacht-carried sensor-network for atmosphere and marine data crowd-sourcing applications. In 2018 IEEE International Conference on Cloud Engineering (IC2E). IEEE, 333--339.
[35]
Raffaele Montella, Mario Ruggieri, and Sokol Kosta. 2018. A fast, secure, reliable, and resilient data transfer framework for pervasive IoT applications. In IEEE INFOCOM 2018-IEEE conference on computer communications workshops (INFOCOM WKSHPS). IEEE, 710--715.
[36]
Tom Oinn, Mark Greenwood, Matthew Addis, M Nedim Alpdemir, Justin Ferris, Kevin Glover, Carole Goble, Antoon Goderis, Duncan Hull, Darren Marvin, et al. 2006. Taverna: lessons in creating a workflow environment for the life sciences. Concurrency and computation: Practice and experience 18, 10 (2006), 1067--1100.
[37]
Steve O'Hagan and Douglas B Kell. 2015. Software review: the KNIME workflow environment and its applications in Genetic Programming and machine learning. Genetic Programming and Evolvable Machines 16 (2015), 387--391.
[38]
Suraj Pandey, Dileban Karunamoorthy, and Rajkumar Buyya. 2011. Workflow engine for clouds. Cloud computing: principles and paradigms (2011), 321--344.
[39]
Quan Pham, Tanu Malik, Ian Foster, Roberto Di Lauro, and Raffaele Montella. 2012. SOLE: linking research papers with science objects. In Provenance and Annotation of Data and Processes: 4th International Provenance and Annotation Workshop, IP AW 2012, Santa Barbara, CA, USA, June 19--21, 2012, Revised Selected Papers 4. Springer, 203--208.
[40]
Dante D Sánchez-Gallegos, Diana Di Luccio, José Luis Gonzalez-Compean, and Raffaele Montella. 2019. Internet of things orchestration using dagon workflow engine. In 2019 IEEE 5th world forum on internet of things (WF-IoT). IEEE, 95--100.
[41]
Dante Domizzi Sánchez-Gallegos, Diana Di Luccio, Sokol Kosta, JL Gonzalez-Compean, and Raffaele Montella. 2021. An efficient pattern-based approach for workflow supporting large-scale science: The DagOnStar experience. Future Generation Computer Systems 122 (2021), 187--203.
[42]
Shashank Shekhar. 2018. Apache Superset Quick Start Guide: Develop interactive visualizations by creating user-friendly dashboards. Packt Publishing Ltd.
[43]
Sameer Shukla. 2022. Developing pragmatic data pipelines using apache airflow on Google Cloud Platform. Int J Comput Sci Eng 10, 8 (2022), 1--8.
[44]
Guilherme H Soares and Miguel A Brito. 2023. Business Intelligence Over and Above Apache Superset. In 2023 18th Iberian Conference on Information Systems and Technologies (CISTI). IEEE, 1--6.
[45]
Domenico Talia. 2013. Workflow systems for science: Concepts and tools. International Scholarly Research Notices 2013 (2013).
[46]
Ryan Tanaka, George Papadimitriou, Sai Charan Viswanath, Cong Wang, Eric Lyons, Komal Thareja, Chengyi Qu, Alicia Esquivel, Ewa Deelman, Anirban Mandal, et al. 2022. Automating edge-to-cloud workflows for science: Traversing the edge-to-cloud continuum with pegasus. In 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, 826--833.
[47]
Abhishek Tiwari and Arvind KT Sekhar. 2007. Workflow based framework for life science informatics. Computational biology and chemistry 31, 5--6 (2007), 305--319.
[48]
Katherine Wolstencroft, Robert Haines, Donal Fellows, Alan Williams, David Withers, Stuart Owen, Stian Soiland-Reyes, Ian Dunlop, Aleksandra Nenadic, Paul Fisher, et al. 2013. The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud. Nucleic acids research 41, W1 (2013), W557--W561.
[49]
Yonghong Yan, Barbara M Chapman, and Babu Sundaram. 2005. Air quality forecasting on campus grid environment. In the Workshop on Grid Applications: From Early Adopters to Mainstream Users, GGF14.
[50]
Peng Yue, Mingda Zhang, and Zhenyu Tan. 2015. A geoprocessing workflow system for environmental monitoring and integrated modelling. Environmental Modelling & Software 69 (2015), 128--140.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
WiDE '24: Proceedings of the 2nd Workshop on Workflows in Distributed Environments
April 2024
32 pages
ISBN:9798400705465
DOI:10.1145/3642978
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 April 2024

Check for updates

Author Tags

  1. workflows
  2. computational environmental science
  3. survey

Qualifiers

  • Research-article

Conference

WiDE '24
Sponsor:

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 121
    Total Downloads
  • Downloads (Last 12 months)121
  • Downloads (Last 6 weeks)25
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media