Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Dfanalyzer: runtime dataflow analysis of scientific applications using provenance

Published: 01 August 2018 Publication History

Abstract

We present DfAnalyzer, a tool that enables monitoring, debugging, steering, and analysis of dataflows while being generated by scientific applications. It works by capturing strategic domain data, registering provenance and execution data to enable queries at runtime. DfAnalyzer provides lightweight dataflow monitoring components to be invoked by high performance applications. It can be plugged in scientific code scripts, or Spark applications, in the same way users already plug visualization library components. During this demo, we will show how DfAnalyzer captures the dataflow, provenance, as well as how it provides runtime data analyses of applications. We will also encourage attendees to use DfAnalyzer for their own applications.

References

[1]
Armbrust, M., Zaharia, M., Das, T., Davidson, A., Ghodsi, A., Or, A., Rosen, J., Stoica, I., Wendell, P., et al. Scaling spark in the real world: performance and usability. PVLDB, 8(12): 1840--1843, 2015.
[2]
Ayachit, U., Bauer, A., Duque, E.P.N., Eisenhauer, G., Ferrier, N., Gu, J., Jansen, K.E., Loring, B., Lukić, Z., et al. Performance Analysis, Design Considerations, and Applications of Extreme-scale in Situ Infrastructures. Supercomputing conference, 79:1--12, 2016.
[3]
Camata, J.J., Silva, V., Valduriez, P., Mattoso, M., Coutinho, A.L.G.A. In situ visualization and data analysis for turbidity currents simulation. Computers & Geosciences, 110:23--31, 2018.
[4]
Ikeda, R., Widom, J. Panda: A System for Provenance and Data. IEEE Data Engineering Bulletin, 42--49, 2010.
[5]
Ogasawara, E., Dias, J., Oliveira, D., Porto, F., Valduriez, P., Mattoso, M. An Algebraic Approach for Data-Centric Scientific Workflows. PVLDB, 4(12):1328--1339, 2011.
[6]
Olma, M., Karpathiotakis, M., Alagiannis, I., Athanassoulis, M., Ailamaki, A. Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing. PVLDB, 10(10): 1106--1117, 2017.
[7]
Pimentel, J.F., Murta, L., Braganholo, V., Freire, J. no Workflow: a tool for collecting, analyzing, and managing provenance from python scripts. PVLDB, 10(12): 1841--1844, 2017.
[8]
Silva, V., Camata, J., de Oliveira, D., Coutinho, A.L.G.A., Valduriez, P., Mattoso, M. In Situ Data Steering on Sedimentation Simulation with Provenance Data. Poster session of Supercomputing conference, 2016.
[9]
Silva, V., Leite, J., Camata, J., Oliveira, D., Coutinho, A.L.G., Valduriez, P., Mattoso, M. Raw Data Queries during Data-intensive Parallel Workflow Execution. Future Generation Computer Systems Journal, 75402--422, 2017.

Cited By

View all
  • (2024)AkôFlow: um Middleware para Execução de Workflows Científicos em Múltiplos Ambientes ConteinerizadosAnais do XXXIX Simpósio Brasileiro de Banco de Dados (SBBD 2024)10.5753/sbbd.2024.241126(27-39)Online publication date: 14-Oct-2024
  • (2024)Measuring Application Interference With System-Level Instrumentation2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825462(3648-3653)Online publication date: 15-Dec-2024
  • (2023)Summarizing Provenance of Aggregate Query Results in Relational DatabasesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.326584035:10(10695-10709)Online publication date: 1-Oct-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 11, Issue 12
August 2018
426 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 August 2018
Published in PVLDB Volume 11, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)AkôFlow: um Middleware para Execução de Workflows Científicos em Múltiplos Ambientes ConteinerizadosAnais do XXXIX Simpósio Brasileiro de Banco de Dados (SBBD 2024)10.5753/sbbd.2024.241126(27-39)Online publication date: 14-Oct-2024
  • (2024)Measuring Application Interference With System-Level Instrumentation2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825462(3648-3653)Online publication date: 15-Dec-2024
  • (2023)Summarizing Provenance of Aggregate Query Results in Relational DatabasesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.326584035:10(10695-10709)Online publication date: 1-Oct-2023
  • (2023)Life Science Workflow Services (LifeSWS): Motivations and ArchitectureTransactions on Large-Scale Data- and Knowledge-Centered Systems LV10.1007/978-3-662-68100-8_1(1-24)Online publication date: 28-Sep-2023
  • (2021)BioProv - A provenance library for bioinformatics workflowsJournal of Open Source Software10.21105/joss.036226:67(3622)Online publication date: Nov-2021
  • (2021)Scientific Workflows Management and Scheduling in Cloud Computing: Taxonomy, Prospects, and ChallengesIEEE Access10.1109/ACCESS.2021.30707859(53491-53508)Online publication date: 2021
  • (2021)Workflow provenance in the lifecycle of scientific machine learningConcurrency and Computation: Practice and Experience10.1002/cpe.654434:14Online publication date: 22-Aug-2021
  • (2020)DfAnalyzer: Runtime dataflow analysis tool for Computational Science and Engineering applicationsSoftwareX10.1016/j.softx.2020.10059212(100592)Online publication date: Jul-2020
  • (2020)Capturing and Analyzing Provenance from Spark-based Scientific Workflows with SAMbA-RaPFuture Generation Computer Systems10.1016/j.future.2020.05.031112(658-669)Online publication date: Nov-2020
  • (2020)Provenance Supporting Hyperparameter Analysis in Deep Neural NetworksProvenance and Annotation of Data and Processes10.1007/978-3-030-80960-7_2(20-38)Online publication date: 22-Jun-2020
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media