Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Dfanalyzer: runtime dataflow analysis of scientific applications using provenance

Published: 01 August 2018 Publication History

Abstract

We present DfAnalyzer, a tool that enables monitoring, debugging, steering, and analysis of dataflows while being generated by scientific applications. It works by capturing strategic domain data, registering provenance and execution data to enable queries at runtime. DfAnalyzer provides lightweight dataflow monitoring components to be invoked by high performance applications. It can be plugged in scientific code scripts, or Spark applications, in the same way users already plug visualization library components. During this demo, we will show how DfAnalyzer captures the dataflow, provenance, as well as how it provides runtime data analyses of applications. We will also encourage attendees to use DfAnalyzer for their own applications.

References

[1]
Armbrust, M., Zaharia, M., Das, T., Davidson, A., Ghodsi, A., Or, A., Rosen, J., Stoica, I., Wendell, P., et al. Scaling spark in the real world: performance and usability. PVLDB, 8(12): 1840--1843, 2015.
[2]
Ayachit, U., Bauer, A., Duque, E.P.N., Eisenhauer, G., Ferrier, N., Gu, J., Jansen, K.E., Loring, B., Lukić, Z., et al. Performance Analysis, Design Considerations, and Applications of Extreme-scale in Situ Infrastructures. Supercomputing conference, 79:1--12, 2016.
[3]
Camata, J.J., Silva, V., Valduriez, P., Mattoso, M., Coutinho, A.L.G.A. In situ visualization and data analysis for turbidity currents simulation. Computers & Geosciences, 110:23--31, 2018.
[4]
Ikeda, R., Widom, J. Panda: A System for Provenance and Data. IEEE Data Engineering Bulletin, 42--49, 2010.
[5]
Ogasawara, E., Dias, J., Oliveira, D., Porto, F., Valduriez, P., Mattoso, M. An Algebraic Approach for Data-Centric Scientific Workflows. PVLDB, 4(12):1328--1339, 2011.
[6]
Olma, M., Karpathiotakis, M., Alagiannis, I., Athanassoulis, M., Ailamaki, A. Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing. PVLDB, 10(10): 1106--1117, 2017.
[7]
Pimentel, J.F., Murta, L., Braganholo, V., Freire, J. no Workflow: a tool for collecting, analyzing, and managing provenance from python scripts. PVLDB, 10(12): 1841--1844, 2017.
[8]
Silva, V., Camata, J., de Oliveira, D., Coutinho, A.L.G.A., Valduriez, P., Mattoso, M. In Situ Data Steering on Sedimentation Simulation with Provenance Data. Poster session of Supercomputing conference, 2016.
[9]
Silva, V., Leite, J., Camata, J., Oliveira, D., Coutinho, A.L.G., Valduriez, P., Mattoso, M. Raw Data Queries during Data-intensive Parallel Workflow Execution. Future Generation Computer Systems Journal, 75402--422, 2017.

Cited By

View all
  • (2020)Provenance Supporting Hyperparameter Analysis in Deep Neural NetworksProvenance and Annotation of Data and Processes10.1007/978-3-030-80960-7_2(20-38)Online publication date: 22-Jun-2020
  • (2020)Experiencing DfAnalyzer for Runtime Analysis of Phylogenomic DataflowsAdvances in Bioinformatics and Computational Biology10.1007/978-3-030-65775-8_10(105-116)Online publication date: 23-Nov-2020
  • (2019)Orchestrating Big Data Analysis Workflows in the CloudACM Computing Surveys10.1145/333230152:5(1-41)Online publication date: 13-Sep-2019

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 11, Issue 12
August 2018
426 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 August 2018
Published in PVLDB Volume 11, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 16 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Provenance Supporting Hyperparameter Analysis in Deep Neural NetworksProvenance and Annotation of Data and Processes10.1007/978-3-030-80960-7_2(20-38)Online publication date: 22-Jun-2020
  • (2020)Experiencing DfAnalyzer for Runtime Analysis of Phylogenomic DataflowsAdvances in Bioinformatics and Computational Biology10.1007/978-3-030-65775-8_10(105-116)Online publication date: 23-Nov-2020
  • (2019)Orchestrating Big Data Analysis Workflows in the CloudACM Computing Surveys10.1145/333230152:5(1-41)Online publication date: 13-Sep-2019

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media