Nothing Special   »   [go: up one dir, main page]

CERN Accelerating science

ATLAS Note
Report number ATL-SOFT-PROC-2024-001
Title A Function-as-a-Task Workflow Management Approach with PanDA and iDDS
Author(s)

Guan, Wen (Brookhaven National Laboratory (US)) ; Maeno, Tadashi (Brookhaven National Laboratory (US)) ; Wenaus, Torre (Brookhaven National Laboratory (US)) ; Alekseev, Aleksandr (University of Texas at Arlington (US)) ; Barreiro Megino, Fernando Harald (University of Texas at Arlington (US)) ; De, Kaushik (University of Texas at Arlington (US)) ; Karavakis, Edward (Brookhaven National Laboratory (US)) ; Korchuganova, Tatiana (University of Pittsburgh (US)) ; Lin, Fa-Hui (University of Texas at Arlington (US)) ; Nilsson, Paul (Brookhaven National Laboratory (US)) ; Yang, Zhaoyu (Brookhaven National Laboratory (US)) ; Zhang, Rui (University of Wisconsin Madison (US)) ; Zhao, Xin (Brookhaven National Laboratory (US))

Corporate Author(s) The ATLAS collaboration
Publication 2024
Imprint 22 Jul 2024
Number of pages 7
In: 22nd International Workshop on Advanced Computing and Analysis Techniques in Physics Research, Stony Brook, United States, 11 - 15 Mar 2024
Subject category Particle Physics - Experiment
Accelerator/Facility, Experiment CERN LHC ; ATLAS
Abstract The growing complexity of high energy physics analysis often involves running various distinct applications. This demands a multi-step data processing approach, with each step requiring different resources and carrying dependencies on preceding steps. It’s important and useful to have a tool to automate these diverse steps efficiently. With the Production and Distributed Analysis (PanDA) system and the intelligent Data Delivery Service (iDDS), we provide a platform that describes data processing steps as tasks and their sequences as workflows, seamlessly orchestrating their execution in a specified order and under predefined conditions, thereby automating the entire task sequence. In this presentation, we will start by giving an overview of the platform's architecture. Following that, we'll introduce a user-friendly interface where workflows are defined in Python with multiple code blocks, with each block being implemented as a Python function. We will then explain the process of converting Python functions into executable tasks, scheduling them across distributed heterogeneous resources, and managing their outputs through a messaging-based asynchronous result-processing mechanism. Finally, we'll showcase a practical example illustrating how this platform effectively converts a machine learning hyperparameter optimization processing to a distributed workflow.



 Registro creado el 2024-07-22, última modificación el 2024-07-22


Texto completo:
Descargar el texto completoPDF
Enlace externo:
Descargar el texto completoOriginal Communication (restricted to ATLAS)