DAG-based workflows scheduling using Actor–Critic Deep Reinforcement Learning

Published: 01 January 2024


High-Performance Computing (HPC) is essential to support the advance in multiple research and industrial fields. Despite the recent growth in processing and networking power, the HPC Data Centers (DCs) are finite, and should be carefully managed to host multiple jobs. The scheduling of tasks (composing a job) is a crucial and complex task, once the reflexes of the scheduler’s decisions are perceptible both for users (e.g., slowdown) and for infrastructure administrators (e.g., use of resources and queue length). In fact, the process of scheduling workflows atop a DC can be modeled as a graph mapping problem. While an undirected graph is used to represent the DC, a Directed Acyclic Graph (DAG) is used to express the tasks dependencies. Each vertex and edge from both graphs can have weights associated with them, denoting the residual capacities for DC resources, as well as computing and networking demands for workflows. Motivated by the combinatorial explosion of the aforementioned scheduling problem, the integration of Machine Learning (ML) for generating or improving scheduling policies is a reality, however the proposals in the specialized literature opt, mostly, for using simplified models to reduce the search space or are trained to specific scenarios, which leads to policies that eventually fall short of real DCs expectations. Given this challenge, this work applies Actor–Critic (AC) Reinforcement Learning (RL) to schedule DAG-based workflows. Instead of proposing a new policy, the AC RL is used to select the appropriated scheduling policy from a pool of consolidated algorithms, guided by the DAGs workload and DC usage. The AC RL-based scheduler analyzes the DAGs queue and the DC status to define which algorithms are better suited to improve the overall performance indicators in each scenario instance. The simulation protocol comprises multiple analysis with distinct workload configurations, number of jobs, queue ordering polices and strategies to select the target DC servers. The results demonstrated that the AC RL selects the scheduling policy which fits the current workload and DC status.


We researched the DAG-based workflows scheduling considering both the users’ and HPC DC’s perspectives.
We proposed, implemented and analyzed an AC RL scheduler to select the appropriated combination of queueing policies.
The AC RL prototype runs alongside state-of-the-art consolidated policies, and can be easily extended.
The AC RL prototype is simple. It is based on well-known queueing policies and actor–critic reinforcement learning; and.
The simulation protocol demonstrated how the AC RL prototype can learn and improve the overall indicators.


  (2024)Towards Highly Compatible I/O-Aware Workflow Scheduling on HPC SystemsProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00031(1-15)Online publication date: 17-Nov-2024
  (2024)Energy-efficient DAG scheduling with DVFS for cloud data centersThe Journal of Supercomputing10.1007/s11227-024-06035-780:10(14799-14823)Online publication date: 27-Mar-2024



Future Generation Computer Systems  Volume 150, Issue C
Jan 2024
451 pages


Published: 01 January 2024

  Scheduling
  Actor–critic
  Deep reinforcement learning
  DAG
  Tasks
  Jobs
  Workflow


  (2024)Towards Highly Compatible I/O-Aware Workflow Scheduling on HPC SystemsProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00031(1-15)Online publication date: 17-Nov-2024
  (2024)Energy-efficient DAG scheduling with DVFS for cloud data centersThe Journal of Supercomputing10.1007/s11227-024-06035-780:10(14799-14823)Online publication date: 27-Mar-2024

