(click to copy)

Publication

Discovering the SUPER in computing – dagster-slurm for reproducible research on HPC

Dagster is a modern data orchestrator that emphasises reproducibility, observability, and a strong developer experience (Elementl, 2024b). In parallel, most high-performance computing (HPC) centres continue to rely on Slurm for batch scheduling and resource governance (Yoo et al., 2003).

The two ecosystems rarely meet in practice: Dagster projects often target cloud or single-node deployments, while Slurm users maintain bespoke submission scripts with limited reuse or visibility. This paper introduces dagster-slurm, an open-source integration that allows the same Dagster assets to run unchanged across laptops, CI pipelines, containerised Slurm clusters, and Tier-0 supercomputers.

The project packages dependencies with Pixi (prefix.dev, 2024), submits workloads through Slurm using Dagster Pipes (Elementl, 2024a), and streams logs plus scheduler metrics back to the Dagster UI. The key contribution is a unified compute resource (ComputeResource) that hides SSH transport (including password-only jump hosts and OTP prompts), dependency packaging, and queue configuration while still respecting Slurm’s scheduling semantics.

The project ships two production-ready execution modes—local for laptop/CI development and slurm for one-jobper-asset submissions—and two stable launchers: Bash for script-based workloads and Ray for multi-node distributed computing. Experimental support for Spark, session-based allocation reuse, and heterogeneous jobs is under active development.

H. Picatto, M. Heß, G. Heiler, M. Pfister, Discovering the SUPER in computing – dagster-slurm for reproducible research on HPC, The Journal of Open Source Software 11(119) (2026) 9795.

Georg Heiler © Stephanie Bourke Altmann.jpg

Georg Heiler

0 Pages 0 Press 0 News 0 Events 0 Projects 0 Publications 0 Person 0 Visualisation 0 Art

Signup

CSH Newsletter

Choose your preference
   
Data Protection*