Tutorial¶
You can use enjoy-slurm to submit and manage Slurm jobs in python.
NOTE: This tutorials was run at the DKRZ Levante. You will have to adapt your partition names and, of course, account if you want to run the tutorial somewhere else.
Let’s assume you have a bash test.sh:
[1]:
!printf "#!/bin/sh\necho 'Hello World from $(hostname)'\n" > test.sh
You can submit this using sbatch:
[2]:
import enjoy_slurm as slurm
jobid = slurm.sbatch("test.sh", account="ch0636", partition="shared")
jobid
[2]:
13657854
Now you can check the state of your job using sacct:
[4]:
slurm.sacct(jobid)
[4]:
| JobID | Elapsed | NCPUS | NTasks | State | Start | End | JobName | |
|---|---|---|---|---|---|---|---|---|
| 0 | 13657854 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |
You can also get some job information into a dictionary:
[5]:
slurm.jobinfo(jobid)
[5]:
{13657854: {'Elapsed': '00:00:00',
'NCPUS': 0,
'NTasks': <NA>,
'State': 'PENDING',
'Start': 'Unknown',
'End': 'Unknown',
'JobName': 'test.sh'}}
Meanwhile the job should have completed:
[8]:
slurm.sacct(jobid)
[8]:
| JobID | Elapsed | NCPUS | NTasks | State | Start | End | JobName | |
|---|---|---|---|---|---|---|---|---|
| 0 | 13657854 | 00:00:04 | 2 | <NA> | COMPLETED | 2024-11-01T22:14:18 | 2024-11-01T22:14:22 | test.sh |
enjoy-slurm becomes more useful if you want to manage more jobs which becomes easy in python, e.g.
[9]:
jobids = [
slurm.sbatch("test.sh", account="ch0636", partition="shared") for i in range(0, 10)
]
Check the accounting:
[10]:
slurm.sacct(name="test.sh", state="PENDING")
[10]:
| JobID | Elapsed | NCPUS | NTasks | State | Start | End | JobName | |
|---|---|---|---|---|---|---|---|---|
| 0 | 13657860 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |
| 1 | 13657861 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |
| 2 | 13657862 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |
| 3 | 13657863 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |
| 4 | 13657864 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |
| 5 | 13657865 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |
| 6 | 13657866 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |
| 7 | 13657867 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |
| 8 | 13657868 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |
| 9 | 13657869 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |
Create a job that depends on the completion of the previous jobs:
[11]:
jobid = slurm.sbatch("test.sh", account="ch0636", partition="shared", dependency=jobids)
[12]:
slurm.sacct(jobid)
[12]:
| JobID | Elapsed | NCPUS | NTasks | State | Start | End | JobName | |
|---|---|---|---|---|---|---|---|---|
| 0 | 13657870 | 00:00:00 | 0 | <NA> | PENDING | Unknown | Unknown | test.sh |