Difference between revisions of "Submitting a job to a slurm queue"

From HPC Guide
Jump to navigation Jump to search
Line 1: Line 1:
===submit commands===
+
==submit commands==
 
'''sbatch''' - submits script
 
'''sbatch''' - submits script
  
Line 13: Line 13:
 
</pre>
 
</pre>
  
===info commands===
+
===Examples===
 +
'''sbatch'''
 +
<pre>
 +
sbatch --ntasks=1 --time=10 pre_process.bash (time is 10 minutes)
 +
(Submitted batch job 45001)
 +
sbatch --ntasks=128 --time=60 --depend=45001 do_work.bash
 +
(Submitted batch job 45002)
 +
sbatch --ntasks=1 --time=30 --depend=45002 post_process.bash
 +
(Submitted batch job 45003)
 +
</pre>
 +
 
 +
'''srun'''
 +
<pre>
 +
srun -intasks=2 --label hostname  (--label means that before the output line write the task id)
 +
0:compute-0-1
 +
1:compute-0-1
 +
</pre>
 +
 
 +
Using 2 nodes:
 +
<pre>
 +
srun -innodes=2 --exclusive --label hostname
 +
0:compute-0-1
 +
1:compute-0-2
 +
</pre>
 +
 
 +
opening bash
 +
<pre>
 +
srun --ntasks=56 --pty bash
 +
[dolevg@compute-0-12 beta16.dvory]$....
 +
</pre>
 +
 
 +
Specifying compute node (which is available)
 +
<pre>
 +
srun --ntasks=56 -p gcohen_2018 --nodelist="compute-0-12" --pty bash
 +
</pre>
 +
 
 +
See available nodes:
 +
'''salloc'''
 +
<pre>
 +
salloc --ntasks=8 --time=10 bash
 +
salloc: Granted job allocation 45000
 +
(gives us a bash prompts of a node:)
 +
</pre>
 +
<pre>
 +
env | grep SLURM
 +
SLURM_JOBID=45000
 +
SLURM_NPROCS=4
 +
SLURM_JOB_NODELIST=compute-0-1,compute-0-2
 +
</pre>
 +
 
 +
<pre>
 +
hostname
 +
powerlogin
 +
</pre>
 +
 
 +
<pre>
 +
srun --label hostname
 +
0:compute-0-1
 +
1:compute-0-1
 +
2:compute-0-2
 +
3:compute-0-2
 +
exit (terminates the shell)
 +
</pre>
 +
 
 +
==info commands==
 
'''sinfo'''    -- to see all queues (partitions)
 
'''sinfo'''    -- to see all queues (partitions)
  

Revision as of 15:13, 13 March 2023

submit commands

sbatch - submits script

salloc - submit interactive job - allocates what it needs, but will not start to work on the node/s

srun - submits interactive job w mpi ("job step")

sattach - connect stdin/out/err for an existing job (or job step)

So for example, may submit a job with command:

sbatch script.sh

Examples

sbatch

sbatch --ntasks=1 --time=10 pre_process.bash (time is 10 minutes)
(Submitted batch job 45001)
sbatch --ntasks=128 --time=60 --depend=45001 do_work.bash
(Submitted batch job 45002)
sbatch --ntasks=1 --time=30 --depend=45002 post_process.bash
(Submitted batch job 45003)

srun

srun -intasks=2 --label hostname  (--label means that before the output line write the task id)
0:compute-0-1
1:compute-0-1

Using 2 nodes:

srun -innodes=2 --exclusive --label hostname
0:compute-0-1
1:compute-0-2

opening bash

srun --ntasks=56 --pty bash
[dolevg@compute-0-12 beta16.dvory]$....

Specifying compute node (which is available)

srun --ntasks=56 -p gcohen_2018 --nodelist="compute-0-12" --pty bash

See available nodes: salloc

salloc --ntasks=8 --time=10 bash
salloc: Granted job allocation 45000
(gives us a bash prompts of a node:)
env | grep SLURM
SLURM_JOBID=45000
SLURM_NPROCS=4
SLURM_JOB_NODELIST=compute-0-1,compute-0-2
hostname
powerlogin
srun --label hostname
0:compute-0-1
1:compute-0-1
2:compute-0-2
3:compute-0-2
exit (terminates the shell)

info commands

sinfo -- to see all queues (partitions)

squeue -- to see all jobs

scontrol show partition -- to see all partitions

scontrol show job <number> -- to see job's attributes