Tutorial: Using Slurm to run MPI jobs

posted Oct 6, 2016, 9:01 PM by SRGICS UPLB   [ updated Oct 16, 2016, 9:00 PM ]
Quick start:

First, learn to use screen.
  1. ssh -ouserknownhostsfile=/dev/null <username>@<cluster ip>
  2. wget https://github.com/srg-ics-uplb/peak-two-cloud/raw/master/slurm/slurm_vcluster_config/submit.sh
  3. wget https://github.com/srg-ics-uplb/peak-two-cloud/raw/master/slurm/slurm_vcluster_config/hello.c
  4. mpicc -o hello.exe hello.c
  5. sbatch submit.sh ./hello.exe
  6. cat output.txt

Sample job script (submit.sh):
if [ "$#" -ne 1 ]; then
    echo "Usage: $0 <executable>"

#the name of the job
#SBATCH --job-name=mpijob

#the stdout output of the job
#SBATCH --output=output.txt

#available nodes
#SBATCH --nodes=4

#number of tasks
#SBATCH --ntasks=4

#time limit (D-HH:MM), terminate the job after two minutes
#if not yet done
#SBATCH -t 0-0:02

#the executable

#number of nodes to use in the run

mpiexec -np $NODES -f ../nodes.txt ./$EXEC
The script above describes a job to compile and execute hello.c, an MPI program. The important parameters to change are --job-name, --output, and EXEC. The values of these parameters should describe your program.

View node information
  • sinfo -N -l
Submit a job
  • sbatch hello-submit.sh
Check status of a job
  • squeue
Check job details
  • scontrol show job <id>
View result
  • cat hello.out
Cancel a job
  • scancel <job id>

More commands