Tutorial: Sharing a vcluster-deployed MPI cluster through Slurm

posted Oct 4, 2016, 1:49 AM by SRGICS UPLB   [ updated Nov 2, 2017, 3:49 AM ]
The main objective of vcluster is to provide on-demand provisioning of MPI clusters. However, sharing a cluster may be more efficient in some use cases, such as for training or teaching. This tutorial describes how to do this by using Slurm.
  1. Create an MPI cluster using vcluster with 3 slave nodes. Name the cluster cmsc180.
  2. On the master node
    1. screen -S slurm
    2. wget https://github.com/srg-ics-uplb/peak-two-cloud/raw/master/slurm/slurm_vcluster_config/setup-master.sh
    3. wget https://github.com/srg-ics-uplb/peak-two-cloud/raw/master/slurm/slurm_vcluster_config/setup-slave.sh
    4. wget https://github.com/srg-ics-uplb/peak-two-cloud/raw/master/slurm/slurm_vcluster_config/adduser.sh
    5. wget https://github.com/srg-ics-uplb/peak-two-cloud/raw/master/slurm/slurm_vcluster_config/addmany.sh
    6. chmod 755 *.sh
    7. sudo ./setup-master.sh
    8. #create users.txt,  with each line having the username and password, each line separated by space
    9. sudo ./addmany.sh users.txt

  3. On each slave node. (ssh from the master.)
    1. sudo ./addmany.sh users.txt #ignore the warnings
    2. sudo ./setup-slave.sh
  4. Test
    1. #ssh as user user01 or any user from users.txt to the master
    2. sinfo -N -l
  5. When the cluster reboots, resume the nodes
    1. sudo scontrol update NodeName=cmsc180-slave-3 State=Resume