Running MPI jobs

Running MPI jobs #

Quick hint: replace mpiexec with mpirun.
Please consult InfiniBand status page for known problems.

Please use mpirun tool to run MPI jobs through Slurm management system. By default mpirun will run your program using all allocated processes and cores. You can run specific number of processes with the parameter -n np, e.g.:

mpirun -n 8 ./mpiprog8.exe

Intel MPI #

The command mpiexec will not work with Intel MPI, please use mpirun.

OpenMPI #

You can also use OpenMPI with gcc compilers:

module load gnu openmpi

Hybrid MPI+OpenMPI jobs #

An example of MPI+OpenMPI program hybrid.c:

#include <stdio.h>
#include <omp.h>
#include "mpi.h"

int main(int argc, char *argv[]) {
  int numprocs, rank, namelen;
  char processor_name[MPI_MAX_PROCESSOR_NAME];
  int iam = 0, np = 1;

  MPI_Init(&argc, &argv);
  MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Get_processor_name(processor_name, &namelen);

  #pragma omp parallel default(shared) private(iam, np)
  {
    np = omp_get_num_threads();
    iam = omp_get_thread_num();
    printf("Node: %12s    rank: %2d/%2d    thread_id: %2d/%2d\n",
           processor_name, rank, numprocs, iam, np);
  }

  MPI_Finalize();
}

Use the following commands to compile the code with Intel MPI:

. /opt/intel/composer_xe_2015/bin/compilervars.sh intel64
. /opt/intel/impi_5.0.3/bin64/mpivars.sh
mpiicc -qopenmp hybrid.c -o hybrid

Use the following commands to compile the code with OpenMPI and gcc compilers:

module load gnu openmpi
mpicc -fopenmp hybrid.c -o hybrid

We will use the following Slurm batch script test-hybrid.sh:

#!/bin/bash

# A hybrid MPI+OpenMPI example

#SBATCH --job-name=hybrid
#SBATCH --output=hybrid.out
#SBATCH --partition=x6core
#SBATCH --time=5

####### 4 MPI ranks
#SBATCH --ntasks=4

####### 6 OMP threads per MPI rank
#SBATCH --cpus-per-task=6

## load MPI libs if needed
## for Intel:
# . /opt/intel/composer_xe_2015/bin/compilervars.sh intel64
# . /opt/intel/impi_5.0.3/bin64/mpivars.sh

## for OpenMPI:
# module load gnu openmpi


## set number of OMP thread to the value of --cpus-per-task
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

mpirun ./hybrid

Slurm will send the user environment variables to the compute nodes, therefore you can skip loading modules for MPI if they are already loaded by the user.

We recommend to explicitly set OMP_NUM_THREADS to the value of $SLURM_CPUS_PER_TASK. In most cases this is not necessary, since each MPI process will be bounded to specific CPU cores, and OpenMP will peek the number of threads automatically.

Enqueue the job:

sbatch test-hybrid.sh

The result will be in hybrid.out file. Example output:

Node:      cl1n031    rank:  1/ 4    thread_id:  0/ 6
Node:      cl1n031    rank:  1/ 4    thread_id:  1/ 6
Node:      cl1n031    rank:  1/ 4    thread_id:  5/ 6
Node:      cl1n031    rank:  1/ 4    thread_id:  3/ 6
Node:      cl1n031    rank:  0/ 4    thread_id:  5/ 6
Node:      cl1n031    rank:  0/ 4    thread_id:  0/ 6
Node:      cl1n031    rank:  0/ 4    thread_id:  4/ 6
Node:      cl1n031    rank:  1/ 4    thread_id:  2/ 6
Node:      cl1n031    rank:  1/ 4    thread_id:  4/ 6
Node:      cl1n031    rank:  0/ 4    thread_id:  1/ 6
Node:      cl1n031    rank:  0/ 4    thread_id:  2/ 6
Node:      cl1n031    rank:  0/ 4    thread_id:  3/ 6
Node:      cl1n032    rank:  2/ 4    thread_id:  0/ 6
Node:      cl1n032    rank:  2/ 4    thread_id:  2/ 6
Node:      cl1n032    rank:  2/ 4    thread_id:  1/ 6
Node:      cl1n032    rank:  2/ 4    thread_id:  4/ 6
Node:      cl1n032    rank:  2/ 4    thread_id:  5/ 6
Node:      cl1n032    rank:  2/ 4    thread_id:  3/ 6
Node:      cl1n032    rank:  3/ 4    thread_id:  1/ 6
Node:      cl1n032    rank:  3/ 4    thread_id:  0/ 6
Node:      cl1n032    rank:  3/ 4    thread_id:  5/ 6
Node:      cl1n032    rank:  3/ 4    thread_id:  4/ 6
Node:      cl1n032    rank:  3/ 4    thread_id:  3/ 6
Node:      cl1n032    rank:  3/ 4    thread_id:  2/ 6

The job used 2 nodes (cl1n031 и cl1n032), two MPI processes were used on each node. Each MPI process used 6 OpenMP threads.