MATLAB Parallel Computing Toolbox on Cheyenne

The MATLAB Parallel Computing Toolbox allows you to run MATLAB code in parallel across multiple workers, which are analogous to MPI tasks or OpenMP threads. Additionally, NCAR has a license for the MATLAB Parallel Server (MPS) – formerly the Distributed Computing Server – which allows you to run a MATLAB script using workers from multiple compute nodes via a batch job you submit with PBS.

Cluster profiles

Using this parallelism requires some setup, particularly if you want to use the parallel server. The toolbox expects you to create and use “cluster profiles” that manage either node-local tasks or batch-scheduler tasks. We have created a cluster profile for PBS on Cheyenne for all versions of MATLAB starting with R2020a.

You can import an existing cluster profile using the wizard in the graphical interface, or you can do it programmatically as follows.

At the MATLAB command line, enter the following lines to import the R2020a profile:

chey2020a = parallel.importProfile(
            '/glade/u/apps/opt/matlab/R2020a/ncar/cheyenne-R2020a');

You only need to import the profile once; MATLAB will remember it in future sessions. If you anticipate using a parallel server profile frequently, you may want to make one your default parallel profile as shown here:

parallel.defaultClusterProfile(chey2020a);

Running a simple parallel code using the toolbox

In this example, a simple code uses multiple workers on a single node to compute a sum in parallel. Here is the sample code:

parallel_sum.m:
function s = parallel_sum(N)
  s = 0;
  parfor i = 1:N
    s = s + i;
  end

  fprintf('Sum of numbers from 1 to %d is %d.n', N, s);
end

This function is executed by a MATLAB script that reads in a few parameters from the user environment. For single-node parallel jobs, this examples uses the “local” cluster profile that is available by default when using the toolbox.

run_local.m:
% Start local cluster and submit job with custom number of workers
c = parcluster('local')
j = c.batch(@parallel_sum, 1, {100}, 'pool',
str2num(getenv('NUMWORKERS')));

% Wait for the job to finish, then get output
wait(j);
diary(j);
exit;

Do not run the toolbox on login nodes; excessive CPU usage will result in your script being terminated. Instead, use a PBS batch job as in the following example to run a single-node cluster on a compute node. This PBS job also specifies the number of workers, which corresponds to the number of CPUs requested in the job script.

submit_local.pbs:
#!/bin/bash
#PBS -N matlab_pct
#PBS -A <PROJECT>
#PBS -l walltime=05:00
#PBS -q share
#PBS -j oe
#PBS -o local.log
#PBS -l select=1:ncpus=4:mpiprocs=4

# This script is designed to be run locally using threads
# and should only ever request a single node!

module load matlab

# MATLAB relies on SHELL variable, which gets modified by PBS
export SHELL=$PBS_O_SHELL

# Derive the number of workers to use in the toolbox run script
export NUMWORKERS=$(wc -l $PBS_NODEFILE | cut -d' ' -f1)

SECONDS=0
matlab -nosplash -nodesktop -r "run_local"
echo "Time elapsed = $SECONDS s"

Using the parallel server to span multiple nodes

The configuration above will limit your job to the number of CPUs on a single node;  on Cheyenne this means 36 workers or 72 if you use hyperthreads. However, you can use the parallel server to span multiple nodes. In this configuration, MATLAB itself will submit a job to the batch scheduler and use an internal MPI library to enable communication between remote workers. Here again, use a MATLAB script to set up your parallel cluster:

run_server.m:
% Start PBS cluster and submit job with custom number of workers
c = parcluster(getenv('CLUSTERNAME'));

% Matlab workers will equal nodes * tasks-per-node - 1
jNodes = getenv('MPSNODES');
jTasks = getenv('MPSTASKS');
jWorkers = str2num(jNodes) * str2num(jTasks) - 1;
jAccount = getenv('MPSACCOUNT');
jQueue = getenv('MPSQUEUE');
jWalltime = getenv('MPSWALLTIME');

c.ResourceTemplate = append('-l select=', jNodes, ':ncpus=', jTasks,
':mpiprocs=', jTasks);
c.SubmitArguments = append('-A ', jAccount, ' -q ', jQueue, ' -l
walltime=', jWalltime);
c.JobStorageLocation = append(getenv('PWD'), '/output');

% Output cluster settings
c

% Submit job to batch scheduler (PBS)
j = batch(c, @parallel_sum, 1, {100}, 'pool', jWorkers);

% Wait for job to finish and get output
wait(j);
diary(j);
exit;

Finally, execute a driver script that configures the user environment and runs the MATLAB script.

submit_server.sh:
#!/bin/bash

# This doesn't need to run on a batch node... we can simply schedule
# the parallel job via the login node

module rm ncarenv
module load matlab

mkdir -p output

export MPSNODES=2
export MPSTASKS=4
export MPSACCOUNT=<PROJECT>
export MPSQUEUE=share
export MPSWALLTIME=300
export CLUSTERNAME="cheyenne-R2020a"
SECONDS=0
matlab -nosplash -nodesktop -r "run_server"
echo "Time elapsed = $SECONDS s"