Starting jobs on Casper nodes

Interactive jobs | Batch jobs | Compiling your code

Updated 11/7/2018

To run jobs on the Casper cluster, Cheyenne users submit them with the open-source Slurm Workload Manager. Procedures for starting both interactive jobs and batch jobs are described below.

Code to be run on Casper should be compiled on Casper nodes. (See compiling your code below.)

Begin by logging in on Cheyenne.


Interactive jobs

Using execdav

Run the execdav script/command to start an interactive job. Invoking it without an argument will start an interactive shell on the first available DAV node. The default wall-clock time is 6 hours.

The execdav command has these optional arguments:

  • -a project_code (defaults to value of DAV_PROJECT)
  • -t time (minutes:seconds or hours:minutes:seconds; defaults to 6 hours)
  • -n number_of_cores (defaults to 1 core: -n 1)
    • When you launch the interactive job, your login shell uses 1 core or "slot," so adjust -n by requesting enough cores to account for that.
  • -m nG
    • Use this if you want to specify how much memory to use per node, from 1 to 1100 gigabytes.
      Example: -m 300G
    • If you do not specify memory per node, the default memory available is 1.87G per core that you request.
  • -C constraint
    • Options include skylake, gpu, gp100, v100, x11. (See table below.)
      Example: -C v100

To specify which project code to charge for your CPU time, set environment variable DAV_PROJECT as shown before invoking execdav.

DAV_PROJECT=UABC0001

* * *

Using exechpss

The exechpss command is used to initiate HSI and HTAR file transfers. See examples in Managing files with HSI and Using HTAR to transfer files.


Batch jobs

Prepare a batch script by following one of the examples below. Be aware that the system does not import your Cheyenne environment, so make sure your script loads the software modules that you will need to run the job.

Basic Slurm commands

When your script is ready, run sbatch to submit the job.

sbatch script_name

To check on your job's progress, run squeue.

squeue -u $USER

To get a detailed status report, run scontrol show job followed by the job number.

scontrol show job nnn

To kill a job, run scancel with the job number.

scancel nnn

Setting constraints and reserving consumable resources

Batch scripts can include constraints that request nodes having certain features, and they can reserve consumable resources. Setting constraints and reserving resources produce different behavior.

  • If you constrain your job to GPUs, you are simply asking the scheduler to place the job on a node that has GPUs.
  • If you reserve a number of GPUs, your job will have exclusive access to those GPUs.

Minimizing constraints and reservations when possible decreases the length of time your job waits in the queue. When you do use them, make sure that your feature constraints and resource reservations don't conflict. (For example, don't constrain your job to nodes with GP100 GPUs while reserving V100 GPUs.)

Examples of node constraints/features

#SBATCH -C skylake
#SBATCH -C x11
#SBATCH -C v100

Example reserving consumable resources

Use the --gres option to reserve specific resources, such as GPUs. This Slurm directive would reserve two V100 GPUs:

#SBATCH --gres=gpu:v100:2

Node count

Constraints/features

Consumable resources

12

skylake

 

8

skylake, gpu, x11, gp100

gpu:gp100:1

2

skylake, gpu, v100, 4xv100

gpu:v100:[1-4]

2

skylake, gpu, v100, 8xv100

gpu:v100:[1-8]

 

Wall-clock

The wall-clock limit on the Casper cluster is 24 hours.

Specify the hours your job needs as in the examples below, which use the hours:minutes:seconds format. It can be shortened to minutes:seconds.

NVMe node-local storage

Casper nodes each have 2 TB of local NVMe solid-state disk (SSD) storage. Some is used to augment memory to reduce the likelihood of jobs failing because of excessive memory use.

NVMe storage can also be used while a job is running. (Recommended only for I/O-intensive jobs.) Data stored in /local_scratch/$SLURM_JOB_ID are deleted when the job ends.

To use this disk space while your job is running, include the following in your batch script after customizing as needed.

### Copy input data to NVMe (can check that it fits first using "df -h")
cp -r /glade/scratch/$USER/input_data /local_scratch/$SLURM_JOB_ID

### Run script to process data (NCL example takes input and output paths as command line arguments)
ncl proc_data.ncl /local_scratch/$SLURM_JOB_ID/input_data /local_scratch/$SLURM_JOB_ID/output_data

### Move output data before the job ends and your output is deleted
mv /local_scratch/$SLURM_JOB_ID/output_data /glade/scratch/$USER/

Script examples

The examples below show how to create a script for running an MPI job. See these pages for other script examples:

For tcsh users

Insert your own project code where indicated and customize other settings as needed for your own job.

#!/bin/tcsh
#SBATCH -J job_name
#SBATCH -n 8
#SBATCH --ntasks-per-node=4
#SBATCH --mem=8G
#SBATCH -t 00:60:00
#SBATCH -A project_code
#SBATCH -p dav
#SBATCH -e job_name.err.%J
#SBATCH -o job_name.out.%J

setenv TMPDIR /glade/scratch/$USER/temp
mkdir -p $TMPDIR

module purge
module load gnu ncarenv ncarcompilers
module load openmpi

srun ./mpihello

For bash users

Insert your own project code where indicated and customize other settings as needed for your own job.

#!/bin/bash -l
#SBATCH -J job_name
#SBATCH -n 8
#SBATCH --ntasks-per-node=4
#SBATCH --mem=8G
#SBATCH -t 00:60:00
#SBATCH -A project_code
#SBATCH -p dav
#SBATCH -e job_name.err.%J
#SBATCH -o job_name.out.%J

export TMPDIR=/glade/scratch/$USER/temp
mkdir -p $TMPDIR

module purge
module load gnu ncarenv ncarcompilers
module load openmpi

srun ./mpihello

Compiling your code

CISL recommends using the default Intel, GNU or PGI compilers for parallel programs.

  1. Load the compiler.
  2. Load the openmpi module if you plan to use MPI.
  3. Compile your code as you usually do. 

Serial programs can use any compiler.