Submitting jobs with PBS

Resource requests and restrictions | Job scripts | Managing and monitoring jobs

Users schedule jobs to run on the Cheyenne supercomputer by submitting them through the PBS Pro workload management system. (To submit jobs to run on Geyser or Caldera, see Starting Geyser and Caldera jobs from Cheyenne.)

Batch jobs

To submit a batch job to the Cheyenne queues, use the qsub command followed by the name of your PBS batch script file.

qsub script_name

Alternative - qcmd: For running resource-intensive tasks such as CESM and WRF builds or other compiles with three or more threads, CISL provides qcmd. This script starts a non-interactive job on a single batch node in the Cheyenne "regular" queue with a default wall-clock time of 1 hour. The first CESM example below uses the default settings. The second shows how to specify a longer wall-clock time if needed.

qcmd -- ./case.build
qcmd -l walltime=12:00:00 -- ./case.build

Output from the build script is printed to the user's terminal window.

Also see Specify a project code below.

Interactive jobs

To start an interactive job on Cheyenne, use the qsub command as in this example, which includes a select statement.

qsub -I -l select=1:ncpus=36:mpiprocs=36 -l walltime=01:00:00 -q regular -A project_code

Alternative - qinteractive: You can start an interactive job in the share queue for one hour by executing the CISL "qinteractive" script. The script is available if the ncarenv environment module is loaded, as it is by default.

qinteractive

You can also customize the job by including additional arguments, as in this example:

qinteractive -l select=2:ncpus=10:mpiprocs=10

Caution: See Using MPT in Cheyenne share queue if you intend to run MPI jobs interactively in that queue.

Specify a project code

If you have more than one project to which you can charge your work, specify which one to charge by setting your PBS_ACCOUNT environment variable to the appropriate project code. Otherwise, the system will choose a project code randomly. You can set that in your startup files for convenience.


Resource requests and restrictions

Some resource requests are specified "per chunk" in a select statement. On the Cheyenne system, a chunk generally means a node, and your select statement defines what resources you need on each node. The first select statement in the interactive example above requests one chunk of resources and specifies the number of CPUs and MPI processes on the node.

Per-chunk resource example Type Description

-l select=1:mpiprocs=36

Request

Specify the number of MPI tasks you wish to utilize with the loaded MPI library

-l select=1:ncpus=36

Request

Specify the number of CPUs you wish to use on the node

-l select=1:ompthreads=36

Request

Specify the number of OpenMP threads to start on the node (defaults to ncpus if not set explicitly)

-l select=1:vmem=1GB

Restriction

Limit the amount of virtual memory for use by all concurrent processes on the node


Other resources are global, which means that they affect all running processes/threads that you request for the job. Cheyenne jobs require a walltime request in the format shown, but resource restrictions are optional. If a global resource restriction is exceeded during program execution, the job is terminated.

Global resource example Type Description

-l cput=10:00

Restriction

Limit the amount of CPU time a job can use across all CPUs

-l file=1GB

Restriction

Limit the maximum size of any single file that the job can create

-l pcput=10:00

Restriction

Limit the amount of CPU time the job can use on each CPU

-l pmem=2GB

Restriction

Limit the amount of physical memory that any single process can use

-l pvmem=2GB

Restriction

Limit the amount of virtual memory that the job can use

-l walltime=01:00:00

Request

Specify the amount of wall-clock time (hours:minutes:seconds) that the job can use (up to the queue limit)


Job scripts

Create your job script after carefully considering the system's usable memory and your program's memory requirements, which queue to specify, wall-clock time, and other parameters:

Your PBS job script can be no larger than 10 MB. Larger scripts, such as those that encode decadal climate data, should be modified to invoke a separate data encoding script.

Usable memory – The system's usable memory and how to specify your memory requirements in a job script are discussed below. To determine how much memory your program requires to run, see Checking memory use.

Job queues – Determine which queue is most appropriate for your job and provide accurate wall-clock times to help fit your job into the earliest possible run opportunity.

Redirecting output – If your job will create large stdout or stderr files, redirect the output to a file in order to avoid job failure, which can result from overflowing the spool directory of the job's first node. MPI jobs are more likely than others to fail this way because their numerous tasks can greatly increase the size of stdout and stderr. You can redirect output from the job's executable command, as shown in the example below, by appending > output_file (using your own name for the output file).

This is an example of a basic PBS script for running an MPI job. Each line that begins with #PBS is a PBS directive. Customize this example and other sample scripts on this page by substituting your own job name, project code, queue specification, email address, and so on where indicated.

#!/bin/tcsh
#PBS -N job_name
#PBS -A project_code
#PBS -l walltime=01:00:00
#PBS -q queue_name
#PBS -j oe
#PBS -m abe
#PBS -M your_email_address
#PBS -l select=2:ncpus=36:mpiprocs=36

### Set TMPDIR as recommended
setenv TMPDIR /glade/scratch/$USER/temp
mkdir -p $TMPDIR

### Run the executable
mpiexec_mpt dplace -s 1 ./executable_name.exe > output_file

The select statement directive in the example indicates that the job requires:

  • 2 chunks of resources (on non-shared Cheyenne queues, this is 2 nodes)
  • 36 cores or individual processors (ncpus=36) on each node
  • 36 MPI processes (mpiprocs=36) on each node.

CISL recommends that users set TMPDIR when running batch jobs on the Cheyenne compute nodes, as explained here: Storing temporary files with TMPDIR.

The mpiexec_mpt command is a wrapper script that sends information from PBS to the HPE Message Passing Toolkit (MPT), which includes the HPE MPI library. With some manual configuration, you can use mpirun instead. See Intel MPI and Open MPI.

Running an MPI job with MPT in the Cheyenne "share" queue requires a different script than what is used for other queues. See Using MPT in Cheyenne share queue.

The HPE dplace command is used on the last line to pin processes to the CPUs, preventing them from migrating and adversely affecting performance. The dplace command and the omplace wrapper script also are used in these other script examples. To learn about using omplace and dplace for more precise control of process placement, see Using omplace and dplace and the commands' man pages.

Specifying memory requirements

Select statements can also specify memory requirements. Cheyenne jobs run by default on nodes that have 64 GB of total memory (45 GB usable). To run a job on the 128-GB nodes (109 GB usable), include mem=109GB in your select statement.

#PBS -l select=2:ncpus=36:mpiprocs=36:mem=109GB

Loading modules in a batch script

Users sometimes need to execute module commands from within a batch job—to load an application, for example, or to load or remove other modules.

To ensure that the module commands are available, insert the following in your batch script if you need to include module commands.

In a tcsh script:

source /etc/profile.d/modules.csh

In a bash script:

source /etc/profile.d/modules.sh

Once that is included, you can add the module purge command if you need to and then load just the modules that are needed to establish the software environment that your job requires.

Using a login environment on batch nodes

It is possible to use the login environment for some jobs that you run on the Cheyenne system's exclusive-use batch nodes. For example, you might want to compile your code in the system's "regular" queue as opposed to the "share" queue, which limits jobs to using no more than 18 cores. Another job might need to use libraries that are available by default only on the login nodes and you also need to use more than 18 cores.

To run such a job on the batch nodes but using the login node environment, include this resource request as a PBS directive in your job script, or add it to your qsub command:

-l inception=login

Managing and monitoring jobs

This information has been moved to a separate page.