Using omplace options | Using dplace and placement files

The omplace command is an HPE wrapper script that tells PBS Pro to pin processes and threads to the CPUs. It uses the HPE dplace tool to control process placement. For purposes of many Cheyenne users, using omplace as shown in these PBS Pro scripts is sufficient to ensure that processes do not migrate among CPUs and adversely affect performance.

Users who do need more precise control of process placement can use a number of omplace options or the dplace command itself. The Cheyenne node configuration diagram is intended to help you visualize the placement examples below.

Cheyenne dual-socket node configuration

Click the image to enlarge it.

Cheyenne dual-socket node configuration
This illustration shows the configuration of a Cheyenne node, where each of two sockets has 18 cores for a total of 36 physical cores per node. Hyper-threading is on, so there are 72 logical cores. The processor IDs (0/36, 1/37, and so on) indicate how the physical CPU numbers correspond to logical CPU numbers. The cores on a socket share the L3 memory cache, while each core has its own L2 memory.

Using omplace options: -vv, -c cpulist

If you use omplace without explicitly defining CPU placement, the script invokes dplace for you and generates a placement file dynamically.

-vv

Use the -vv option on your command line to include the placement file in your job output for reference. (The -v option will include portions of the file.)

mpiexec_mpt omplace -vv ./executable_name.exe

Here is an example of output from omplace -vv:

omplace information:
 MPI type is HPE MPI, 36 threads, thread model is intel
 placement file /tmp/omplace.file.ktrout.72921:
 fork skip=0 exact cpu=0-71:36
 thread oncpu=0 cpu=1-35 noplace=1 exact
 thread oncpu=36 cpu=37-71 noplace=1 exact

This is the select statement that was used in the job script:

#PBS -l select=1:ncpus=36:mpiprocs=36

-c cpulist

Include this option on your command line with omplace to specify a list or range of CPUs to be used. The CPUs can be specified several different ways.

Examples:

mpiexec_mpt omplace -c 0-3,18-21 ./executable_name.exe
mpiexec_mpt omplace -c 0,17-18,35 ./executable_name.exe

See man omplace(1) on Cheyenne for additional information.


Using dplace and placement files

Advanced users may decide to create their own placement files for special situations. Here are two placement files and associated select statements, followed by an example of how to specify the file on your command line.

Example 1

Placement file for 18 MPI ranks x 2 OpenMP threads per node. Although hyper-threading is enabled on Cheyenne nodes, it is not used here because only 36 threads are running.

fork skip=0  exact cpu=0-71:2 
thread oncpu=0 cpu=1 noplace=1  exact
thread oncpu=2 cpu=3 noplace=1  exact
thread oncpu=4 cpu=5 noplace=1  exact
thread oncpu=6 cpu=7 noplace=1  exact
thread oncpu=8 cpu=9 noplace=1  exact
thread oncpu=10 cpu=11 noplace=1  exact
thread oncpu=12 cpu=13 noplace=1  exact
thread oncpu=14 cpu=15 noplace=1  exact
thread oncpu=16 cpu=17 noplace=1  exact
thread oncpu=18 cpu=19 noplace=1  exact
thread oncpu=20 cpu=21 noplace=1  exact
thread oncpu=22 cpu=23 noplace=1  exact
thread oncpu=24 cpu=25 noplace=1  exact
thread oncpu=26 cpu=27 noplace=1  exact
thread oncpu=28 cpu=29 noplace=1  exact
thread oncpu=30 cpu=31 noplace=1  exact
thread oncpu=32 cpu=33 noplace=1  exact
thread oncpu=34 cpu=35 noplace=1  exact

To run a job using that placement file on a single node, you would use this select statement in your PBS job script:

#PBS -l select=1:ncpus=36:mpiprocs=18:ompthreads=2

Example 2

Placement file for 18 MPI ranks x 4 OpenMP threads per node. Hyper-threading is being used here because 72 threads are running on each node.

fork skip=0  exact cpu=0-71:4 
thread oncpu=0 cpu=1-3 noplace=1  exact
thread oncpu=4 cpu=5-7 noplace=1  exact
thread oncpu=8 cpu=9-11 noplace=1  exact
thread oncpu=12 cpu=13-15 noplace=1  exact
thread oncpu=16 cpu=17-19 noplace=1  exact
thread oncpu=20 cpu=21-23 noplace=1  exact
thread oncpu=24 cpu=25-27 noplace=1  exact
thread oncpu=28 cpu=29-31 noplace=1  exact
thread oncpu=32 cpu=33-35 noplace=1  exact
thread oncpu=36 cpu=37-39 noplace=1  exact
thread oncpu=40 cpu=41-43 noplace=1  exact
thread oncpu=44 cpu=45-47 noplace=1  exact
thread oncpu=48 cpu=49-51 noplace=1  exact
thread oncpu=52 cpu=53-55 noplace=1  exact
thread oncpu=56 cpu=57-59 noplace=1  exact
thread oncpu=60 cpu=61-63 noplace=1  exact
thread oncpu=64 cpu=65-67 noplace=1  exact
thread oncpu=68 cpu=69-71 noplace=1  exact

To run a job using that placement file on a single node, you would use this select statement in your PBS job script:

#PBS -l select=1:ncpus=72:mpiprocs=18:ompthreads=4

Submitting the job

To use your own placement file when you submit a job, include the dplace -p option and the file name on the command line in your PBS job script.

mpiexec_mpt dplace -p filename ./executable_name.exe

See these man pages on Cheyenne for additional details and usage examples: dplace(1), dplace(5).