Casper

Transition from Geyser and Caldera | Casper hardware | License use guidelines

The Casper cluster is a new, heterogeneous system of specialized data analysis and visualization resources and large-memory, multi-GPU nodes. Casper is the successor to the Geyser and Caldera clusters, which have been in service since 2012 and will be decommissioned by the end of 2018.

NCAR's new Casper system, procured from PCPC Direct, Ltd., consists of a total of 24 Supermicro nodes featuring Intel Skylake processors.

  • 20 Supermicro SuperWorkstation nodes will be used for data analysis and visualization jobs. Each node has 36 cores and 384 GB of memory. Eight of the 20 nodes also feature an NVIDIA GPU.
  • 4 additional nodes feature large-memory, dense GPU configurations to support explorations in machine learning (ML) and deep learning (DL) in atmospheric and related sciences.

See the hardware table below for more detailed specifications.

Job scheduler: Users run jobs on the Casper cluster by logging in to Cheyenne and submitting them with the Slurm Workload Manager.

Operating system: CentOS 7


Transition from using Geyser and Caldera

Geyser and Caldera users can prepare to run jobs on Casper nodes by taking these steps:

  • Review the documentation:

Starting jobs on Casper nodes

Starting TurboVNC on Casper nodes

Compiling GPU code on Casper

Compiling multi-GPU MPI/CUDA code on Casper

  • Create or revise job scripts for use on the desired nodes. Casper job scripts are similar to those for Geyser and Caldera.
  • Recompile their codes on the new system. See Compiling code.
  • Register for upcoming training events when they are publicized in the CISL Daily Bulletin.

Casper, Cheyenne, Geyser, and Caldera all mount the central GLADE file systems. This means you can analyze your data files in place, without sending large amounts of data across a network or creating copies in multiple locations — and there is no need to move your files to work on Casper rather than on Geyser or Caldera.


Hardware

Data Analysis & Visualization nodes

20 Supermicro 7049GP-TRT SuperWorkstation nodes
384 GB DDR4-2666 memory per node
2 18-core 2.3-GHz Intel Xeon Gold 6140 (Skylake) processors per node
2 TB local NVMe Solid State Disk
Mellanox VPI EDR InfiniBand dual-port interconnect
(one port configured for FDR and one as 100 GbE)
Intel 10 Gb dual-port Ethernet
NVIDIA QuadroGP100 GPU on each of 8 nodes

Machine Learning/Deep Learning nodes

2 Supermicro 1029GQ-TVRT SuperServer nodes
768 GB DDR4-2666 memory per node
2 18-core 2.3-GHz Intel Xeon Gold 6140 (Skylake) processors per node
2 TB local NVMe Solid State Disk
Mellanox VPI EDR InfiniBand dual-port interconnect
(one port configured for FDR and one as 100 GbE)
Intel 10 Gb dual-port Ethernet
NVIDIA Tesla V100 SXM2 GPUs with NVLink

2 Supermicro 4029GP-TVRT SuperServer nodes
1152 GB DDR4-2666 memory per node
2 18-core 2.3-GHz Intel Xeon Gold 6140 (Skylake) processors per node
2 TB local NVMe Solid State Disk
Mellanox VPI EDR InfiniBand dual-port interconnect
(one port configured for FDR and one as 100 GbE)
Intel 10 Gb dual-port Ethernet
NVIDIA Tesla V100 SXM2 GPUs with NVLink

License use guidelines

The CISL user community shares a limited number of licenses for running MATLAB, MATLAB Toolboxes, and some other applications.

Follow these guidelines to ensure fair access for all users:

  • Avoid monopolizing these licenses.
  • If you need to use multiple licenses at one time, be considerate of others and finish your session as quickly as possible.
  • Close applications when you are done to free up licenses for others to use.

CISL reserves the right to kill jobs/tasks of users who monopolize these licenses.

To see how many licenses are being used, run licstats at your command line.

licstats

Run it with option -h for additional information.

licstats -h

MATLAB alternative - Octave

Many MATLAB codes run with very little or no modification under Octave, a free interactive data analysis software package with syntax and functionality that are very similar to MATLAB's. Since using Octave is not constrained by license issues, we encourage MATLAB users to try it, particularly those who have long-running MATLAB jobs. Depending on compute intensity, Octave usually runs slower than MATLAB but it may be suitable for most data analysis work and you won't risk having jobs killed because of a lack of licenses.

To use Octave interactively, start an interactive job and load the module.

module load octave

Run octave to start the command line interface, or run the following command to use the GUI.

octave --force-gui