Casper

The Casper cluster is a heterogeneous system of specialized data analysis and visualization resources and large-memory, multi-GPU nodes. Casper is the successor to the Geyser and Caldera clusters, which were decommissioned at the end of 2018.

NCAR's Casper system comprises 28 Supermicro nodes featuring Intel Skylake processors.

  • 20 Supermicro SuperWorkstation nodes are used for data analysis and visualization jobs. Each node has 36 cores and up to 384 GB memory. Eight of the nodes also feature an NVIDIA Quadro class GPU.
  • 6 additional nodes feature large-memory, dense GPU configurations to support explorations in machine learning (ML) and deep learning (DL) in atmospheric and related sciences.
  • 2 serve as login nodes.

See the hardware summary table below for detailed specifications.

Job scheduler: Users submit jobs to run on Casper nodes with the Slurm Workload Manager as documented here.

Operating system: CentOS 7.8


Hardware

Data Analysis & Visualization nodes

20 Supermicro 7049GP-TRT SuperWorkstation nodes
Up to 384 GB DDR4-2666 memory per node
2 18-core 2.3-GHz Intel Xeon Gold 6140 (Skylake) processors per node
2 TB local NVMe Solid State Disk
1 Mellanox ConnectX-4 100Gb Ethernet connection (GLADE, Campaign Storage, external connectivity)
1 Mellanox ConnectX-6 HDR100 InfiniBand link
1 NVIDIA Quadro GP100 GPU 16GB PCIe on each of 8 nodes

Machine Learning/Deep Learning nodes

2 Supermicro SuperServer nodes with 4 V100 GPUs
768 GB DDR4-2666 memory per node
2 18-core 2.3-GHz Intel Xeon Gold 6140 (Skylake) processors per node
2 TB local NVMe Solid State Disk
1 Mellanox ConnectX-4 100Gb Ethernet connection (GLADE, Campaign Storage, external connectivity)
2 Mellanox ConnectX-6 HDR200 InfiniBand adapters with each CPU socket having HDR100 link
4 NVIDIA Tesla V100 32GB SXM2 GPUs with NVLink

4 Supermicro SuperServer nodes with 8 V100 GPUs
1152 GB DDR4-2666 memory per node
2 18-core 2.3-GHz Intel Xeon Gold 6140 (Skylake) processors per node
2 TB local NVMe Solid State Disk
1 Mellanox ConnectX-4 100Gb Ethernet connection (GLADE, Campaign Storage, external connectivity)
2 Mellanox ConnectX-6 HDR200 InfiniBand adapters with each CPU socket having HDR200 link (operating MAX PCIe 3.0x16 slot capability, ~128Gb/s capability)
8 NVIDIA Tesla V100 32GB SXM2 GPUs with NVLink