Machine learning / deep learning

3/25/2021: This was revised to reflect changes in how to start a Casper job.

CISL provides several libraries for users' machine learning and deep learning (ML/DL) work on Casper nodes

These libraries have been compiled from source to use native CUDA (GPU) and MPI libraries, increasing the capabilities over downloadable distributions that are available online. The ML/DL library installations can be found in NPL versions for Python 3.7.9.

Users load them by activating the NCAR Package Library (NPL).

The libraries available are:

Starting a job

ML/DL workloads are most likely targeted toward NVIDIA's Tesla V100 hardware. To start an interactive job on a Casper node using a V100 GPU, run the execcasper command with the ngpus=# and gpu_type=v100 resources set as shown in this documentation.

Then load the modules you need, including Python version 3.7.9, and activate the NPL.