Tutorial: Using Dask on HPC Systems

tutorial
Feb. 6, 2023

1:00 – 5:00 pm MST

NCAR's Mesa Lab and Virtual

CISL’s Consulting Services Group and the NCAR Earth System Data Science (ESDS) Initiative are planning a half-day tutorial for users interested in effective use of Dask on HPC resources like Casper and Cheyenne. The four-hour tutorial will be split into two sections, with early topics focused on novice Dask users and later topics focused on intermediate usage on HPC and associated best practices. The knowledge areas covered include (but are not limited to):

  • Beginner section
    • High-level collections including dask.array and dask.dataframe 
    • Distributed Dask clusters using HPC job schedulers
    • Earth Science data analysis using Dask with Xarray
    • Using the Dask dashboard to understand your computation 
  • Intermediate section
    • Optimizing the number of workers and memory allocation
    • Choosing appropriate chunk shapes and sizes for Dask collections
    • Querying resource usage and debugging errors

The tutorial will take place from 1 to 5 p.m. MST on February 6 and can be attended either virtually or in-person at the NCAR Mesa Lab. If you are interested in participating in one or both sessions, please use this form to register by February 3 at noon MST. 

This tutorial is open to non-UCAR staff. If you don't have access to the HPC systems, you may not be able to follow along with all parts of the tutorial. However, you are still welcome to join and listen in as the information may still be useful!

Contact

Please direct questions/comments about this page to: