SIParCS 2023 - Yuta Norden

Yuta Norden, University of Hawaii Manoa

Yuta Norden, University of Hawaii Manoa

Reproducible and Scalable Analysis of Remote Sensing Data in the Cloud Using Xarray

Recorded Talk

Science today requires software that enables expressive and easily parallelized workflows on gigabyte- to petabyte-sized datasets. Xarray is an actively developed open-source library that provides scientists with a powerful interface for parallelized computation with multi-dimensional raster datasets (e.g., image stacks), which are prevalent today across all scientific domains. Modern workflows can leverage Xarray to analyze massive cloud-hosted archives such as NASA’s Earth observation archive. Over the summer, you will learn skills that are key to practicing open science and that are transferable to both academic and non-academic career paths: The student will collaborate with a team of research scientists, data scientists, and software developers to produce publicly-accessible tutorials that leverage Xarray for scientific analysis of Cloud-hosted remote sensing data; contribute to multiple open source geoscientific Python projects (particularly Xarray and RioXarray) as well as general open source tools such as JupyterBook; use cloud-based datasets and computational resources; gain experience with collaborative software development workflows via GitHub; learn about the technical components of reproducible computational workflows including testing, continuous integration, and sharing data and results.

Mentors: Julia Kent, Deepak Cherian (CGD), Scott Henderson (External), Jessica Scheick (External)

Slides and poster