IMAGe Brown Bag- Statistical-based Compression of Climate Model Output

04/06/2016 - 12:00pm to 1:00pm
ML- Chapman Room

Stefano Castruccio
Newcastle University

Wednesday, April 6, 2016
Mesa Lab, Chapman Room
(Bring your lunch)

An active area of research at NCAR focuses on how to compress climate model output to mitigate the storage pressure from the ever increasing quantity of data from modern ensembles. Current research mostly relies on lossy algorithms for both the CESM1 ensemble for the CMIP5 experiment and, most recently, the Large Ensemble Project. Lossy algorithms are based on truncation of significant digits in the output, achieve a compression rate of at most 1:5, and have been shown to be indistinguishable from internal variability of an ensemble for many quantities.

In this talk, we propose a different, statistics-based approach to data compression which achieves compression rates between one to two orders of magnitude higher than the lossy approach. The key idea is to fit a space-time statistical model to the variability around the mean in an initial condition ensemble, and to regard the parameter estimates as the compressed data. I will discuss the advantages and disadvantages of this approach, and further extensions to tailor compression schemes that preserve only some local (but crucial) properties of the original output.