SIParCS 2015- Mittapalli

Shreya Mittapalli, New Jersey Institute of Technology

Lossy Compression of Structured Scientific Data Sets

(Slides) (Recorded Talk)

Similarly to the more widely studied Fourier transforms, wavelet transforms may be used to approximate signals or functions using superposition of other, simpler, basis functions. Because of their multi-resolution and information compaction properties wavelets are widely used for lossy compression in numerous consumer multimedia applications (e.g. images, music, and video). Recently, researchers have been investigating whether lossy compression may have a role in addressing the problems of big scientific data sets. A number of choices exist for selecting appropriate compression parameters. We have been attempting to experimentally determine the optimal parameter choices for compressing numerical simulation data using wavelets. For that, we analyzed three different data sets: two WRF simulations of hurricanes (Katrina and Sandy), and a Taylor Green turbulence simulation. We constructed a python framework that allowed us to change various compression parameters like wavelet type and block size each time, and we plotted the lmax, RMSE errors against different compression ratios by comparing with the original data. We desire to minimize distortion for a desired output file size, while also constraining computation cost of compressing and decompressing the data.

Mentors: John Clyne and Alan Norton, CISL