SIParCS 2018- Julia Piscioniere
Evaluating the Performance of Large Scale Data Assimilation in Modern Geophysical Models
This project focused on the optimization of DART: Data Assimilation Research Testbed. Data assimilation is when a numerical model uses observations to produce more accurate forecasts. The original DART code processes one observation at a time, updating the model states after each observation. The optimized code uses a graph coloring algorithm, so observations are put in independent groups (colors). These groupings enable us to reduce communication time by processing multiple observations at a time. My project was to run comparisons of the two codes; we put timers in the code to collect data on how the graph code scales in comparison to the original code. I ran the comparisons on Cheyenne using a bash script that looped through different variable values and ran DART with those specific parameters, then output timer files. The graph code, using pre-calculated colors, has consistently faster performance compared to the original code. It also scales well up to 256 nodes on a one-degree CAM case, does better with a higher number of ensemble members, and decreases the broadcast time by an average of about 90%.
Mentors: John Dennis, Brian Dobbins