WRF Performance and Scaling Assessment

08/02/2013 - 3:25am to 3:45am
ML - Main Seminar Room
Christopher Kruse


Benchmarking and scaling assessments were performed on the Yellowstone supercomputer at the NCAR-Wyoming Supercomputing Center using the Weather Research and Forecasting model using 512 to 64K cores.  A large test case simulating Hurricane Katrina at 1-km resolution with nearly 0.4 billion grid cells was used as the workload for benchmarking.  Hybrid MPI-OpenMP parallelization of the WRF model was tested by varying numbers of MPI tasks and OpenMP threads.  Intel MPI Library and IBM Parallel Environment MPI implementations were also tested and compared. Simulation speed (simulated time/wall clock time) was found to scale nearly linearly through 16K cores, with appreciable gains in simulation speed with increasing core counts beyond 16K.  While compute time decreased with increasing core counts, time to complete operations involving disk I/O (e.g., processing of initial and boundary conditions, writing output) using default I/O settings increased with increasing core counts, overwhelming the gains in simulation speed at 2K cores for this case.  Asynchronous I/O (quilting) and splitting of input and output files were explored to overcome these limitations.