SIParCS 2015- Marcus

Kyle Marcus, University at Buffalo

Improving the Thread Scalability of the Community Atmosphere Model

(Slides) (Recorded Talk n/a)

The future computer system architectures will introduce changes on many levels especially thread scalability. The next generation of supercomputers will include very large thread counts and software will need to be modified to take advantage of this. The up coming Intel Haswell enterprise chips will include up to 18 cores (36 with hyper-threading) and the Intel Knights Landing accelerator will include 72 cores with four threads per core. If you have a node with 4 sockets and 4 KNL's, that's an immense amount of threads (18*2*4 + 72*4*4 = 1296 threads)! This Intel hardware will be used to build the next supercomputers including possibly the successor to Yellowstone. Using this motivation, the project goal this summer was to improve the thread scalability of the Community Atmosphere Model code (specifically HOMME) to prepare for running on future architectures. Utilizing OpenMP, the scalability of the vertical sections of threading were analyzed and improved where needed. The results showed that improving thread scalability using OpenMP can be done by collapsing loops and combining parallel sections.

Mentors: John Dennis and Ben Jamroz, CISL TDD