Power-efficient computing produces more science per watt

By Brian Bevirt
05/26/2017 - 3:30pm

This article is part of a CISL News series describing the many ways CISL improves modeling beyond providing supercomputing systems and facilities. These articles briefly describe CISL modeling projects and how they benefit the research community.

Optimization improvements on both platforms
This chart shows the impact of ASAP code optimizations on the energy usage of the HOMME dynamical core at a production resolution (100-kilometer grid spacing with 70 vertical levels in the atmosphere) running a WACCM-like configuration on Cheyenne and the Cori supercomputer at NERSC. The simulation rate is shown on the X axis while the number of MegaJoules per simulated year is on the Y axis. Optimization reduces the energy usage on both platforms by about a factor of two versus the original code.

CISL’s Application Scalability and Performance group (ASAP) is working to optimize supercomputer applications on current and future computer architectures. After studying a large number of small pieces of code from the Community Earth System Model (CESM), the Model for Prediction across scale (MPAS), and the Weather Research Forecast Model (WRF), the ASAP group consistently observed great potential for significantly reducing execution times. “We typically see that it is possible to reduce the execution time by a factor of two,” reports ASAP group head John Dennis. “Code optimization not only reduces the time to complete a simulation, but it also reduces the cost to perform science. Because the cost of electricity is a significant component of the cost of operating a massively parallel computer like Cheyenne, code optimization makes it possible to do science faster and cheaper.”

Supercomputing applications consist of collections of mathematical expressions organized into algorithms that model the behavior of a physical system. The ASAP group is working to optimize the parts of the CESM that reproduce the behavior of the most important components of the Earth’s climate system: atmosphere, oceans, land surface, land ice, and sea ice.

Typically, the software and hardware necessary to model the Earth’s physical systems undergo a similar evolution. Back in the late 1980s and early 1990s, most of the models’ codes operated on vector operands and executed on vector processors. Vector processors execute an identical set of instructions on multiple floating-point values or vector operands. The advent of scalar processors in the mid 1990s forced the evolution of NCAR modeling applications to a new scalar design. This hardware and software evolution has continued, and ironically, the ASAP group is now spending a large amount of time optimizing code by re-introducing vector constructs back into application codes to take advantage of the current generation of vector processors.

The ASAP group has been optimizing CESM codes by modifying the way they use instructions to access processor caches and main memory. An example of how ASAP staff’s code modifications are producing “more science per watt” is seen in the optimization work done on the implicit chemistry solver used within the Whole Atmosphere Community Climate Model (WACCM). In collaboration with members of NCAR’s Atmospheric Chemistry Observations and Modeling laboratory, we resurrected vector code that had not been used for 15 years, then made some seemingly minor code changes to the way in which variables are declared. The revised code was integrated back into the CESM2 code base and will be used during the first very long resource-intensive CMIP6 runs. These optimizations will produce a 20% reduction in the time to perform this first simulation.

Optimization improves performance and scalability
This chart shows test results from running ASAP code optimizations in the HOMME dynamical core at high resolution (12-kilometer grid spacing with 128 vertical levels in the atmosphere) running the NGGPS benchmark code on three supercomputers with three different Intel processor technologies. Results for Cheyenne (Broadwell) are shown in red, Cori (Knights Landing, the newest technology) results are in green, and Edison (Ivy Bridge, the oldest technology) results are in blue. The number of nodes (tightly networked groups of processors) used are shown on a logarithmic scale along the X axis, and the simulation rate in seconds of computer time per two-hours of simulated dynamics in the atmosphere is shown on a logarithmic scale along the Y axis. The lines indicate that the optimized HOMME code scales well (faster simulation rates as node count increases) on all three processor types up to the maximum number of nodes on the system.

ASAP has been collaborating for several years with engineers from Intel Corporation, NERSC, and NCAR’s Climate and Global Dynamics Laboratory (CGD) to optimize the spectral element dynamical core on the Knights Landing (KNL) processor, a Xeon Phi platform. The spectral element dynamical core is the default dynamical core used by the Community Atmosphere Model (CAM) for all high-resolution configurations. The KNL product has a high-bandwidth memory capability and focuses on energy-efficient computing. The collaboration with NERSC, Intel, and CGD has produced some very impressive results in both absolute performance and energy usage. Compared to NCAR’s Cheyenne supercomputer, the KNL platform achieves the best execution rate for a challenging benchmark while using half the electricity.

Energy efficiency is not a new concept in computing or code optimization. In the past, engineers had to make trade-offs between energy efficiency and high performance. However, this use of KNL illustrates that we can have both energy efficiency and high performance on the same platform.

“I really would love to see the NCAR computing community start to think more about the energy efficiency of their scientific applications,” said John Dennis. “If you look at the ways NCAR employees get to work, it seems like half of the staff use either a Toyota Prius, Nissan Leaf, a bicycle, the NCAR shuttle, or their feet. There really is a strong energy efficiency ethos here with regard to personal transportation. This ethos has not yet translated to the efficiency of the science applications that run on NCAR supercomputers. While there are always exceptions, the NCAR science applications are more analogous to giant SUVs. I hope to see the day when it is part of the NCAR culture to have energy-efficient science applications that rival our energy-efficient modes of transportation.”