Performance Analysis and Optimization of Scientific Applications

on Extreme-Scale Computer Systems


Bern Mohr

The number of processor cores available in high-performance computing systems is steadily increasing. In the November 2012 list of the TOP500 supercomputers, only three systems have less than 4,096 processor cores and the average is almost 30,000 cores, which is an increase of 12,000 in just one year. Even the median system size is already over 15,300 cores. While these machines promise ever more compute power and memory capacity to tackle today's complex simulation problems, they force application developers to greatly enhance the scalability of their codes to be able to exploit it. To better support them in their porting and tuning process, many parallel tools research groups have already started to work on scaling their methods, techniques and tools to extreme processor counts.

In this talk, we survey existing performance analysis and optimization tools covering both profiling and tracing techniques, report on our experience in using them in extreme scaling environments, review existing working and promising new methods and techniques, and discuss strategies for addressing unsolved issues and problems. The performance instrumentation and measurement package Score-P and the parallel performance analysis toolset Scalasca will be discussed in greater detail.