The number of processor cores
available in high-performance computing systems is steadily increasing.
In the November 2012 list of the TOP500 supercomputers, only three
systems have less than 4,096 processor cores and the average is almost
30,000 cores, which is an increase of 12,000 in just one year. Even the
median system size is already over 15,300 cores. While these machines
promise ever more compute power and memory capacity to tackle today's
complex simulation problems, they force application developers to
greatly enhance the scalability of their codes to be able to exploit
it. To better support them in their porting and tuning process, many
parallel tools research groups have already started to work on scaling
their methods, techniques and tools to extreme processor counts.
In this talk, we survey existing performance analysis and optimization
tools covering both profiling and tracing techniques, report on our
experience in using them in extreme scaling environments, review
existing working and promising new methods and techniques, and discuss
strategies for addressing unsolved issues and problems. The performance
instrumentation and measurement package
Score-P and the parallel performance analysis toolset
Scalasca will be discussed in greater detail.