Performance tools
Contents: Demonstration of performance tools with simple examples
CUDA
Visual profiler:
nvcc
make prof
Memory checker:
cuda-memcheck
make mem
OpenACC (PGI)
Profiling: pgcollect,
pgprof
make prof
Debugger: pgdbg
Tutotrials:
Rob Farber:
profiling
and
improving
Mark Harris:
Part3
Hints
by PGI
OpenACC performance tuning and profiling by Cray [
p.1-p.10
]
NVIDIA-
talk
: Profiling and Tuning OpenACC Code
Scalasca
: free performance tool for parallel computing.