Ph.D.-Seminar:
High Performance Computing I (WS 20/21)
Under
Construction
- Contents:
- We will start with an introduction into basic principles and
algorithms of parallel computing followed by transferring selected
algorithms onto many-core architectures. We will focus first on
NVIDIA-GPUs using the CUDA library, even on multiple GPUs. The students
will compare the performance of the algorithms on the GPU with
performance on multi-core CPUs using OpenMP. Recent Compilers for
OpenACC supporting GPU programming will available for testing and
benchmarking. A similar programming scheme can be used for the core
accelerator card Xeon Phi by using Intel compilers and OpenMP 4.0.
- A tech-report has to be produced until the end of the term.
Course
Material : Follow the link.
See templates.
Compilers:
- GCC/ICC install with
Windows/Linux
- Environment
variables for compilers and git on mephisto,
gpu11, with NVIDIA and PGI
compilers
- check out the git repository
- Lecturer:
- Prof.
Gundolf
Haase, Heinrichstr. 36, Zi 506, Tel. 5178,
Appointments: Friday 14:00 - 15:30 in Heinrichstr. 36, SR
11.32
-
- Projects:
- Hardware
(login from outside KFU
only via VPN):
- TBA
- Remote login to servers:
- VPN to KFU is needed: install via
VPN Service the software
AnyConnect (configure as server: https://univpn.uni-graz.at;
login: KFU E-mail)
- Linux: use
ssh -X 143.50.47.xxx
to connect to
compute server
- Windows: Install WinSSHTerm
with a guided installation of further packages (putty, winscp,
X-Server)
- Material
for CUDA:
- Getting Started
with CUDA (pdf)
- first simple code incl. makefile
- Code in mephisto for CUDA/PGI-OpenACC (add-on for ~/.bashrc)
- Material
for OpenACC (PGI):
Material for CUDA:
- NVIDIA: Hardware
Donation Program
- Online
course CUDA
- CUDA Toolkit Documentation: all
- CUDA Toolkit/SDK 6.5 Download,
Documentation,
- CUDA 6.5: blog
by Mark Harries, Performance report, CUDA
7.0 (1,2),
- new: Tutorials
on CUDA, OpenCL, Thrust, Nsight, PGI
- NVIDIA, CUDA,
OpenCL,
OpenCL for
NVIDIA
- CUDA Programming: Getting Started,
Guide,
Reduction in CUDA
- Slides by M.Liebmann (1,
2, 3, 4, 5, 6)
- AMD, Radeon: Developer
Center
- List
of GPU-acclerated libraries; Thrust
1.7 (C++ STL in CUDA),
ppt
- GPGPU.org
- Software/Compiler/Hardware:
- FLAMEGPU: Flexible Large
Scale Agent Modelling Environment for the GPU
- Nvidia: Pascal
with 3840 cores
- OpenACC (Cray,
NVIDIA, PGI, CAPS), Quick
Ref
- CUBLAS,
CUFFT,
CUSPARSE,
CURAND, Thrust
1.7, CuSolver,
- LAPACK on GPU (Info):
cuLA
- PGI-Compiler
with CUDA pragmas [prices]
- CUDA-Programme auch
auf
CPUs lauffähig
- HMPP
workbench with pragmas for CUDA/OpenCL [prices]
- OP2
project by Mike Giles,
great Course
by Mike Giles (see also the guest talks)
- PetSc
on GPU
- LIBJACKET,
C++/C library for GPU computing (Download)
- Kepler-GK110: 1,
2,
3,
4
- Tesla K80: 1,
; Tesla K20: 1,
2,
3,
Top
500
- AMD: FFirepro
S1000, Kaveri
(856 GFLOPS, update),
hUMA,
APU13,
- NVIDIA and Unified Memory Access [Nov
15, 2013]
- Intel® Xeon® Phi: 1,
2,
3,
4,
Stampede
- Further
Links
- gpgpu.org, gpucomputing.net
- MultiCoreInfo, GPGPU
- Comparison
GPU/CPU
- New: NVIDIA OpenCL 1.0
(download),
- New: Intel OpenCL SDK
1.1
- New: GTC
[30.09.2009, 01.10.2009,
Fermi
in
c't 22/09], NEXUS
(visual studio based) , Radeon
HD
5800 [23.09.2009], comparison
- Tesla mit Fermi
[16.11.2009], GF100
[18.01.2010], Tesla
C2050, Quadro
6000
- Wiener Supercomputer
[VSC-2] [Nov. 27,
2009], login to Tesla in Wien
- Chinease
GPU-supercomputers [May 31, 2010], nebulae,
auto-tuning,
- Aubrey
Isle / Knights Ferry co-processor card by Intel [June 1, 2010;
c't 13/2010, p.20], comparison
with fermi, Compiler,
1
TFLOP DGEMM [Nov. 16, 2011]
- AMD Llano
[Oct. 19, 2010], aktuell
[June 2011]
- NVIDA GPUs in servers
by
Cray [Sept. 22, 2010],
- next generation: Kepler
and
Maxwell [Sept. 22, 2010
- Chinese Tianhe-1A
on rank 1 in top 500 [Oct 28. 2010]: 14336
Xeon + 7168 Tesla (2.5 PFLOPS, 4.04 MW) located at NSC
in Tianjin (see Spiegel,
Heise,
nvidia).
More
details.
- BlueGene/Q with 17 cores (Heise)
- Mathematica8
supports GPU computing
- low energy supercomputer at Uni Frankfurt (Heise)
- Top 500 [June 2011] (heise)
- Oak
Ridge plans with 18000 Kepler-GPUs [Oct. 11, 2011]; Titan
[Oct. 30, 2012]
- MS-AMP
[Feb 2012]
- Cray
XC30 using Xeon Phi [Nov. 9, 2012]; Titan
- Nvida: Volta
(1 TB bandwidth)
- Qarnot Computing
-
21.10.2020