Response 3
This commit is contained in:
parent
3a69c0494f
commit
c21fee7862
2 changed files with 35 additions and 0 deletions
BIN
sheet3/gh_code_Schmidt.pdf
Normal file
BIN
sheet3/gh_code_Schmidt.pdf
Normal file
Binary file not shown.
35
sheet3/gh_response.txt
Normal file
35
sheet3/gh_response.txt
Normal file
|
|
@ -0,0 +1,35 @@
|
||||||
|
* output.txt contains the resukts of runs in directories
|
||||||
|
* [345], [6] add matrix dimension to output
|
||||||
|
* sehr niedrige GigaFlop rates [345]
|
||||||
|
|
||||||
|
* in [6]: zu hohe GiByte/s in [6] auf meiner Workstation bei MB = 1700, da groesser als peak bandwidth 47.68 GB/s
|
||||||
|
===== Benchmark B =====
|
||||||
|
1.7e+07
|
||||||
|
bytes: 2.31472e+07
|
||||||
|
Timing in sec. : 5.15627e-05
|
||||||
|
GFLOPS : 104.398
|
||||||
|
GiByte/s : 418.083
|
||||||
|
|
||||||
|
Hat der Compiler in (A) und (B) den Loop "weg-optimiert"?
|
||||||
|
==> Nein die Matrix 1700 x 1700 belegt 22.5 MB Speicher < 32 MB L3-cache (bei 8*0.5 MB L2-Cache)
|
||||||
|
Vektor belegt 13 kB und damit passen (theoretisch) 2 Vektoren in einen L1-cache (pro Core) von 32kB.
|
||||||
|
|
||||||
|
* [6] using MB = 1700*5:
|
||||||
|
===== Benchmark B =====
|
||||||
|
8.5e+07
|
||||||
|
bytes: 5.78136e+08
|
||||||
|
Timing in sec. : 0.023487
|
||||||
|
GFLOPS : 5.72981
|
||||||
|
GiByte/s : 22.9246
|
||||||
|
|
||||||
|
|
||||||
|
* clang-tidy *.cpp -checks=llvm-*,-llvm-header-guard -header-filter=.* -enable-check-profile -extra-arg="-std=c++17" -extra-arg="-fopenmp" -- *.cpp > gh_codecheck.txt
|
||||||
|
|
||||||
|
* [7] größere Matrizen benutzen (2000x2000 o.ä.)
|
||||||
|
|
||||||
|
|
||||||
|
* see also annotated PDF
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue