Response 3

2025-11-12 18:02:36 +01:00 · 2025-11-12 18:02:36 +01:00 · c21fee7862
commit c21fee7862
parent 3a69c0494f
2 changed files with 35 additions and 0 deletions
--- a/sheet3/gh_code_Schmidt.pdf
+++ b/sheet3/gh_code_Schmidt.pdf
--- a/sheet3/gh_response.txt
+++ b/sheet3/gh_response.txt
@ -0,0 +1,35 @@
 * output.txt contains the resukts of runs in directories
 * [345], [6] add matrix dimension to output
 * sehr niedrige GigaFlop rates [345]
 * in [6]:  zu hohe GiByte/s in [6] auf meiner Workstation bei MB = 1700, da groesser als peak bandwidth 47.68 GB/s
 ===== Benchmark B =====
 1.7e+07
 bytes: 2.31472e+07
 Timing in sec. : 5.15627e-05
 GFLOPS         : 104.398
 GiByte/s       : 418.083
  Hat der Compiler in (A) und (B) den Loop "weg-optimiert"?
  ==> Nein die Matrix 1700 x 1700 belegt 22.5 MB Speicher < 32 MB L3-cache  (bei 8*0.5 MB L2-Cache)
      Vektor belegt 13 kB und damit passen (theoretisch) 2 Vektoren in einen L1-cache (pro Core) von 32kB.
 * [6] using MB = 1700*5:
 ===== Benchmark B =====
 8.5e+07
 bytes: 5.78136e+08
 Timing in sec. : 0.023487
 GFLOPS         : 5.72981
 GiByte/s       : 22.9246
 * clang-tidy *.cpp -checks=llvm-*,-llvm-header-guard -header-filter=.* -enable-check-profile -extra-arg="-std=c++17" -extra-arg="-fopenmp" -- *.cpp > gh_codecheck.txt
 * [7] größere Matrizen benutzen (2000x2000 o.ä.)
 * see also annotated PDF