-------------- Task 1 -------------- // ---- Speedup for scalar ---- // Threads | Time | Speedup // 1 | 0.032s | 1 // 2 | 0.022s | 1.45 // 4 | 0.021s | 1.52 // 8 | 0.022s | 1.45 // 16 | 0.022s | 1.45 -------------- Task 2 -------------- 2 threads have been started. Minimum: 1.000000 Maximum: 1000.000000 Arithmetic: 498.184000 Geometric: 364.411859 Harmonic: 95.685690 Deviation: 287.905085 Execution time: 0.000251 -------------- Task 3 -------------- 4 threads have been started. single_goldbach(k = 694) = 19 Decompositions for k = 694: 3 + 691, 11 + 683, 17 + 677, 41 + 653, 47 + 647, 53 + 641, 101 + 593, 107 + 587, 131 + 563, 137 + 557, 173 + 521, 191 + 503, 227 + 467, 233 + 461, 251 + 443, 263 + 431, 293 + 401, 311 + 383, 347 + 347, count_goldbach(n = 10000): k = 9240, decompositions = 329, time elapsed: 0.965918 milliseconds count_goldbach(n = 100000): k = 99330, decompositions = 2168, time elapsed: 22.705675 milliseconds count_goldbach(n = 400000): k = 390390, decompositions = 7094, time elapsed: 290.159072 milliseconds count_goldbach(n = 1000000): k = 990990, decompositions = 15594, time elapsed: 2079.505610 milliseconds count_goldbach(n = 2000000): k = 1981980, decompositions = 27988, time elapsed: 25809.757501 milliseconds Should be: k = 9240, 99330, 390390, 990990, 1981980, 9699690 decompositions = 329, 2168, 7094, 15594, 27988, 124180 -> Not much faster? -------------- Task 4 -------------- 32 threads have been started. ----- Benchmark (B) ----- Memory allocated : 0.715 GByte Duration per loop : 0.022 sec GFLOPS : 8.173 GiByte/s : 32.700 ------------------------- ----- Benchmark (C) ----- Memory allocated : 0.026 GByte Duration per loop : 0.088 sec GFLOPS : 21.138 GiByte/s : 0.296 ------------------------- ----- Benchmark (D) ----- Memory allocated : 0.015 GByte Duration per loop : 0.069 sec GFLOPS : 5.389 GiByte/s : 0.216 ------------------------- Speedup: summation k \ threads | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 3 |1.0000|0.0245|0.0346|0.1735|0.0904|0.1119|0.0694|0.0997|0.0862|0.0296|0.0260|0.0466|0.0621|0.0549|0.0297|0.0007| 4 |1.0000|0.0107|0.0655|0.1648|0.1524|0.1390|0.0914|0.0590|0.0689|0.0616|0.0291|0.0441|0.0319|0.0587|0.0012|0.0010| 5 |1.0000|0.3598|0.0411|0.1168|0.1475|0.0992|0.0765|0.0928|0.0822|0.0927|0.0352|0.0602|0.0550|0.0674|0.0044|0.0010| 6 |1.0000|0.4073|0.0473|0.2216|0.2124|0.2095|0.1657|0.1580|0.1480|0.1397|0.1399|0.1358|0.1267|0.0465|0.0805|0.0996| 7 |1.0000|0.7302|0.6892|0.7880|0.6080|0.7542|0.7682|0.5308|0.5688|0.5979|0.5030|0.4456|0.3850|0.2684|0.4358|0.0447| 8 |1.0000|2.6435|3.3757|2.4117|3.4190|4.7954|3.3526|2.8954|4.7947|4.8622|2.6566|4.3404|5.0582|4.2450|3.4107|3.8654| 9 |1.0000|2.1727|2.3811|3.0651|5.3540|4.5753|4.8977|7.9730|7.8982|6.6374|10.6732|12.1762|4.4168|2.5704|3.6872|3.3859| Speedup: scalar k \ threads | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 3 |1.0000|0.2678|0.5313|0.5062|0.3329|0.2381|0.2406|0.2091|0.1866|0.0588|0.1120|0.0787|0.1135|0.0037|0.0171|0.0005| 4 |1.0000|0.6541|0.4114|0.3849|0.3311|0.2969|0.2423|0.1231|0.1788|0.2430|0.1339|0.1497|0.0807|0.1912|0.1223|0.0009| 5 |1.0000|0.3351|0.4708|0.3720|0.3325|0.3002|0.2844|0.2314|0.2237|0.2303|0.1854|0.1978|0.1521|0.1317|0.1633|0.0013| 6 |1.0000|0.6177|0.6917|0.5664|0.5029|0.4694|0.4497|0.4150|0.3407|0.3572|0.2950|0.2419|0.0765|0.1299|0.2160|0.1146| 7 |1.0000|1.2031|1.7445|1.2619|1.9429|2.0026|1.8708|1.7237|1.3480|1.3512|0.6757|0.8699|1.0127|0.7778|0.8944|1.1033| 8 |1.0000|2.0439|3.1783|2.4128|3.9504|4.2336|3.4501|5.3944|5.9051|4.9925|2.5838|5.1547|6.1956|6.1280|5.2520|5.2100| 9 |1.0000|1.4889|1.6942|1.7735|1.5782|1.6020|1.7236|1.9074|1.8863|1.9077|1.6988|1.5458|1.3436|1.1935|1.3728|1.2548| -------------- Task 5 -------------- parallelized functions: CRS_MATRIX.CalculateLaplace() mesh.SetValues() CRS_MATRIX.ApplyDirichletBC() for JacobiSolve(): vdaxpy, vddiv, dscapr, CRS_MATRIX.Defect make ./main.GCC_ -n X^2 ###################################################### There are 9 processes running. Intervalls: 300 x 300 Start Jacobi solver for 10201 d.o.f.s aver. Jacobi rate : 0.997922 (1000 iter) final error: 0.124971 (rel) 0.000194029 (abs) JacobiSolve: timing in sec. : 0.028673 ASCI file square_100.txt opened 17361 2 34320 3 Start Jacobi solver for 17361 d.o.f.s aver. Jacobi rate : 0.998401 (1000 iter) final error: 0.201744 (rel) 0.000265133 (abs) JacobiSolve: timing in sec. : 0.057275 ###################################################### There are 1 processes running. Intervalls: 100 x 100 Start Jacobi solver for 10201 d.o.f.s aver. Jacobi rate : 0.997922 (1000 iter) final error: 0.124971 (rel) 0.000194029 (abs) JacobiSolve: timing in sec. : 0.080427 ASCI file square_100.txt opened 17361 2 34320 3 Start Jacobi solver for 17361 d.o.f.s aver. Jacobi rate : 0.998401 (1000 iter) final error: 0.201744 (rel) 0.000265133 (abs) JacobiSolve: timing in sec. : 0.193201