Summery of your results?! g++ -O3 -fopenmp mainEx1.cpp mylib.cpp -o dotprod I added Makefile, CLANG_default.mk GCC_default.mk to your repository Linux: Using g++ > make run EX=Ex5 Using clang++ > make run EX=Ex5 COMPILER=CLANG_ Added #include // GH: transform() in mylib.pp:2 1: no scheduling tested reduction_vec_append() implemented but never tested. 2: mainEx2.cpp no parallelization at all. neighter OpenMP nor C++ execution policies 3: mainEx3.cpp nested parallelization in count_goldbach(), single_goldbach() Did that pay off?! 4: Try collapse(2) in bench_funcs.cpp:75 (faster in my code for Mat-Mat-Mult) 5: Why not using my provided 2D FEM code? That was already required in exercise 3. Code only 1D FEM whioch is quite simple to parallelize because of the very simple matrix pattern. jacobi_par_parallel() build_fem_system_atomic() OK