rm -f *.exe *.o gcc -O3 -c -o mysecond.o mysecond.c gcc -O3 -c mysecond.c gfortran -O3 -DSTREAM_ARRAY_SIZE=80000000 -DNTIMES=20 -c stream.f gfortran -O3 stream.o mysecond.o -o stream_f.exe gcc -O3 -DSTREAM_ARRAY_SIZE=80000000 -DNTIMES=20 stream.c -o stream_c.exe gcc -O3 -DUNIX flops.c -o flops.exe ./stream_c.exe ------------------------------------------------------------- STREAM version $Revision: 5.10 $ ------------------------------------------------------------- This system uses 8 bytes per array element. ------------------------------------------------------------- Array size = 80000000 (elements), Offset = 0 (elements) Memory per array = 610.4 MiB (= 0.6 GiB). Total memory required = 1831.1 MiB (= 1.8 GiB). Each kernel will be executed 20 times. The *best* time for each kernel (excluding the first iteration) will be used to compute the reported bandwidth. ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 46252 microseconds. (= 46252 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Best Rate MB/s Avg time Min time Max time Copy: 28478.6 0.047858 0.044946 0.054333 Scale: 20551.4 0.066044 0.062283 0.077807 Add: 22534.2 0.089671 0.085204 0.099586 Triad: 22709.5 0.088864 0.084546 0.098536 ------------------------------------------------------------- Solution Validates: avg error less than 1.000000e-13 on all three arrays ------------------------------------------------------------- ./flops.exe FLOPS C Program (Double Precision), V2.0 18 Dec 1992 Module Error RunTime MFLOPS (usec) 1 4.0146e-13 0.0021 6622.7552 2 -1.4166e-13 0.0006 12723.3419 3 4.7184e-14 0.0027 6253.2599 4 -1.2557e-13 0.0026 5758.6323 5 -1.3800e-13 0.0051 5740.4851 6 3.2380e-13 0.0051 5674.2511 7 -8.4583e-11 0.0031 3827.0478 8 3.4867e-13 0.0053 5610.0203 Iterations = 512000000 NullTime (usec) = 0.0000 MFLOPS(1) = 9507.3864 MFLOPS(2) = 5042.7572 MFLOPS(3) = 5597.4972 MFLOPS(4) = 5766.1547