70 lines
3 KiB
Text
70 lines
3 KiB
Text
---------------------------------- 1. ----------------------------------
|
||
|
||
|
||
rm -f *.exe *.o
|
||
gcc -O3 -c -o mysecond.o mysecond.c
|
||
gcc -O3 -c mysecond.c
|
||
gfortran -O3 -DSTREAM_ARRAY_SIZE=80000000 -DNTIMES=20 -c stream.f
|
||
gfortran -O3 stream.o mysecond.o -o stream_f.exe
|
||
gcc -O3 -DSTREAM_ARRAY_SIZE=80000000 -DNTIMES=20 stream.c -o stream_c.exe
|
||
gcc -O3 -DUNIX flops.c -o flops.exe
|
||
flops.c: In function ‘main’:
|
||
flops.c:231:4: warning: implicit declaration of function ‘dtime’ [-Wimplicit-function-declaration]
|
||
231 | dtime(TimeArray);
|
||
| ^~~~~
|
||
flops.c: At top level:
|
||
flops.c:723:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
|
||
723 | dtime(p)
|
||
| ^~~~~
|
||
./stream_c.exe
|
||
-------------------------------------------------------------
|
||
STREAM version $Revision: 5.10 $
|
||
-------------------------------------------------------------
|
||
This system uses 8 bytes per array element.
|
||
-------------------------------------------------------------
|
||
Array size = 80000000 (elements), Offset = 0 (elements)
|
||
Memory per array = 610.4 MiB (= 0.6 GiB).
|
||
Total memory required = 1831.1 MiB (= 1.8 GiB).
|
||
Each kernel will be executed 20 times.
|
||
The *best* time for each kernel (excluding the first iteration)
|
||
will be used to compute the reported bandwidth.
|
||
-------------------------------------------------------------
|
||
Your clock granularity/precision appears to be 1 microseconds.
|
||
Each test below will take on the order of 79294 microseconds.
|
||
(= 79294 clock ticks)
|
||
Increase the size of the arrays if this shows that
|
||
you are not getting at least 20 clock ticks per test.
|
||
-------------------------------------------------------------
|
||
WARNING -- The above is only a rough guideline.
|
||
For best results, please be sure you know the
|
||
precision of your system timer.
|
||
-------------------------------------------------------------
|
||
Function Best Rate MB/s Avg time Min time Max time
|
||
Copy: 26720.6 0.057416 0.047903 0.098979
|
||
Scale: 17008.6 0.087616 0.075256 0.133899
|
||
Add: 19169.9 0.113818 0.100157 0.177676
|
||
Triad: 19144.9 0.111248 0.100288 0.170877
|
||
-------------------------------------------------------------
|
||
Solution Validates: avg error less than 1.000000e-13 on all three arrays
|
||
-------------------------------------------------------------
|
||
./flops.exe
|
||
|
||
FLOPS C Program (Double Precision), V2.0 18 Dec 1992
|
||
|
||
Module Error RunTime MFLOPS
|
||
(usec)
|
||
1 4.0146e-13 0.0023 6183.8896
|
||
2 -1.4166e-13 0.0006 11377.1999
|
||
3 4.7184e-14 0.0034 5031.5222
|
||
4 -1.2557e-13 0.0031 4841.2566
|
||
5 -1.3800e-13 0.0056 5208.6586
|
||
6 3.2380e-13 0.0053 5484.2426
|
||
7 -8.4583e-11 0.0031 3832.8326
|
||
8 3.4867e-13 0.0055 5410.0375
|
||
|
||
Iterations = 512000000
|
||
NullTime (usec) = 0.0000
|
||
MFLOPS(1) = 8055.7366
|
||
MFLOPS(2) = 4732.2658
|
||
MFLOPS(3) = 5164.0037
|
||
MFLOPS(4) = 5257.0181
|