152 lines
5.4 KiB
Text
152 lines
5.4 KiB
Text
-------------------------------------------------------------------------
|
|
|
|
Revisions as of Thu, Jan 17, 2013 3:50:01 PM
|
|
|
|
Version 5.10 of stream.c has been released.
|
|
This version includes improved validation code and will automatically
|
|
use 64-bit array indices on 64-bit systems to allow very large arrays.
|
|
|
|
-------------------------------------------------------------------------
|
|
|
|
Revisions as of Thu Feb 19 08:16:57 CST 2009
|
|
|
|
Note that the codes in the "Versions" subdirectory should be
|
|
considered obsolete -- the versions of stream.c and stream.f
|
|
in this main directory include the OpenMP directives and structure
|
|
for creating "TUNED" versions.
|
|
|
|
Only the MPI version in the "Versions" subdirectory should be
|
|
of any interest, and I have not recently checked that version for
|
|
errors or compliance with the current versions of stream.c and
|
|
stream.f.
|
|
|
|
I added a simple Makefile to this directory. It works under Cygwin
|
|
on my Windows XP box (using gcc and g77).
|
|
|
|
A user suggested a sneaky trick for "mysecond.c" -- instead of using
|
|
the #ifdef UNDERSCORE to generate the function name that the Fortran
|
|
compiler expects, the new version simply defines both "mysecond()"
|
|
and "mysecond_()", so it should automagically link with most Fortran
|
|
compilers.
|
|
|
|
-------------------------------------------------------------------------
|
|
|
|
Revisions as of Wed Nov 17 09:15:37 CST 2004
|
|
|
|
The most recent "official" versions have been renamed "stream.f" and
|
|
"stream.c" -- all other versions have been moved to the "Versions"
|
|
subdirectory.
|
|
|
|
The "official" timer (was "second_wall.c") has been renamed "mysecond.c".
|
|
This is embedded in the C version ("stream.c"), but still needs to be
|
|
externally linked to the FORTRAN version ("stream.f").
|
|
|
|
-------------------------------------------------------------------------
|
|
|
|
Revisions as of Tue May 27 11:51:23 CDT 2003
|
|
|
|
Copyright and License info added to stream_d.f, stream_mpi.f, and
|
|
stream_tuned.f
|
|
|
|
|
|
-------------------------------------------------------------------------
|
|
|
|
Revisions as of Tue Apr 8 10:26:48 CDT 2003
|
|
|
|
I changed the name of the timer interface from "second" to "mysecond"
|
|
and removed the dummy argument in all versions of the source code (but
|
|
not the "Contrib" versions).
|
|
|
|
|
|
-------------------------------------------------------------------------
|
|
|
|
Revisions as of Mon Feb 25 06:48:14 CST 2002
|
|
|
|
Added an OpenMP version of stream_d.c, called stream_d_omp.c. This is
|
|
still not up to date with the Fortran version, which includes error
|
|
checking and advanced data flow to prevent overoptimization, but it is
|
|
a good start....
|
|
|
|
|
|
-------------------------------------------------------------------------
|
|
|
|
Revisions as of Tue Jun 4 16:31:31 EDT 1996
|
|
|
|
I have fixed an "off-by-one" error in the RMS time calculation in
|
|
stream_d.f. This was already corrected in stream_d.c. No results are
|
|
invalidated, since I use minimum time instead of RMS time anyway....
|
|
|
|
-------------------------------------------------------------------------
|
|
|
|
Revisions as of Fri Dec 8 14:49:56 EST 1995
|
|
|
|
I have renamed the timer routines to:
|
|
second_cpu.c
|
|
second_wall.c
|
|
second_cpu.f
|
|
|
|
All have a function interface named 'second' which returns a double
|
|
precision floating point number. It should be possible to link
|
|
second_wall.c with stream_d.f without too much trouble, though the
|
|
details will depend on your environment.
|
|
|
|
If anyone builds versions of these timers for machines running the
|
|
Macintosh O/S or DOS/Windows, I would appreciate getting a copy.
|
|
|
|
To clarify:
|
|
* For single-user machines, the wallclock timer is preferred.
|
|
* For parallel machines, the wallclock timer is required.
|
|
* For time-shared systems, the cpu timer is more reliable,
|
|
though less accurate.
|
|
|
|
|
|
-------------------------------------------------------------------------
|
|
|
|
Revisions as of Wed Oct 25 09:40:32 EDT 1995
|
|
|
|
(1) NOTICE to C users:
|
|
|
|
stream_d.c has been updated to version 4.0 (beta), and
|
|
should be functionally identical to stream_d.f
|
|
|
|
Two timers are provided --- second_cpu.c and second_wall.c
|
|
second_cpu.c measures cpu time, while second_wall.c measures
|
|
elapsed (real) time.
|
|
|
|
For single-user machines, the wallclock timer is preferred.
|
|
For parallel machines, the wallclock timer is required.
|
|
For time-shared systems, the cpu timer is more reliable,
|
|
though less accurate.
|
|
|
|
(2) cstream.c has been removed -- use stream_d.c
|
|
|
|
(3) stream_wall.f has been removed --- to do parallel aggregate
|
|
bandwidth runs, comment out the definition of FUNCTION SECOND
|
|
in stream_d.f and compile/link with second_wall.c
|
|
|
|
(4) stream_offset has been deprecated. It is still here
|
|
and usable, but stream_d.f is the "standard" version.
|
|
There are easy hooks in stream_d.f to change the
|
|
array offsets if you want to.
|
|
|
|
(5) The rules of the game are clarified as follows:
|
|
|
|
The reference case uses array sizes of 2,000,000 elements
|
|
and no additional offsets. I would like to see results
|
|
for this case.
|
|
|
|
But, you are free to use any array size and any offset
|
|
you want, provided that the arrays are each bigger than
|
|
the last-level of cache. The output will show me what
|
|
parameters you chose.
|
|
|
|
I expect that I will report just the best number, but
|
|
if there is a serious discrepancy between the reference
|
|
case and the "best" case, I reserve the right to report
|
|
both.
|
|
|
|
Of course, I also reserve the right to reject any results
|
|
that I do not trust....
|
|
--
|
|
John D. McCalpin, Ph.D.
|
|
john@mccalpin.com
|