- system time of process A on one processor
- system time of process A on processors
(3.1)
A Speedup of is desirable, i.e., processes are times faster than one.
The size of the problem which have to be solved remains
constant.
i.e.,
the more processors to be used, the less the individual processor has
to do.
Ex.: Speedup-Worker
worker digs a pit of [] in .
How long [] does it take workers to perform the same work ?
The workers obstruct themselves mutually, a too high organization
overhead is necessary.
very little speedup.
Remark : The above theorem was said as a remark by
Gene Amdahl
in 1967
Interpretation of AMDAHL's Theorem
(very old advertisement of IBM)
1% sequential part in algorithm
10
100
1000
10000
9
50
91
99
The use of more than 100 processors appears senseless, i.e.,
that the wages of 1000 workers are 10 times higher than
the wages of 100 workers but
the work did not even double itself.
Discouraging
End of lecture ?
Is AMDAHL's Theorem suitable for us ?
Certain algorithms achieve even a speedup higher than
on smaller processor numbers () !
This is caused by a more intensive use of cache memory on the
processors if the size of the local problem is getting smaller.
AMDAHL's Theorem does not take into account any time for
communication so that the real behavior of the speedup is even worse,
see Fig. 3.16.
Figure 3.16:
Several Speedups
Consequence: The attainable speedup depends on the problem size.
Ex.:
If each worker digs a pit of , then 1000 workers reach nearly a
1000 times higher performance during that hour. They all work
on one huge project.
A scaling of the problem size with the number of processors
is a way out to overcome AMDAHL's Theorem.
Scaleup :
Which increase in problem size is obtained on the parallel computer in
comparison to the sequential case with comparable computing time.
Scaled Speedup :
Under the assumption that the parallel part of the algorithm is optimal
with respect to the problem size (i.e.
()) and introducing
the notations
- system time on a parallel computer
- system time on a sequential computer
we can define
Interpretation : 1% sequential part
because the serial part decreases with the problem size.
This theoretical forecast is confirmed in practice.
Remarks on Speedup/Scaleup
(Scaled) Speedup may not be the only criteria for the
usefulness of an algorithm, e.g., on can see in Fig. 3.17 that
(Gaußelimination) (pcg=preconditioned cg)
BUT(Gaußelimination) (pcg) !!
If one compares algorithms implemented on
different parallel machines also the costs of that machine should
be taken into account, e.g., cents per unknown variable.
Figure 3.17:
system times for Gauß, cg, pcg [Dr. Pester, Chemnitz]