Next: 2. Parallel and vector Up: 1. Introduction Previous: 1. Introduction Contents

1.1 Why should one use parallel computers ?

Micro-electronics/Technology
$\downarrow$
Always higher complexity of the chips.
Always faster clock frequencies.
$\downarrow$
Performance of former main frames is exceeded today by each PC !

But :

The signal rate is finite ( $3\cdot 10^8 m/s$ ). Therefore, in 1994 the single processor units of the fastest Cray were arranged in a circle, so that messages have been transmitted within a clock (cycle time $\approx 1\cdot 10^{-9} s$ ). A further reduction of the chip structures from at present approx. $0.25 \mu m \;=\;2.5\cdot 10^{-7}m$ is a technological problem (reproduction of chip masks). Beside quantum effects, the atomic size ( $\approx 10^{-10} m$ ) will be the all-last lower limit.

Today's status of the arithmetic performance (III/98) with INTEL, SUN, SGI, HP, DEC :

Processor (Company)	SpecInt95	SpecFloat95
Pentium-II/400 (INTEL)	16	13
Ultra60/130 (SUN)	16	23
Octane 250 (SGI)	15	24
HP9000/580 (HP)	17	28
Alpha4100 5/600 (DEC)	19	29

The development of very fast computer systems, i.e., processor + memory + I/O, becomes ever more difficult and expensive. With the necessary complexity of the chips also inevitably the ''chance'' rises from production or design errors (incorrect division with Intel, overheating of processors with SGI).

At the same time the requests of science an technique rise faster than the available performance.

If a required arithmetic performance is furnished, the user {mathematician, physicist, engineer} modifies a $\varepsilon$ or , in order to be allowed to be again dissatisfied with the current computer.

$\Downarrow$
high arithmetic performance
high storage capacity
fast data access and analysis (graphics)

Some typical consumers of arithmetic performance :

Physics	$\longrightarrow$	Reentry into the terrestrial atmosphere
	$\Longrightarrow$	Boltzmann equations with 7 degrees of freedom
Chemistry	$\longrightarrow$	Combustion (in engines, etc.)
	$\Longrightarrow$	Huge systems ( $200 \ldots 100.000$ ) of ordinary differential equations
Meteorology	$\longrightarrow$	Global weather forecast / climate modelling
	$\Longrightarrow$	# observation points $\approx$ forecast duration
Mechanics	$\longrightarrow$	Simulation of crash tests, elastic-plastic deformations
	$\Longrightarrow$	Huge (non-linear) system of equations in each time step
CFD	$\longrightarrow$	Wind tunnel simulation, design, turbulent fluids
	$\Longrightarrow$	coupled systems of (non-linear) non-symmetric differential equations in 3D

Above examples do not claim on completeness, similarly challenging examples can be found in almost each technical area of application . If the arithmetic performance should be sufficient to solve the direct problem, then one wants to solve immediately the appropriate inverse problem or optimize certain parameters with respect to an objective.

Classical VON-NEUMANN architecture is not sufficient.
$\downarrow$
Acceleration by means of parallel processing.

Parallel approaches in VON-NEUMANN arithmetic units

I/O, Floating point and integer arithmetic can be done in parallel.

$\begin{figure}\unitlength0.05\textwidth \begin{picture}(11,3)(-3,0) \put(0,0){\... ...put(8,0.5){\line(1,0){2}} \put(8,0.75){\line(1,0){2}} \end{picture}\end{figure}$
Instruction Look Ahead (cache of instructions).
Memory Interleaving, i.e., separated access on neighboring bytes/bits, e.g., only one bit of a byte is stored per memory chip.

Pipelining in instruction scheduling

$\begin{figure}\begin{tabular}{l@{\quad}clclclclc} clock & $i$ && $i+1$ && $i+... ... $\searrow$ \\ store result &&&&&&&&& $\times$ \\ \end{tabular}\end{figure}$

In current processors, this pipelining is used in combination with Instruction Look Ahead strategies.

Further increase in performance

A further increase in performance is possible essentially only by use of multiple CPUs.

$\fbox{ \begin{minipage}{0.8\textwidth} \emph{Theorem of H. Gotsch (1953) :} \ ... ...r system rise only like the square root from its performance. \end{minipage}}$

The above statement is only valid from a technological point of view - the vendors have to distribute the costs for research and development so that the most recent CPU is quite expensive. On the other hand, the prices of computers are constant since years but the computer power therein increases continuously.

Next: 2. Parallel and vector Up: 1. Introduction Previous: 1. Introduction Contents

Gundolf Haase 2000-03-20