Next: 2. Parallel and vector
Up: 1. Introduction
Previous: 1. Introduction
  Contents
1.1 Why should one use parallel computers ?
Micro-electronics/Technology
Always higher complexity of the chips.
Always faster clock frequencies.
Performance of former main frames is exceeded today by each PC !
But :
The signal rate is finite (
).
Therefore, in 1994 the single processor units of the fastest Cray
were arranged in a circle, so that messages have been transmitted
within a clock (cycle time
).
A further reduction of the chip structures from at present
approx.
is a technological problem (reproduction of chip masks).
Beside quantum effects, the atomic size (
)
will be the all-last lower limit.
|
Today's status of the
arithmetic performance
(III/98) with
INTEL,
SUN,
SGI,
HP,
DEC :
| Processor (Company) |
| SpecInt95 |
| SpecFloat95 |
| Pentium-II/400 (INTEL) |
| 16 |
| 13 |
| Ultra60/130 (SUN) |
| 16 |
| 23 |
| Octane 250 (SGI) |
| 15 |
| 24 |
| HP9000/580 (HP) |
| 17 |
| 28 |
| Alpha4100 5/600 (DEC) |
| 19 |
| 29
|
The development of very fast computer systems,
i.e., processor + memory + I/O,
becomes ever more difficult and expensive.
With the necessary complexity of the chips also inevitably the ''chance''
rises from production or design errors
(incorrect division with Intel,
overheating of processors with SGI).
At the same time the requests of science an technique rise faster
than the available performance.
If a required arithmetic performance is furnished,
the user
{mathematician, physicist, engineer} modifies a
or
,
in order to be allowed to be again dissatisfied with the current computer.
high arithmetic performance
high storage capacity
fast data access and analysis (graphics)
Some typical consumers of arithmetic performance :
| Physics |
 |
Reentry into the terrestrial atmosphere |
| |
 |
Boltzmann equations with 7 degrees of freedom |
| Chemistry |
 |
Combustion (in engines, etc.) |
| |
 |
Huge systems (
)
of ordinary differential equations |
| Meteorology |
 |
Global weather forecast / climate modelling |
| |
 |
# observation points
forecast duration |
| Mechanics |
 |
Simulation of crash tests,
elastic-plastic deformations |
| |
 |
Huge (non-linear) system of equations in
each time step |
| CFD |
 |
Wind tunnel simulation, design,
turbulent fluids |
| |
 |
coupled systems of (non-linear)
non-symmetric differential equations in 3D |
Above examples do not claim on completeness,
similarly challenging examples can be found
in almost each technical area of application .
If the arithmetic performance should be sufficient to solve the
direct problem, then one wants to solve immediately the appropriate
inverse problem or optimize certain parameters with
respect to an objective.
Classical
VON-N
EUMANN architecture is not sufficient.
Acceleration by means of
parallel processing.
- Parallel approaches in VON-NEUMANN arithmetic units
- I/O, Floating point and integer arithmetic can be done in parallel.
- Instruction Look Ahead (cache of instructions).
- Memory Interleaving, i.e., separated access on
neighboring bytes/bits, e.g., only one bit
of a byte is stored per memory chip.
- Pipelining in instruction scheduling
In current processors, this
pipelining is used
in combination with Instruction Look Ahead strategies.
- Further increase in performance
A further increase in performance is possible essentially only by use
of multiple CPUs.
The above statement is only valid from a technological point of view -
the vendors have to distribute the costs for
research and development so that the most recent CPU is quite
expensive.
On the other hand, the prices of computers are constant since years but
the computer power therein increases continuously.
Next: 2. Parallel and vector
Up: 1. Introduction
Previous: 1. Introduction
  Contents
Gundolf Haase
2000-03-20