next up previous contents
Next: 2.1.4 Classification by memory Up: 2.1 Classifications Previous: 2.1.2 Classification by hardware   Contents


2.1.3 Classification by Flynn

In 1966, Michael Flynn [Fly66] differentiated parallel computer architectures with respect to data stream and instruction stream.

Table 2.1: Classification by Flynn (Flynn's taxonomy)
Single Multiple  
Instruction Stream  
SISD MISD Single Data
SIMD MIMD Multiple Stream


SISD :
Classical architecture

SIMD :
Parallelism at instruction level
  1. Data are distributed on the processes and one program will be executed in equal steps. Often quite simple processors are used but in a large amount (processor arrays), e.g.: CM2, MasPar.
    A typical problem on those computers consists in the efficient handling of alternatives in the code :

    \begin{figure}\begin{center}
\unitlength0.05\textwidth
\begin{picture}(8,6)
\...
...b]{yes}} \put(6,4.2){\makebox(0,0)[b]{no}}
\end{picture}\end{center}\end{figure}

    \begin{figure}[H]
\begin{minipage}[t]{0.45\textwidth}
If the test ''?'' result...
...} \put(9.2,0.5){\makebox(0,0)[l]{t}}
\end{picture}
\end{minipage} \end{figure}
  2. In difference to a sequential computer, a vector unit uses pipelining such that the same sequence of instructions will be repeated on $ n$ data sets. A typical operation is the SAXPY-routine $  \underline{y} := \underline{y} + {\alpha} \underline{x} $, see Fig. 2.1 where a vector  $ \underline{x}$ will be scaled by a scalar and added to a  $ \underline{y}$. Here, the pipelining consists in loading operands, multiplication, addition and storing of the result. All these operations will be executed on special vector registers. After a startup phase, the vector unit produces one component of the result per clock although the sequential/scalar execution would require much more clocks.

    Figure 2.1: SAXPY in a vector unit
    \begin{figure}\mbox{}\hfill
\unitlength0.026\textwidth
\begin{picture}(11,22)(0...
...t(7.2,2){\makebox(0,0){$\underline{y}$}}
\end{picture}\hfill\mbox{}
\end{figure}


    For achieving the full performance of the vector unit all vector registers have to be filled all the time. If this feeding of the vector registers stops then the vector unit works just as a scalar unit and therefore the performance drops by a factor of 50-200. The additional vector  $ \underline{\alpha}$ appearing in Fig. 2.1 is just a filling of the vector pipe with the constant scalar $ \alpha$. This will be done automatically by the compiler to handle $  \underline{y} := \underline{y} + \underline{\alpha}^T \circledast \underline{x} $ efficiently.

    From the point of a mathematician, writing a code for a vector unit is ''only'' sequential programming without any parallelism inside.


MIMD :
This is parallelism at the level of program execution, i.e., each processor runs its own code. Usually, those codes do not run fully independent so that we have to distinguish between :
  1. competitive processes (shared resources)
  2. communicating processes (data stream, data exchange)
Differentiate between
Program (static) :
Exact notation of an algorithm.

Process (dynamic) :
Sequence of actions with only one assumption: a positive program speed


\unitlength0.03\textwidth
\mbox{}\hfill
%% begin\{picture\}(5,12)
% Box [-1,1]x[...
...(Theorem of conservation of difficulties)
\end{minipage}\hfill\mbox{}\ [1ex]
%



Quite often, each process runs the same code on different data sets.
$ \Longrightarrow$ Class SPMD is a sub-class of MIMD

SPMD :
Single Program Multiple Data


next up previous contents
Next: 2.1.4 Classification by memory Up: 2.1 Classifications Previous: 2.1.2 Classification by hardware   Contents
Gundolf Haase 2000-03-20