next up previous contents
Next: 5.3 -Jacobi iteration Up: 5.2 GMRES solver Previous: 5.2.1 Sequential algorithm   Contents


5.2.2 Parallel GMRES

We have to distinguish in algorithm 5.3 not only between functionals  $ \underline{f}$, $ \underline{r}$ and state variables $ \underline{u}$, $ \underline{w}^k$ but also with vectors  $ \{z_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$, $ \{s_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$, $ \{c_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$ and matrix  $ \{h_{i,j}\}_{i,j=\overline{1,k} \makebox[0pt]{}}$ which do not fit in that scheme. Therefore, we will handle them just as scalar values in the parallelization.

Our strategy for parallelization of the GMRES without preconditioning ($ C=I$) is the following.

 $&bull#bullet;$
$ \underline{f}$, $ \underline{r}$ $ \rightarrow$ $ \underline{{\ensuremath{\color{green}{\sf f}}}}$, $ \underline{{\ensuremath{\color{green}{\sf r}}}}$ are distributed stored.
 $&bull#bullet;$
$ \underline{u}$, $ \underline{w}^k$ $ \rightarrow$ $ \underline{{\ensuremath{\color{red}\mathfrak{u}}}}$, $ \underline{{\ensuremath{\color{red}\mathfrak{w}}}}^k$ are accumulated stored.
 $&bull#bullet;$
Store the scalar values $ \{z_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$, $ \{s_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$, $ \{c_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$ and $ \{h_{i,j}\}_{i,j=\overline{1,k} \makebox[0pt]{}}$ redundantly on each processor.
$ \Longrightarrow$
Inner products require only a minimum of communication (4.7), matrix-times-vector operation requires no communication at all (4.8).
!!
The setting $ \underline{{\ensuremath{\color{red}\mathfrak{w}}}} := \underline{{\ensuremath{\color{green}{\sf r}}}}$ includes one change of the vector type via accumulation (4.5). Here we need communication between all the processes sharing the proper data.
 $&bull#bullet;$
All operations on scalar values $ c_i$, $ s_i$, $ z_i$ and $ h_{i,j}$ will be performed locally on each processor. These operations are redundant.
!!
Nearly all DAXPY-operations use the same vector types (and scalar values) excluding the operation

$\displaystyle \underline{{\ensuremath{\color{green}{\sf r}}}}  :=  \underline...
... h_{i,k} \cdot \underline{{\ensuremath{\color{red}\mathfrak{w}}}}^i \enspace , $

combining different vector types. Changing an accumulated vector  $ \underline{{\ensuremath{\color{red}\mathfrak{w}}}}^i$ into a distributed one can be done without any communication (4.6) !
$ \Longrightarrow$
No DAXPY-operation requires communication.

\begin{algorithmus}
% latex2html id marker 15107
[H]
\caption{Parallel GMRES} $ ...
...nderline{{\ensuremath{\color{red}\mathfrak{w}}}}^j
\end{array}$\end{algorithmus}
The $ k^{\text{th}}$ iteration of algorithm 5.4 includes $ (k+1)$ ALL/SMALL>_REDUCE-operations with one real number and one vector accumulation. Due to changing the type of vector  $ \underline{{\ensuremath{\color{red}\mathfrak{w}}}}^i$ we have to perform $ k\cdot n$ additional multiplications ($ n$ - length of vector  $ \underline{{\ensuremath{\color{red}\mathfrak{w}}}}$).

Parallelization of GMRES(m) with a restart after $ m$ iterations can be done in the same way as above.


next up previous contents
Next: 5.3 -Jacobi iteration Up: 5.2 GMRES solver Previous: 5.2.1 Sequential algorithm   Contents
Gundolf Haase 2000-03-20