5.2.2 Parallel GMRES

Next: 5.3 -Jacobi iteration Up: 5.2 GMRES solver Previous: 5.2.1 Sequential algorithm Contents

5.2.2 Parallel GMRES

We have to distinguish in algorithm 5.3 not only between functionals $\underline{f}$ , $\underline{r}$ and state variables $\underline{u}$ , $\underline{w}^k$ but also with vectors $\{z_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$ , $\{s_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$ , $\{c_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$ and matrix $\{h_{i,j}\}_{i,j=\overline{1,k} \makebox[0pt]{}}$ which do not fit in that scheme. Therefore, we will handle them just as scalar values in the parallelization.

Our strategy for parallelization of the GMRES without preconditioning () is the following.

$&bull#bullet;$

$\underline{f}$ , $\underline{r}$ $\rightarrow$ $\underline{{\ensuremath{\color{green}{\sf f}}}}$ , $\underline{{\ensuremath{\color{green}{\sf r}}}}$ are distributed stored.

$&bull#bullet;$

$\underline{u}$ , $\underline{w}^k$ $\rightarrow$ $\underline{{\ensuremath{\color{red}\mathfrak{u}}}}$ , $\underline{{\ensuremath{\color{red}\mathfrak{w}}}}^k$ are accumulated stored.

$&bull#bullet;$

Store the scalar values $\{z_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$ , $\{s_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$ , $\{c_i\}_{i=\overline{1,k+1} \makebox[0pt]{}}$ and $\{h_{i,j}\}_{i,j=\overline{1,k} \makebox[0pt]{}}$ redundantly on each processor.

$\Longrightarrow$

Inner products require only a minimum of communication (4.7), matrix-times-vector operation requires no communication at all (4.8).

!!

The setting $\underline{{\ensuremath{\color{red}\mathfrak{w}}}} := \underline{{\ensuremath{\color{green}{\sf r}}}}$ includes one change of the vector type via accumulation (4.5). Here we need communication between all the processes sharing the proper data.

$&bull#bullet;$

All operations on scalar values

and $h_{i,j}$ will be performed locally on each processor. These operations are redundant.

!!

Nearly all DAXPY-operations use the same vector types (and scalar values) excluding the operation

$\displaystyle \underline{{\ensuremath{\color{green}{\sf r}}}} := \underline... ... h_{i,k} \cdot \underline{{\ensuremath{\color{red}\mathfrak{w}}}}^i \enspace ,$

combining different vector types. Changing an accumulated vector $\underline{{\ensuremath{\color{red}\mathfrak{w}}}}^i$ into a distributed one can be done without any communication (4.6) !

$\Longrightarrow$

No DAXPY-operation requires communication.

$\begin{algorithmus} % latex2html id marker 15107 [H] \caption{Parallel GMRES} $ ... ...nderline{{\ensuremath{\color{red}\mathfrak{w}}}}^j \end{array}$\end{algorithmus}$
The $k^{\text{th}}$ iteration of algorithm 5.4 includes

ALL/SMALL>_REDUCE-operations with one real number and one vector accumulation. Due to changing the type of vector $\underline{{\ensuremath{\color{red}\mathfrak{w}}}}^i$ we have to perform $k\cdot n$ additional multiplications (

- length of vector $\underline{{\ensuremath{\color{red}\mathfrak{w}}}}$ ).

Parallelization of GMRES(m) with a restart after iterations can be done in the same way as above.

Next: 5.3 -Jacobi iteration Up: 5.2 GMRES solver Previous: 5.2.1 Sequential algorithm Contents

Gundolf Haase 2000-03-20