5.4.4.3 Variant 3 : Gauß-Seidel block iteration

Next: 5.5 Incomplete factorizations Up: 5.4.4 Parallel algorithms Previous: 5.4.4.2 Variant 2 : Gauß-Seidel-Jacobi Contents

5.4.4.3 Variant 3 : Gauß-Seidel block iteration

In variant 2 we substituted (5.3b) and (5.3d) by a method with a slower convergence. Here, we solve the proper block systems in these equations directly.

A parallel solution with the accumulated matrices ${\ensuremath{\color{red}\mathfrak{K}}}_E$ and ${\ensuremath{\color{red}\mathfrak{K}}}_V$ is only feasible if they posses the block structure required in Sec. 4.3.1 (again we have to fulfill the mesh conditions). Then the submatrices ${\ensuremath{\color{red}\mathfrak{K}}}_{E,j}$ from ${\ensuremath{\color{red}\mathfrak{K}}}_{E} = \mathrm{blockdiag}\{{\ensuremath{... ...r{red}\mathfrak{K}}}_{E,j}\}_{j=\overline{1,\mathrm{NumEdges}} \makebox[0pt]{}}$ are tridiagonal matrices for edges with linear FE-test functions. There local inverting can be done easily by a Thompson/Progonka-algorithm. Matrix ${\ensuremath{\color{red}\mathfrak{K}}}_V$ is usually a diagonal matrix.
Between the blocks '''' and inside the subdomains (block '''') a Gauß-Seidel iteration will be performed. Due to the partial use of direct solvers, the convergence rate is slightly better as in a Gauß-Seidel iteration.
$\begin{algorithmus} % latex2html id marker 17555 [H] \caption{Parallel Gau\ss-Se... ... \\ \multicolumn{3}{l}{ \mbox{\textbf{\sf end}}} \end{array}$\end{algorithmus}$
The accumulation of cross points ('''') and interface data ('''') is usually performed separately so that we need exactly the same amount of communication as in the $\omega$ -Jacobi iteration (Alg. 5.6). On the other hand, the accumulation of matrix ${\ensuremath{\color{red}\mathfrak{K}}}_E$ (tridiagonal) in the setup of the iteration is three times more expensive then the normal accumulation of the main diagonal.
The accumulated matrices ${\ensuremath{\color{red}\mathfrak{K}}}_E$ und ${\ensuremath{\color{red}\mathfrak{K}}}_V$ have to be stored additionally because the distributed matrices ${\ensuremath{\color{green}{\sf K}}}_E$ and ${\ensuremath{\color{green}{\sf K}}}_V$ are still needed in the matrix-times-vector operation.

In case of using the iteration as a smoother with a fixed number of iterations the calculation of the inner product is no longer necessary. This saves the ALL/SMALL>_REDUCE operation in the parallel code and vectors $\underline{{\ensuremath{\color{green}{\sf r}}}}$ and $\underline{{\ensuremath{\color{red}\mathfrak{w}}}}$ can be stored in one place.

Exercise 13:

Find a modification of Alg. 5.11 without the double storing of and !

Next: 5.5 Incomplete factorizations Up: 5.4.4 Parallel algorithms Previous: 5.4.4.2 Variant 2 : Gauß-Seidel-Jacobi Contents

Gundolf Haase 2000-03-20