next up previous contents
Next: 5.5 Incomplete factorizations Up: 5.4.4 Parallel algorithms Previous: 5.4.4.2 Variant 2 : Gauß-Seidel-Jacobi   Contents


5.4.4.3 Variant 3 : Gauß-Seidel block iteration

In variant 2 we substituted (5.3b) and (5.3d) by a method with a slower convergence. Here, we solve the proper block systems in these equations directly.

A parallel solution with the accumulated matrices  $ {\ensuremath{\color{red}\mathfrak{K}}}_E$ and  $ {\ensuremath{\color{red}\mathfrak{K}}}_V$ is only feasible if they posses the block structure required in Sec. 4.3.1 (again we have to fulfill the mesh conditions). Then the submatrices  $ {\ensuremath{\color{red}\mathfrak{K}}}_{E,j}$ from $ {\ensuremath{\color{red}\mathfrak{K}}}_{E} = \mathrm{blockdiag}\{{\ensuremath{...
...r{red}\mathfrak{K}}}_{E,j}\}_{j=\overline{1,\mathrm{NumEdges}} \makebox[0pt]{}}$ are tridiagonal matrices for edges with linear FE-test functions. There local inverting can be done easily by a Thompson/Progonka-algorithm. Matrix $ {\ensuremath{\color{red}\mathfrak{K}}}_V$ is usually a diagonal matrix.
Between the blocks ''$ V,E,I$'' and inside the subdomains (block ''$ I$'') a Gauß-Seidel iteration will be performed. Due to the partial use of direct solvers, the convergence rate is slightly better as in a Gauß-Seidel iteration.
\begin{algorithmus}
% latex2html id marker 17555
[H]
\caption{Parallel Gau\ss-Se...
... \\
\multicolumn{3}{l}{ \mbox{\textbf{\sf end}}}
\end{array}$\end{algorithmus}
The accumulation of cross points (''$ V$'') and interface data (''$ E$'') is usually performed separately so that we need exactly the same amount of communication as in the $ \omega $-Jacobi iteration (Alg. 5.6). On the other hand, the accumulation of matrix  $ {\ensuremath{\color{red}\mathfrak{K}}}_E$ (tridiagonal) in the setup of the iteration is three times more expensive then the normal accumulation of the main diagonal.
The accumulated matrices  $ {\ensuremath{\color{red}\mathfrak{K}}}_E$ und  $ {\ensuremath{\color{red}\mathfrak{K}}}_V$ have to be stored additionally because the distributed matrices  $ {\ensuremath{\color{green}{\sf K}}}_E$ and  $ {\ensuremath{\color{green}{\sf K}}}_V$ are still needed in the matrix-times-vector operation.

In case of using the iteration as a smoother with a fixed number of iterations the calculation of the inner product is no longer necessary. This saves the ALL/SMALL>_REDUCE operation in the parallel code and vectors  $ \underline{{\ensuremath{\color{green}{\sf r}}}}$ and $ \underline{{\ensuremath{\color{red}\mathfrak{w}}}}$ can be stored in one place.


Exercise 13:
Find a modification of Alg. 5.11 without the double storing of $ K_E$ and $ K_V$ !


next up previous contents
Next: 5.5 Incomplete factorizations Up: 5.4.4 Parallel algorithms Previous: 5.4.4.2 Variant 2 : Gauß-Seidel-Jacobi   Contents
Gundolf Haase 2000-03-20