5.7.2.4 Coarse grid solver

Next: 5.7.3 Parallel algorithm Up: 5.7.2 Parallel components of Previous: 5.7.2.3 Smoothing Contents

5.7.2.4 Coarse grid solver

The main problem in parallelizing a (multilevel or) multigrid algorithm consists in solving the coarse grid system exactly.

$\displaystyle \pmb{\sum\limits_{s=1}^P} A_{s,1}^T {\ensuremath{\color{green}{\s... ...P} A_{s,1}^T \underline{{\ensuremath{\color{green}{\sf f}}}}_{s,1} \enspace.$

(5.10)

The following ideas are common :

Direct solver - sequential

The coarsest grid is so small that one processor $\ensuremath{\mathbb{P}}_0$ can store it additionally.
$\bullet$ Accumulate ${\ensuremath{\color{green}{\sf K}}}_1$ (REDUCE).
$\bullet$ Accumulate $\underline{{\ensuremath{\color{green}{\sf f}}}}_1$ (REDUCE).
$\bullet$ Processor $\ensuremath{\mathbb{P}}_0$ solves ${\ensuremath{\color{red}\mathfrak{K}}}_1\cdot\underline{{\ensuremath{\color{red}\mathfrak{u}}}}_1 = \underline{{\ensuremath{\color{red}\mathfrak{f}}}}_1$ .
$\bullet$ Distribute $\underline{{\ensuremath{\color{red}\mathfrak{u}}}}_1$ to all other processors (SCATTER).

This algorithm is purely sequential !
Direct solver - parallel

We store parts of the accumulated matrix on all processors and solve the coarse grid system by some direct parallel solver. Therein, load inbalancies will appear.
Iterative solver - parallel

Here, no accumulation of ${\ensuremath{\color{green}{\sf K}}}_1$ and $\underline{{\ensuremath{\color{green}{\sf f}}}}_1$ is necessary but in each iteration at least one accumulation has to be performed (see 5.1 - 5.6).

Caution : If cg-like methods are used as coarse grid solvers then a symmetric multigrid operator cannot be achieved. To preserve the symmetry one should combine, e.g., a Gauß-Seidel forward iteration with the appropriate backward iteration (SSOR).

The (usually poor) ratio between communication and arithmetic behavior of parallel machines results in a bad parallel efficiency of the coarse grid solver. This can be observed if one uses a W-cycle on 32 or more processors because that multigrid cycle works on the coarser grids more often then a V-cycle.
One idea to overcome this bottleneck consists in the distribution of the coarse (or the coarser) grid on less then

processors. The disadvantage of this approach is the appearance of communication between certain grids $\ell_0$ and $\ell_0 + 1$ in the intergrid transfer.

Communication in coarse grid solver !

Next: 5.7.3 Parallel algorithm Up: 5.7.2 Parallel components of Previous: 5.7.2.3 Smoothing Contents

Gundolf Haase 2000-03-20