Next: 5.4.4.2 Variant 2 : Gauß-Seidel-Jacobi
Up: 5.4.4 Parallel algorithms
Previous: 5.4.4 Parallel algorithms
  Contents
5.4.4.1 Variant 1 : formal transformation
Writing the update step in Alg. 5.7 using parallel data types
results in :
Algorithm 5.9 seems to be perfect at the first glance
however an efficient parallel implementation is nearly impossible.
The numbering of unknowns like in Sec. 4.3.1
produces a structure of matrices and vectors as presented
in (4.2).
For a detailed analysis, we split the calculations in the update
step of Alg. 5.9 according to the block structure of
vectors and matrices.
 |
(5.3a) |
Sum
means that
in the update of each single component
we have to accumulate the appropriate component of the residuum.
The amount of data to transfer is the same as in the update step of
the
-Jacobi iteration (
) but here
we have to perform
single communications with additional
startups.
Assuming the ratio
we have :
The ratio of the communication behavior between both iterations is
i.e. the parallel efficiency of the Gauß-Seidel iteration with
update step (5.3) will be much lower than in a
Jacobi iteration.
By taking advantage of the neighborhood relations in the domain
decomposition, the communication ratio
can be reduced to
the maximal number of coupled unknowns on one interface edge/face.
For an efficient parallelization of (5.3) the requirements
3a and 3a on the FE-mesh have to be fulfilled, see
also Fig. 5.4. If we cannot fulfill these conditions
then we have to apply (5.3b) and (5.3d)
purely sequentially for all
!
In that case a red-black numbering of the interfaces is advisable
which is not non-trivial for complex domains.
Next: 5.4.4.2 Variant 2 : Gauß-Seidel-Jacobi
Up: 5.4.4 Parallel algorithms
Previous: 5.4.4 Parallel algorithms
  Contents
Gundolf Haase
2000-03-20