next up previous contents
Next: 7. Conservation laws Up: 6.4 FFT Previous: 6.4.1 1d Fourier analysis   Contents


6.4.2 The FFT idea and its parallelization

The FFT idea regarded here is called also radix-2 FFT.

We investigate the matrix  $ \ensuremath{{\cal F}}$ from (6.8) for $ n=4$ .

$\displaystyle \ensuremath{{\cal F}}_4 \;=\;
\begin{bmatrix}
1 & 1 & 1 & 1 \  ...
...ga^2 & \omega^4 & \omega^6 \  1 & \omega^3 & \omega^6 & \omega^9
\end{bmatrix}$

Because of $ \; \omega^s = \omega^{s \mod 4} \;$ and $ \; \omega_{n=4} = e^{-\frac{2\pi}{4} i} = - i \;$ we have

$\displaystyle \ensuremath{{\cal F}}_4 \;=\; \begin{bmatrix}1 & 1 & 1 & 1 \  1 ...
... & -i & -1 & i \  1 & -1 & 1 & -1 \  1 & i & -1 & -i \end{bmatrix} \enspace .$ (6.9)

A rearranging of the entries in (6.9) by means of the $ 4\times 4$ permutation matrix

\begin{figure}%%
\begin{eqnarray}\Pi_4 &=& \begin{bmatrix}1 & 0 & 0 & 0 \\ 0 & 0...
...-1 \\ 1 & -1 &\vert& i & -i
\end{bmatrix} \enspace .
\end{eqnarray}\end{figure}


From $ \; \ensuremath{{\cal F}}_2 = \begin{bmatrix}1 & 1 \  1 & -1 \end{bmatrix} \;$ and the definition $ \; \Omega_2 := \begin{bmatrix}1 & 0 \  0 & -i \end{bmatrix} =$   diag$ \{1,\omega_{n=2}\} \;$ follows immediately

$\displaystyle \ensuremath{{\cal F}}_4 \Pi_4 \:\;=\;\; \begin{bmatrix}\ensuremat...
...remath{{\cal F}}_2 & -\Omega_2 \ensuremath{{\cal F}}_2 \end{bmatrix} \enspace ,$ (6.10)

i.e., the Fourier analysis for $ n=4$ is defined by the Fourier analysis for $ n=2$.

Generally :

For $ n=2m$ we have

$\displaystyle \ensuremath{{\cal F}}_n \Pi_n \;=\; \begin{bmatrix}\ensuremath{{\...
...remath{{\cal F}}_m & -\Omega_m \ensuremath{{\cal F}}_m \end{bmatrix} \enspace ,$ (6.11)

with $ \Pi_n$ as an even-odd permutation which rearranges the vector components

$\displaystyle \underline{y}\;:=\; \Pi_n^T \underline{x} \quad\Leftrightarrow\qu...
..._j & {\scriptstyle j=\overline{1,n-1:2} \makebox[0pt]{}} \end{bmatrix} \enspace$ (6.12)

and $ \Omega_m  := $   diag$ \{ \omega_{n=2m}^k,
 {\scriptstyle \overline{k=0,m-1}} \} $.
If $ n=2^P$ holds, then formula (6.13) can be applied recursively $ P-1$ times so that the FFT will be reduced to $ P-1$ rearrangements of vectors with only little additional arithmetics.


FFT with $ n=8$

In the remaining section, $ x(0,7:2)$ denotes that subvector of $ x$ which contains from element 0 to element $ 7$ each second component, i.e., $ x(0,7:2) = [x_j]_{j=\overline{0,7:2} \makebox[0pt]{}}$

Figure 6.7: Split of subscripts
\begin{figure}\unitlength0.07\textwidth
\mbox{}\hfill
\protect\begin{picture}(...
...){$x \scriptstyle (0,7:1)$}}
\protect
\end{picture} \hfill\mbox{}
\end{figure}



The rearrangement in Fig. 6.7 results in a vector $ [x_0, x_4, x_2, x_6, x_1, x_5, x_3, x_7]^T$.

Fig. 6.8 presents the so called butterfly algorithm which is a parallel realization of the Fourier transformation  $ \ensuremath{{\cal F}}$ (arithmetics and rearranging) on a distributed memory computer. Therein one sees the good parallelization properties of the Fourier transformation, however the resulting components are present unordered and their rearranging requires additional communication.

Figure 6.8: $\textstyle \parbox{10cm}{Flow chart on 4 processors
\ <I>
The marked bits of ...
...opriate bit
for the next butterfly operation.
[Dr. Pester Chemnitz].
</I>}$


Looking back on the example at the beginning of this section 6.4, we see that we have to solve the equation

$\displaystyle \tfrac{1}{n} \ensuremath{{\cal F}}\Lambda \ensuremath{{\cal F}}\e...
...th{{\cal F}}\cdot \underline{u} \;=\; \ensuremath{{\cal F}}\cdot \underline{f} $

instead of the discrete equation  $ \Lambda \underline{u} = \underline{f}$ :

a)  Fourier analysis :   $ \underline{\gamma}  :=  \tfrac{1}{n} \ensuremath{{\cal F}}\underline{f}$   $ \underline{\gamma}$ rearranged
b)  Operator  $ \Lambda^{-1}$ :   $ \beta_i  :=  \frac{\gamma_i}{\lambda_i}$   $ \underline{\beta}$ rearranged
c)  Fourier synthesis :   $ \underline{u}  :=  \ensuremath{{\cal F}}\underline{\beta} $   $ \underline{u}$ in correct order !!


Since the Fourier transformation had to be used twice, the components of the solution are present again in correct order in the end!

$ \Longrightarrow$
The combination of Fourier analysis and synthesis can be parallelized very well.

next up previous contents
Next: 7. Conservation laws Up: 6.4 FFT Previous: 6.4.1 1d Fourier analysis   Contents
Gundolf Haase 2000-03-20