Combining shared memory and distributed memory computation.
OpenMP: Quick reference, home page, tutorial (LLNL).Compiling Code:
> mpicxx [compiler options] -fopenmp
skalar.cpp -o main.GCC_
> mpirun -np 4 ./main.GCC_
> mpirun -np 4 --hostfile my_hostfile ./main.GCC_
Changing the underlying Compiler for OpenMPI (briefly):
> export OMPI_CXX="icpc -openmp"
How to OpenMP+MPI-parallelize the inner product:
Original code for inner product:double scalar(const int N, const double x[], const double y[]) { double sum = 0.0;
for (int i=0; i<N; ++i) { sum += x[i]*y[i]; } return sum; }
int main()
{
...
double s = scalar(n,a,b);
...
}
#include <mpi.h>
// local sequential inner product
double scalar(const int N, const double x[], const double y[]) { double sum = 0.0;
#pragma omp parallel for private(i) shared(x,y) schedule(static) reduction(+:sum)
for (int i=0; i<N; ++i)
sum += x[i]*y[i]; return sum; }
// MPI inner product double scalar(const int n, const double x[], const double y[], const MPI_Comm icomm) { const double s = scalar(n,x,y); // call sequential inner product double sg; MPI_Allreduce(&s,&sg,1,MPI_DOUBLE,MPI_SUM,icomm); return(sg);
}
int main(int argc, char* argv[])
{
...
MPI_Init(&argc,&argv);
...
double s = scalar(n,a,b,MPI_COMM_WORLD);
...
MPI_Finalize();
...
}
-fopenmp
skalar.cpp -o main.GCC_
pgc++
-Mmpi=mpich
-fast
-mp
skalar.cpp -o main.PGI_
Each MPI process spreads OMP_NUM_THREADS OpenMP threads
> export
OMP_NUM_THREADS 2
omp_set_num_threads(2);
.