Performance Aware Software Design and for a Large Scale Lattice Boltzmann Framework


M. Bauer, H. Köstler, C. Feichtinger, J. Habich, U. Rüde

WaLBerla (Widely applicable Lattice-Boltzmann from Erlangen) is a massively parallel software framework supporting a wide range of physical phenomena. In this talk we will present the software design and performance optimization techniques used in this framework and especially focus on how to deal with conflicting goals like high node performance and scalability on the one side, and clean software structure, maintainability, and flexibility on the other side.

We first present single core optimization techniques and software design concepts used for the LBM kernels in waLBerla. The single core performance results are evaluated using the ECM machine model.

Then the scalability of the code is studied on current high performance computing clusters. As test platforms, the BlueGene/Q cluster "Juqeen" located in Jülich supercomputing center and the Intel Xeon E5-2680 petascale system "SuperMUC" located in Munich, are chosen.

Time permitting, we will also discuss a variant of waLBerla suited for GPGPUs and clusters with GPU acceleration,including performance results on Tsubame 2.0.