Advanced partitioning and communication strategies for the efficient parallelization of the multilevel fast multipole algorithm
Large-scale electromagnetics problems can be solved efficiently with the multilevel fast multipole algorithm (MLFMA) , which reduces the complexity of matrix-vector multiplications required by iterative solvers from O(N 2) to O(N logN). Parallelization of MLFMA on distributed-memory architectures enables fast and accurate solutions of radiation and scattering problems discretized with millions of unknowns using limited computational resources. Recently, we developed a hierarchical partitioning strategy , which provides an efficient parallelization of MLFMA, allowing for the solution of very large problems involving hundreds of millions of unknowns. In this strategy, both clusters (sub-domains) of the multilevel tree structure and their samples are partitioned among processors, which leads to improved load-balancing. We also show that communications between processors are reduced and the communication time is shortened, compared to previous parallelization strategies in the literature. On the other hand, improved partitioning of the tree structure complicates the arrangement of communications between processors. In this paper, we discuss communications in detail when MLFMA is parallelized using the hierarchical partitioning strategy. We present well-organized arrangements of communications in order to maximize the efficiency offered by the improved partitioning. We demonstrate the effectiveness of the resulting parallel implementation on a very large scattering problem involving a conducting sphere discretized with 375 million unknowns. ©2010 IEEE.