Advanced partitioning and communication strategies for the efficient parallelization of the multilevel fast multipole algorithm