Efficient overlapped FFT algorithms for hypercube-connected multicomputers
Parallel Algorithms and Applications
91 - 110
Item Usage Stats
MetadataShow full item record
In this work, we propose parallel FFT algorithms, for medium-to-coarse grain hypercube-connected multicomputers, which are more elegant and efficient than the existing ones. The proposed algorithms achieve perfect load-balance for the efficient simplified-butterfly scheme, minimize the communication overhead by decreasing both the number and the volume of concurrent communications. Communication and computation cannot be overlapped easily due to the strong data dependencies in the FFT algorithm. In this paper, we propose a restructuring for the FFT algorithm which enables overlapping each communication with one fifth of the local computations involved in a stage. Two of the proposed parallel FET algorithms achieve overlapping by exploiting this restructuring while using the efficient table-lookup scheme for complex coefficients. The proposed algorithms are implemented on an Intel’s 32-node iPSC/2 hypercube multicomputer. High efficiency values are obtained even for small size FFT problems. © 1994, Taylor & Francis Group, LLC. All rights reserved.
Overlapping communication and computation
Perfect load balance