Browsing by Subject "Distributed Memory"

Now showing 1 - 2 of 2

Open Access
Hierarchical parallelization of the multilevel fast multipole algorithm (MLFMA)
(IEEE, 2013) Gürel, Levent; Ergül, Özgür
Due to its O(N log N) complexity, the multilevel fast multipole algorithm (MLFMA) is one of the most prized algorithms of computational electromagnetics and certain other disciplines. Various implementations of this algorithm have been used for rigorous solutions of large-scale scattering, radiation, and miscellaneous other electromagnetics problems involving 3-D objects with arbitrary geometries. Parallelization of MLFMA is crucial for solving real-life problems discretized with hundreds of millions of unknowns. This paper presents the hierarchical partitioning strategy, which provides a very efficient parallelization of MLFMA on distributed-memory architectures. We discuss the advantages of the hierarchical strategy over previous approaches and demonstrate the improved efficiency on scattering problems discretized with millions of unknowns. © 1963-2012 IEEE.
Open Access
Object-space parallel polygon rendering on hypercubes
(Pergamon Press, 1998) Kurç, T. M.; Aykanat, Cevdet; Özgüç, B.
This paper presents algorithms for object-space parallel polygon rendering on hypercube-connected multicomputers. A modified scanline z-buffer algorithm is proposed for local rendering phase. The proposed algorithm avoids message fragmentation by packing local foremost pixels in consecutive memory locations efficiently, and it eliminates the initialization of scanline z-buffer for each scanline. Several algorithms, utilizing different communication strategies and topological embeddings, are proposed for global z-buffering of local foremost pixels during the pixel merging phase. The performance comparison of these pixel merging algorithms are presented based on the communication overhead incurred in each scheme. Two adaptive screen subdivision heuristics are proposed for load balancing in the pixel merging phase. These heuristics utilize the distribution of foremost pixels on the screen for the subdivision. Experimental results obtained on an Intel's iPSC/2 hypercube multicomputer and a Parsytec CC system are presented. Rendering rates of 300K-700K triangles per second are attained on 16 processors of Parsytec CC system in the rendering of datasets from publicly available SPD database. (C) 1998 Elsevier Science Ltd. All rights reserved.