Browsing by Subject "Parallel Processing Systems"

Now showing 1 - 3 of 3

Open Access
Algorithms for efficient vectorization of repeated sparse power system network computations
(IEEE, 1995) Aykanat, Cevdet; Özgü, Ö.; Güven, N.
Standard sparsity-based algorithms used in power system appllcations need to be restructured for efficient vectorization due to the extremely short vectors processed. Further, intrinsic architectural features of vector computers such as chaining and sectioning should also be exploited for utmost performance. This paper presents novel data storage schemes and vectorization alsorim that resolve the recurrence problem, exploit chaining and minimize the number of indirect element selections in the repeated solution of sparse linear system of equations widely encountered in various power system problems. The proposed schemes are also applied and experimented for the vectorization of power mismatch calculations arising in the solution phase of FDLF which involves typical repeated sparse power network computations. The relative performances of the proposed and existing vectorization schemes are evaluated, both theoretically and experimentally on IBM 3090ArF.
Open Access
Image-space decomposition algorithms for sort-first parallel volume rendering of unstructured grids
(Springer, 2000) Kutluca, H.; Kurç, T. M.; Aykanat, Cevdet
Twelve adaptive image-space decomposition algorithms are presented for sort-first parallel direct volume rendering (DVR) of unstructured grids on distributed-memory architectures. The algorithms are presented under a novel taxonomy based on the dimension of the screen decomposition, the dimension of the workload arrays used in the decomposition, and the scheme used for workload-array creation and querying the workload of a region. For the 2D decomposition schemes using 2D workload arrays, a novel scheme is proposed to query the exact number of screen-space bounding boxes of the primitives in a screen region in constant time. A probe-based chains-on-chains partitioning algorithm is exploited for load balancing in optimal 1D decomposition and iterative 2D rectilinear decomposition (RD). A new probe-based optimal 2D jagged decomposition (OJD) is proposed which is much faster than the dynamic-programming based OJD scheme proposed in the literature. The summed-area table is successfully exploited to query the workload of a rectangular region in constant time in both OJD and RD schemes for the subdivision of general 2D workload arrays. Two orthogonal recursive bisection (ORB) variants are adapted to relax the straight-line division restriction in conventional ORB through using the medians-of-medians approach on regular mesh and quadtree superimposed on the screen. Two approaches based on the Hilbert space-filling curve and graph-partitioning are also proposed. An efficient primitive classification scheme is proposed for redistribution in 1D, and 2D rectilinear and jagged decompositions. The performance comparison of the decomposition algorithms is modeled by establishing appropriate quality measures for load-balancing, amount of primitive replication and parallel execution time. The experimental results on a Parsytec CC system using a set of benchmark volumetric datasets verify the validity of the proposed performance models. The performance evaluation of the decomposition algorithms is also carried out through the sort-first parallelization of an efficient DVR algorithm.
Open Access
Iterative methods based on splittings for stochastic automata networks
(1998) Uysal, E.; Dayar T.
This paper presents iterative methods based on splittings (Jacobi, Gauss-Seidel, Successive Over Relaxation) and their block versions for Stochastic Automata Networks (SANs). These methods prove to be better than the power method that has been used to solve SANs until recently. With the help of three examples we show that the time it takes to solve a system modeled as a SAN is still substantial and it does not seem to be possible to solve systems with tens of millions of states on standard desktop workstations with the current state of technology. However, the SAN methodology enables one to solve much larger models than those could be solved by explicitly storing the global generator in the core of a target architecture especially if the generator is reasonably dense. © 1998 Elsevier Science B.V. All rights reserved.