Browsing by Subject "Load balance"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Open Access Locality-aware and load-balanced static task scheduling for MapReduce(Elsevier, 2018) Selvitopu, Oğuz; Demirci, Gündüz Vehbi; Türk, Ata; Aykanat, CevdetTask scheduling for MapReduce jobs has been an active area of research with the objective of decreasing the amount of data transferred during the shuffle phase via exploiting data locality. In the literature, generally only the scheduling of reduce tasks is considered with the assumption that scheduling of map tasks is already determined by the input data placement. However, in cloud or HPC deployments of MapReduce, the input data is located in a remote storage and scheduling map tasks gains importance. Here, we propose models for simultaneous scheduling of map and reduce tasks in order to improve data locality and balance the processors’ loads in both map and reduce phases. Our approach is based on graph and hypergraph models which correctly encode the interactions between map and reduce tasks. Partitions produced by these models are decoded to schedule map and reduce tasks. A two-constraint formulation utilized in these models enables balancing processors’ loads in both map and reduce phases. The partitioning objective in the hypergraph models correctly encapsulates the minimization of data transfer when a local combine step is performed prior to shuffle, whereas the partitioning objective in the graph models achieve the same feat when a local combine is not performed. We show the validity of our scheduling on the MapReduce parallelizations of two important kernel operations – sparse matrix–vector multiplication (SpMV) and generalized sparse matrix–matrix multiplication (SpGEMM) – that are widely encountered in big data analytics and scientific computations. Compared to random scheduling, our models lead to tremendous savings in data transfer by reducing data traffic from several hundreds of megabytes to just a few megabytes in the shuffle phase and consequently leading up to 2.6x and 4.2x speedup for SpMV and SpGEMM, respectively.Item Open Access Partitioning functions for steteful data parallelism in stream processing(Association for Computing Machinery, 2014) Gedik, B.In this paper we study partitioning functions for stream processing systems that employ stateful data parallelism to improve application throughput. In particular, we develop partitioning functions that are effective under workloads where the domain of the partitioning key is large and its value distribution is skewed. We define various desirable properties for partitioning functions, ranging from balance properties such as memory, processing, and communication balance, structural properties such as compactness and fast lookup, and adaptation properties such as fast computation and minimal migration. We introduce a partitioning function structure that is compact and develop several associated heuristic construction techniques that exhibit good balance and low migration cost under skewed workloads. We provide experimental results that compare our partitioning functions to more traditional approaches such as uniform and consistent hashing, under different workload and application characteristics, and show superior performance.Item Open Access Subdivision of 3D space based on the graph partitioning for parallel ray tracing(Springer, 1994) İşler, V.; Aykanat, Cevdet; Özgüç, Bülent; Brunet, P.; Jansen, F. W.An approach for parallel ray tracing is to subdivide the 3D space into rectangular volumes and assign the object descriptions with their related computations in each volume to a different processor. The subdivision process is critical in reducing the interprocessor communication overhead, and maintaining the load balance among processors of a multicomputer. In this paper, after a brief overview of parallel ray tracing, a heuristic is proposed to subdivide the 3D space by converting the problem into a graph partitioning problem. The proposed algorithm tries to minimize the communication cost while maintaining a load balance among processors.