Browsing by Subject "CUDA"

Now showing 1 - 7 of 7

Open Access
Improving the performance of similarity joins using graphics processing unit
(2012) Korkmaz, Zeynep
The similarity join is an important operation in data mining and it is used in many applications from varying domains. A similarity join operator takes one or two sets of data points and outputs pairs of points whose distances in the data space is within a certain threshold value, ". The baseline nested loop approach computes the distances between all pairs of objects. When considering large set of objects which yield too long query time for nested loop paradigm, accelerating such operator becomes more important. The computing capability of recent GPUs with the help of a general purpose parallel computing architecture (CUDA) has attracted many researches. With this motivation, we propose two similarity join algorithms for Graphics Processing Unit (GPU). To exploit the advantages of general purpose GPU computing, we rst propose an improved nested loop join algorithm (GPU-INLJ) for the speci c environment of GPU. Also we present a partitioning-based join algorithm (KMEANS-JOIN) that guarantees each partition can be joined independently without missing any join pair. Our experiments demonstrate massive performance gains and the suitability of our algorithms for large datasets.
Open Access
Massively parallel mapping of next generation sequence reads using GPU
(2012) Korkmaz, Mustafa
The high throughput sequencing (HTS) methods have already started to fundamentally revolutionize the area of genome research through low-cost and highthroughput genome sequencing. However, the sheer size of data imposes various computational challenges. For example, in the Illumina HiSeq2000, each run produces over 7-8 billion short reads and over 600 Gb of base pairs of sequence data within less than 10 days. For most applications, analysis of HTS data starts with read mapping, i.e. nding the locations of these short sequence reads in a reference genome assembly. The similarities between two sequences can be determined by computing their optimal global alignments using a dynamic programming method called the Needleman-Wunsch algorithm. The Needleman-Wunsch algorithm is widely used in hash-based DNA read mapping algorithms because of its guaranteed sensitivity. However, the quadratic time complexity of this algorithm makes it highly timeconsuming and the main bottleneck in analysis. In addition to this drawback, the short length of reads ( 100 base pairs) and the large size of mammalian genomes (3.1 Gbp for human) worsens the situation by requiring several hundreds to tens of thousands of Needleman-Wunsch calculations per read. The fastest approach proposed so far avoids Needleman-Wunsch and maps the data described above in 70 CPU days with lower sensitivity. More sensitive mapping approaches are even slower. We propose that e cient parallel implementations of string comparison will dramatically improve the running time of this process. With this motivation, we propose to develop enhanced algorithms to exploit the parallel architecture of GPUs.
Open Access
Particle based modeling and simulation of natural phenomena
(2010) Bayraktar, Serkan
This thesis is about modeling and simulation of fluids and cloth-like deformable objects by the physically-based simulation paradigm. Simulated objects are modeled with particles and their interaction with each other and the environment is defined by particle-to-particle forces. We propose several improvements over the existing particle simulation techniques. Neighbor search algorithms are crucial for the performance efficiency and robustness of a particle system. We present a sorting-based neighbor search method which operates on a uniform grid, and can be parallelizable. We improve upon the existing fluid surface generation methods so that our method captures surface details better since we consider the relative position of fluid particles to the fluid surface. We investigate several alternatives of particle interaction schema (i.e. Smoothed Particle Hydrodynamics, the Discrete Element Method, and Lennard-Jones potential) for the purpose of defining fluid-fluid, fluid-cloth, fluid-boundary interaction forces. We also propose a practical way to simulate knitwear and its interaction with fluids. We employ capillary pressure–based forces to simulate the absorption of fluid particles by knitwear. We also propose a method to simulate the flow of miscible fluids. Our particle simulation system is implement to exploit parallel computing capabilities of the commodity computers. Specifically, we implemented the proposed methods on multicore CPUs and programmable graphics boards. The experiments show that our method is computationally efficient and produces realistic results.
Open Access
Read mapping methods optimized for multiple GPGPU
(2016-07) Nouri, Azita
DNA sequence alignment problem can be broadly defined as the character-level comparison of DNA sequences obtained from one or more samples against a database of reference (i.e., consensus) genome sequence of the same or a similar species. High throughput sequencing (HTS) technologies were introduced in 2006, and the latest iterations of HTS technologies are able to read the genome of a human individual in just three days for a cost of ~ $1,000. With HTS technologies we may encounter massive amount of reads available in different size and they also present a computational problem since the analysis of the HTS data requires the comparison of >1 billion short (100 characters, or base pairs) \reads" against a very long (3 billion base pairs) reference genome. Since DNA molecules are composed of two opposing strands (i.e. two complementary strings), the number of required comparisons are doubled. It is therefore present a diffcult and important challenge of mapping in terms of execution time and scalability with this volume of different-size short reads. Instead of calculating billions of local alignment of short vs long sequences using a quadratic-time algorithm, heuristics are applied to speed up the process. First, partial sequence matches, called \seeds", are quickly found using either Burrows Wheeler Transform (BWT) followed with Ferragina-Manzini Index (FM), or a simple hash table. Next, the candidate locations are verified using a dynamic programming alignment algorithm that calculates Levenshtein edit distance (mismatches, insertions, deletions different from reference), which runs in quadratic time. Although these heuristics are substantially faster than local alignment, because of the repetitive nature of the human genome, they often require hundreds of verification runs per read, imposing a heavy computational burden. However, all of these billions of alignments are independent from each other, thus the read mapping problem presents itself as embarrassingly parallel. In this thesis we propose novel algorithms that are optimized for multiple graphic processing units (GPGPUs) to accelerate the read mapping procedure beyond the capabilities of algorithmic improvements that only use CPUs. We distribute the read mapping workload into the massively parallel architecture of GPGPUs to performing millions of alignments simultaneously, using single or many GPGPUs, together with multi-core CPUs. Our aim is to reduce the need for large scale clusters or cloud platforms to a single server with advanced parallel processing units.
Open Access
Realistic modeling of spectator behavior for soccer videogames with CUDA
(2011) Ylmaz, E.; Molla, E.; Yıldız, C.; İşler V.
Soccer has always been one of the most popular videogame genres. When designing a soccer game, designers tend to focus on the game field and game play due to the limited computational resources, and thus the modelling of virtual spectators is paid less attention. In this study we present a novel approach to the modeling of spectator behavior, which treats each spectator as a unique individual. We also propose an independent software layer for sport-based games that simply obtains the game status from the game engine via a simple messaging protocol and computes the spectator behavior accordingly. The result is returned to the game engine, to be used in the animation and rendering of the spectators. Additionally, we offer a customizable spectator knowledge base with well structured XML to minimize coding efforts, while generating individualized behavior. The employed AI is based on fuzzy inference. In order to overcome additional demand for computing realistic spectator behavior, we use GPU parallel computing with CUDA. © 2011 Elsevier Ltd. All rights reserved.
Open Access
Simulation of a flowing snow avalanche using molecular dynamics
(TÜBİTAK, 2014) Güçer, Denizhan; Özgüç, Halil Bülent
This paper presents an approach for the modeling and simulation of a flowing snow avalanche, which is formed of dry and liquefied snow that slides down a slope, using molecular dynamics and the discrete element method. A particle system is utilized as a base method for the simulation and marching cubes with real-time shaders are employed for rendering. A uniform grid-based neighbor search algorithm is used for collision detection for interparticle and particleterrain interactions. A mass-spring model of the collision resolution is employed to mimic the compressibility of the snow and particle attraction forces are put into use between the particles and terrain surface. In order to achieve greater performance, general purpose GPU language and multithreaded programming are utilized for collision detection and resolution. The results are displayed with different combinations of rendering methods for the realistic representation of the flowing avalanche.
Open Access
Simulation of a flowing snow avalanche using molecular dynamics
(2010) Güçer, Denizhan
This thesis presents an approach for modeling and simulation of a flowing snow avalanche, which is formed of dry and liquefied snow that slides down a slope, by using molecular dynamics and discrete element method. A particle system is utilized as a base method for the simulation and marching cubes with real-time shaders are employed for rendering. A uniform grid based neighbor search algorithm is used for collision detection for inter-particle and particle-terrain interactions. A mass-spring model of collision resolution is employed to mimic compressibility of snow and particle attraction forces are put into use between particles and terrain surface. In order to achieve greater performance, general purpose GPU language and multi-threaded program-ming is utilized for collision detection and resolution. The results are dis-played with different combinations of rendering methods for the realistic re-presentation of the flowing avalanche.