Browsing by Subject "Hardware"

Now showing 1 - 10 of 10

Restricted
20. yüzyıl sonlarında Türkiye'de yazılım sektörü ve Akınsoft
(Bilkent University, 2019) Sakınoğlu, Bedirhan; Öztürk, Bulut; Okkalı, Deniz; Tereci, Mert; Şatır, Muhammed Maruf; Mehder, Serkan
1960'ların başlarından itibaren yaygınlaşmaya başlayan yazılım sektörü diğer sektörlerin daha verimli iş yürütmesi ve elektronik ortamda yeni bir pazar yaratması ile gelişmiş ve gelişmekte olan ülkelerin radarına girmiştir. Yüksek kâr marjı sağlaması, diğer sektörlerin üretim alanlarında maliyeti azaltması ve kaynakların daha verimli kullanılmasını sağlaması yönünden teknolojik gelişmelerin mihenk taşı konumuna gelmiştir. Türkiye yazılım sektörüne yavaş bir giriş yapsa da 1990'lı yıllardan itibaren yazılım sektörünün çeşitli avantajlarından yararlanmaya ve uluslararası rekabete ortak olmaya başlamıştır. Ancak bu gelişimini yabancı yazılım firmalarının sermayaleri ve teşvikleri üzerine kurması, bu gelişimin yerel bir boyutta gerçekleşmesini engellemiştir; ancak buna rağmen Akınsoft gibi yerel bir vizyona ve uzun vadeli stratejik planlamalara sahip yerel firmalar sektörde kendini önemli bir konuma getirmeyi başarmıştır. Akınsoft 12 Nisan 1995 tarihinde Dr. Özgün Akın tarafından bireysel sermaye üzerine kurulmuş yerel bir yazılım şirketidir. 4 Aralık 1996 tarihinde Dr. Özgün Akın tarafından donanım sektöründen sıyrılarak hedeflerini tamamen yazılım sektörü üzerine biçimlendirmiştir. Akınsoft kendi imkanlarıyla ürettiği yazılımları bu tarihten itibaren gerek iç pazarda gerekse dış pazarda satışa sunarak ülke ekonomisine yerel anlamda bir katma değer sağlamıştır. Akınsoft yazılımları genel olarak iş, eğitim ve sosyal hayatın kolaylaştırılmasına yönelik yazılımlar üretirken 2009 yılında robotik sektörü üzerine AR-GE çalışmalarına başlamıştır ve 2015 yılında Akınrobotics'i kurmuştur. Akınrobotics robotları üretirken kamusal ihtiyaçları gözeten bir bakış açısı benimsemiş ve Akınsoft'un ileri teknoloji vizyonunu gerçekleştirmesini başarmıştır.
Open Access
Architectural requirements for energy efficient execution of graph analytics applications
(IEEE, 2015-11) Özdal, Muhammet Mustafa; Yeşil, Şerif; Kim, T.; Ayupov, A.; Burns, S.; Öztürk, Özcan
Intelligent data analysis has become more important in the last decade especially because of the significant increase in the size and availability of data. In this paper, we focus on the common execution models and characteristics of iterative graph analytics applications. We show that the features that improve work efficiency can lead to significant overheads on existing systems. We identify the opportunities for custom hardware implementation, and outline the desired architectural features for energy efficient computation of graph analytics applications. © 2015 IEEE.
Open Access
Emerging accelerator platforms for data centers
(IEEE, 2017-12-04) Özdal, Muhammet Mustafa
CPU and GPU platforms may not be the best options for many emerging compute patterns, which led to a new breed of emerging accelerator platforms. This article gives a comprehensive overview with a focus on commercial platforms.
Open Access
Energy efficient architecture for graph analytics accelerators
(IEEE, 2016-06) Özdal, Muhammet Mustafa; Yeşil, Şerif; Kim, T.; Ayupov, A.; Greth, J.; Burns, S.; Öztürk, Özcan
Specialized hardware accelerators can significantly improve the performance and power efficiency of compute systems. In this paper, we focus on hardware accelerators for graph analytics applications and propose a configurable architecture template that is specifically optimized for iterative vertex-centric graph applications with irregular access patterns and asymmetric convergence. The proposed architecture addresses the limitations of the existing multi-core CPU and GPU architectures for these types of applications. The SystemC-based template we provide can be customized easily for different vertex-centric applications by inserting application-level data structures and functions. After that, a cycle-accurate simulator and RTL can be generated to model the target hardware accelerators. In our experiments, we study several graph-parallel applications, and show that the hardware accelerators generated by our template can outperform a 24 core high end server CPU system by up to 3x in terms of performance. We also estimate the area requirement and power consumption of these hardware accelerators through physical-aware logic synthesis, and show up to 65x better power consumption with significantly smaller area. © 2016 IEEE.
Open Access
An FPGA implementation architecture for decoding of polar codes
(IEEE, 2011) Pamuk, Alptekin
Polar codes are a class of codes versatile enough to achieve the Shannon bound in a large array of source and channel coding problems. For that reason it is important to have efficient implementation architectures for polar codes in hardware. Motivated by this fact we propose a belief propagation (BP) decoder architecture for an increasingly popular hardware platform; Field Programmable Gate Array (FPGA). The proposed architecture supports any code rate and is quite flexible in terms of hardware complexity and throughput. The architecture can also be extended to support multiple block lengths without increasing the hardware complexity a lot. Moreover various schedulers can be adapted into the proposed architecture so that list decoding techniques can be used with a single block. Finally the proposed architecture is compared with a convolutional turbo code (CTC) decoder for WiMAX taken from a Xilinx Product Specification and seen that polar codes are superior to CTC codes both in hardware complexity and throughput. © 2011 IEEE.
Open Access
GenASM: a high-performance, low-power approximate string matching acceleration framework for genome sequence analysis
(IEEE Computer Society, 2020) Şenol-Çalı, D.; Kalsi, G. S.; Bingöl, Zülal; Fırtına, C.; Subramanian, L.; Kim, J. S.; Ausavarungnirun, R.; Alser, M.; Gomez-Luna, J.; Boroumand, A.; Norion, A.; Scibisz, A.; Subramoneyon, S.; Alkan, Can; Ghose, S.; Mutlu, Onur
Genome sequence analysis has enabled significant advancements in medical and scientific areas such as personalized medicine, outbreak tracing, and the understanding of evolution. To perform genome sequencing, devices extract small random fragments of an organism's DNA sequence (known as reads). The first step of genome sequence analysis is a computational process known as read mapping. In read mapping, each fragment is matched to its potential location in the reference genome with the goal of identifying the original location of each read in the genome. Unfortunately, rapid genome sequencing is currently bottlenecked by the computational power and memory bandwidth limitations of existing systems, as many of the steps in genome sequence analysis must process a large amount of data. A major contributor to this bottleneck is approximate string matching (ASM), which is used at multiple points during the mapping process. ASM enables read mapping to account for sequencing errors and genetic variations in the reads. We propose GenASM, the first ASM acceleration framework for genome sequence analysis. GenASM performs bitvectorbased ASM, which can efficiently accelerate multiple steps of genome sequence analysis. We modify the underlying ASM algorithm (Bitap) to significantly increase its parallelism and reduce its memory footprint. Using this modified algorithm, we design the first hardware accelerator for Bitap. Our hardware accelerator consists of specialized systolic-array-based compute units and on-chip SRAMs that are designed to match the rate of computation with memory capacity and bandwidth, resulting in an efficient design whose performance scales linearly as we increase the number of compute units working in parallel. We demonstrate that GenASM provides significant performance and power benefits for three different use cases in genome sequence analysis. First, GenASM accelerates read alignment for both long reads and short reads. For long reads, GenASM outperforms state-of-the-art software and hardware accelerators by 116× and 3.9×, respectively, while reducing power consumption by 37× and 2.7×. For short reads, GenASM outperforms state-of-the-art software and hardware accelerators by 111× and 1.9×. Second, GenASM accelerates pre-alignment filtering for short reads, with 3.7× the performance of a state-of-the-art pre-alignment filter, while reducing power consumption by 1.7× and significantly improving the filtering accuracy. Third, GenASM accelerates edit distance calculation, with 22-12501× and 9.3-400× speedups over the state-of-the-art software library and FPGA-based accelerator, respectively, while reducing power consumption by 548-582× and 67×. We conclude that GenASM is a flexible, high-performance, and low-power framework, and we briefly discuss four other use cases that can benefit from GenASM.
Open Access
Hardware accelerator design for data centers
(IEEE, 2016-11) Yeşil, Şerif; Özdal, Muhammet Mustafa; Kim, T.; Ayupov, A.; Burns, S.; Öztürk, Özcan.
As the size of available data is increasing, it is becoming inefficient to scale the computational power of traditional systems. To overcome this problem, customized application-specific accelerators are becoming integral parts of modern system on chip (SOC) architectures. In this paper, we summarize existing hardware accelerators for data centers and discuss the techniques to implement and embed them along with the existing SOCs. © 2015 IEEE.
Open Access
The state of the art in mobile graphics research
(Institute of Electrical and Electronics Engineers, 2008) Capin, T.; Pulli, K.; Akenine-Möller, T.
High-quality computer graphics let mobile-device users access more compelling content. Still, the devices' limitations and requirements differ substantially from those of a PC. This survey of mobile graphics research describes current solutions in terms of specialized hardware (including 3D displays), rendering and transmission, visualization, and user interfaces. © 2008 IEEE.
Open Access
A template-based design methodology for graph-parallel hardware accelerators
(IEEE, 2017-05) Ayupov, A.; Yeşil, Şerif; Özdal, Muhammet Mustafa; Kim, T.; Burns, S.; Öztürk, Özcan
Graph applications have been gaining importance in the last decade due to emerging big data analytics problems such as Web graphs, social networks, and biological networks. For these applications, traditional CPU and GPU architectures suffer in terms of performance and power consumption due to irregular communications, random memory accesses, and load balancing problems. It has been shown that specialized hardware accelerators can achieve much better power and energy efficiency compared to the general purpose CPUs and GPUs. In this paper, we present a template-based methodology specifically targeted for hardware accelerator design of big-data graph applications. Important architectural features that are key for energy efficient execution are implemented in a common template. The proposed template-based methodology is used to design hardware accelerators for different graph applications with little effort. Compared to an application-specific high-level synthesis methodology, we show that the proposed methodology can generate hardware accelerators with up to 18× better energy efficiency and requires less design effort.
Open Access
A unified graphics rendering pipeline for autostereoscopic rendering
(IEEE, 2007-05) Kalaiah, A.; Çapin, Tolga K.
Autostereoscopic displays require rendering a scene from multiple viewpoints. The architecture of current-generation graphics processors are still grounded in the historic evolution of monoscopic rendering. In this paper, we present a novel programmable rendering pipeline that renders to multiple viewpoints in a single pass. Our approach leverages on the computational and memory fetch coherence of rendering to multiple viewpoints to achieve significant speedup. We present an emulation of the principles of our pipeline using the current-generation GPUs and present a quantitative estimate of the benefits of our approach. We make a case for the new rendering pipeline by demonstrating its benefits for a range of applications such as autostereoscopic rendering and for shadow map computation for a scene with multiple light sources. © 2007 IEEE.