Browsing by Keywords "Computer architecture"
Now showing items 1-20 of 29
-
Adaptive compute-phase prediction and thread prioritization to mitigate memory access latency
(ACM, 2014-06)The full potential of chip multiprocessors remains unex- ploited due to the thread oblivious memory access sched- ulers used in off-chip main memory controllers. This is especially pronounced in embedded systems due to ... -
Adaptive routing framework for network on chip architectures
(ACM, 2016-01)In this paper we suggest and demonstrate the idea of applying multiple routing algorithms during the execution of a real application mapped on a Network-on-Chip (NoC). Traffic pattern of a real application may change during ... -
Application-specific heterogeneous network-on-chip design
(Oxford University Press, 2014)As a result of increasing communication demands, application-specific and scalable Network-on-Chips (NoCs) have emerged to connect processing cores and subsystems in Multiprocessor System-on-Chips. A challenge in ... -
Auto-tuning similarity search algorithms on multi-core architectures
(2013)In recent times, large high-dimensional datasets have become ubiquitous. Video and image repositories, financial, and sensor data are just a few examples of such datasets in practice. Many applications that use such datasets ... -
Big-data streaming applications scheduling based on staged multi-armed bandits
(Institute of Electrical and Electronics Engineers, 2016)Several techniques have been recently proposed to adapt Big-Data streaming applications to existing many core platforms. Among these techniques, online reinforcement learning methods have been proposed that learn how to ... -
Code scheduling for optimizing parallelism and data locality
(Springer, 2010-08-09)As chip multiprocessors proliferate, programming support for these devices is likely to receive a lot of attention in the near future. Parallelism and data locality are two critical issues in a chip multiprocessor environment. ... -
A data-level parallel linear-quadratic penalty algorithm for multicommodity network flows
(Association for Computing Machinery, 1994)We describe the development of a data-level, massively parallel software system for the solution of multicommodity network flow problems. Using a smooth linear-quadratic penalty (LQP) algorithm we transform the multicommodity ... -
Deploy-DDS: Tool framework for supporting deployment architecture of data distribution service based systems
(ACM, 2014-08)Data Distribution Service (DDS) is the Object Management Group's (OMG) new standard middleware after Common Object Request Broker Architecture (CORBA), which is becoming increasingly popular. One of the important problems ... -
An efficient computation model for coarse grained reconfigurable architectures and its applications to a reconfigurable computer
(IEEE, 2010-07)The mapping of high level applications onto the coarse grained reconfigurable architectures (CGRA) are usually performed manually by using graphical tools or when automatic compilation is used, some restrictions are imposed ... -
Efficient parallel spatial subdivision algorithm for object-based parallel ray tracing
(Pergamon Press, 1994)Parallel ray tracing of complex scenes on multicomputers requires the distribution of both computation and scene data to the processors. This is carried out during preprocessing and usually consumes too much time and memory. ... -
Efficient vectorization of forward/backward substitutions in solving sparse linear equations
(IEEE, 1994)Vector processors have promised an enormous increase in computing speed for computationally intensive and time-critical power system problems which require the repeated solution of sparse linear equations. Due to short ... -
Emerging accelerator platforms for data centers
(IEEE, 2017-12-04)CPU and GPU platforms may not be the best options for many emerging compute patterns, which led to a new breed of emerging accelerator platforms. This article gives a comprehensive overview with a focus on commercial platforms. -
Energy efficient architecture for graph analytics accelerators
(IEEE, 2016-06)Specialized hardware accelerators can significantly improve the performance and power efficiency of compute systems. In this paper, we focus on hardware accelerators for graph analytics applications and propose a configurable ... -
Energy reduction in 3D NoCs through communication optimization
(Springer Wien, 2015)Network-on-Chip (NoC) architectures and three-dimensional (3D) integrated circuits have been introduced as attractive options for overcoming the barriers in interconnect scaling while increasing the number of cores. Combining ... -
Exploiting locality in sparse matrix-matrix multiplication on many-core rchitectures
(IEEE Computer Society, 2017)Exploiting spatial and temporal localities is investigated for efficient row-by-row parallelization of general sparse matrix-matrix multiplication (SpGEMM) operation of the form C=A,B on many-core architectures. Hypergraph ... -
Fundamentals of optical interconnections-a review
(IEEE, 1997-06)We review some of the relatively fundamental work in the area of optically interconnected digital computing systems. We cover comparisons of optical interconnections with other interconnection media in terms of energy and ... -
Implications of non-volatile memory as primary storage for database management systems
(IEEE, 2017)Traditional Database Management System (DBMS) software relies on hard disks for storing relational data. Hard disks are cheap, persistent, and offer huge storage capacities. However, data retrieval latency for hard disks ... -
Integrating platform selection rules in the model driven architecture approach
(Springer, Berlin, Heidelberg, 2005)A key issue in the MDA approach is the transformation of platform independent models to platform specific models. Before transforming to a platform specific model, however, it is necessary to select the appropriate platform. ... -
Locality-aware parallel sparse matrix-vector and matrix-transpose-vector multiplication on many-core processors
(Institute of Electrical and Electronics Engineers, 2016)Sparse matrix-vector and matrix-transpose-vector multiplication (SpMMTV) repeatedly performed as z ← ATx and y ← A z (or y ← A w) for the same sparse matrix A is a kernel operation widely used in various iterative solvers. ... -