Adaptive compute-phase prediction and thread prioritization to mitigate memory access latency
Author
Aktürk, İsmail
Öztürk, Özcan
Date
2014-06Source Title
ACM International Conference Proceeding Series
Publisher
ACM
Pages
1 - 4
Language
English
Type
Conference PaperItem Usage Stats
161
views
views
104
downloads
downloads
Abstract
The full potential of chip multiprocessors remains unex- ploited due to the thread oblivious memory access sched- ulers used in off-chip main memory controllers. This is especially pronounced in embedded systems due to limita- Tions in memory. We propose an adaptive compute-phase prediction and thread prioritization algorithm for memory access scheduling for embedded chip multiprocessors. The proposed algorithm eficiently categorize threads based on execution characteristics and provides fine-grained priori- Tization that allows to differentiate threads and prioritize their memory access requests accordingly. The threads in compute phase are prioritized among the threads in mem- ory phase. Furthermore, the threads in compute phase are prioritized among themselves based on the potential of mak- ing more progress in their execution. Compared to the prior works First-Ready First-Come First-Serve (FR-FCFS) and Compute-phase Prediction with Writeback-Refresh Overlap (CP-WO), the proposed algorithm reduces the execution time of the generated workloads up to 23.6% and 12.9%, respectively. Copyright 2014 ACM.
Keywords
Computer architectureEmbedded systems
Scheduling
Chip multiprocessor
Embedded chips
Main memory
Memory access
Memory access latency
Memory access scheduling
Off-chip
Prioritization
Forecasting
Permalink
http://hdl.handle.net/11693/27152Published Version (Please cite this version)
http://dx.doi.org/10.1145/2613908.2613919Collections
Related items
Showing items related by title, author, creator and subject.
-
Using data compression for increasing memory system utilization
Ozturk, O.; Kandemir, M.; Irwin, M. J. (Institute of Electrical and Electronics Engineers, 2009-06)The memory system presents one of the critical challenges in embedded system design and optimization. This is mainly due to the ever-increasing code complexity of embedded applications and the exponential increase seen in ... -
A high-performance hybrid memory architecture for embedded CMPs using a convex optimization model
Onsori, Salman; Asad, Arghavan; Raahemifar, K.; Fathy, M. (IEEE, 2015-11)In this article, we present a convex optimization model to design a stacked hybrid memory system for 3D embedded chip-multiprocessors (eCMP). Our convex model optimizes numbers and placement of SRAM and STT-RAM memories ... -
A decoupled local memory allocator
Diouf, B.; Hantaş, C.; Cohen, A.; Özturk, Ö.; Palsberg, J. (Association for Computing Machinery, 2013)Compilers use software-controlled local memories to provide fast, predictable, and power-efficient access to critical data. We show that the local memory allocation for straight-line, or linearized programs is equivalent ...