Adaptive compute-phase prediction and thread prioritization to mitigate memory access latency
ACM International Conference Proceeding Series
1 - 4
Item Usage Stats
MetadataShow full item record
The full potential of chip multiprocessors remains unex- ploited due to the thread oblivious memory access sched- ulers used in off-chip main memory controllers. This is especially pronounced in embedded systems due to limita- Tions in memory. We propose an adaptive compute-phase prediction and thread prioritization algorithm for memory access scheduling for embedded chip multiprocessors. The proposed algorithm eficiently categorize threads based on execution characteristics and provides fine-grained priori- Tization that allows to differentiate threads and prioritize their memory access requests accordingly. The threads in compute phase are prioritized among the threads in mem- ory phase. Furthermore, the threads in compute phase are prioritized among themselves based on the potential of mak- ing more progress in their execution. Compared to the prior works First-Ready First-Come First-Serve (FR-FCFS) and Compute-phase Prediction with Writeback-Refresh Overlap (CP-WO), the proposed algorithm reduces the execution time of the generated workloads up to 23.6% and 12.9%, respectively. Copyright 2014 ACM.
Memory access latency
Memory access scheduling
Published Version (Please cite this version)http://dx.doi.org/10.1145/2613908.2613919
Showing items related by title, author, creator and subject.
Ozturk, O.; Kandemir, M.; Irwin, M. J. (Institute of Electrical and Electronics Engineers, 2009-06)The memory system presents one of the critical challenges in embedded system design and optimization. This is mainly due to the ever-increasing code complexity of embedded applications and the exponential increase seen in ...
Onsori, Salman; Asad, Arghavan; Raahemifar, K.; Fathy, M. (IEEE, 2015-11)In this article, we present a convex optimization model to design a stacked hybrid memory system for 3D embedded chip-multiprocessors (eCMP). Our convex model optimizes numbers and placement of SRAM and STT-RAM memories ...
Diouf, B.; Hantaş, C.; Cohen, A.; Özturk, Ö.; Palsberg, J. (Association for Computing Machinery, 2013)Compilers use software-controlled local memories to provide fast, predictable, and power-efficient access to critical data. We show that the local memory allocation for straight-line, or linearized programs is equivalent ...