Compiler directed network-on-chip reliability enhancement for chip multiprocessors

Date
2010-04
Authors
Ozturk, O.
Kandemir, M.
Irwin, M. J.
Narayanan, S.H. K.
Advisor
Instructor
Source Title
ACM / SIGPLAN Notices
Print ISSN
1523-2867
Electronic ISSN
Publisher
Association for Computing Machinery
Volume
45
Issue
4
Pages
85 - 94
Language
English
Type
Article
Journal Title
Journal ISSN
Volume Title
Abstract

Chip multiprocessors (CMPs) are expected to be the building blocks for future computer systems. While architecting these emerging CMPs is a challenging problem on its own, programming them is even more challenging. As the number of cores accommodated in chip multiprocessors increases, network-on-chip (NoC) type communication fabrics are expected to replace traditional point-to-point buses. Most of the prior software related work so far targeting CMPs focus on performance and power aspects. However, as technology scales, components of a CMP are being increasingly exposed to both transient and permanent hardware failures. This paper presents and evaluates a compiler-directed power-performance aware reliability enhancement scheme for network-on-chip (NoC) based chip multiprocessors (CMPs). The proposed scheme improves on-chip communication reliability by duplicating messages traveling across CMP nodes such that, for each original message, its duplicate uses a different set of communication links as much as possible (to satisfy performance constraint). In addition, our approach tries to reuse communication links across the different phases of the program to maximize link shutdown opportunities for the NoC (to satisfy power constraint). Our results show that the proposed approach is very effective in improving on-chip network reliability, without causing excessive power or performance degradation. In our experiments, we also evaluate the performance oriented and energy oriented versions of our compiler-directed reliability enhancement scheme, and compare it to two pure hardware based fault tolerant routing schemes. © 2010 ACM.

Course
Other identifiers
Book Title
Keywords
Chip multiprocessors, Compiler, NoC, Reliability, Building blockes, Experimentation, Management, Design, Performance
Citation
Published Version (Please cite this version)