Show simple item record

dc.contributor.authorHong, S.en_US
dc.contributor.authorNarayanan, S. H. K.en_US
dc.contributor.authorKandemir, M.en_US
dc.contributor.authorÖzturk, Özcanen_US
dc.coverage.spatialNice, France
dc.date.accessioned2016-02-08T12:28:08Z
dc.date.available2016-02-08T12:28:08Z
dc.date.issued2009-04en_US
dc.identifier.urihttp://hdl.handle.net/11693/28728
dc.descriptionDate of Conference: 20-24 April, 2009
dc.descriptionConference name: DATE '09 Proceedings of the Conference on Design, Automation and Test in Europe
dc.description.abstractWith the increasing scaling of manufacturing technology, process variation is a phenomenon that has become more prevalent. As a result, in the context of Chip Multiprocessors (CMPs) for example, it is possible that identically-designed processor cores on the chip have non-identical peak frequencies and power consumptions. To cope with such a design, each processor can be assumed to run at the frequency of the slowest processor, resulting in wasted computational capability. This paper considers an alternate approach and proposes an algorithm that intelligently maps (and remaps) computations onto available processors so that each processor runs at its peak frequency. In other words, by dynamically changing the thread-to-processor mapping at runtime, our approach allows each processor to maximize its performance, rather than simply using chip-wide lowest frequency amongst all cores and highest cache latency. Experimental evidence shows that, as compared to a process variation agnostic thread mapping strategy, our proposed scheme achieves as much as 29% improvement in overall execution latency, average improvement being 13% over the benchmarks tested. We also demonstrate in this paper that our savings are consistent across different processor counts, latency maps, and latency distributions.With the increasing scaling of manufacturing technology, process variation is a phenomenon that has become more prevalent. As a result, in the context of Chip Multiprocessors (CMPs) for example, it is possible that identically-designed processor cores on the chip have non-identical peak frequencies and power consumptions. To cope with such a design, each processor can be assumed to run at the frequency of the slowest processor, resulting in wasted computational capability. This paper considers an alternate approach and proposes an algorithm that intelligently maps (and remaps) computations onto available processors so that each processor runs at its peak frequency. In other words, by dynamically changing the thread-to-processor mapping at runtime, our approach allows each processor to maximize its performance, rather than simply using chip-wide lowest frequency amongst all cores and highest cache latency. Experimental evidence shows that, as compared to a process variation agnostic thread mapping strategy, our proposed scheme achieves as much as 29% improvement in overall execution latency, average improvement being 13% over the benchmarks tested. We also demonstrate in this paper that our savings are consistent across different processor counts, latency maps, and latency distributions. © 2009 EDAA.en_US
dc.language.isoEnglishen_US
dc.source.titleDATE '09 Proceedings of the Conference on Design, Automation and Test in Europeen_US
dc.relation.isversionofhttps://doi.org/10.1109/DATE.2009.5090776
dc.subjectCache latencyen_US
dc.subjectChip Multiprocessoren_US
dc.subjectComputational capabilityen_US
dc.subjectExperimental evidenceen_US
dc.subjectManufacturing technologiesen_US
dc.subjectMapping strategyen_US
dc.subjectOverall executionen_US
dc.subjectPeak frequenciesen_US
dc.subjectProcess variationen_US
dc.subjectProcessor coresen_US
dc.subjectRuntimesen_US
dc.subjectDesignen_US
dc.subjectElectric power utilizationen_US
dc.subjectMicroprocessor chipsen_US
dc.subjectMultiprocessing systemsen_US
dc.subjectSystems analysisen_US
dc.subjectMappingen_US
dc.titleProcess variation aware thread mapping for chip multiprocessorsen_US
dc.typeConference Paperen_US
dc.departmentDepartment of Computer Engineering
dc.citation.spage821en_US
dc.citation.epage826en_US
dc.identifier.doi10.1109/DATE.2009.5090776
dc.publisherIEEE


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record