Adaptive prefetching for shared cache based chip multiprocessors

dc.citation.epage778en_US
dc.citation.spage773en_US
dc.contributor.authorKandemir, M.en_US
dc.contributor.authorZhang, Y.en_US
dc.contributor.authorÖztürk, Özcanen_US
dc.coverage.spatialNice, France
dc.date.accessioned2016-02-08T12:28:11Z
dc.date.available2016-02-08T12:28:11Z
dc.date.issued2009-04en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionDate of Conference: 20-24 April, 2009
dc.descriptionConference name: DATE '09 Proceedings of the Conference on Design, Automation and Test in Europe
dc.description.abstractChip multiprocessors (CMPs) present a unique scenario for software data prefetching with subtle tradeoffs between memory bandwidth and performance. In a shared L2 based CMP, multiple cores compete for the shared on-chip cache space and limited off-chip pin bandwidth. Purely software based prefetching techniques tend to increase this contention, leading to degradation in performance. In some cases, prefetches can become harmful by kicking out useful data from the shared cache whose next usage is earlier than the prefetched data, and the fraction of such harmful prefetches usually increases when we increase the number of cores used for executing a multi-threaded application code. In this paper, we propose two complementary techniques to address the problem of harmful prefetches in the context of shared L2 based CMPs. These techniques, namely, suppressing select data prefetches (if they are found to be harmful) and pinning select data in the L2 cache (if they are found to be frequent victim of harmful prefetches), are evaluated in this paper using two embedded application codes. Our experiments demonstrate that these two techniques are very effective in mitigating the impact of harmful prefetches, and as a result, we extract significant benefits from software prefetching even with large core counts. © 2009 EDAA.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T12:28:11Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2009en
dc.identifier.doi10.1109/DATE.2009.5090768
dc.identifier.urihttp://hdl.handle.net/11693/28731
dc.language.isoEnglishen_US
dc.publisherIEEE
dc.relation.isversionofhttps://doi.org/10.1109/DATE.2009.5090768
dc.source.titleProceedings -Design, Automation and Test in Europe, DATE'09en_US
dc.subjectChip Multiprocessoren_US
dc.subjectEmbedded applicationen_US
dc.subjectL2 Cacheen_US
dc.subjectLarge coreen_US
dc.subjectMemory bandwidthsen_US
dc.subjectMulti-threaded applicationen_US
dc.subjectOff-chipen_US
dc.subjectOn-chip cacheen_US
dc.subjectPrefetchesen_US
dc.subjectPrefetchingen_US
dc.subjectPrefetching techniquesen_US
dc.subjectShared cacheen_US
dc.subjectSoftware dataen_US
dc.subjectSoftware-baseden_US
dc.subjectComputer softwareen_US
dc.subjectMicroprocessor chipsen_US
dc.subjectSystems analysisen_US
dc.subjectMultiprocessing systemsen_US
dc.titleAdaptive prefetching for shared cache based chip multiprocessorsen_US
dc.typeConference Paperen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Adaptive prefetching for shared cache based chip multiprocessors.pdf
Size:
282.92 KB
Format:
Adobe Portable Document Format
Description:
Full printable version