Adaptive prefetching for shared cache based chip multiprocessors

Date

2009-04

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Proceedings -Design, Automation and Test in Europe, DATE'09

Print ISSN

Electronic ISSN

Publisher

IEEE

Volume

Issue

Pages

773 - 778

Language

English

Type

Conference Paper

Journal Title

Journal ISSN

Volume Title

Series

Abstract

Chip multiprocessors (CMPs) present a unique scenario for software data prefetching with subtle tradeoffs between memory bandwidth and performance. In a shared L2 based CMP, multiple cores compete for the shared on-chip cache space and limited off-chip pin bandwidth. Purely software based prefetching techniques tend to increase this contention, leading to degradation in performance. In some cases, prefetches can become harmful by kicking out useful data from the shared cache whose next usage is earlier than the prefetched data, and the fraction of such harmful prefetches usually increases when we increase the number of cores used for executing a multi-threaded application code. In this paper, we propose two complementary techniques to address the problem of harmful prefetches in the context of shared L2 based CMPs. These techniques, namely, suppressing select data prefetches (if they are found to be harmful) and pinning select data in the L2 cache (if they are found to be frequent victim of harmful prefetches), are evaluated in this paper using two embedded application codes. Our experiments demonstrate that these two techniques are very effective in mitigating the impact of harmful prefetches, and as a result, we extract significant benefits from software prefetching even with large core counts. © 2009 EDAA.

Course

Other identifiers

Book Title

Degree Discipline

Degree Level

Degree Name

Citation

Published Version (Please cite this version)