• About
  • Policies
  • What is open access
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Optimizing shared cache behavior of chip multiprocessors

      Thumbnail
      View / Download
      7.4 Mb
      Author(s)
      Kandemir, M.
      Muralidhara, S. P.
      Narayanan, S. H. K.
      Zhang, Y.
      Öztürk, Özcan
      Date
      2009-12
      Source Title
      MICRO 42 Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, 2009
      Publisher
      ACM
      Pages
      505 - 516
      Language
      English
      Type
      Conference Paper
      Item Usage Stats
      164
      views
      113
      downloads
      Abstract
      One of the critical problems associated with emerging chip multiprocessors (CMPs) is the management of on-chip shared cache space. Unfortunately, single processor centric data locality optimization schemes may not work well in the CMP case as data accesses from multiple cores can create conflicts in the shared cache space. The main contribution of this paper is a compiler directed code restructuring scheme for enhancing locality of shared data in CMPs. The proposed scheme targets the last level shared cache that exist in many commercial CMPs and has two components, namely, allocation, which determines the set of loop iterations assigned to each core, and scheduling, which determines the order in which the iterations assigned to a core are executed. Our scheme restructures the application code such that the different cores operate on shared data blocks at the same time, to the extent allowed by data dependencies. This helps to reduce reuse distances for the shared data and improves on-chip cache performance. We evaluated our approach using the Splash-2 and Parsec applications through both simulations and experiments on two commercial multi-core machines. Our experimental evaluation indicates that the proposed data locality optimization scheme improves inter-core conflict misses in the shared cache by 67% on average when both allocation and scheduling are used. Also, the execution time improvements we achieve (29% on average) are very close to the optimal savings that could be achieved using a hypothetical scheme. Copyright 2009 ACM.
      Keywords
      Algorithm
      Experimentation
      Processors
      Design styles
      Memory structure
      Performance
      Programming language
      Computer software
      Design
      Experiments
      Linguistics
      Microprocessor chips
      Multiprocessing systems
      Optimization
      Program compilers
      Cache memory
      Compilers
      Permalink
      http://hdl.handle.net/11693/28639
      Published Version (Please cite this version)
      http://dx.doi.org/10.1145/1669112.1669176
      Collections
      • Department of Computer Engineering 1435
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartments

      My Account

      LoginRegister

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 1771
      © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy