General reuse-centric CNN accelerator
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Volume
Issue
Pages
Language
Type
Journal Title
Journal ISSN
Volume Title
Attention Stats
Usage Stats
views
downloads
Series
Abstract
Reuse-centric CNN acceleration speeds up CNN inference by reusing computa-tions for similar neuron vectors in CNN’s input layer or activation maps. This new paradigm of optimizations is however largely limited by the overheads in neuron vector similarity detection, an important step in reuse-centric CNN. This thesis presents the first in-depth exploration of architectural support for reuse-centric CNN. It proposes a hardware accelerator, which improves neuron vector similar-ity detection and reduces the energy consumption of reuse-centric CNN inference. The accelerator is implemented to support a wide variety of network settings with a banked memory subsystem. Design exploration is performed through RTL sim-ulation and synthesis on an FPGA platform. When integrated into Eyeriss, the accelerator can potentially provide improvements up to 7.75X in performance. Furthermore, it can make the similarity detection up to 95.46% more energy-eÿcient, and it can accelerate the convolutional layer up to 3.63X compared to the software-based implementation running on the CPU.