General reuse-centric CNN accelerator
Embargo Lift Date: 2021-08-09
Item Usage Stats
Reuse-centric CNN acceleration speeds up CNN inference by reusing computa-tions for similar neuron vectors in CNN’s input layer or activation maps. This new paradigm of optimizations is however largely limited by the overheads in neuron vector similarity detection, an important step in reuse-centric CNN. This thesis presents the first in-depth exploration of architectural support for reuse-centric CNN. It proposes a hardware accelerator, which improves neuron vector similar-ity detection and reduces the energy consumption of reuse-centric CNN inference. The accelerator is implemented to support a wide variety of network settings with a banked memory subsystem. Design exploration is performed through RTL sim-ulation and synthesis on an FPGA platform. When integrated into Eyeriss, the accelerator can potentially provide improvements up to 7.75X in performance. Furthermore, it can make the similarity detection up to 95.46% more energy-eÿcient, and it can accelerate the convolutional layer up to 3.63X compared to the software-based implementation running on the CPU.