High level synthesis based FPGA implementation of Matricized Tensor Times Khatri-Rao Product to accelerate canonical polyadic decomposition

Date

2019-10

Editor(s)

Advisor

Aykanat, Cevdet

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Print ISSN

Electronic ISSN

Publisher

Volume

Issue

Pages

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

Tensor factorization has many applications such as network anomaly detection, structural damage detection and music genre classification. Most time consuming part of the CPD-ALS based tensor factorization is the Matricized Tensor Times Khatri-Rao Product (MTTKRP). In this thesis, the goal was to show that an FPGA implementation of the MTTKRP kernel can be comparable with the state of the art software implementations. To achieve this goal, a flat design consisting of a single loop is developed using Vivado HLS. In order to process the large tensors with the limited BRAM capacity of the FPGA board, a tiling methodology with optimized processing order is introduced. It has been shown that tiling has a negative impact on the general performance because of increasing DRAM access per subtensor. On the other hand, with the minimum tiling possible to process the tensors, the FPGA implementation achieves up to 3.40 speedup against the single threaded software.

Course

Other identifiers

Book Title

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Citation

Published Version (Please cite this version)