Exploiting architectural features of a computer vision platform towards reducing memory stalls

Mustafa, Naveed Ul; O’Riordan, M. J.; Rogers, S.; Öztürk, Özcan

Exploiting architectural features of a computer vision platform towards reducing memory stalls

buir.contributor.author	Mustafa, Naveed Ul
buir.contributor.author	O’Riordan
buir.contributor.author	Öztürk, Özcan
dc.citation.epage	870	en_US
dc.citation.issueNumber	4	en_US
dc.citation.spage	853	en_US
dc.citation.volumeNumber	17	en_US
dc.contributor.author	Mustafa, Naveed Ul	en_US
dc.contributor.author	O’Riordan, M. J.	en_US
dc.contributor.author	Rogers, S.	en_US
dc.contributor.author	Öztürk, Özcan	en_US
dc.date.accessioned	2021-02-19T10:25:11Z
dc.date.available	2021-02-19T10:25:11Z
dc.date.issued	2020
dc.department	Department of Computer Engineering	en_US
dc.description.abstract	Computer vision applications are becoming more and more popular in embedded systems such as drones, robots, tablets, and mobile devices. These applications are both compute and memory intensive, with memory bound stalls (MBS) making a significant part of their execution time. For maximum reduction in memory stalls, compilers need to consider architectural details of a platform and utilize its hardware components efficiently. In this paper, we propose a compiler optimization for a vision-processing system through classification of memory references to reduce MBS. As the proposed optimization is based on the architectural features of a specific platform, i.e., Myriad 2, it can only be applied to other platforms having similar architectural features. The optimization consists of two steps: affinity analysis and affinity-aware instruction scheduling. We suggest two different approaches for affinity analysis, i.e., source code annotation and automated analysis. We use LLVM compiler infrastructure for implementation of the proposed optimization. Application of annotation-based approach on a memory-intensive program shows a reduction in stall cycles by 67.44%, leading to 25.61% improvement in execution time. We use 11 different image-processing benchmarks for evaluation of automated analysis approach. Experimental results show that classification of memory references reduces stall cycles, on average, by 69.83%. As all benchmarks are both compute and memory intensive, we achieve improvement in execution time by up to 30%, with a modest average of 5.79%.	en_US
dc.description.sponsorship	This work is supported by European Union’s Horizon2020 research and innovation programme under grant agreement number 687698 and Ph.D. scholarship from Higher Education Commission (HEC) of Pakistan awarded to Naveed Ul Mustafa.	en_US
dc.identifier.doi	10.1007/s11554-018-0830-8	en_US
dc.identifier.issn	1861-8200	en_US
dc.identifier.uri	http://hdl.handle.net/11693/75485	en_US
dc.language.iso	English	en_US
dc.publisher	Springer	en_US
dc.relation.isversionof	https://dx.doi.org/10.1007/s11554-018-0830-8	en_US
dc.source.title	Journal of Real-Time Image Processing	en_US
dc.subject	Computer vision	en_US
dc.subject	Compiler optimization	en_US
dc.subject	Execution time	en_US
dc.subject	Memory bound stalls	en_US
dc.title	Exploiting architectural features of a computer vision platform towards reducing memory stalls	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Exploiting_architectural_features_of_a_computer_vision_platform_towards_reducing_memory_stalls.pdf
Size:: 1.57 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Scholarly Publications - Computer Engineering