SiameseFuse: A computationally efficient and a not-so-deep network to fuse visible and infrared images

Özer, S.; Ege, Mert; Özkanoglu, Mehmet Akif

SiameseFuse: A computationally efficient and a not-so-deep network to fuse visible and infrared images

buir.contributor.author	Ege, Mert
buir.contributor.author	Özkanoglu, Mehmet Akif
buir.contributor.orcid	Ege, Mert\|0000-0001-9060-290X
buir.contributor.orcid	Özkanoglu, Mehmet Akif\|0000-0003-2581-9525
dc.citation.epage	12-108712	en_US
dc.citation.spage	1-108712	en_US
dc.citation.volumeNumber	129	en_US
dc.contributor.author	Özer, S.
dc.contributor.author	Ege, Mert
dc.contributor.author	Özkanoglu, Mehmet Akif
dc.date.accessioned	2023-02-15T10:17:36Z
dc.date.available	2023-02-15T10:17:36Z
dc.date.issued	2022-04-22
dc.department	Department of Computer Engineering	en_US
dc.description.abstract	Recent developments in pattern analysis have motivated many researchers to focus on developing deep learning based solutions in various image processing applications. Fusing multi-modal images has been one such application area where the interest is combining different information coming from different modalities in a more visually meaningful and informative way. For that purpose, it is important to first extract salient features from each modality and then fuse them as efficiently and informatively as possible. Recent literature on fusing multi-modal images reports multiple deep solutions that combine both visible (RGB) and infra-red (IR) images. In this paper, we study the performance of various deep solutions available in the literature while seeking an answer to the question: “Do we really need deeper networks to fuse multi-modal images?” To have an answer for that question, we introduce a novel architecture based on Siamese networks to fuse RGB (visible) images with infrared (IR) images and report the state-of-the-art results. We present an extensive analysis on increasing the layer numbers in the architecture with the above-mentioned question in mind to see if using deeper networks (or adding additional layers) adds significant performance in our proposed solution. We report the state-of-the-art results on visually fusing given visible and IR image pairs in multiple performance metrics, while requiring the least number of trainable parameters. Our experimental results suggest that shallow networks (as in our proposed solutions in this paper) can fuse both visible and IR images as well as the deep networks that were previously proposed in the literature (we were able to reduce the total number of trainable parameters up to 96.5%, compare 2,625 trainable parameters to the 74,193 trainable parameters).	en_US
dc.embargo.release	2024-04-22
dc.identifier.doi	10.1016/j.patcog.2022.108712	en_US
dc.identifier.eissn	1873-5142	en_US
dc.identifier.issn	0031-3203	en_US
dc.identifier.uri	http://hdl.handle.net/11693/111322	en_US
dc.language.iso	English	en_US
dc.publisher	Elsevier BV	en_US
dc.relation.isversionof	https://doi.org/10.1016/j.patcog.2022.108712	en_US
dc.source.title	Pattern Recognition	en_US
dc.subject	Multi-temporal fusion	en_US
dc.subject	Efficient learning	en_US
dc.subject	Multi-modal fusion	en_US
dc.title	SiameseFuse: A computationally efficient and a not-so-deep network to fuse visible and infrared images	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: SiameseFuse_A_computationally_efficient_and_a_not-so-deep_network_to_fuse_visible_and_infrared_images.pdf
Size:: 1.99 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.69 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Scholarly Publications - Computer Engineering