• About
  • Policies
  • What is open access
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      SiameseFuse: A computationally efficient and a not-so-deep network to fuse visible and infrared images

      Thumbnail
      Embargo Lift Date: 2024-04-22
      View / Download
      2.0 Mb
      Author(s)
      Özer, S.
      Ege, Mert
      Özkanoglu, Mehmet Akif
      Date
      2022-04-22
      Source Title
      Pattern Recognition
      Print ISSN
      0031-3203
      Electronic ISSN
      1873-5142
      Publisher
      Elsevier BV
      Volume
      129
      Pages
      1-108712 - 12-108712
      Language
      English
      Type
      Article
      Item Usage Stats
      9
      views
      2
      downloads
      Abstract
      Recent developments in pattern analysis have motivated many researchers to focus on developing deep learning based solutions in various image processing applications. Fusing multi-modal images has been one such application area where the interest is combining different information coming from different modalities in a more visually meaningful and informative way. For that purpose, it is important to first extract salient features from each modality and then fuse them as efficiently and informatively as possible. Recent literature on fusing multi-modal images reports multiple deep solutions that combine both visible (RGB) and infra-red (IR) images. In this paper, we study the performance of various deep solutions available in the literature while seeking an answer to the question: “Do we really need deeper networks to fuse multi-modal images?” To have an answer for that question, we introduce a novel architecture based on Siamese networks to fuse RGB (visible) images with infrared (IR) images and report the state-of-the-art results. We present an extensive analysis on increasing the layer numbers in the architecture with the above-mentioned question in mind to see if using deeper networks (or adding additional layers) adds significant performance in our proposed solution. We report the state-of-the-art results on visually fusing given visible and IR image pairs in multiple performance metrics, while requiring the least number of trainable parameters. Our experimental results suggest that shallow networks (as in our proposed solutions in this paper) can fuse both visible and IR images as well as the deep networks that were previously proposed in the literature (we were able to reduce the total number of trainable parameters up to 96.5%, compare 2,625 trainable parameters to the 74,193 trainable parameters).
      Keywords
      Multi-temporal fusion
      Efficient learning
      Multi-modal fusion
      Permalink
      http://hdl.handle.net/11693/111322
      Published Version (Please cite this version)
      https://doi.org/10.1016/j.patcog.2022.108712
      Collections
      • Department of Computer Engineering 1561
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsCoursesThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsCourses

      My Account

      Login

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 2976
      © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy