Browsing by Subject "Fine-grained object recognition"

Now showing 1 - 2 of 2

Open Access
Weakly supervised approaches for image classification in remote sensing and medical image analysis
(2020-12) Aygüneş, Bulut
Weakly supervised learning (WSL) aims to utilize data with imprecise or noisy annotations to solve various learning problems. We study WSL approaches in two different domains: remote sensing and medical image analysis. For remote sensing, we focus on the multisource fine-grained object recognition problem that aims to classify an object into one of many similar subcategories. The task we work on involves images where an object with a given class label is present in the image without any knowledge of its exact location. We approach this problem from a WSL perspective and propose a method using a single-source deep instance attention model with parallel branches for joint localization and classification of objects. We then extend this model into a multisource setting where a reference source assumed to have no location uncertainty is used to aid the fusion of multiple sources. We show that all four proposed fusion strategies that operate at the probability level, logit level, feature level, and pixel level provide higher accuracies compared to the state-of-the-art. We also provide an in-depth comparison by evaluating each model at various parameter complexity settings, where the increased model capacity results in a further improvement over the default capacity setting. For medical image analysis, we study breast cancer classification on regions of interest (ROI) of arbitrary shapes and sizes from breast biopsy whole slides. The typical solution to this problem is to aggregate the classification results of fixed-sized patches cropped from ROIs to obtain image-level classification scores. We first propose a generic methodology to incorporate local inter-patch context through a graph convolution network (GCN) that aims to propagate information over neighboring patches in a progressive manner towards classifying the whole ROI. The experiments using a challenging data set for a 3-class ROI-level classification task and comparisons with several baseline approaches show that the proposed model that incorporates the spatial context performs better than commonly used fusion rules. Secondly, we revisit the WSL framework we use in our remote sensing experiments and apply it to a 4-class ROI classification problem. We propose a new training methodology tailored for this WSL task that combines the patches and labels from pairs of ROIs together to exploit the instance attention model’s capability to learn from samples with multiple labels, which results in superior performance over several baselines.
Open Access
Weakly supervised instance attention for multisource fine-grained object recognition with an application to tree species classification
(Elsevier BV, 2021-06) Aygüneş, Bulut; Cinbiş, R. G.; Aksoy, Selim
Multisource image analysis that leverages complementary spectral, spatial, and structural information benefits fine-grained object recognition that aims to classify an object into one of many similar subcategories. However, for multisource tasks that involve relatively small objects, even the smallest registration errors can introduce high uncertainty in the classification process. We approach this problem from a weakly supervised learning perspective in which the input images correspond to larger neighborhoods around the expected object locations where an object with a given class label is present in the neighborhood without any knowledge of its exact location. The proposed method uses a single-source deep instance attention model with parallel branches for joint localization and classification of objects, and extends this model into a multisource setting where a refer- ence source that is assumed to have no location uncertainty is used to aid the fusion of multiple sources in four different levels: probability level, logit level, feature level, and pixel level. We show that all levels of fusion provide higher accuracies compared to the state-of-the-art, with the best performing method of feature-level fusion resulting in 53% accuracy for the recognition of 40 different types of trees, corresponding to an improvement of 5.7% over the best performing baseline when RGB, multispectral, and LiDAR data are used. We also provide an in-depth comparison by evaluating each model at various parameter complexity settings, where the increased model capacity results in a further improvement of 6.3% over the default capacity setting.