Focal modulation network for lung segmentation in chest X-ray images

buir.contributor.authorÖztürk, Şaban
buir.contributor.authorÇukur, Tolga
buir.contributor.orcidÖztürk, Şaban|0000-0003-2371-8173
buir.contributor.orcidÇukur, Tolga|0000-0002-2296-851X
dc.citation.epage1020en_US
dc.citation.spage1006
dc.citation.volumeNumber31
dc.contributor.authorÖztürk, Şaban
dc.contributor.authorÇukur, Tolga
dc.date.accessioned2024-03-15T11:35:45Z
dc.date.available2024-03-15T11:35:45Z
dc.date.issued2023-08-09
dc.departmentDepartment of Electrical and Electronics Engineering
dc.departmentAysel Sabuncu Brain Research Center (BAM)
dc.description.abstractSegmentation of lung regions is of key importance for the automatic analysis of Chest X-Ray (CXR) images, which have a vital role in the detection of various pulmonary diseases. Precise identification of lung regions is the basic prerequisite for disease diagnosis and treatment planning. However, achieving precise lung segmentation poses significant challenges due to factors such as variations in anatomical shape and size, the presence of strong edges at the rib cage and clavicle, and overlapping anatomical structures resulting from diverse diseases. Although commonly considered as the de-facto standard in medical image segmentation, the convolutional UNet architecture and its variants fall short in addressing these challenges, primarily due to the limited ability to model long-range dependencies between image features. While vision transformers equipped with self-attention mechanisms excel at capturing long-range relationships, either a coarse-grained global self-attention or a fine-grained local self-attention is typically adopted for segmentation tasks on high-resolution images to alleviate quadratic computational cost at the expense of performance loss. This paper introduces a focal modulation UNet model (FMN-UNet) to enhance segmentation performance by effectively aggregating fine-grained local and coarse-grained global relations at a reasonable computational cost. FMN-UNet first encodes CXR images via a convolutional encoder to suppress background regions and extract latent feature maps at a relatively modest resolution. FMN-UNet then leverages global and local attention mechanisms to model contextual relationships across the images. These contextual feature maps are convolutionally decoded to produce segmentation masks. The segmentation performance of FMN-UNet is compared against state-of-the-art methods on three public CXR datasets (JSRT, Montgomery, and Shenzhen). Experiments in each dataset demonstrate the superior performance of FMN-UNet against baselines.
dc.description.provenanceMade available in DSpace on 2024-03-15T11:35:45Z (GMT). No. of bitstreams: 1 Focal_modulation_network_for_lung_segmentation_in_chest_X-ray_images.pdf: 1862222 bytes, checksum: 199fbcd97753ef119bb61b1f9da57c72 (MD5) Previous issue date: 2023-08-09en
dc.identifier.doi10.55730/1300-0632.4031
dc.identifier.eissn1303-6203
dc.identifier.issn1300-0632
dc.identifier.urihttps://hdl.handle.net/11693/114799
dc.language.isoen
dc.relation.isversionofhttps://dx.doi.org/10.55730/1300-0632.4031
dc.rightsCC BY 4.0 DEED (Attribution 4.0 International)
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.source.titleTurkish Journal of Electrical Engineering and Computer Sciences
dc.subjectFocal modulation
dc.subjectLung segmentation
dc.subjectChest x-ray
dc.subjectTransformer
dc.subjectAttention
dc.titleFocal modulation network for lung segmentation in chest X-ray images
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Focal_modulation_network_for_lung_segmentation_in_chest_X-ray_images.pdf
Size:
1.78 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.01 KB
Format:
Item-specific license agreed upon to submission
Description: