Novel deep learning algorithms for multi-modal medical image synthesis
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Volume
Issue
Pages
Language
Type
Journal Title
Journal ISSN
Volume Title
Usage Stats
views
downloads
Series
Abstract
Multi-modal medical imaging is a powerful tool for diagnosis and treatment of various diseases, as it provides complementary information about tissue morphology and function. However, acquiring multiple images from different modalities or contrasts is often impractical or impossible due to various factors such as scan time, cost, and patient comfort. Medical image translation has emerged as a promising solution to synthesize target-modality images given source-modality images. Ability to synthesize unavailable images enhance the ubiquity and utility of multi-modal protocols while decreasing examination costs and toxicity exposure such as ionizing radiation and contrast agents. Existing medical image translation methods prominently rely on generative adversarial networks (GANs) with convolutional neural networks (CNNs) backbones. CNNs are designed to perform local processing with compact filters, and this inductive bias is prone to limited contextual sensitivity. Meanwhile, GANs suffer from limited sample fidelity and diversity due to one-shot sampling and implicit characterization of the image distribution. To overcome the challenges with CNN based GAN models, in this thesis, first ResViT was introduced that leverages novel aggregated residual transformer (ART) blocks that synergistically fuse representations from convolutional and transformer modules. Then SynDiff is introduced, a conditional diffusion model that progressively maps noise and source images onto the target image via large diffusion steps and adversarial projections, capturing a direct correlate of the image distribution and improving sample quality and speed. ResViT provides a unified implementation to avoid the need to rebuild separate synthesis models for varying source-target modality configurations, whereas SynDiff enables unsupervised training on unpaired datasets via a cycle-consistent architecture. ResViT and SynDiff was demonstrated on synthesizing missing sequences in multi-contrast MRI, and CT images from MRI, and their state-of-the-art performance in medical image translation was shown.