Browsing by Subject "Diffusion models"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Open Access Image inpainting with diffusion models and generative adversarial networks(2024-05) Yıldırım, Ahmet BurakWe present two novel approaches to image inpainting, a task that involves erasing unwanted pixels from images and filling them in a semantically consistent and realistic way. The first approach uses natural language input to determine which object to remove from an image. We construct a dataset named GQA-Inpaint for this task and train a diffusion-based inpainting model on it, which can remove objects from images based on text prompts. The second approach tackles the challenging task of inverting erased images into StyleGAN’s latent space for realistic inpainting and editing. For this task, we propose learning an encoder and a mixing network to combine encoded features of erased images with StyleGAN’s mapped features from random samples. To achieve diverse inpainting results for the same erased image, we combine the encoded features and randomly sampled style vectors via the mixing network. We compare our methods with different evaluation metrics that measure the quality of the models and show significant quantitative and qualitative improvements.Item Open Access Novel deep learning algorithms for multi-modal medical image synthesis(2023-08) Dalmaz, OnatMulti-modal medical imaging is a powerful tool for diagnosis and treatment of various diseases, as it provides complementary information about tissue morphology and function. However, acquiring multiple images from different modalities or contrasts is often impractical or impossible due to various factors such as scan time, cost, and patient comfort. Medical image translation has emerged as a promising solution to synthesize target-modality images given source-modality images. Ability to synthesize unavailable images enhance the ubiquity and utility of multi-modal protocols while decreasing examination costs and toxicity exposure such as ionizing radiation and contrast agents. Existing medical image translation methods prominently rely on generative adversarial networks (GANs) with convolutional neural networks (CNNs) backbones. CNNs are designed to perform local processing with compact filters, and this inductive bias is prone to limited contextual sensitivity. Meanwhile, GANs suffer from limited sample fidelity and diversity due to one-shot sampling and implicit characterization of the image distribution. To overcome the challenges with CNN based GAN models, in this thesis, first ResViT was introduced that leverages novel aggregated residual transformer (ART) blocks that synergistically fuse representations from convolutional and transformer modules. Then SynDiff is introduced, a conditional diffusion model that progressively maps noise and source images onto the target image via large diffusion steps and adversarial projections, capturing a direct correlate of the image distribution and improving sample quality and speed. ResViT provides a unified implementation to avoid the need to rebuild separate synthesis models for varying source-target modality configurations, whereas SynDiff enables unsupervised training on unpaired datasets via a cycle-consistent architecture. ResViT and SynDiff was demonstrated on synthesizing missing sequences in multi-contrast MRI, and CT images from MRI, and their state-of-the-art performance in medical image translation was shown.