Browsing by Subject "Generative adversarial networks"

Now showing 1 - 14 of 14

Open Access
Bottleneck sharing generative adversarial networks for unified multi-contrast MR image synthesis
(IEEE, 2022-08-29) Dalmaz, Onat; Sağlam, Baturay; Gönç, Kaan; Dar, Salman Uh.; Çukur, Tolga
Magnetic Resonance Imaging (MRI) is the favored modality in multi-modal medical imaging due to its safety and ability to acquire various different contrasts of the anatomy. Availability of multiple contrasts accumulates diagnostic information and, therefore, can improve radiological observations. In some scenarios, acquiring all contrasts might be challenging due to reluctant patients and increased costs associated with additional scans. That said, synthetically obtaining missing MRI pulse sequences from the acquired sequences might prove to be useful for further analyses. Recently introduced Generative Adversarial Network (GAN) models offer state-of-the-art performance in learning MRI synthesis. However, the proposed generative approaches learn a distinct model for each conditional contrast to contrast mapping. Learning a distinct synthesis model for each individual task increases the time and memory demands due to the increased number of parameters and training time. To mitigate this issue, we propose a novel unified synthesis model, bottleneck sharing GAN (bsGAN), to consolidate learning of synthesis tasks in multi-contrast MRI. bsGAN comprises distinct convolutional encoders and decoders for each contrast to increase synthesis performance. A central information bottleneck is employed to distill hidden representations. The bottleneck, based on residual convolutional layers, is shared across contrasts to avoid introducing many learnable parameters. Qualitative and quantitative comparisons on a multi-contrast brain MRI dataset show the effectiveness of the proposed method against existing unified synthesis methods.
Open Access
Deep learning for accelerated 3D MRI
(2021-08) Özbey, Muzaﬀer
Magnetic resonance imaging (MRI) oﬀers the ﬂexibility to image a given anatomic volume under a multitude of tissue contrasts. Yet, scan time considerations put stringent limits on the quality and diversity of MRI data. The gold-standard approach to alleviate this limitation is to recover high-quality images from data undersampled across various dimensions, most commonly the Fourier domain or contrast sets. A primary distinction among recovery methods is whether the anatomy is processed per volume or per cross-section. Volumetric models oﬀer enhanced capture of global contextual information, but they can suﬀer from sub-optimal learning due to elevated model complexity. Cross-sectional models with lower complexity oﬀer improved learning behavior, yet they ignore contextual information across the longitudinal dimension of the volume. Here, we introduce a novel progressive volumetrization strategy for generative models (ProvoGAN) that serially decomposes complex volumetric image recovery tasks into succes-sive cross-sectional mappings task-optimally ordered across individual rectilinear dimensions. ProvoGAN eﬀectively captures global context and recovers ﬁne-structural details across all dimensions, while maintaining low model complexity and improved learning behaviour. Comprehensive demonstrations on mainstream MRI reconstruction and synthesis tasks show that ProvoGAN yields superior per-formance to state-of-the-art volumetric and cross-sectional models.
Open Access
Deep learning for accelerated MR imaging
(2021-02) Dar, Salman Ul Hassan
Magnetic resonance imaging is a non-invasive imaging modality that enables multi-contrast acquisition of an underlying anatomy, thereby supplementing mul-titude of information for diagnosis. However, prolonged scan duration may pro-hibit its practical use. Two mainstream frameworks for accelerating MR image acquisitions are reconstruction and synthesis. In reconstruction, acquisitions are accelerated by undersampling in k-space, followed by reconstruction algorithms. Lately deep neural networks have oﬀered signiﬁcant improvements over tradi-tional methods in MR image reconstruction. However, deep neural networks rely heavily on availability of large datasets which might not be readily available for some applications. Furthermore, a caveat of the reconstruction framework in general is that the performance naturally starts degrading towards higher accel-eration factors where fewer data samples are acquired. In the alternative syn-thesis framework, acquisitions are accelerated by acquiring a subset of desired contrasts, and recovering the missing ones from the acquired ones. Current syn-thesis methods are primarily based on deep neural networks, which are trained to minimize mean square or absolute loss functions. This can bring about loss of intermediate-to-high spatial frequency content in the recovered images. Fur-thermore, the synthesis performance in general relies on similarity in relaxation parameters between source and target contrasts, and large dissimilarities can lead to artifactual synthesis or loss of features. Here, we tackle issues associated with reconstruction and synthesis approaches. In reconstruction, the data scarcity is-sue is addressed by pre-training a network on large readily available datasets, and ﬁne-tuning on just a few samples from target datasets. In synthesis, the loss of intermediate-to-high spatial frequency is catered for by adding adversarial and high-level perceptual losses on top of traditional mean absolute error. Fi-nally, a joint reconstruction and synthesis approach is proposed to mitigate the issues associated with both reconstruction and synthesis approaches in general. Demonstrations on MRI brain datasets of healthy subjects and patients indicate superior performance of the proposed techniques over the current state-of-the art ones.
Open Access
Deep learning for digital pathology
(2020-11) Sarı, Can Taylan
Histopathological examination is today’s gold standard for cancer diagnosis and grading. However, this task is time consuming and prone to errors as it requires detailed visual inspection and interpretation of a histopathological sample provided on a glass slide under a microscope by an expert pathologist. Low-cost and high-technology whole slide digital scanners produced in recent years have eliminated the disadvantages of physical glass slide samples by digitizing histopathological samples and relocating them to digital media. Digital pathology aims at alleviating the problems of traditional examination approaches by providing auxiliary computerized tools that quantitatively analyze digitized histopathological images. Traditional machine learning methods have proposed to extract handcrafted features from histopathological images and to use these features in the design of a classification or a segmentation algorithm. The performance of these methods mainly relies on the features that they use, and thus, their success strictly depends on the ability of these features to successfully quantify the histopathology domain. More recent studies have employed deep architectures to learn expressive and robust features directly from images avoiding complex feature extraction procedures of traditional approaches. Although deep learning methods perform well in many classification and segmentation problems, convolutional neural networks that they frequently make use of require annotated data for training and this makes it difficult to utilize unannotated data that cover the majority of the available data in the histopathology domain. This thesis addresses the challenges of traditional and deep learning approaches by incorporating unsupervised learning into classification and segmentation algorithms for feature extraction and training regularization purposes in the histopathology domain. As the first contribution of this thesis, the first study presents a new unsupervised feature extractor for effective representation and classification of histopathological tissue images. This study introduces a deep belief network to quantize the salient subregions, which are identified with domain-specific prior knowledge, by extracting a set of features directly learned on image data in an unsupervised way and uses the distribution of these quantizations for image representation and classification. As its second contribution, the second study proposes a new regularization method to train a fully convolutional network for semantic tissue segmentation in histopathological images. This study relies on the benefit of unsupervised learning, in the form of image reconstruction, for network training. To this end, it puts forward an idea of defining a new embedding, which is generated by superimposing an input image on its segmentation map, that allows uniting the main supervised task of semantic segmentation and an auxiliary unsupervised task of image reconstruction into a single one and proposes to learn this united task by a generative adversarial network. We compare our classification and segmentation methods with traditional machine learning methods and the state-of-the-art deep learning algorithms on various histopathological image datasets. Visual and quantitative results of our experiments demonstrate that the proposed methods are capable of learning robust features from histopathological images and provides more accurate results than their counterparts.
Open Access
Diverse inpainting and editing with semantic conditioning
(2024-09) Sivük, Hakan
Semantic image editing involves filling in pixels according to a given semantic map, a complex task that demands contextual harmony and precise adherence to the semantic map. Most previous approaches attempt to encode all information from the erased image, but when adding an object like a car, its style cannot be inferred only from the context. Models capable of producing diverse results often struggle with smooth integration between generated and existing parts of the image. Moreover, existing methods lack a mechanism to encode the styles of fully and partially visible objects differently, limiting their effectiveness. In this work, we introduce a framework incorporating a novel mechanism to distinguish between visible and partially visible objects, leading to more consistent style encoding and improved final outputs. Through extensive comparisons with existing conditional image generation and semantic editing methods, our experiments demonstrate that our approach significantly outperforms the state-of-the-art. In addition to improved quantitative results, our method provides greater diversity in outcomes. For code and a demo, please visit our project page at https://github.com/hakansivuk/DivSem.
Open Access
Fine detailed texture learning for 3D meshes with generative models
(Institute of Electrical and Electronics Engineers, 2023-11-03) Dündar, Ayşegül; Gao, J.; Tao, A.; Catanzaro, B.
This paper presents a method to achieve fine detailed texture learning for 3D models that are reconstructed from both multi-view and single-view images. The framework is posed as an adaptation problem and is done progressively where in the first stage, we focus on learning accurate geometry, whereas in the second stage, we focus on learning the texture with a generative adversarial network. The contributions of the paper are in the generative learning pipeline where we propose two improvements. First, since the learned textures should be spatially aligned, we propose an attention mechanism that relies on the learnable positions of pixels. Second, since discriminator receives aligned texture maps, we augment its input with a learnable embedding which improves the feedback to the generator. We achieve significant improvements on multi-view sequences from Tripod dataset as well as on single-view image datasets, Pascal 3D+ and CUB. We demonstrate that our method achieves superior 3D textured models compared to the previous works.
Open Access
Image inpainting with diffusion models and generative adversarial networks
(2024-05) Yıldırım, Ahmet Burak
We present two novel approaches to image inpainting, a task that involves erasing unwanted pixels from images and filling them in a semantically consistent and realistic way. The first approach uses natural language input to determine which object to remove from an image. We construct a dataset named GQA-Inpaint for this task and train a diffusion-based inpainting model on it, which can remove objects from images based on text prompts. The second approach tackles the challenging task of inverting erased images into StyleGAN’s latent space for realistic inpainting and editing. For this task, we propose learning an encoder and a mixing network to combine encoded features of erased images with StyleGAN’s mapped features from random samples. To achieve diverse inpainting results for the same erased image, we combine the encoded features and randomly sampled style vectors via the mixing network. We compare our methods with different evaluation metrics that measure the quality of the models and show significant quantitative and qualitative improvements.
Open Access
Image-to-image translation for face attribute editing with disentangled latent directions
(2023-06) Dalva, Yusuf
We propose an image-to-image translation framework for facial attribute editing with disentangled interpretable latent directions. Facial attribute editing task faces the challenges of targeted attribute editing with controllable strength and disentanglement in the representations of attributes to preserve the other at-tributes during edits. For this goal, inspired by the latent space factorization works of fixed pretrained GANs, we design the attribute editing by latent space factorization, and for each attribute, we learn a linear direction that is orthogonal to the others. We train these directions with orthogonality constraints and dis-entanglement losses. To project images to semantically organized latent spaces, we set an encoder-decoder architecture with attention-based skip connections. We extensively compare with previous image translation algorithms and editing with pretrained GAN works. Our extensive experiments show that our method significantly improves over the state-of-the-arts.
Open Access
Improving image synthesis quality in multi-contrast MRI using transfer learning via autoencoders
(IEEE, 2022-08-29) Selçuk, Şahan Yoruç; Dalmaz, Onat; Ul Hassan Dar, Salman; Çukur, Tolga
The capacity of magnetic resonance imaging (MRI) to capture several contrasts within a session enables it to obtain increased diagnostic information. However, such multi-contrast MRI tests take a long time to scan, resulting in acquiring just a part of the essential contrasts. Synthetic multi-contrast MRI has the potential to improve radiological observations and consequent image analysis activities. Because of its ability to generate realistic results, generative adversarial networks (GAN) have recently been the most popular choice for medical imaging synthesis. This paper proposes a novel generative adversarial framework to improve the image synthesis quality in multi-contrast MRI. Our method uses transfer learning to adapt pre-trained autoencoder networks to the synthesis task and enhances the image synthesis quality by initializing the training process with more optimal network parameters. We demonstrate that the proposed method outperforms competing synthesis models by 0.95 dB on average on a well-known multi-contrast MRI dataset.
Open Access
Key protected classification for collaborative learning
(Elsevier, 2020) Sarıyıldız, Mert Bülent; Cinbiş, R. G.; Ayday, Erman
Large-scale datasets play a fundamental role in training deep learning models. However, dataset collection is difficult in domains that involve sensitive information. Collaborative learning techniques provide a privacy-preserving solution, by enabling training over a number of private datasets that are not shared by their owners. However, recently, it has been shown that the existing collaborative learning frameworks are vulnerable to an active adversary that runs a generative adversarial network (GAN) attack. In this work, we propose a novel classification model that is resilient against such attacks by design. More specifically, we introduce a key-based classification model and a principled training scheme that protects class scores by using class-specific private keys, which effectively hide the information necessary for a GAN attack. We additionally show how to utilize high dimensional keys to improve the robustness against attacks without increasing the model complexity. Our detailed experiments demonstrate the effectiveness of the proposed technique. Source code will be made available at https://github.com/mbsariyildiz/key-protected-classification.
Open Access
Learning portrait drawing of face photos from unpaired data with unsupervised landmarks
(2023-12) Taşdemir, Burak
Translating face photos to artistic drawings by hand is a complex task that typically needs the expertise of professional artists. The demand for automating this artistic task is clearly on the rise. Turning a photo into a hand-drawn portrait goes beyond simple transformation. This task contemplates a sophisticated process that focuses on highlighting key facial features and often omits small details. Thus, designing an effective tool for image conversion involves selectively preserving certain elements of the subject’s face. In our study, we introduce a new technique for creating portrait drawings that learn exclusively from unpaired data without the use of extra labels. By utilizing unsupervised learning to extract features, our technique shows a promising ability to generalize across different domains. Our proposed approach integrates an in-depth understanding of images using unsupervised components and the ability to maintain individual identity, which is typically seen in simpler networks. We also present an innovative concept: an asymmetric pose-based cycle consistency loss. This concept introduces flexibility to the traditional cycle consistency loss, which typically expects an original image to be perfectly reconstructed after being converted to a portrait and then reverted. In our comprehensive testing, we evaluate our method with both in-domain and out-domain images and benchmark it against the leading methods. Our findings reveal that our approach yields superior results, both numerically and in terms of visual quality, across three different datasets.
Open Access
Progressively volumetrized deep generative models for data-efficient contextual learning of MR image recovery
(Elsevier BV, 2022-05) Yurt, Mahmut; Özbey, Muzaffer; Dar, Salman U.H.; Tınaz, Berk; Oğuz, Kader K.; Çukur, Tolga
Magnetic resonance imaging (MRI) offers the ﬂexibility to image a given anatomic volume under a multi- tude of tissue contrasts. Yet, scan time considerations put stringent limits on the quality and diversity of MRI data. The gold-standard approach to alleviate this limitation is to recover high-quality images from data undersampled across various dimensions, most commonly the Fourier domain or contrast sets. A primary distinction among recovery methods is whether the anatomy is processed per volume or per cross-section. Volumetric models offer enhanced capture of global contextual information, but they can suffer from suboptimal learning due to elevated model complexity. Cross-sectional models with lower complexity offer improved learning behavior, yet they ignore contextual information across the longitu- dinal dimension of the volume. Here, we introduce a novel progressive volumetrization strategy for gen- erative models (ProvoGAN) that serially decomposes complex volumetric image recovery tasks into suc- cessive cross-sectional mappings task-optimally ordered across individual rectilinear dimensions. Provo-GAN effectively captures global context and recovers ﬁne-structural details across all dimensions, while maintaining low model complexity and improved learning behavior. Comprehensive demonstrations on mainstream MRI reconstruction and synthesis tasks show that ProvoGAN yields superior performance to state-of-the-art volumetric and cross-sectional models.
Open Access
Style synthesizing conditional generative adversarial networks
(2020-01) Çetin, Yarkın Deniz
Neural style transfer (NST) models aim to transfer a particular visual style to a image while preserving its content using neural networks. Style transfer models that can apply arbitrary styles without requiring style-speciﬁc models or architectures are called universal style transfer (UST) models. Typically a UST model takes a content image and a style image as inputs and outputs the corresponding stylized image. It is, therefore, required to have a style image with the required characteristics to facilitate the transfer. However, in practical applications, where the user wants to apply variations of a style class or a mixture of multiple style classes, such style images may be diﬃcult to ﬁnd or simply non-existent. In this work we propose a conditional style transfer network which can model multiple style classes. While our model requires training examples (style images) for each class at training time, it does not require any style images at test time. The model implicitly learns the manifold of each style and is able to generate diverse stylization outputs corresponding to a single style class or a mixture of the available style classes. This requires the model to be able to learn one-to-many mappings, from an input single class label to multiple styles. For this reason, we build our model based on generative adversarial networks (GAN), which have been shown to generate realistic data in highly complex and multi-modal distributions in numerous domains. More speciﬁcally, we design a conditional GAN model that takes a semantic conditioning vector specifying the desired style class(es) and a noise vector as input and outputs the statistics required for applying style transfer. In order to achieve style transfer, we adapt a preexisting encoder-decoder based universal style transfer model. The encoder component extracts convolutional feature maps from the content image. These features are ﬁrst whitened and then colorized using the statistics of the input style image. The decoder component then reconstructs the stylized image from the colorized features. In our adaptation, instead of using full covariance matrices, we approximate the whitening and coloring transforms using diagonal elements of the covariance matrices. We then remove the dependence to the input style image by learning to generate the statistics via our GAN model. In our experiments, we use a subset of the WikiArt dataset to train and validate our approach. We demonstrate that our approximation method achieves stylization results similar to the preexisting model but with higher speeds and using a fraction of target style statistics. We also show that our conditional GAN model leads to successful style transfer results by learning the manifold of styles corresponding to each style class. We additionally show that the GAN model can be used to generate novel style class combinations, which are highly correlated with the corresponding actual stylization results that are not seen during training.
Open Access
VecGAN: Image-to-Image translation with interpretable latent directions
(2022-10-21) Dalva, Yusuf; Dundar, Aysegul; Altındiş, Said Fahri
We propose VecGAN, an image-to-image translation framework for facial attribute editing with interpretable latent directions. Facial attribute editing task faces the challenges of precise attribute editing with controllable strength and preservation of the other attributes of an image. For this goal, we design the attribute editing by latent space factorization and for each attribute, we learn a linear direction that is orthogonal to the others. The other component is the controllable strength of the change, a scalar value. In our framework, this scalar can be either sampled or encoded from a reference image by projection. Our work is inspired by the latent space factorization works of fixed pretrained GANs. However, while those models cannot be trained end-to-end and struggle to edit encoded images precisely, VecGAN is end-to-end trained for image translation task and successful at editing an attribute while preserving the others. Our extensive experiments show that VecGAN achieves significant improvements over state-of-the-arts for both local and global edits.