Dept. of Computer Engineering - Master's degree

Permanent URI for this collection

https://hdl.handle.net/11693/13861

Browse

Now showing 1 - 20 of 582

Embargo
Hardware acceleration for Swin Transformers at the edge
(Bilkent University, 2024-05) Esergün, Yunus
While deep learning models have greatly enhanced visual processing abilities, their implementation in edge environments with limited resources can be challenging due to their high energy consumption and computational requirements. Swin Transformer is a prominent mechanism in computer vision that differs from traditional convolutional approaches. It adopts a hierarchical approach to interpreting images. A common strategy that improves the efficiency of deep learning algorithms during inference is clustering. Locality-Sensitive Hashing (LSH) is a mechanism that implements clustering and leverages the inherent redundancy within Transformers to identify and exploit computational similarities. This the-sis introduces a hardware accelerator for Swin Transformer implementation with LSH in edge computing settings. The main goal is to reduce energy consumption while improving performance with custom hardware components. Specifically, our custom hardware accelerator design utilizes LSH clustering in Swin Transformers to decrease the amount of computation required. We tested our accelerator with two different state-of-the-art datasets, namely, Imagenet-1K and CIFAR-100. Our results demonstrate that the hardware accelerator enhances the processing speed of the Swin Transformer when compared to GPU-based implementations. More specifically, our accelerator improves performance by 1.35x while reducing the power consumption to 5-6 Watts instead of 19 Watts in the baseline GPU setting. We observe these improvements with a negligible decrease in model accuracy of less than 1%, confirming the effectiveness of our hardware accelerator design in edge computing environments with limited resources.
Embargo
Misinformation detection by leveraging user communities on social media
(Bilkent University, 2024-05) Özçelik, Oğuzhan
Social media platforms have become a primary source of accessing information. However, the spread of misinformation is inevitable due to the ease of creating and sharing malicious content, including fake news. Social media users on such platforms (e.g., Twitter) often find themselves exposed to similar viewpoints and tend to avoid contrasting opinions, particularly when connected within a community. To investigate this problem, we examine the presence of user communities and leverage them as a tool to detect misinformation on social media. In this thesis, we first collect tweets together with user engagements relevant to recent events between 2020 and 2022. We then construct a human-annotated social media dataset having 5,284 English and 5,064 Turkish tweets with their veracity labels. After the data construction process, we leverage the presence of user communities for misinformation detection on social media. For this purpose, we propose a text similarity-based method that utilizes user-follower interactions within a social network to identify misinformation content. Our method first extracts important textual features of social media posts using contrastive learning. We then measure the similarity for each social media post, based on its relevance to each user in the community. Next, we train a classifier to assess the truthfulness of social media posts using these similarity scores. We evaluate our approach on three social media datasets and compare our method with the state-of-the-art approaches. The experimental results show that contrastive learning and user communities can effectively enhance the detection of misinformation on social media. Our model can identify misinformation content by achieving a consistently high weighted F1 score of over 90% across all datasets, even employing only a small number of users in communities.
Open Access
Image inpainting with diffusion models and generative adversarial networks
(Bilkent University, 2024-05) Yıldırım, Ahmet Burak
We present two novel approaches to image inpainting, a task that involves erasing unwanted pixels from images and filling them in a semantically consistent and realistic way. The first approach uses natural language input to determine which object to remove from an image. We construct a dataset named GQA-Inpaint for this task and train a diffusion-based inpainting model on it, which can remove objects from images based on text prompts. The second approach tackles the challenging task of inverting erased images into StyleGAN’s latent space for realistic inpainting and editing. For this task, we propose learning an encoder and a mixing network to combine encoded features of erased images with StyleGAN’s mapped features from random samples. To achieve diverse inpainting results for the same erased image, we combine the encoded features and randomly sampled style vectors via the mixing network. We compare our methods with different evaluation metrics that measure the quality of the models and show significant quantitative and qualitative improvements.
Embargo
Augmenting bus factor analysis with visualization
(Bilkent University, 2024-01) Ahmed, Muhammad Umair
‘Bus factor’, also known as ‘truck factor’, is a measure of how vulnerable a software project is based on the minimum number of people who would have to leave the project (be ‘hit by a bus’) for it to stall. There is existing research on how to calculate bus factor for software projects but limited work on visualizing the bus factor. We believe providing visualization along with conventionally provided numerical bus factor results will help decision-makers manage the workload and knowledge distribution across the project and also help in planning and hiring decisions. This thesis proposes, implements, and evaluates a tool named BFViz to visualize bus factor and contributions for software projects from pre-processed Git history. It is a web application that provides a file-browser-like interface with an interactively navigable treemap. Additionally, it has filename-based filtering, individual contribution data for files and folders, and simulation of contributor departure. The tool is validated with a round of four user evaluations where users, ranging from project owners and engineering managers to developers, complete tasks using the tool on an open-source project that they are involved in and provide feedback with a semi-structured interview and a feature ranking activity. The overall task completion rate for the tasks was 79.55%. All case study participants preferred BFViz over text reports to understand bus factor data. The top three features, by mean ranking, were the contributors’ list, the files and folders’ visualization, and the simulation mode.
Open Access
Three-dimensional human texture estimation learning from multi-view images
(Bilkent University, 2023-12) Altındiş, Said Fahri
In the fields of graphics and vision, accurately estimating 3D human texture from a single image is a critical task. This process involves developing a mapping function that transforms input images of humans in various poses into parametric (UV) space, while also effectively inferring the appearance of unseen parts. To enhance the quality of 3D human texture estimation, our study introduces a framework that utilizes deformable convolution for adaptive input sampling. This convolution is uniquely characterized by offsets learned through a sophisticated deep neural network. Additionally, we introduce an innovative cycle consistency loss, which markedly enhances view generalization. Our framework is further refined by incorporating an uncertainty-based, pixel-level image reconstruction loss, aimed at augmenting color accuracy. Through comprehensive comparisons with leading-edge methods, our approach demonstrates notable qualitative and quantitative advancements in the field.
Open Access
Learning portrait drawing of face photos from unpaired data with unsupervised landmarks
(Bilkent University, 2023-12) Taşdemir, Burak
Translating face photos to artistic drawings by hand is a complex task that typically needs the expertise of professional artists. The demand for automating this artistic task is clearly on the rise. Turning a photo into a hand-drawn portrait goes beyond simple transformation. This task contemplates a sophisticated process that focuses on highlighting key facial features and often omits small details. Thus, designing an effective tool for image conversion involves selectively preserving certain elements of the subject’s face. In our study, we introduce a new technique for creating portrait drawings that learn exclusively from unpaired data without the use of extra labels. By utilizing unsupervised learning to extract features, our technique shows a promising ability to generalize across different domains. Our proposed approach integrates an in-depth understanding of images using unsupervised components and the ability to maintain individual identity, which is typically seen in simpler networks. We also present an innovative concept: an asymmetric pose-based cycle consistency loss. This concept introduces flexibility to the traditional cycle consistency loss, which typically expects an original image to be perfectly reconstructed after being converted to a portrait and then reverted. In our comprehensive testing, we evaluate our method with both in-domain and out-domain images and benchmark it against the leading methods. Our findings reveal that our approach yields superior results, both numerically and in terms of visual quality, across three different datasets.
Open Access
HySE: a spring embedder approach for layout of hybrid graphs
(Bilkent University, 2023-09) Islam, Hamza
In recent times, the growth of data has been exponential, making the visual analysis of relational data progressively complex. Presenting such data in a visually appealing manner can help simplify the analysis process. Hybrid graphs, comprising a central directed or hierarchical part and interconnected undirected components, offer a practical structure for representing relational data with varying levels of abstraction while managing its complexity. To comprehend the relationships in data, discover insights, and get important patterns, a well-optimized graph layout for such graphs is needed. In response, we present HySE (Hybrid Spring Embedder), a novel graph layout algorithm tailored for hybrid graphs. HySE makes use of a holistic approach based on the popular spring embedder to achieve the aesthetics and quality of an optimized force-directed layout, not only on the undirected part of the graph but also on the hierarchy while maintaining the cohesion between both directed and undirected elements of the graph. The layout algorithm assumes the rank information of directed graph elements is already calculated with one of the popular approaches. Then, it finds appropriate initial positions and uses a force-directed layout technique to integrate the undirected parts into the layout, applying spring forces to model the edges, and repulsive electric forces for the nodes. Iteratively, HySE converges to an equilibrium state with minimized energy, resulting in visually pleasing and interpretable layouts for intricate hybrid graphs. Experiments performed on graphs, generated randomly through a well-designed process, validate that HySE performs as well as the state-of-the-art algorithms in terms of quality. It also matches the speed of well-established algorithms as well in small-to-medium-sized graphs.
Open Access
Object detection and synthetic infrared image generation for UAV-based aerial images
(Bilkent University, 2023-09) Özkanoğlu, Mehmet Akif
This thesis contains two main works related to aerial image processing. In the first work (in the first main part of this thesis), we present novel approaches to detect objects in aerial images. We introduce a novel object detection algorithm based on CenterNet which yields the state-of-the-art results in many metrics on many aerial benchmark datasets, when this thesis was written. In this part, we study the effect of different loss functions, and architectures for improving the detection performance of objects in aerial images taken by UAVs. We show that our proposed approaches help improving certain aspects of the learning process for detecting objects in aerial images. To train recent deep learning-based supervised object detection algorithms, the availability of annotations is essential. Many algorithms, today, use both infrared (IR) and visible (RGB) image pairs as input. However, large datasets (such as VisDrone [1] or ImageNet [2]) typically are captured in the visible spectrum. Therefore, a domain transfer-based approach to artificially generate in-frared equivalents of the visible images for existing datasets is presented in the second part of this thesis. Such image pairs, then, can be used to train object detection algorithms for either mode in future work.
Open Access
Affect and personality aware analysis of speech content for automatic estimation of depression severity
(Bilkent University, 2023-09) Gönç, Kaan
The detection of depression has gained a significant amount of scientific attention for its potential in early diagnosis and intervention. In light of this, we propose a novel approach that places exclusive emphasis on textual features for depression severity estimation. The proposed method seamlessly integrates affect (emotion and sentiment), and personality features as distinct yet interconnected modalities within a transformer-based architecture. Our key contribution lies in a masked multimodal joint cross-attention fusion, which adeptly combines the information gleaned from these different text modalities. This fusion approach empowers the model not only to discern subtle contextual cues within textual data but also to comprehend intricate interdependencies between the modalities. A comprehensive experimental evaluation is undertaken to meticulously assess the individual components comprising the proposed architecture, as well as extraneous ones that are not inherent to it. The evaluation additionally includes the assessments conducted in a unimodal setting where the impact of each modality is examined individually. The findings derived from these experiments substantiate the self-contained efficacy of our architecture. Furthermore, we explore the significance of individual sentences within speech content, offering valuable insights into the contribution of specific textual cues and we perform a segmented evaluation of the proposed method for different ranges of depression severity. Finally, we compare our method with existing state-of-the-art studies utilizing different combinations of auditory, visual, and textual features. The final results demonstrate that our method achieves promising results in depression severity estimation, outperforming the other methods.
Open Access
Modeling spatial context in transformer-based whole slide image classification
(Bilkent University, 2023-09) Erkan, Cihan
The common method for histopathology image classiﬁcation is to sample small patches from the large whole slide images and make predictions based on aggregations of patch representations. Transformer models provide a promising alternative with their ability to capture long-range dependencies of patches and their potential to detect representative regions, thanks to their novel self-attention strategy. However, as sequence-based architectures, transformers are unable to directly capture the two-dimensional nature of images. Modeling the spatial con-text of an image for a transformer requires two steps. In the ﬁrst step the patches of the image are ordered as a 1-dimensional sequence, then the order information is injected to the model. However, commonly used spatial context modeling methods cannot accurately capture the distribution of the patches as they are designed to work on images with a ﬁxed size. We propose novel spatial context modeling methods in an eﬀort to make the model be aware of the spatial context of the patches as neighboring patches usually form diagnostically relevant structures. We achieve that by generating sequences that preserve the locality of the patches. We test the generated sequences by utilizing various information injection strategies. We evaluate the performance of the proposed transformer-based whole slide image classiﬁcation framework on a lung dataset obtained from The Cancer Genome Atlas. Our experimental evaluations show that the proposed sequence generation method that utilizes space-ﬁlling curves to model the spatial context performs better than both baseline and state-of-the-art methods by achieving 87.6% accuracy.
Open Access
Face manipulation detection
(Bilkent University, 2023-09) Nourmohammadi, Sepehr
Advancements in deep learning have facilitated the creation of highly realistic counterfeit human faces, ushering in the era of deepfakes. The potential to generate such convincingly authentic fake content prompts concerns due to the potential harm it could inflict on individuals and societies alike. Current studies predominantly focus on binary approaches that differentiate between real and fake images or videos. However, this approach can be time-consuming, requiring a multitude of diverse fake examples for training. Furthermore, unique deepfake content generated using different models may elude detection, making it challenging to apprehend all deepfakes. We propose two potential solutions. First, we suggest a one-class classification method, a purist approach that trains solely on real data and tests on both real and fake data. Second, using a cross-manipulation technique as a non-purist approach, which refers to the application of image manipulations to a use unseen/unknown manipulated samples during the training of the machine learning model. Efficacy in this process can be achieved by using a combination of different models, which enhances the detection of deep fakes. This is done by merging learning-based systems involving an ℓp-norm constraint with adjustable p-norm rules, thereby providing both sparse and non-sparse solutions to enhance discriminatory information between based learners in ensemble learning. Contrary to conventional subject-independent learning methods employed in deep fake detection, we propose a subject-dependent learning approach. Our preliminary findings suggest that this multifaceted approach can effectively detect deepfakes, demonstrating impressive results on the FaceForensics++ dataset as well as on generic one-class classification datasets including the UCI, and Keel datasets in both pure and non-pure approaches.
Open Access
Real image editing with StyleGAN
(Bilkent University, 2023-09) Pehlivan, Hamza
We present a novel image inversion framework and a training pipeline to achieve high-fidelity image inversion with high-quality attribute editing. Inverting real images into StyleGAN’s latent space is an extensively studied problem, yet the trade-off between image reconstruction fidelity and image editing quality remains an open challenge. The low-rate latent spaces are limited in their expressiveness power for high-fidelity reconstruction. On the other hand, high-rate latent spaces result in degradation in editing quality. In this work, to achieve high-fidelity inversion, we learn residual features in higher latent codes that lower latent codes were not able to encode. This enables preserving image details in reconstruction. To achieve high-quality editing, we learn how to transform the residual features for adapting to manipulations in latent codes. We train the framework to extract residual features and transform them via a novel architecture pipeline and cycle consistency losses. We run extensive experiments and compare our method with state-of-the-art inversion methods. Qualitative metrics and visual comparisons show significant improvements.
Open Access
CMGV: a unified framework for complexity management in graph visualization
(Bilkent University, 2023-08) Zafar, Osama
In today’s era of technological revolution, the sheer volume of data being produced poses a significant challenge for analyzing relational data of such scale, particularly in terms of visual analysis. Graphs provide an effective way of organizing and representing relational data, with nodes representing entities. In contrast, edges representing relationships, a comprehensive and intuitive view of complex large-scale data is created. A well-represented visualization of complex graphs allows users to understand relationships, uncover new insights, and discover hid-den patterns. To this end, we introduce a complexity management framework for effectively analyzing large-scale relational data represented as graphs. Existing methods for managing graph complexity work independently and may lead to in-consistencies and confusion consecutively applied. The Complexity Management Graph Visualization framework (CMGV) presents a novel approach integrating commonly used complexity management techniques while ensuring the preservation of the user’s mental map through a specialized layout algorithm. The frame-work introduces an intuitive Graph Complexity Management Model (CMGM) for both graph representation and complexity management. CMGV supports commonly utilized complexity management tasks, including filtering, hiding, showing, collapsing, and expanding graph elements. Importantly, CMGV is designed to be independent of the rendering method and can be seamlessly integrated with different graph rendering libraries. This is possible through an extension that synchronizes the graph models between the rendering library and CMGM. Our experiments performed on randomly generated graphs verify that CMGV flawlessly performs consecutive graph complexity management operations, leaving the user graph intact, and outperforms existing complexity management solutions in terms of both runtime and generally accepted graph layout criteria. It is fast enough to be used in interactive applications with small to medium-sized graphs.
Open Access
Whole genome alignment via Alternating Lyndon Factorization Tree traversal
(Bilkent University, 2023-07) Aydın, Mahmud Sami
The Whole Genome Alignment Problem (WGA) is an important challenge in the field of genomics, especially in the context of pangenome construction. Here we propose a novel indexing structure called the Alternating Lyndon Factor-ization Tree (ALFTree), which incorporates both spatial and lexicographical information within its nodes. The ALFTree is a powerful tool for WGA, as it can efficiently store and retrieve information about large DNA sequences. We present an algorithm, namely Idoneous, specifically designed to construct the ALFTree from a given DNA sequence. The algorithm works by generating intervals of specific sizes, identifying matches within these intervals, and perform-ing a sanity check through alignment procedures. The algorithm is efficient and scalable, making it a valuable tool for WGA. Some of the key features of the ALFTree are 1) compact and efficient data structure for storing large DNA sequences; 2) efficient retrieval of information about specific regions of a DNA sequence; 3) ability to handle both spatial and lexicographical information; and 4) scalability to large DNA sequences. Our experimental results on different genomes highlight the effects of param-eter selections on coverage and identity. Idoneous demonstrates competitive per-formance in terms of coverage and provides flexibility in adjusting sensitivity and specificity for different alignment scenarios. The ALFTree has the potential to significantly improve the performance of WGA algorithms. We believe that the ALFTree is a valuable contribution to the field of genomics, and we hope that it will be used by researchers to accelerate the pace of discovery.
Open Access
System-on-chip memory design for a domain-specific RISC-V processor
(Bilkent University, 2023-05) Gülgeç, Utku
The use of graph applications is common in many areas; however, irregular and data-driven memory access patterns combined with the large sizes of graph data results in performance loss in general-purpose computing systems. Existing studies proposed hardware accelerators often implemented on FPGAs to alleviate performance problems and improve energy efficiency while having less emphasis on programmability and flexibility. This thesis presents a hardware implementation of a domain-specific processor design for graph applications on a system-on-chip (SoC) platform accompanied by a design of a memory framework. The proposed system architecture improves the micro-architecture of a baseline design, integrates the baseline with an efficient system-on-chip bus communication protocol, and compares alternative memory framework implementations. The hardware is implemented on a state-of-the-art evaluation board. Popular graph benchmarks are used for performance evaluations of the implemented system, and various sensitivity analysis is done on newly added system parameters to determine the optimal system configuration. An analysis of power consumption and resource utilization is also provided. Overall, average speed-ups vary between 15% and 25% depending on the benchmark and graph data, while on-chip power consumption varies between 3.8 to 4.2 Watts depending on the system clock frequency.
Open Access
Local context based linear text segmentation
(Bilkent University, 2014-02) Erdem, Hayrettin
Understanding the topical structure of text documents is important for eﬀective retrieval and browsing, automatic summarization, and tasks related to identifying, clustering and tracking documents about their topics. Despite documents often display structural organization and contain explicit section markers, some lack of such properties thereby revealing the need for topical text segmentation systems. Examples of such documents are speech transcripts and inherently un-structured texts like newspaper columns and blog entries discussing several sub-jects in a discourse. A novel local-context based approach depending on lexical cohesion is presented for linear text segmentation, which is the task of dividing text into a linear sequence of coherent segments. As the lexical cohesion indicator, the proposed technique exploits relationships among terms induced from semantic space called HAL (Hyperspace Analogue to Language), which is built upon by examining co-occurrence of terms through passing a ﬁxed-sized window over text. The proposed algorithm (BTS) iteratively discovers topical shifts by examining the most relevant sentence pairs in a block of sentences considered at each iteration. The technique is evaluated on both error-free speech transcripts of news broadcasts and documents formed by concatenating diﬀerent topical regions of text. A new corpus for Turkish is automatically built where each document is formed by concatenating diﬀerent news articles. For performance comparison, two state-of-the-art methods, TextTiling and C99, are leveraged, and the results show that the proposed approach has comparable performance with these two techniques. The results are also statistically validated by applying the ANOVA and Tukey post–hoc test.
Open Access
Image-to-image translation for face attribute editing with disentangled latent directions
(Bilkent University, 2023-06) Dalva, Yusuf
We propose an image-to-image translation framework for facial attribute editing with disentangled interpretable latent directions. Facial attribute editing task faces the challenges of targeted attribute editing with controllable strength and disentanglement in the representations of attributes to preserve the other at-tributes during edits. For this goal, inspired by the latent space factorization works of fixed pretrained GANs, we design the attribute editing by latent space factorization, and for each attribute, we learn a linear direction that is orthogonal to the others. We train these directions with orthogonality constraints and dis-entanglement losses. To project images to semantically organized latent spaces, we set an encoder-decoder architecture with attention-based skip connections. We extensively compare with previous image translation algorithms and editing with pretrained GAN works. Our extensive experiments show that our method significantly improves over the state-of-the-arts.
Open Access
Identification of protein-protein interaction bridges for multiple sclerosis
(Bilkent University, 2022-12) Yazıcı, Gözde
Identifying and prioritizing disease-related proteins is an important scientific problem to understand disease etiology. Network science has become an important discipline to prioritize such proteins. Multiple sclerosis (MS), an autoimmune disease which still cannot be cured, is characterized by a damaging process called demyelination. Demyelination is the destruction of the crucial nerve sheath, myelin, and oligodendrocytes, the cells producing myelin, by immune cells. Identifying the proteins having special features on the network formed by the proteins of oligodendrocyte and immune cells can reveal useful information about the disease. To this end, we investigated the most significant protein pairs for the intraand intercellular protein networks that we define as bridges among the proteins providing the interaction between the two cells in demyelination. We analyzed two protein networks including the oligodendrocyte and each type of two immune cells, macrophage and T-cell. We developed a model called BriFin that prioritizes contact protein pairs using network analysis techniques and integer programming. We showed several proteins it prioritized have already been associated with MS in the relevant literature. For the oligodendrocyte-macrophage network, we showed that 77% to 100% of the proteins BriFin detected, depending on the parametrization, are MS-associated. We further experimentally investigated 4 proteins prioritized by BriFin, and observed that the mRNA expression levels of 2 out of these 4 proteins significantly decreased in a group of MS patients. We therefore here present a model, BriFin, which can be used to analyze processes where interactions of two cell types play an important role.
Open Access
Impact of code review process smells on code smells
(Bilkent University, 2023-01) Tuna, Erdem
The code review process is conducted by software teams with various motivations. Among other goals, code reviews act as a gatekeeper for software quality. Software quality comprises several aspects, maintainability (i.e., code quality) being one of them. In this study, we explore whether code review process quality (as evidenced by the presence of code review process smells) influences software maintainability (as evidenced by the presence of code smells). In other words, we investigate whether smells in the code review process are related to smells in the code that was reviewed by using correlation analysis. We augment our quantitative analysis with a focus group study to learn practitioners’ opinions. Contrary to our own intuition and that of the practitioners in our focus groups, we found that code review process smells have little to no correlation with the level of code smells. Further investigations revealed that the level of code smells neither increases nor decreases in 8 out of 10 code reviews, regardless of the quality of the code review. We identified multiple potential reasons behind the counter-intuitive results based on our focus group data. Furthermore, practitioners still believe that code reviews are helpful in improving software quality. Our results imply that the community should update our goals for code review practices and reevaluate those practices to align them with more relevant and modern realities.
Open Access
Anatomic context-aware segmentation of organs-at-risk in thorax computed tomography scans
(Bilkent University, 2022-12) Khattak, Haya Shamim Khan
Organ segmentation plays a crucial role in disease diagnosis and radiation therapy planning. Efficient and automated segmentation of the organs-at-risk (OARs) re-quires immediate attention since manual segmentation is a time consuming and costly task that is also prone to inter-observer variability. Automatic segmen-tation of organs-at-risk using deep learning is prone to predicting extraneous regions, especially in apical and basal slices of the organs where the shape is dif-ferent from the center slices. This thesis presents a novel method to incorporate prior knowledge on shape and anatomical context into deep-learning based organ segmentation. This prior knowledge is quantified using distance transforms that capture characteristics of the shape, location, and relation of the organ position with respect to the surrounding organs. In this thesis, the role of various distance transform maps has been explored to show that using distance transform regres-sion, alone or in conjunction with classification, improves the overall performance of the organ segmentation network. These maps can be the distance between each pixel and the center of the organ, or the closest distance between two organs; such as the esophagus and the spine. When used in a single-task regression model, these distance maps improved the segmentation results. Moreover, when used in a multi-task network with classification being the other task, they acted as regularizers for the classification task and yielded improved segmentations. The experiments were conducted on a computed tomography (CT) thorax dataset of 265 patients and the organs of interest are the heart, the esophagus, the lungs, and the spine. The results revealed a significant increase in f-scores and decrease in the Hausdorff distances for the OARs when segmented using the proposed model compared to the baseline network architectures.