Browsing by Subject "Deep learning"

Now showing 1 - 20 of 117

Open Access
A diffusion-based reconstruction technique for single pixel camera
(IEEE - Institute of Electrical and Electronics Engineers, 2023-08-28) Güven, Baturalp; Güngör, A.; Bahçeci, M. U.; Çukur, Tolga
Single-pixel imaging enables high-resolution imaging through multiple coded measurements based on lowresolution snapshots. To reconstruct a high-resolution image from these coded measurements, an ill-posed inverse problem is solved. Despite the recent popularity of deep learning-based methods for single-pixel imaging reconstruction, they are insufficient in preserving spatial details and achieving a stable reconstruction. Diffusion-based methods, which have gained attention in recent years, provide a solution to this problem. In this study, to the best of our knowledge, the single-pixel image reconstruction is performed for the first time using a denoising diffusion probabilistic model. The proposed method reconstructs the image by conditioning it towards the least squares solution while preserving data consistency after unconditional training of the model. The proposed method is compared against existing singlepixel imaging methods, and ablation studies are conducted to demonstrate the individual model components. The proposed method outperforms competing methods in both quantitative measurements and visual quality.
Embargo
A new CNN-LSTM architecture for activity recognition employing wearable motion sensor data: enabling diverse feature extraction
(Elsevier, 2023-06-28) Koşar, Enes; Barshan, Billur
Extracting representative features to recognize human activities through the use of wearables is an area of on-going research. While hand-crafted features and machine learning (ML) techniques have been sufficiently well investigated in the past, the use of deep learning (DL) techniques is the current trend. Specifically, Convolutional Neural Networks (CNNs), Long Short Term Memory Networks (LSTMs), and hybrid models have been investigated. We propose a novel hybrid network architecture to recognize human activities through the use of wearable motion sensors and DL techniques. The LSTM and the 2D CNN branches of the model that run in parallel receive the raw signals and their spectrograms, respectively. We concatenate the features extracted at each branch and use them for activity recognition. We compare the classification performance of the proposed network with three single and three hybrid commonly used network architectures: 1D CNN, 2D CNN, LSTM, standard 1D CNN-LSTM, 1D CNN-LSTM proposed by Ordóñez and Roggen, and an alternative 1D CNN-LSTM model. We tune the hyper-parameters of six of the models using Bayesian optimization and test the models on two publicly available datasets. The comparison between the seven networks is based on four performance metrics and complexity measures. Because of the stochastic nature of DL algorithms, we provide the average values and standard deviations of the performance metrics over ten repetitions of each experiment. The proposed 2D CNN-LSTM architecture achieves the highest average accuracies of 95.66% and 92.95% on the two datasets, which are, respectively, 2.45% and 3.18% above those of the 2D CNN model that ranks the second. This improvement is a consequence of the proposed model enabling the extraction of a broader range of complementary features that comprehensively represent human activities. We evaluate the complexities of the networks in terms of the total number of parameters, model size, training/testing time, and the number of floating point operations (FLOPs). We also compare the results of the proposed network with those of recent related work that use the same datasets.
Open Access
A transformer-based prior legal case retrieval method
(IEEE - Institute of Electrical and Electronics Engineers, 2023-08-28) Öztürk, Ceyhun Emre; Özçelik, Şemsi Barış; Koç, Aykut
In this work, BERTurk-Legal, a transformer-based language model, is introduced to retrieve prior legal cases. BERTurk-Legal is pre-trained on a dataset from the Turkish legal domain. This dataset does not contain any labels related to the prior court case retrieval task. Masked language modeling is used to train BERTurk-Legal in a self-supervised manner. With zero-shot classification, BERTurk-Legal provides state-of-the-art results on the dataset consisting of legal cases of the Court of Cassation of Turkey. The results of the experiments show the necessity of developing language models specific to the Turkish law domain.
Open Access
A transformer-based real-time focus detection technique for wide-field interferometric microscopy
(IEEE - Institute of Electrical and Electronics Engineers, 2023-08-28) Polat, Can; Güngör, A.; Yorulmaz, M.; Kızılelma, B.; Çukur, Tolga
Wide-field interferometric microscopy (WIM) has been utilized for visualization of individual biological nanoparticles with high sensitivity. However, the image quality is highly affected by the focusing of the image. Hence, focus detection has been an active research field within the scope of imaging and microscopy. To tackle this issue, we propose a novel convolution and transformer based deep learning technique to detect focus in WIM. The method is compared to other focus detecton techniques and is able to obtain higher precision with less number of parameters. Furthermore, the model achieves real-time focus detection thanks to its low inference time.
Open Access
Affect and personality aware analysis of speech content for automatic estimation of depression severity
(2023-09) Gönç, Kaan
The detection of depression has gained a significant amount of scientific attention for its potential in early diagnosis and intervention. In light of this, we propose a novel approach that places exclusive emphasis on textual features for depression severity estimation. The proposed method seamlessly integrates affect (emotion and sentiment), and personality features as distinct yet interconnected modalities within a transformer-based architecture. Our key contribution lies in a masked multimodal joint cross-attention fusion, which adeptly combines the information gleaned from these different text modalities. This fusion approach empowers the model not only to discern subtle contextual cues within textual data but also to comprehend intricate interdependencies between the modalities. A comprehensive experimental evaluation is undertaken to meticulously assess the individual components comprising the proposed architecture, as well as extraneous ones that are not inherent to it. The evaluation additionally includes the assessments conducted in a unimodal setting where the impact of each modality is examined individually. The findings derived from these experiments substantiate the self-contained efficacy of our architecture. Furthermore, we explore the significance of individual sentences within speech content, offering valuable insights into the contribution of specific textual cues and we perform a segmented evaluation of the proposed method for different ranges of depression severity. Finally, we compare our method with existing state-of-the-art studies utilizing different combinations of auditory, visual, and textual features. The final results demonstrate that our method achieves promising results in depression severity estimation, outperforming the other methods.
Open Access
Analysis of speech content and voice for deceit detection
(2024-09) Eskin, Maria Raluca
Deceptive behavior is part of daily life, often without being recognized, leading to severe repercussions. With the recent improvements in machine learning, more reliable detection of deceit appears to be possible. Although current visual and multimodal models can identify deception with adequate precision, the individual use of speech content or voice still performs poorly. Therefore, we systematically analyze such essential communication forms focusing on feature extraction and optimization for deceit detection. To this end, we assess the reliability of employing transformers, spatial and temporal architectures, state-of-the-art pre-trained models, and handcrafted representations to detect deceit patterns. Furthermore, we conduct a thorough analysis to comprehend the distinct properties and discriminative power of the evaluated methods. The results demonstrate that speech content (transcribed text) provides more information than vocal characteristics. In addition, transformer architectures are found to be effective in representation learning and modeling, providing insights into optimal model configurations for deceit detection.
Open Access
Anatomic context-aware segmentation of organs-at-risk in thorax computed tomography scans
(2022-12) Khattak, Haya Shamim Khan
Organ segmentation plays a crucial role in disease diagnosis and radiation therapy planning. Efficient and automated segmentation of the organs-at-risk (OARs) re-quires immediate attention since manual segmentation is a time consuming and costly task that is also prone to inter-observer variability. Automatic segmen-tation of organs-at-risk using deep learning is prone to predicting extraneous regions, especially in apical and basal slices of the organs where the shape is dif-ferent from the center slices. This thesis presents a novel method to incorporate prior knowledge on shape and anatomical context into deep-learning based organ segmentation. This prior knowledge is quantified using distance transforms that capture characteristics of the shape, location, and relation of the organ position with respect to the surrounding organs. In this thesis, the role of various distance transform maps has been explored to show that using distance transform regres-sion, alone or in conjunction with classification, improves the overall performance of the organ segmentation network. These maps can be the distance between each pixel and the center of the organ, or the closest distance between two organs; such as the esophagus and the spine. When used in a single-task regression model, these distance maps improved the segmentation results. Moreover, when used in a multi-task network with classification being the other task, they acted as regularizers for the classification task and yielded improved segmentations. The experiments were conducted on a computed tomography (CT) thorax dataset of 265 patients and the organs of interest are the heart, the esophagus, the lungs, and the spine. The results revealed a significant increase in f-scores and decrease in the Hausdorff distances for the OARs when segmented using the proposed model compared to the baseline network architectures.
Open Access
Anomaly detection in diverse sensor networks using machine learning
(2022-01) Akyol, Ali Alp
Earthquake precursor detection is one of the oldest research areas that has the potential of saving human lives. Recent studies have enlightened the fact that strong seismic activities and earthquakes affect the electron distribution of the ionosphere. These effects are clearly observable on the ionospheric Total Electron Content (TEC) that shall be measured by using the satellite position data of the Global Navigation Satellite System (GNSS). In this dissertation, several earthquake precursor detection techniques are proposed and their precursor detection performances are investigated on TEC data obtained from different sensor networks. First, a model based earthquake precursor detection technique is proposed to detect precursors of the earthquakes with magnitudes greater than 5 in the vicinity of Turkey. Precursor detection and TEC reliability signals are generated by using ionospheric TEC variations. These signals are thresholded to obtain earthquake precursor decisions. Earthquake precursor detections are made by using Particle Swarm Optimization (PSO) technique on these precursor decisions. Performance evaluations show that the proposed technique is able to detect 14 out of 23 earthquake precursors of magnitude larger than 5 in Richter scale while generating 8 false precursor decisions. Second, a machine learning based earthquake precursor detection technique, EQ-PD is proposed to detect precursors of the earthquakes with magnitudes greater than 4 in the vicinity of Italy. Spatial and spatio-temporal anomaly detection thresholds are obtained by using the statistics of TEC variation during seismically active times and applied on TEC variation based anomaly detection signal to form precursor decisions. Resulting spatial and spatio-temporal anomaly decisions are fed to a Support Vector Machine (SVM) classifier to generate earthquake precursor detections. When the precursor detection performance of the EQ-PD is investigated, it is observed that the technique is able to detect 22 out of 24 earthquake precursors while generating 13 false precursor decisions during 147 days of no-seismic activity. Last, a deep learning based earthquake precursor detection technique, DLPD is proposed to detect precursors of the earthquakes with magnitudes greater than 5.4 in the vicinity Anatolia region. The DL-PD technique utilizes a deep neural network with spatio-temporal Global Ionospheric Map (GIM)-TEC data estimation capabilities. GIM-TEC anomaly score is obtained by comparing GIMTEC estimates with GIM-TEC recordings. Earthquake precursor detections are generated by thresholding the GIM-TEC anomaly scores. Precursor detection performance evaluations show that DL-PD shall detect 5 out of 7 earthquake precursors while generating 1 false precursor decision during 416 days of noseismic activity.
Open Access
Applying deep learning in augmented reality tracking
(IEEE, 2016-11-12) Akgül, Ömer; Penekli, H. I.; Genç, Y.
An existing deep learning architecture has been adapted to solve the detection problem in camera-based tracking for augmented reality (AR). A known target, in this case a planar object, is rendered under various viewing conditions including varying orientation, scale, illumination and sensor noise. The resulting corpus is used to train a convolutional neural network to match given patches in an incoming image. The results show comparable or better performance compared to state of art methods. Timing performance of the detector needs improvement but when considered in conjunction with the robust pose estimation process promising results are shown. © 2016 IEEE.
Open Access
Artificial intelligence-based hybrid anomaly detection and clinical decision support techniques for automated detection of cardiovascular diseases and Covid-19
(2023-10) Terzi, Merve Begüm
Coronary artery diseases are the leading cause of death worldwide, and early diagnosis is crucial for timely treatment. To address this, we present a novel automated arti cial intelligence-based hybrid anomaly detection technique com posed of various signal processing, feature extraction, supervised, and unsuper vised machine learning methods. By jointly and simultaneously analyzing 12-lead electrocardiogram (ECG) and cardiac sympathetic nerve activity (CSNA) data, the automated arti cial intelligence-based hybrid anomaly detection technique performs fast, early, and accurate diagnosis of coronary artery diseases. To develop and evaluate the proposed automated arti cial intelligence-based hybrid anomaly detection technique, we utilized the fully labeled STAFF III and PTBD databases, which contain 12-lead wideband raw recordings non invasively acquired from 260 subjects. Using the wideband raw recordings in these databases, we developed a signal processing technique that simultaneously detects the 12-lead ECG and CSNA signals of all subjects. Subsequently, using the pre-processed 12-lead ECG and CSNA signals, we developed a time-domain feature extraction technique that extracts the statistical CSNA and ECG features critical for the reliable diagnosis of coronary artery diseases. Using the extracted discriminative features, we developed a supervised classi cation technique based on arti cial neural networks that simultaneously detects anomalies in the 12-lead ECG and CSNA data. Furthermore, we developed an unsupervised clustering technique based on the Gaussian mixture model and Neyman-Pearson criterion that performs robust detection of the outliers corresponding to coronary artery diseases. By using the automated arti cial intelligence-based hybrid anomaly detection technique, we have demonstrated a signi cant association between the increase in the amplitude of CSNA signal and anomalies in ECG signal during coronary artery diseases. The automated arti cial intelligence-based hybrid anomaly de tection technique performed highly reliable detection of coronary artery diseases with a sensitivity of 98.48%, speci city of 97.73%, accuracy of 98.11%, positive predictive value (PPV) of 97.74%, negative predictive value (NPV) of 98.47%, and F1-score of 98.11%. Hence, the arti cial intelligence-based hybrid anomaly detection technique has superior performance compared to the gold standard diagnostic test ECG in diagnosing coronary artery diseases. Additionally, it out performed other techniques developed in this study that separately utilize either only CSNA data or only ECG data. Therefore, it signi cantly increases the detec tion performance of coronary artery diseases by taking advantage of the diversity in di erent data types and leveraging their strengths. Furthermore, its perfor mance is comparatively better than that of most previously proposed machine and deep learning methods that exclusively used ECG data to diagnose or clas sify coronary artery diseases. It also has a very short implementation time, which is highly desirable for real-time detection of coronary artery diseases in clinical practice. The proposed automated arti cial intelligence-based hybrid anomaly detection technique may serve as an e cient decision-support system to increase physicians' success in achieving fast, early, and accurate diagnosis of coronary artery diseases. It may be highly bene cial and valuable, particularly for asymptomatic coronary artery disease patients, for whom the diagnostic information provided by ECG alone is not su cient to reliably diagnose the disease. Hence, it may signi cantly improve patient outcomes, enable timely treatments, and reduce the mortality associated with cardiovascular diseases. Secondly, we propose a new automated arti cial intelligence-based hybrid clinical decision support technique that jointly analyzes reverse transcriptase polymerase chain reaction (RT-PCR) curves, thorax computed tomography im ages, and laboratory data to perform fast and accurate diagnosis of Coronavirus disease 2019 (COVID-19). For this purpose, we retrospectively created the fully labeled Ankara University Faculty of Medicine COVID-19 (AUFM-CoV) database, which contains a wide variety of medical data, including RT-PCR curves, thorax computed tomogra phy images, and laboratory data. The AUFM-CoV is the most comprehensive database that includes thorax computed tomography images of COVID-19 pneu monia (CVP), other viral and bacterial pneumonias (VBP), and parenchymal lung diseases (PLD), all of which present signi cant challenges for di erential diagnosis. We developed a new automated arti cial intelligence-based hybrid clinical de cision support technique, which is an ensemble learning technique consisting of two preprocessing methods, long short-term memory network-based deep learning method, convolutional neural network-based deep learning method, and arti cial neural network-based machine learning method. By jointly analyzing RT-PCR curves, thorax computed tomography images, and laboratory data, the proposed automated arti cial intelligence-based hybrid clinical decision support technique bene ts from the diversity in di erent data types that are critical for the reliable detection of COVID-19 and leverages their strengths. The multi-class classi cation performance results of the proposed convolu tional neural network-based deep learning method on the AUFM-CoV database showed that it achieved highly reliable detection of COVID-19 with a sensitivity of 91.9%, speci city of 92.5%, precision of 80.4%, and F1-score of 86%. There fore, it outperformed thorax computed tomography in terms of the speci city of COVID-19 diagnosis. Moreover, the convolutional neural network-based deep learning method has been shown to very successfully distinguish COVID-19 pneumonia (CVP) from other viral and bacterial pneumonias (VBP) and parenchymal lung diseases (PLD), which exhibit very similar radiological ndings. Therefore, it has great potential to be successfully used in the di erential diagnosis of pulmonary dis eases containing ground-glass opacities. The binary classi cation performance results of the proposed convolutional neural network-based deep learning method showed that it achieved a sensitivity of 91.5%, speci city of 94.8%, precision of 85.6%, and F1-score of 88.4% in diagnosing COVID-19. Hence, it has compara ble sensitivity to thorax computed tomography in diagnosing COVID-19. Additionally, the binary classi cation performance results of the proposed long short-term memory network-based deep learning method on the AUFM-CoV database showed that it performed highly reliable detection of COVID-19 with a sensitivity of 96.6%, speci city of 99.2%, precision of 98.1%, and F1-score of 97.3%. Thus, it outperformed the gold standard RT-PCR test in terms of the sensitivity of COVID-19 diagnosis Furthermore, the multi-class classi cation performance results of the proposed automated arti cial intelligence-based hybrid clinical decision support technique on the AUFM-CoV database showed that it diagnosed COVID-19 with a sen sitivity of 66.3%, speci city of 94.9%, precision of 80%, and F1-score of 73%. Hence, it has been shown to very successfully perform the di erential diagnosis of COVID-19 pneumonia (CVP) and other pneumonias. The binary classi cation performance results of the automated arti cial intelligence-based hybrid clinical decision support technique revealed that it diagnosed COVID-19 with a sensi tivity of 90%, speci city of 92.8%, precision of 91.8%, and F1-score of 90.9%. Therefore, it exhibits superior sensitivity and speci city compared to laboratory data in COVID-19 diagnosis. The performance results of the proposed automated arti cial intelligence-based hybrid clinical decision support technique on the AUFM-CoV database demon strate its ability to provide highly reliable diagnosis of COVID-19 by jointly ana lyzing RT-PCR data, thorax computed tomography images, and laboratory data. Consequently, it may signi cantly increase the success of physicians in diagnosing COVID-19, assist them in rapidly isolating and treating COVID-19 patients, and reduce their workload in daily clinical practice.
Open Access
Assessment of Parkinson's disease severity from videos using deep architecture
(IEEE, 2021-07-26) Yin, Z.; Geraedts, V. J.; Wang, Z.; Contarino, M. F.; Dibeklioğlu, Hamdi; Gemert, J. V.
Parkinson's disease (PD) diagnosis is based on clinical criteria, i.e., bradykinesia, rest tremor, rigidity, etc. Assessment of the severity of PD symptoms with clinical rating scales, however, is subject to inter-rater variability. In this paper, we propose a deep learning based automatic PD diagnosis method using videos to assist the diagnosis in clinical practices. We deploy a 3D Convolutional Neural Network (CNN) as the baseline approach for the PD severity classification and show the effectiveness. Due to the lack of data in clinical field, we explore the possibility of transfer learning from non-medical dataset and show that PD severity classification can benefit from it. To bridge the domain discrepancy between medical and non-medical datasets, we let the network focus more on the subtle temporal visual cues, i.e., the frequency of tremors, by designing a Temporal Self-Attention (TSA) mechanism. Seven tasks from the Movement Disorders Society - Unified PD rating scale (MDS-UPDRS) part III are investigated, which reveal the symptoms of bradykinesia and postural tremors. Furthermore, we propose a multi-domain learning method to predict the patient-level PD severity through task-assembling. We show the effectiveness of TSA and task-assembling method on our PD video dataset empirically. We achieve the best MCC of 0.55 on binary task-level and 0.39 on three-class patient-level classification.
Open Access
AttentionBoost: learning what to attend for gland segmentation in histopathological images by boosting fully convolutional networks
(IEEE, 2020) Güneşli, Gözde Nur; Sökmensüer, C.; Gündüz-Demir, Çiğdem
Fully convolutional networks (FCNs) are widely used for instance segmentation. One important challenge is to sufficiently train these networks to yield good generalizations for hard-to-learn pixels, correct prediction of which may greatly affect the success. A typical group of such hard-to-learn pixels are boundaries between instances. Many studies have developed strategies to pay more attention to learning these boundary pixels. They include designing multi-task networks with an additional task of boundary prediction and increasing the weights of boundary pixels' predictions in the loss function. Such strategies require defining what to attend beforehand and incorporating this defined attention to the learning model. However, there may exist other groups of hard-to-learn pixels and manually defining and incorporating the appropriate attention for each group may not be feasible. In order to provide an adaptable solution to learn different groups of hard-to-learn pixels, this article proposes AttentionBoost, which is a new multi-attention learning model based on adaptive boosting, for the task of gland instance segmentation in histopathological images. AttentionBoost designs a multi-stage network and introduces a new loss adjustment mechanism for an FCN to adaptively learn what to attend at each stage directly on image data without necessitating any prior definition. This mechanism modulates the attention of each stage to correct the mistakes of previous stages, by adjusting the loss weight of each pixel prediction separately with respect to how accurate the previous stages are on this pixel. Working on histopathological images of colon tissues, our experiments demonstrate that the proposed AttentionBoost model improves the results of gland segmentation compared to its counterparts.
Open Access
An augmented crowd simulation system using automatic determination of navigable areas
(Elsevier Ltd, 2021-04) Doğan, Yalım; Sonlu, Sinan; Güdükbay, Uğur
Crowd simulations imitate the group dynamics of individuals in different environments. Applications in entertainment, security, and education require augmenting simulated crowds into videos of real people. In such cases, virtual agents should realistically interact with the environment and the people in the video. One component of this augmentation task is determining the navigable regions in the video. In this work, we utilize semantic segmentation and pedestrian detection to automatically locate and reconstruct the navigable regions of surveillance-like videos. We place the resulting flat mesh into our 3D crowd simulation environment to integrate virtual agents that navigate inside the video avoiding collision with real pedestrians and other virtual agents. We report the performance of our open-source system using real-life surveillance videos, based on the accuracy of the automatically determined navigable regions and camera configuration. We show that our system generates accurate navigable regions for realistic augmented crowd simulations.
Open Access
Automated cancer stem cell recognition in H and E stained tissue using convolutional neural networks and color deconvolution
(SPIE, 2017) Aichinger, W.; Krappe, S.; Çetin, A. Enis; Çetin-Atalay, R.; Üner, A.; Benz, M.; Wittenberg, T.; Stamminger, M.; Münzenmayer, C.
The analysis and interpretation of histopathological samples and images is an important discipline in the diagnosis of various diseases, especially cancer. An important factor in prognosis and treatment with the aim of a precision medicine is the determination of so-called cancer stem cells (CSC) which are known for their resistance to chemotherapeutic treatment and involvement in tumor recurrence. Using immunohistochemistry with CSC markers like CD13, CD133 and others is one way to identify CSC. In our work we aim at identifying CSC presence on ubiquitous Hematoxilyn and Eosin (HE) staining as an inexpensive tool for routine histopathology based on their distinct morphological features. We present initial results of a new method based on color deconvolution (CD) and convolutional neural networks (CNN). This method performs favorably (accuracy 0.936) in comparison with a state-of-the-art method based on 1DSIFT and eigen-analysis feature sets evaluated on the same image database. We also show that accuracy of the CNN is improved by the CD pre-processing.
Open Access
Automatic deceit detection through multimodal analysis of high-stake court-trials
(Institute of Electrical and Electronics Engineers, 2023-10-05) Biçer, Berat; Dibeklioğlu, Hamdi
In this article we propose the use of convolutional self-attention for attention-based representation learning, while replacing traditional vectorization methods with a transformer as the backbone of our speech model for transfer learning within our automatic deceit detection framework. This design performs a multimodal data analysis and applies fusion to merge visual, vocal, and speech(textual) channels; reporting deceit predictions. Our experimental results show that the proposed architecture improves the state-of-the-art on the popular Real-Life Trial (RLT) dataset in terms of correct classification rate. To further assess the generalizability of our design, we experiment on the low-stakes Box of Lies (BoL) dataset and achieve state-of-the-art performance as well as providing cross-corpus comparisons. Following our analysis, we report that (1) convolutional self-attention learns meaningful representations while performing joint attention computation for deception, (2) apparent deceptive intent is a continuous function of time and subjects can display varying levels of apparent deceptive intent throughout recordings, and (3), in support of criminal psychology findings, studying abnormal behavior out of context can be an unreliable way to predict deceptive intent.
Open Access
Automatic deceit detection through multimodal analysis of speech videos
(2022-09) Biçer, Berat
In this study we propose the use of self-attention for spatial representation learning, while explore transformers as the backbone of our speech model for in-ferring apparent deceptive intent based on multimodal analysis of speech videos. The proposed model applies separate modality-speciﬁc representation learning from visual, vocal, and speech modality representations and applies fusion afterwards to merge information channels. We test our method on the popular, high-stake Real-Life Trial (RLT) dataset. We also introduce a novel, low-stake, in-the-wild dataset named PoliDB for deceit detection; and report the ﬁrst results on this dataset as well. Experiments suggest the proposed design surpasses previous studies performed on RLT dataset, while it achieves signiﬁcant classiﬁcation performance on the proposed PoliDB dataset. Following our analysis, we report (1) convolutional self-attention successfully achieves joint representation learning and attention computation with up to three times less number of parameters than its competitors, (2) apparent deceptive intent is a continuous function of time that can ﬂuctuate throughout the videos, and (3) studying particular abnormal behaviors out of context can be an unreliable way to predict deceptive intent.
Open Access
BolT: Fused window transformers for fMRI time series analysis
(Elsevier B.V., 2023-05-18) Bedel, Hasan Atakan; Şıvgın, Irmak; Dalmaz, Onat; Ul Hassan Dar, Salman ; Çukur, Tolga
Deep-learning models have enabled performance leaps in analysis of high-dimensional functional MRI (fMRI) data. Yet, many previous methods are suboptimally sensitive for contextual representations across diverse time scales. Here, we present BolT, a blood-oxygen-level-dependent transformer model, for analyzing multi-variate fMRI time series. BolT leverages a cascade of transformer encoders equipped with a novel fused window attention mechanism. Encoding is performed on temporally-overlapped windows within the time series to capture local representations. To integrate information temporally, cross-window attention is computed between base tokens in each window and fringe tokens from neighboring windows. To gradually transition from local to global representations, the extent of window overlap and thereby number of fringe tokens are progressively increased across the cascade. Finally, a novel cross-window regularization is employed to align high-level classification features across the time series. Comprehensive experiments on large-scale public datasets demonstrate the superior performance of BolT against state-of-the-art methods. Furthermore, explanatory analyses to identify landmark time points and regions that contribute most significantly to model decisions corroborate prominent neuroscientific findings in the literature.
Open Access
Boosting fully convolutional networks for gland instance segmentation in histopathological images
(2019-08) Güneşli, Gözde Nur
In the current literature, fully convolutional neural networks (FCNs) are the most preferred architectures for dense prediction tasks, including gland segmentation. However, a signi cant challenge is to adequately train these networks to correctly predict pixels that are hard-to-learn. Without additional strategies developed for this purpose, networks tend to learn poor generalizations of the dataset since the loss functions of the networks during training may be dominated by the most common and easy-to-learn pixels in the dataset. A typical example of this is the border separation problem in the gland instance segmentation task. Glands can be very close to each other, and since the border regions contain relatively few pixels, it is more di cult to learn these regions and separate gland instances. As this separation is essential for the gland instance segmentation task, this situation arises major drawbacks on the results. To address this border separation problem, it has been proposed to increase the given attention to border pixels during network training either by increasing the relative loss contribution of these pixels or by adding border detection as an additional task to the architecture. Although these techniques may help better separate gland borders, there may exist other types of hard-to-learn pixels (and thus, other mistake types), mostly related to noise and artifacts in the images. Yet, explicitly adjusting the appropriate attention to train the networks against every type of mistake is not feasible. Motivated by this, as a more e ective solution, this thesis proposes an iterative attention learning model based on adaptive boosting. The proposed AttentionBoost model is a multi-stage dense segmentation network trained directly on image data without making any prior assumption. During the end-to-end training of this network, each stage adjusts the importance of each pixel-wise prediction for each image depending on the errors of the previous stages. This way, each stage learns the task with di erent attention forcing the stage to learn the mistakes of the earlier stages. With experiments on the gland instance segmentation task, we demonstrate that our model achieves better segmentation results than the approaches in the literature.
Open Access
Çok görevli öğrenme ile eşzamanlı darbe tespiti ve kipleme sınıflandırma
(IEEE, 2019-04) Akyön, Fatih Çağatay; Nuhoğlu, Mustafa Atahan; Alp, Yaşar Kemal; Arıkan, Orhan
Bu çalışmada, elektronik harp sistemlerindeki sayısal almaç yapıları tarafından toplanan, ortamdaki tehdit radarların gönderdigi darbesel sinyal örnekleri üzerinden otomatik olarak eşzamanlı SGO (Sinyal Gürültü Oranı) kestiren ve darbe tespiti yapan, tespit edilen darbesel bölge üzerindeki kiplemeyi sınıflandıran, çok görevli ögrenme ve yinelemeli sinir ağı tabanlı yeni bir yöntem önerilmiştir. Önerilen yöntem, girdi olarak almaç tarafından toplanan örneklerden herhangi bir öznitelik çıkarmaksızın, ham IQ (Inphase-Quadrature) verilerini kullanmaktadır. Sınıflandırma başarımını artırmak için, farklı SGO seviyelerine göre eğitilmiş modeller kullanılmıştır. Ham IQ veri üzerinden kestirilen SGO degerine göre, uygun model otomatik olarak seçilmektedir. Yapılan kapsamlı benzetimlerde, -30 dB SGO seviyesinde, 1.5 dB ortalama mutlak hata ile SGO kestirimi, %90 başarımla darbe tespiti ve %84 ihtimalle başarılı kipleme sınıflandırması yapılabildiği gözlemlenmiştir. Tipik bir elektronik ˘ harp almacının darbe tespiti yapabildigi en dü¸sük SGO seviyesinin 10 dB olduğu düşünüldüğünde, önerilen yöntemin geleceğin elektronik harp almaç yapıları için oldukça önemli bir teknolojik kazanım olduğu değerlendirilmektedir.
Open Access
Çoklu kontrast MRG’de çoklu görüntü geriçatımı
(IEEE, 2021-07-19) Özbey, Muzaffer; Çukur, Tolga
Çoklu kontrastlı manyetik rezonans görüntülerinin (MRG) edinimi, tanı bilgi birikimini artırarak klinik tanıda önemli bir role sahiptir. Hastanın hareketsiz kalması gereken uzun tetkik süreleri, çoklu kontrast MRG edinimini sınırlandırmaktadır. Görüntülerin alt örneklenerek toplanması ve geriçatımı ile tarama süreleri kısaltılabilmektedir. Yaygın yöntemler, tek kontrasta ait alt örneklenmiş MR görüntülerinden aynı kontrasta ait tam örneklenmiş MR görüntüsü üretmektedir. Ancak girdi verisindeki tek kontrastlı MR görüntüsüne ait sınırlı bilgiler, geriçatım performansını sınırlandırmaktadır. Bu yüzden, çoklu kontrast MRG girdi verilerinin kullanımı ile geriçatım performansı artırılabilir. Bu çalışma kapsamında, birden fazla kontrasta ait alt örneklenmiş görüntülerden, tam örneklenmiş görüntüleri eş zamanlı olarak üreten bir çoklu kontrast MRG geriçatım yöntemi önerilmiştir. Önerilen yöntem, yüksek frekans değerlerini daha iyi tahmin ederek oldukça gerçekçi görüntüler üreten çekişmeli üretici ağlar kullanılarak uygulanmıştır. Önerilen yöntem, çoklu kontrast beyin MR görüntüleri içeren verisetinde test edilmiş, sayısal ve görsel değerlendirmeler sonucunda alternatif tekli kontrast geriçatım yöntemine göre daha üstün performans sağladığı kanıtlanmıştır.