Browsing by Subject "Deep neural networks"

Now showing 1 - 10 of 10

Open Access
Channel estimation and symbol demodulation for OFDM systems over rapidly varying multipath channels with hybrid deep neural networks
(Institute of Electrical and Electronics Engineers, 2023-05-01) Gümüş, Mücahit; Duman, Tolga Mete
We consider orthogonal frequency division multiplexing over rapidly time-varying multipath channels, for which performance of standard channel estimation and equalization techniques degrades dramatically due to inter-carrier interference (ICI). We focus on improving the overall system performance by designing deep neural network (DNN) architectures for both channel estimation and data demodulation. To accomplish this, we employ the basis expansion model to track the channel tap variations, and exploit convolutional neural networks’ learning abilities of local correlations together with a coarse least square solution for a robust and accurate channel estimation procedure. For data demodulation, we use a recurrent neural network for improved performance and robustness as single tap frequency-domain equalizers perform poorly, and more sophisticated equalization techniques such as band-limited linear minimum mean squared error equalizers are vulnerable to model mismatch and channel estimation errors. Numerical examples illustrate that the proposed DNN architectures outperform the traditional algorithms. Specifically, the bit error rate results for a wide range of Doppler values reveal that the proposed DNN-based equalizer is robust, and it mitigates the ICI effectively, offering an excellent demodulation performance. We further note that the DNN-based channel estimator offers an improved performance with a reduced computational complexity.
Open Access
Comparing the performance of humans and 3D-convolutional neural networks in material perception using dynamic cues
(2019-07) Mehrzadfar, Hossein
There are numerous studies on material perception in humans. Similarly, there are various deep neural network models that are trained to perform different visual tasks such as object recognition. However, the intersection of material perception in humans and deep neural network models has not been investigated to our knowledge. Especially, the importance of the ability of deep neural networks in categorizing materials and also comparing human performance with the performance of deep convolutional neural networks has not been appreciated enough. Here we have built, trained and tested a 3D-convolutional neural network model that is able to categorize the animations of simulated materials. We have compared the performance of the deep neural network with that of humans and concluded that the conventional training of deep neural networks is not necessarily giving the optimal state of the network to be compared to the performance of the humans. In the material categorization task, the similarity between the performance of humans and deep neural networks increases and reaches the maximum similarity and then decreases as we train the network further. Also, by training the 3D-CNN on regular, temporally consistent animations and also training it on the temporally inconsistent animations and comparing the results we found out that the 3D-CNN model can use spatial information in order to categorize the material animations. In other words, we found out that the temporal, and consistent motion information is not necessary for the deep neural networks in order to categorize the material animations.
Open Access
Deep fractional Fourier networks
(2024-08) Koç, Emirhan
This thesis introduces the integration of the fractional Fourier Transform (FrFT) into the deep learning domain, with the aim of opening new avenues for incorporating signal processing into deep neural networks (DNNs). This work starts by introducing FrFT into recurrent neural networks (RNNs) for time series prediction, leveraging its ability and flexibility to perform infinitely many continuous transformations and offering an alternative to the traditional Fourier Transform (FT). Despite the initial success, a significant challenge identified is the manual tuning of the fraction order parameter a, which can be cumbersome and limits broader applicability. To overcome this limitation, we introduce a novel approach where the fraction order a is treated as a learnable parameter within deep learning models. First, a theoretical foundation is established to support the learnability of this parameter, followed by extensive experimentation in image classification and time series prediction tasks. The results demonstrate that incorporating a learnable fraction order significantly improves model performance, particularly when integrated with well-known architectures such as ResNet and VGG models. Furthermore, the thesis proposes fractional Fourier Pooling (FrFP), a pooling technique that replaces traditional Global Average Pooling (GAP) layers in Convolutional Neural Networks (CNNs). FrFP enhances feature representation by processing intermediate signal regions, leading to better model performance and offering a new perspective on integrating signal transformations within deep learning frameworks. Overall, this thesis contributes to the growing body of research exploring advanced signal processing techniques in deep learning, highlighting the potential of FrFT as a powerful tool for improving model accuracy and efficiency across various applications.
Open Access
Deep learning for accelerated MR imaging
(2021-02) Dar, Salman Ul Hassan
Magnetic resonance imaging is a non-invasive imaging modality that enables multi-contrast acquisition of an underlying anatomy, thereby supplementing mul-titude of information for diagnosis. However, prolonged scan duration may pro-hibit its practical use. Two mainstream frameworks for accelerating MR image acquisitions are reconstruction and synthesis. In reconstruction, acquisitions are accelerated by undersampling in k-space, followed by reconstruction algorithms. Lately deep neural networks have oﬀered signiﬁcant improvements over tradi-tional methods in MR image reconstruction. However, deep neural networks rely heavily on availability of large datasets which might not be readily available for some applications. Furthermore, a caveat of the reconstruction framework in general is that the performance naturally starts degrading towards higher accel-eration factors where fewer data samples are acquired. In the alternative syn-thesis framework, acquisitions are accelerated by acquiring a subset of desired contrasts, and recovering the missing ones from the acquired ones. Current syn-thesis methods are primarily based on deep neural networks, which are trained to minimize mean square or absolute loss functions. This can bring about loss of intermediate-to-high spatial frequency content in the recovered images. Fur-thermore, the synthesis performance in general relies on similarity in relaxation parameters between source and target contrasts, and large dissimilarities can lead to artifactual synthesis or loss of features. Here, we tackle issues associated with reconstruction and synthesis approaches. In reconstruction, the data scarcity is-sue is addressed by pre-training a network on large readily available datasets, and ﬁne-tuning on just a few samples from target datasets. In synthesis, the loss of intermediate-to-high spatial frequency is catered for by adding adversarial and high-level perceptual losses on top of traditional mean absolute error. Fi-nally, a joint reconstruction and synthesis approach is proposed to mitigate the issues associated with both reconstruction and synthesis approaches in general. Demonstrations on MRI brain datasets of healthy subjects and patients indicate superior performance of the proposed techniques over the current state-of-the art ones.
Open Access
Deep neural network based precoding for wiretap channels with finite alphabet inputs
(IEEE, 2021-04-28) Gümüş, Mücahit; Duman, Tolga M.
We consider secure transmission over multi-input multi-output multi-antenna eavesdropper (MIMOME) wiretap channels with finite alphabet inputs. We use a linear precoder to maximize the secrecy rate, which benefits from the generalized singular value decomposition to obtain independent streams and exploits the function approximation abilities of deep neural networks (DNNs) for solving the required power allocation problem. It is demonstrated that the DNN learns the optimal power allocation without any performance degradation compared to the conventional technique with a significant reduction in complexity.
Open Access
Deep-learning for communication systems: new channel estimation, equalization, and secure transmission solutions
(2023-08) Gümüş, Mücahit
Traditional communication system design takes a model-based approach that aims to optimize relevant performance metrics using somewhat simple and tractable channel and signal models. For instance, channel codes are designed for simple additive white Gaussian or fading channel models, channel equalization algorithms are based on mathematical models for inter-symbol interference (ISI), and channel estimation techniques are developed with the underlying channel statistics and characterizations in mind. Through utilizing superior mathematical models and expert knowledge in signal processing and information theory, the model-based approach has been highly successful and has enabled development of many communication systems until now. On the other hand, beyond 5G wireless communication systems will further exploit the massive number of antennas, higher bandwidths, and more advanced multiple access technologies. As communication systems become more and more complicated, it is becoming increasingly important to go beyond the limits of the model-based approach. Noting that there have been tremendous advancements in learning from data over the past decades, a major research question is whether machine learning based approaches can be used to develop new communication technologies. With the above motivation, this thesis deals with the development of deep neural network (DNN) solutions to address various challenges in wireless communications. We ﬁrst consider orthogonal frequency division multiplexing (OFDM) over rapidly time-varying multipath channels, for which the performance of standard channel estimation and equalization techniques degrades dramatically due to inter-carrier in-terference (ICI). We focus on improving the overall system performance by designing DNN architectures for both channel estimation and data demodulation. In addition, we study OFDM over frequency-selective channels without cyclic preﬁx insertion in an eﬀort to improve the overall throughputs. Speciﬁcally, we design a recurrent neu-ral network to mitigate the eﬀects of ISI and ICI for improved symbol detection. Furthermore, we explore secure transmission over multi-input multi-output multi-antenna eavesdropper wiretap channels with ﬁnite alphabet inputs. We use a linear precoder to maximize the secrecy rate, which beneﬁts from the generalized singular value decomposition to obtain independent streams and exploits function approximation abilities of DNNs for solving the required power allocation problem. We also propose a DNN technique to jointly optimize the data precoder and the power allocation for artiﬁcial noise. We use extensive numerical examples and computational complexity analyses to demonstrate the eﬀectiveness of the proposed solutions.
Open Access
Energy efficient boosting of GEMM accelerators for DNN via reuse
(Association for Computing Machinery, Inc, 2022-06-06) Cicek, Nihat Mert; Shen, Xipeng; Özturk, Özcan
Reuse-centric convolutional neural networks (CNN) acceleration speeds up CNN inference by reusing computations for similar neuron vectors in CNN’s input layer or activation maps. This new paradigm of optimizations is, however, largely limited by the overheads in neuron vector similarity detection, an important step in reuse-centric CNN. This article presents an in-depth exploration of architectural support for reuse-centric CNN. It addresses some major limitations of the state-of-the-art design and proposes a novel hardware accelerator that improves neuron vector similarity detection and reduces the energy consumption of reuse-centric CNN inference. The accelerator is implemented to support a wide variety of neural network settings with a banked memory subsystem. Design exploration is performed through RTL simulation and synthesis on an FPGA platform. When integrated into Eyeriss, the accelerator can potentially provide improvements up to 7.75× in performance. Furthermore, it can reduce the energy used for similarity detection up to 95.46%, and it can accelerate the convolutional layer up to 3.63× compared to the software-based implementation running on the CPU.
Open Access
An ensemble classification model for detecting voice phishing in telecommunication networks and its integration into a visual analysis tool
(2022-09) Çalık, Hüseyin Eren
Voice phishing, a method of social engineering fraud performed over phone calls, has been a major problem globally since the use of phones became widespread. Traditional and modern methods to detect these fraud schemes include visual analysis of the customers’ behaviour, rule-based systems and machine learning models such as clustering, decision trees, shallow classifiers and deep learning models. Visual analysis depends only on human expertise and requires very high labor force to be effective. Rule-based systems are useful for extreme cases but are vulnerable to concept drifts. The-state-of-the-art methods generally utilize machine learning approaches. However, they require one or more of feature engineering done by experts, high computational power and privacy infringements. Therefore, in collaboration with Turkcell Technology, we aimed to develop a system that benefits from the advantages of the traditional methods while exploiting the effectiveness and efficiency of the state-of-the-art ones to tackle this issue. In doing so, we integrated an ensemble learning model to an existing visualization tool for detecting fraud users. This tool visualizes relational data as knowledge graphs, shows the informational data as texts and statistical data with charts and texts. Our ensemble learning model has two deep neural networks and one decision tree classifier. Multiple neural networks are used to reduce the variance and make a more stable model. One of them is composed of an input layer, two hidden layers with 200 nodes using Rectified Linear Unit (ReLU) activation function, each followed by a dropout layer and an output layer of one node with sigmoid activation function. We used dropout layers in this network to prevent over-fitting. The second neural network we built has 3 hidden layers instead with node numbers 64, 64 and 32, respectively, with ReLU as their activation function. To feed these models, a total of 34 features, 20 of which are raw, have been engineered with Turkcell fraud experts. The aggregation of the outputs is done by taking their average. We measured the success of our model by calculating the F1 Score as the class imbalance is high. Our model’s F1 score is 0.82 with a precision of 0.82 and a recall of 0.83. Also, with the integration of our model into this visualization tool, a framework was formed allowing mobile network operators to examine and detect fraud cases more efficiently and act accordingly.
Open Access
Exploring the role of loss functions in multiclass classification
(IEEE, 2020-05) Demirkaya, Ahmet; Chen, J.; Oymak, Samet
Cross-entropy is the de-facto loss function in modern classification tasks that involve distinguishing hundreds or even thousands of classes. To design better loss functions for new machine learning tasks, it is critical to understand what makes a loss function suitable for a problem. For instance, what makes the cross entropy better than other alternatives such as quadratic loss? In this work, we discuss the role of loss functions in learning tasks with a large number of classes. We hypothesize that different loss functions can have large variability in the difficulty of optimization and that simplicity of training is a key catalyst for better test-time performance. Our intuition draws from the success of over-parameterization in deep learning: As a model has more parameters, it trains faster and achieves higher test accuracy. We argue that, effectively, cross-entropy loss results in a much more over-parameterized problem compared to the quadratic loss, thanks to its emphasis on the correct class (associated with the label). Such over-parameterization drastically simplifies the training process and ends up boosting the test performance. For separable mixture models, we provide a separation result where cross-entropy loss can always achieve small training loss, whereas quadratic loss has diminishing benefit as the number of classes and class correlations increase. Numerical experiments with CIFAR 100 corroborate our results. We show that the accuracy with quadratic loss disproportionately degrades with a growing number of classes; however, encouraging quadratic loss to focus on the correct class results in a drastically improved performance.
Open Access
Understanding how orthogonality of parameters improves quantization of neural networks
(IEEE, 2022-05-10) Eryılmaz, Şükrü Burç; Dündar, Ayşegül
We analyze why the orthogonality penalty improves quantization in deep neural networks. Using results from perturbation theory as well as through extensive experiments with Resnet50, Resnet101, and VGG19 models, we mathematically and experimentally show that improved quantization accuracy resulting from orthogonality constraint stems primarily from reduced condition numbers, which is the ratio of largest to smallest singular values of weight matrices, more so than reduced spectral norms, in contrast to the explanations in previous literature. We also show that the orthogonality penalty improves quantization even in the presence of a state-of-the-art quantized retraining method. Our results show that, when the orthogonality penalty is used with quantized retraining, ImageNet Top5 accuracy loss from 4- to 8-bit quantization is reduced by up to 7% for Resnet50, and up to 10% for Resnet101, compared to quantized retraining with no orthogonality penalty.