Browsing by Subject "GRU"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Open Access Online additive updates with FFT-IFFT operator on the GRU neural networks(IEEE, 2018) Mirza, Ali H.In this paper, we derived the online additive updates of gated recurrent unit (GRU) network by using fast fourier transform-inverse fast fourier transform (FFT-IFFT) operator. In the gating process of the GRU networks, we work in the frequency domain and execute all the linear operations. For the non-linear functions in the gating process, we first shift back to the time domain and then apply non-linear GRU gating functions. Furthermore, in order to reduce the computational complexity and speed up the training process, we apply weight matrix factorization (WMF) on the FFT-IFFT variant GRU network. We then compute the online additive updates of the FFT-WMF based GRU networks using stochastic gradient descent (SGD) algorithm. We also used long short-term memory (LSTM) networks in place of the GRU networks. Through an extensive set of experiments, we illustrate that our proposed algorithm achieves a significant increase in performance with a decrease in computational complexity.Item Open Access Variants of combinations of additive and multiplicative updates for GRU neural networks(IEEE, 2018) Mirza, Ali H.In this paper, we formulate several variants of the mixture of both the additive and multiplicative updates using stochastic gradient descent (SGD) and exponential gradient (EG) algorithms respectively. We employ these updates on the gated recurrent unit (GRU) networks. We then derive the gradient-based updates for the parameters of the GRU networks. We propose four different updates as a mean, minimum, even-odd and balanced set of updates for the GRU network. Through an extensive set of experiments, we demonstrate that these update variants perform better than simple SGD and EG updates. Overall, we observed that GRU-Mean update achieved the minimum cumulative and steady-state error performance. We also simulated the same set of experiments on the long short-term memory (LSTM) networks.