Browsing by Subject "Distributed machine learning"

Now showing 1 - 3 of 3

Open Access
Blind federated learning at the wireless edge with low-resolution ADC and DAC
(IEEE, 2021-06-15) Teğin, Büşra
We study collaborative machine learning systems where a massive dataset is distributed across independent workers which compute their local gradient estimates based on their own datasets. Workers send their estimates through a multipath fading multiple access channel with orthogonal frequency division multiplexing to mitigate the frequency selectivity of the channel. We assume that there is no channel state information (CSI) at the workers, and the parameter server (PS) employs multiple antennas to align the received signals. To reduce the power consumption and the hardware costs, we employ complex-valued low-resolution digital-to-analog converters (DACs) and analog-to-digital converters (ADCs), at the transmitter and the receiver sides, respectively, and study the effects of practical low-cost DACs and ADCs on the learning performance. Our theoretical analysis shows that the impairments caused by low-resolution DACs and ADCs, including those of one-bit DACs and ADCs, do not prevent the convergence of the federated learning algorithms, and the multipath channel effects vanish when a sufficient number of antennas are used at the PS. We also validate our theoretical results via simulations, and demonstrate that using low-resolution, even one-bit, DACs and ADCs causes only a slight decrease in the learning accuracy.
Open Access
Distributed caching and learning over wireless channels
(2020-01) Tegin, Büşra
Coded caching and coded computing have drawn signiﬁcant attention in recent years due to their advantages in reducing the traﬃc load and in distributing computational burden to edge devices. There have been many research results addressing diﬀerent aspects of these problems; however, there are still various challenges that need to be addressed. In particular, their use over wireless channels is not fully understood. With this motivation, this thesis considers these two distributed systems over wireless channels taking into account realistic channel eﬀects as well as practical implementation constraints. In the ﬁrst part of the thesis, we study coded caching over a wireless packet erasure channel where each receiver encounters packet erasures independently with the same probability. We propose two diﬀerent schemes for packet erasure channels: sending the same message (SSM) and a greedy approach. Also, a simpliﬁed version of the greedy algorithm called the grouped greedy algorithm is proposed to reduce the system complexity. For the grouped greedy algorithm, an upper bound for transmission rate is derived, and it is shown that this upper bound is very close to the simulation results for small packet erasure probabilities. We then study coded caching over non-ergodic fading channels. As the multicast capacity of a broadcast channel is restricted by the user experiencing the worst channel conditions, we formulate an optimization problem to minimize the transmission time by grouping users based on their channel conditions, and transmit coded messages according to the worst channel in the group, as opposed to the worst among all. We develop two algorithms to determine the user groups: a locally optimal iterative algorithm and a numerically more eﬃcient solution through a shortest path problem. In the second part of the thesis, we study collaborative machine learning (ML) systems, which is also known as federated learning, where a massive dataset is distributed across independent workers that compute their local gradient estimates based on their own datasets. Workers send their estimates through a multipath fading multiple access channel (MAC) with orthogonal frequency division multiplexing (OFDM) to mitigate the frequency selectivity of the channel. We assume that the parameter server (PS) employs multiple antennas to align the received signals with no channel state information (CSI) at the workers. To reduce the power consumption and hardware costs, we employ complex-valued low-resolution analog to digital converters (ADCs) at the receiver side and study the eﬀects of practical low cost ADCs on the learning performance of the system. Our theoretical analysis shows that the impairments caused by a low-resolution ADC do not prevent the convergence of the learning algorithm, and fading eﬀects vanish when a suﬃcient number of antennas are used at the PS. We also validate our theoretical results via simulations, and further, we show that using one-bit ADCs causes only a slight decrease in the learning accuracy.
Open Access
On federated learning over wireless channels with over-the-air aggregation
(2022-07) Aygün, Ozan
A decentralized machine learning (ML) approach called federated learning (FL) has recently been at the center of attention since it secures edge users’ data and decreases communication costs. In FL, a parameter server (PS), which keeps track of the global model orchestrates local training and global model aggregation across a set of mobile users (MUs). While there exist studies on FL over wireless channels, its performance on practical wireless communication scenarios has not been investigated very well. With this motivation, this thesis considers wireless FL schemes that use realistic channel models, and analyze the impact of different wireless channel effects. In the first part of the thesis, we study hierarchical federated learning (HFL) where intermediate servers (ISs) are utilized to make the server-side closer to the MUs. Clustering approach is used where MUs are assigned to ISs to perform multiple cluster aggregations before the global aggregation. We first analyze the performance of a partially wireless approach where the MUs send their gradients through a channel with path-loss and fading using over-the-air (OTA) aggregation. We assume that there is no inter-cluster interference and the gradients from the ISs to the PS are sent error-free. We show through numerical and experimental analysis that our proposed algorithm offers a faster convergence and lower power consumption compared to the standard FL with OTA aggregation. As an extension, we also examine a fully-wireless HFL setup where both the MUs and ISs send their gradients through OTA aggregation, taking into account the effect of inter-cluster interference. Our numerical and experimental results reveal that utilizing ISs results in a faster convergence and a better performance than the OTA FL without any IS while using less transmit power. It is also shown that the best choice of cluster aggregations depends on the data distribution among the MUs and the clusters. In the second part of the thesis, we study FL with energy harvesting MUs with stochastic energy arrivals. In every global iteration, the MUs with enough energy in their batteries perform local SGD iterations, and transmit their gradients using OTA aggregation. Before sending the gradients to the PS, the gradients are scaled with respect to the idle time and data cardinality of each MU, through a cooldown multiplier, to amplify the importance of the MUs that send less frequent local updates. We provide a convergence analysis of the proposed setup, and validate our results with numerical and neural network simulations under different energy arrival profiles. The results show that the OTA FL with energy harvesting devices performs slightly worse than the OTA FL without any energy restrictions, and that utilizing the excess energy for more local SGD iterations gives a better convergence rate than simply increasing the transmit power.