Online learning over distributed networks
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Volume
Issue
Pages
Language
Type
Journal Title
Journal ISSN
Volume Title
Attention Stats
Usage Stats
views
downloads
Series
Abstract
We study online learning strategies over distributed networks. Here, we have a distributed collection of agents with learning and cooperation capabilities. These agents observe a noisy version of a desired state of the nature through a linear model. The agents seek to learn this state by also interacting with each other yet the communication load plays significant role. To this end, we propose compressive diffusion strategies that extract the compressed information from the diffused data. Agents can compress the information into a scalar or a single bit, i.e., a substantial reduction in the communication load. Importantly, we show that agents can achieve a comparable performance to the conventional diffusion strategies that require the direct diffusion of information without compression and with infinite precision. We also examine which information to disclose and how to utilize them optimally in the mean-square-error (MSE) sense. Note that all the well-known distributed learning strategies achieve suboptimal learning performance in the MSE sense. Hence, we provide algorithms that achieve distributed minimum MSE (MMSE) performance over an arbitrary network topology based on the aggregation of information at each agent. This approach differs from the diffusion of information across network, i.e., exchange of local estimates. Notably, exchange of local estimates is sufficient only over the certain network topologies. For these networks, we also propose strategies that achieve the distributed MMSE performance through the diffusion of information. Hence, we can substantially reduce the communication load while achieving the best possible MSE performance. Finally, for practical implementations we provide approaches to reduce the complexity of the algorithms through the time-windowing of the observations.