Epidemic Learning: Boosting Decentralized Learning with Randomized Communication

Abstract

We present Epidemic Learning (EL), a simple yet powerful decentralized learning (DL) algorithm that leverages changing communication topologies to achieve faster model convergence compared to conventional DL approaches. At each round of EL, each node sends its model updates to a random sample of s other nodes (in a system of n nodes). We provide an extensive theoretical analysis of EL, demonstrating that its changing topology culminates in superior convergence properties compared to the state-of-the-art (static and dynamic) topologies. Considering smooth non-convex loss functions, the number of transient iterations for EL, i.e., the rounds required to achieve asymptotic linear speedup, is in O(n3/s2) which outperforms the best-known bound O(n3) by a factor of s2, indicating the benefit of randomized communication for DL. We empirically evaluate EL in a 96-node network and compare its performance with state-of-the-art DL approaches. Our results illustrate that EL converges up to 1.6x quicker than baseline DL algorithms and attains 1.8% higher accuracy for the same communication volume.

Publication
In the Advances in Neural Information Processing Systems 37 (2023)
Rishi Sharma
Rishi Sharma
PhD Student at EPFL

Currently exploring research interests in Computer Science.

comments powered by Disqus

Related