Please use this identifier to cite or link to this item:
http://hdl.handle.net/11375/30218
Title: | Improving Communication Efficiency And Convergence In Federated Learning |
Authors: | Liu, Yangyi |
Advisor: | Chen, Jun |
Department: | Electrical and Computer Engineering |
Keywords: | Federated Learning;Information Theory;Model Compression;Communication;Machine Learning |
Publication Date: | 2024 |
Abstract: | Federated learning is an emerging field that has received tremendous attention as it enables training Deep Neural Networks in a distributed fashion. By keeping the data decentralized, Federated Learning enhances data privacy and security while maintaining the ability to train robust machine learning models. Unfortunately, despite these advantages, the communication overhead resulting from the demand for fre- quent communication between the central server and remote clients poses a serious challenge to the present-day communication infrastructure. As the size of the deep learning models and the number of devices participating in the training are ever in- creasing, the model gradient transmission between the remote clients and the central server orchestrating the training process becomes the critical performance bottleneck. In this thesis, we investigate and address the problems related to improving the communication efficiency while maintaining convergence speed and accuracy in Federated Learning. To characterize the trade-off between communication cost and convergence in Federated Learning, an innovative formulation utilizing the clients’ correlation is proposed, which considers gradient transmission and reconstruction problems as a multi-terminal source coding problem. Leveraging this formulation, the model up- date problem in Federated Learning is converted to a convex optimization problem from a rate-distortion perspective. Technical results, including an iterative algorithm to solve for the upper bound and lower bound of the sum-rate, as well as the rate allocation schemes, are provided. Additionally, a correlation-aware client selection strategy is proposed and evaluated against the state-of-the-art methods. Extensive simulations are conducted to validate our theoretical analysis and the effectiveness of the proposed approaches. Furthermore, based on the statistical insights about the model gradient, we pro- pose a gradient compression algorithm also inspired by rate-distortion theory. More specifically, the proposed algorithm adopts model-wise sparsification for preliminary gradient dimension reduction and then performs layer-wise gradient quantization for further compression. The experimental results show that our approach achieves compression as aggressive as 1-bit while maintaining proper model convergence speed and final accuracy. |
URI: | http://hdl.handle.net/11375/30218 |
Appears in Collections: | Open Access Dissertations and Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Liu_Yangyi_202408_PhD.pdf | 4.64 MB | Adobe PDF | View/Open |
Items in MacSphere are protected by copyright, with all rights reserved, unless otherwise indicated.