PENS: Leveraging Data Heterogeneity in Federated Learning
Examensarbete för masterexamen
Federated learning (FL) is a decentralized machine learning technique where training is done cooperatively by exchanging model weights or gradients instead of sharing the raw data between the cooperating devices (clients). Classical FL algorithms such as federated averaging work best in the special case when the data is IID over clients. In this work, we address the problem of data heterogeneity in federated learning. We propose a decentralized federated learning (DFL) algorithm termed Performancebased Neighbour Selection Federated Learning Algorithm (PENS), that effectively leverages the data heterogeneity over clients. PENS is a cooperative communicationbased algorithm where clients communicate with other clients that have a similar data distribution. Specifically, model performance is used as a proxy for data similarity as no raw data is allowed to be shared among clients. Experiments on the CIFAR-10 dataset show that this communication scheme results in higher model accuracies than if clients communicate randomly with each other. The method is robust for different numbers of participating clients as long as the local datasets are sufficiently large.
decentralized federated learning, federated learning, data heterogeneity, personalization, distributed machine learning, gossip learning, privacy, image classification