While Federated Learning has become the most widely used approach for distributed, privacy-preserving machine learning, its efficacy decreases when clients have heterogeneous resources or data distributions. To address these challenges, several state-of-the-art algorithms have proposed modifications to the baseline FedAvg algorithm, aiming to reduce clients’ network usage and computational demands. This allows clients to train models within their computational “budgets”; however, these modifications often come at the cost of reduced model accuracy. Split learning offers an alternative, principled solution. By shifting the computational burden onto a central server, split learning reduces the memory and computational requirements on clients. While this enables clients to train larger models than they could manage independently, it also places heavy demands on the server, leading to scalability issues. In this talk, we will first provide an overview of algorithms designed to address client heterogeneity in distributed machine learning. Next, we will introduce split learning as a practical approach for training large models across clients with limited computational resources. Finally, we will explore ongoing research aimed at reducing server-side demands by minimizing the amount of data sent to the server through data filtering techniques, such as coreset selection and uncertainty-based sampling. Ultimately, this research seeks to answer the question: Can a large cohort of clients privately collaborate to train a model that exceeds their individual computational capabilities?
Boris Radovič earned his B.Sc. in Computer Science and M.Sc. in Data Science from the University of Ljubljana in 2020 and 2023, respectively. He is currently a Ph.D. candidate jointly affiliated with the Faculty of Computer and Information Science at the University of Ljubljana and King Abdullah University of Science and Technology (KAUST) in Saudi Arabia, advised by Professors Marco Canini and Veljko Pejović. His research focuses on federated learning and collaborative learning, with a particular emphasis on different aspects of client heterogeneity.