[slides and audio] Breaking the Memory Wall for Heterogeneous Federated Learning with Progressive Training

The paper "Breaking the Memory Wall for Heterogeneous Federated Learning via Progressive Training" by Yebo Wu, Li Li, and Cheng-zhong Xu addresses the challenge of memory constraints in Federated Learning (FL) on resource-constrained devices. The authors propose ProFL, a novel framework that divides the global model into blocks and trains them progressively, freezing converged blocks to reduce peak memory usage. ProFL features two stages: progressive model shrinking and progressive model growing. During model shrinking, basic layers are trained to preserve feature representation and initialize parameters for each block. In the growing stage, the trained blocks are progressively combined to form the full model. A novel metric, effective movement, is introduced to assess block convergence and control freezing timing. Theoretical analysis and extensive experiments on various datasets and models demonstrate that ProFL reduces peak memory usage by up to 57.4% and improves model accuracy by up to 82.4%. The framework is shown to be effective in both IID and Non-IID settings, and it is compatible with existing FL algorithms.The paper "Breaking the Memory Wall for Heterogeneous Federated Learning via Progressive Training" by Yebo Wu, Li Li, and Cheng-zhong Xu addresses the challenge of memory constraints in Federated Learning (FL) on resource-constrained devices. The authors propose ProFL, a novel framework that divides the global model into blocks and trains them progressively, freezing converged blocks to reduce peak memory usage. ProFL features two stages: progressive model shrinking and progressive model growing. During model shrinking, basic layers are trained to preserve feature representation and initialize parameters for each block. In the growing stage, the trained blocks are progressively combined to form the full model. A novel metric, effective movement, is introduced to assess block convergence and control freezing timing. Theoretical analysis and extensive experiments on various datasets and models demonstrate that ProFL reduces peak memory usage by up to 57.4% and improves model accuracy by up to 82.4%. The framework is shown to be effective in both IID and Non-IID settings, and it is compatible with existing FL algorithms.

Breaking the Memory Wall for Heterogeneous Federated Learning via Progressive Training

August 3–7, 2025, Toronto, ON, Canada | Yebo Wu, Li Li, Cheng-zhong Xu