This paper introduces Co-Supervised Learning (CSL), a method to improve weak-to-strong generalization by leveraging multiple specialized teachers instead of a single generalist one. The approach is inspired by the classical hierarchical mixture of experts model and involves two key components: teacher assignment and noise reduction. Teacher assignment alternates between student training and teacher assignment, progressively identifying suitable supervisors. Noise reduction enforces consistency between teachers and students, as well as between local and global models, to reject potential annotation noise. The method is validated on visual recognition tasks and multi-domain datasets, showing significant improvements over the vanilla single-teacher baseline. Experiments on the OpenAI weak-to-strong benchmark and DomainNet demonstrate that CSL outperforms existing methods, particularly when the capability gap between the student and teacher is large. The results suggest that CSL provides a more effective way to align strong models with human preferences, contributing to the broader goal of superalignment. The method is implemented in Python and available at https://github.com/yuejiangliu/csl.This paper introduces Co-Supervised Learning (CSL), a method to improve weak-to-strong generalization by leveraging multiple specialized teachers instead of a single generalist one. The approach is inspired by the classical hierarchical mixture of experts model and involves two key components: teacher assignment and noise reduction. Teacher assignment alternates between student training and teacher assignment, progressively identifying suitable supervisors. Noise reduction enforces consistency between teachers and students, as well as between local and global models, to reject potential annotation noise. The method is validated on visual recognition tasks and multi-domain datasets, showing significant improvements over the vanilla single-teacher baseline. Experiments on the OpenAI weak-to-strong benchmark and DomainNet demonstrate that CSL outperforms existing methods, particularly when the capability gap between the student and teacher is large. The results suggest that CSL provides a more effective way to align strong models with human preferences, contributing to the broader goal of superalignment. The method is implemented in Python and available at https://github.com/yuejiangliu/csl.