Re-Simulation-based Self-Supervised Learning for Pre-Training Foundation Models

Re-Simulation-based Self-Supervised Learning for Pre-Training Foundation Models

11 Mar 2024 | P. Harris, M. Kagan, J. Krupa, B. Maier, and N. Woodward
RS3L is a novel simulation-based self-supervised learning strategy that uses re-simulation to generate data augmentations for contrastive learning. The method involves intervening in the simulation process, fixing the upstream latent state of an event, and re-simulating downstream components to sample augmentations from all possible variations in the simulator. This approach enables the creation of a domain-complete dataset, which is crucial for learning robust representations. The RS3L method is tested on high-energy physics data, where it demonstrates strong performance in downstream tasks such as jet discrimination and uncertainty mitigation. The method is applied to jet tagging, where it achieves competitive performance with state-of-the-art deep learning models and shows improved robustness against domain shifts. The RS3L dataset is made publicly available for further research. The method is also shown to be effective in out-of-distribution tasks, such as distinguishing between QCD jets and jets from hadronic W boson decays. The RS3L approach is compared to fully-supervised learning and shows significant improvements in performance and robustness. The method is based on a contrastive loss function that aligns positive pairs and pushes negative pairs apart, leading to a latent space that captures key physical features. The RS3L framework is shown to be effective in various high-energy physics applications and has the potential to improve the efficiency of deep learning training in the field. The method is also extended to other domains where simulation is present. The results demonstrate that RS3L provides a powerful foundation model that can be used for a variety of downstream tasks.RS3L is a novel simulation-based self-supervised learning strategy that uses re-simulation to generate data augmentations for contrastive learning. The method involves intervening in the simulation process, fixing the upstream latent state of an event, and re-simulating downstream components to sample augmentations from all possible variations in the simulator. This approach enables the creation of a domain-complete dataset, which is crucial for learning robust representations. The RS3L method is tested on high-energy physics data, where it demonstrates strong performance in downstream tasks such as jet discrimination and uncertainty mitigation. The method is applied to jet tagging, where it achieves competitive performance with state-of-the-art deep learning models and shows improved robustness against domain shifts. The RS3L dataset is made publicly available for further research. The method is also shown to be effective in out-of-distribution tasks, such as distinguishing between QCD jets and jets from hadronic W boson decays. The RS3L approach is compared to fully-supervised learning and shows significant improvements in performance and robustness. The method is based on a contrastive loss function that aligns positive pairs and pushes negative pairs apart, leading to a latent space that captures key physical features. The RS3L framework is shown to be effective in various high-energy physics applications and has the potential to improve the efficiency of deep learning training in the field. The method is also extended to other domains where simulation is present. The results demonstrate that RS3L provides a powerful foundation model that can be used for a variety of downstream tasks.
Reach us at info@study.space