Finetuning Foundation Models for Joint Analysis Optimization

Finetuning Foundation Models for Joint Analysis Optimization

25 Jan 2024 | Matthias Vigl, Nicole Hartman, Lukas Heinrich
This paper explores the application of modern machine learning techniques, particularly foundation models, to optimize the data analysis process in High Energy Physics (HEP). The authors demonstrate that significant gains in performance and data efficiency can be achieved by moving beyond the traditional sequential optimization approach, which involves optimizing reconstruction and analysis components separately. Instead, they propose a more global gradient-based optimization strategy inspired by modern machine learning workflows, including pretraining, fine-tuning, domain adaptation, and high-dimensional embedding spaces. The study focuses on a specific use case: searching for heavy resonances decaying via an intermediate di-Higgs system to four $b$-jets. The authors compare different architectural and training strategies, including traditional HEP workflows (frozen backbone), fine-tuning, and from-scratch training. They find that fine-tuning, particularly when combined with pretraining, significantly improves performance and data efficiency. Fine-tuned models achieve up to 10-100 times better data efficiency and can reach higher background rejection rates compared to frozen backbone models. The paper also discusses the importance of domain adaptation, showing that pretraining on datasets other than the target dataset can enhance performance. Additionally, it highlights the potential for automated calibration techniques and the need to design better pretraining tasks to further improve performance. Overall, the study provides a conceptual framework for optimizing HEP data analysis pipelines and demonstrates the practical benefits of applying modern machine learning techniques to this field.This paper explores the application of modern machine learning techniques, particularly foundation models, to optimize the data analysis process in High Energy Physics (HEP). The authors demonstrate that significant gains in performance and data efficiency can be achieved by moving beyond the traditional sequential optimization approach, which involves optimizing reconstruction and analysis components separately. Instead, they propose a more global gradient-based optimization strategy inspired by modern machine learning workflows, including pretraining, fine-tuning, domain adaptation, and high-dimensional embedding spaces. The study focuses on a specific use case: searching for heavy resonances decaying via an intermediate di-Higgs system to four $b$-jets. The authors compare different architectural and training strategies, including traditional HEP workflows (frozen backbone), fine-tuning, and from-scratch training. They find that fine-tuning, particularly when combined with pretraining, significantly improves performance and data efficiency. Fine-tuned models achieve up to 10-100 times better data efficiency and can reach higher background rejection rates compared to frozen backbone models. The paper also discusses the importance of domain adaptation, showing that pretraining on datasets other than the target dataset can enhance performance. Additionally, it highlights the potential for automated calibration techniques and the need to design better pretraining tasks to further improve performance. Overall, the study provides a conceptual framework for optimizing HEP data analysis pipelines and demonstrates the practical benefits of applying modern machine learning techniques to this field.
Reach us at info@study.space