Understanding The Landscape of Unfolding with Machine Learning

The paper "The Landscape of Unfolding with Machine Learning" by Nathan Huetsch et al. explores the application of machine learning (ML) techniques to unfold particle physics data, aiming to provide a comprehensive overview of existing and new methods. The authors evaluate these methods on two datasets: Z+jets and top quark pair production, to assess their performance in accurately reproducing particle-level spectra across complex observables. The paper begins with an introduction to the problem of unfolding, highlighting the challenges of traditional forward inference methods and the advantages of ML-based approaches. It then details several ML-based unfolding methods, including reweighting (OmniFold), distribution mapping (Schrödinger Bridge and Direct Diffusion), and generative unfolding (cINN, Transfermer, CFM, TraCFM, Latent Diffusion). Key findings include: - **OmniFold**: A deep learning-based method that reweights simulated samples to match the data. - **Schrödinger Bridge**: Uses a time-dependent stochastic differential equation to map particle-level events to reco-level events. - **Direct Diffusion**: Similar to Schrödinger Bridge but uses an ordinary differential equation to describe the time evolution of events. - **Generative Unfolding**: Utilizes conditional generative networks to learn the inverse simulation from reco-level to particle-level events. The authors benchmark these methods on the Z+jets dataset, demonstrating that all techniques can accurately reproduce particle-level spectra. They also explore the performance of these methods on top quark pair production, highlighting their potential for probing the Standard Model with high precision. The paper concludes by summarizing the advantages of different methods, providing experimental collaborations with guidance on selecting the most suitable method for specific tasks. The results are supported by detailed figures and tables, including metrics such as Wasserstein 1-distance, triangular distance, and energy distance, which show the agreement between unfolded and true particle-level distributions.The paper "The Landscape of Unfolding with Machine Learning" by Nathan Huetsch et al. explores the application of machine learning (ML) techniques to unfold particle physics data, aiming to provide a comprehensive overview of existing and new methods. The authors evaluate these methods on two datasets: Z+jets and top quark pair production, to assess their performance in accurately reproducing particle-level spectra across complex observables. The paper begins with an introduction to the problem of unfolding, highlighting the challenges of traditional forward inference methods and the advantages of ML-based approaches. It then details several ML-based unfolding methods, including reweighting (OmniFold), distribution mapping (Schrödinger Bridge and Direct Diffusion), and generative unfolding (cINN, Transfermer, CFM, TraCFM, Latent Diffusion). Key findings include: - **OmniFold**: A deep learning-based method that reweights simulated samples to match the data. - **Schrödinger Bridge**: Uses a time-dependent stochastic differential equation to map particle-level events to reco-level events. - **Direct Diffusion**: Similar to Schrödinger Bridge but uses an ordinary differential equation to describe the time evolution of events. - **Generative Unfolding**: Utilizes conditional generative networks to learn the inverse simulation from reco-level to particle-level events. The authors benchmark these methods on the Z+jets dataset, demonstrating that all techniques can accurately reproduce particle-level spectra. They also explore the performance of these methods on top quark pair production, highlighting their potential for probing the Standard Model with high precision. The paper concludes by summarizing the advantages of different methods, providing experimental collaborations with guidance on selecting the most suitable method for specific tasks. The results are supported by detailed figures and tables, including metrics such as Wasserstein 1-distance, triangular distance, and energy distance, which show the agreement between unfolded and true particle-level distributions.

The Landscape of Unfolding with Machine Learning

May 20, 2024 | Nathan Huetsch, Javier Maríño Villadamigo, Alexander Shmakov, Sascha Diefenbacher, Vinicius Mikuni, Theo Heimel, Michael Fenton, Kevin Greif, Benjamin Nachman, Daniel Whiteson, Anja Butter, Tilman Plehn