Computational Optimal Transport is a mathematical framework that studies the problem of transporting mass from one distribution to another, with applications in various fields such as machine learning, computer vision, and statistics. The theory, rooted in the work of Gaspard Monge and later formalized by others, provides a way to compare probability distributions by considering the cost of moving mass between points. This cost is quantified using a ground cost function, and the goal is to find the most efficient way to transport mass while minimizing this cost.
The paper reviews the theoretical foundations of optimal transport (OT), including the concepts of histograms, measures, and couplings. It discusses the Monge problem, which seeks a deterministic mapping between two distributions, and the Kantorovich relaxation, which generalizes this problem to allow for probabilistic transport. The Kantorovich formulation is a linear program that finds the optimal coupling between two distributions, minimizing the expected transportation cost.
The paper also covers the algorithmic foundations of OT, including the Kantorovich linear programs, c-transforms, and dual ascent methods. It discusses the entropic regularization of OT, which introduces a penalty term to the optimization problem to make it more tractable. This regularization leads to the Sinkhorn algorithm, which is efficient for computing OT in large-scale settings.
The paper explores various extensions of OT, including dynamic formulations, semidiscrete optimal transport, and Wasserstein spaces. It also discusses the geometric properties of OT, such as the Wasserstein distance, which defines a metric on the space of probability distributions. The paper highlights the importance of OT in data science, particularly in imaging sciences, graphics, and machine learning, where it provides powerful tools for comparing and manipulating distributions.
The paper concludes by emphasizing the practical effectiveness of OT and its ability to provide a rich geometric structure on the space of probability distributions. It also discusses the computational challenges of OT and the recent advances in efficient algorithms that have made OT applicable to large-scale problems. The paper serves as a comprehensive overview of the theoretical and computational aspects of optimal transport, providing a foundation for further research and applications in data science and related fields.Computational Optimal Transport is a mathematical framework that studies the problem of transporting mass from one distribution to another, with applications in various fields such as machine learning, computer vision, and statistics. The theory, rooted in the work of Gaspard Monge and later formalized by others, provides a way to compare probability distributions by considering the cost of moving mass between points. This cost is quantified using a ground cost function, and the goal is to find the most efficient way to transport mass while minimizing this cost.
The paper reviews the theoretical foundations of optimal transport (OT), including the concepts of histograms, measures, and couplings. It discusses the Monge problem, which seeks a deterministic mapping between two distributions, and the Kantorovich relaxation, which generalizes this problem to allow for probabilistic transport. The Kantorovich formulation is a linear program that finds the optimal coupling between two distributions, minimizing the expected transportation cost.
The paper also covers the algorithmic foundations of OT, including the Kantorovich linear programs, c-transforms, and dual ascent methods. It discusses the entropic regularization of OT, which introduces a penalty term to the optimization problem to make it more tractable. This regularization leads to the Sinkhorn algorithm, which is efficient for computing OT in large-scale settings.
The paper explores various extensions of OT, including dynamic formulations, semidiscrete optimal transport, and Wasserstein spaces. It also discusses the geometric properties of OT, such as the Wasserstein distance, which defines a metric on the space of probability distributions. The paper highlights the importance of OT in data science, particularly in imaging sciences, graphics, and machine learning, where it provides powerful tools for comparing and manipulating distributions.
The paper concludes by emphasizing the practical effectiveness of OT and its ability to provide a rich geometric structure on the space of probability distributions. It also discusses the computational challenges of OT and the recent advances in efficient algorithms that have made OT applicable to large-scale problems. The paper serves as a comprehensive overview of the theoretical and computational aspects of optimal transport, providing a foundation for further research and applications in data science and related fields.