The Elements of Differentiable Programming

The Elements of Differentiable Programming

July 24, 2024 | Mathieu Blondel, Vincent Roulet
# The Elements of Differentiable Programming Mathieu Blondel and Vincent Roulet Google DeepMind This book provides a comprehensive introduction to differentiable programming, a new programming paradigm that enables end-to-end differentiation of complex computer programs, including those with control flows and data structures. Differentiable programming is not merely the differentiation of programs, but also the thoughtful design of programs intended for differentiation. By making programs differentiable, we inherently introduce probability distributions over their execution, providing a means to quantify the uncertainty associated with program outputs. Differentiable programming builds upon several areas of computer science and applied mathematics, including automatic differentiation, graphical models, optimization, and statistics. This book presents a comprehensive review of the fundamental concepts useful for differentiable programming. We adopt two main perspectives, that of optimization and that of probability, with clear analogies between the two. Differentiable programming is not just deep learning. While there is clearly overlap between deep learning and differentiable programming, their focus is different. Deep learning studies artificial neural networks composed of multiple layers, able to learn intermediate representations of the data. Neural network architectures have been proposed with various inductive biases. For example, convolutional neural networks are designed for images and transformers are designed for sequences. On the other hand, differentiable programming studies the techniques for designing complex programs and differentiating through them. It is useful beyond deep learning: for instance in reinforcement learning, probabilistic programming and scientific computing in general. Differentiable programming is not just autodiff. While autodiff is a key ingredient of differentiable programming, this is not the only one. Differentiable programming is also concerned with the design of principled differentiable operations. In fact, much research on differentiable programming has been devoted to make classical computer programming operations compatible with autodiff. As we shall see, many differentiable relaxations can be interpreted in a probabilistic framework. A core theme of this book is the interplay between optimization, probability and differentiation. Differentiation is useful for optimization and conversely, optimization can be used to design differentiable operators. The book is intended to be a graduate-level introduction to differentiable programming. Our pedagogical choices are made with the machine learning community in mind. Some familiarity with calculus, linear algebra, probability theory and machine learning is beneficial. This book does not need to be read linearly chapter by chapter. When needed, we indicate at the beginning of a chapter what chapters are recommended to be read as a prerequisite. Differentiable programming builds upon a variety of connected topics. We review in this section relevant textbooks, tutorials and software. The present book was also influenced by Peyré (2020)'s textbook on data science. The history of reverse-mode autodiff is reviewed by Griewank (2012). A tutorial on different perspectives of backpropagation is “There and Back Again: A Tale of Slopes and Expectations” (link), by Deisenroth and O# The Elements of Differentiable Programming Mathieu Blondel and Vincent Roulet Google DeepMind This book provides a comprehensive introduction to differentiable programming, a new programming paradigm that enables end-to-end differentiation of complex computer programs, including those with control flows and data structures. Differentiable programming is not merely the differentiation of programs, but also the thoughtful design of programs intended for differentiation. By making programs differentiable, we inherently introduce probability distributions over their execution, providing a means to quantify the uncertainty associated with program outputs. Differentiable programming builds upon several areas of computer science and applied mathematics, including automatic differentiation, graphical models, optimization, and statistics. This book presents a comprehensive review of the fundamental concepts useful for differentiable programming. We adopt two main perspectives, that of optimization and that of probability, with clear analogies between the two. Differentiable programming is not just deep learning. While there is clearly overlap between deep learning and differentiable programming, their focus is different. Deep learning studies artificial neural networks composed of multiple layers, able to learn intermediate representations of the data. Neural network architectures have been proposed with various inductive biases. For example, convolutional neural networks are designed for images and transformers are designed for sequences. On the other hand, differentiable programming studies the techniques for designing complex programs and differentiating through them. It is useful beyond deep learning: for instance in reinforcement learning, probabilistic programming and scientific computing in general. Differentiable programming is not just autodiff. While autodiff is a key ingredient of differentiable programming, this is not the only one. Differentiable programming is also concerned with the design of principled differentiable operations. In fact, much research on differentiable programming has been devoted to make classical computer programming operations compatible with autodiff. As we shall see, many differentiable relaxations can be interpreted in a probabilistic framework. A core theme of this book is the interplay between optimization, probability and differentiation. Differentiation is useful for optimization and conversely, optimization can be used to design differentiable operators. The book is intended to be a graduate-level introduction to differentiable programming. Our pedagogical choices are made with the machine learning community in mind. Some familiarity with calculus, linear algebra, probability theory and machine learning is beneficial. This book does not need to be read linearly chapter by chapter. When needed, we indicate at the beginning of a chapter what chapters are recommended to be read as a prerequisite. Differentiable programming builds upon a variety of connected topics. We review in this section relevant textbooks, tutorials and software. The present book was also influenced by Peyré (2020)'s textbook on data science. The history of reverse-mode autodiff is reviewed by Griewank (2012). A tutorial on different perspectives of backpropagation is “There and Back Again: A Tale of Slopes and Expectations” (link), by Deisenroth and O
Reach us at info@study.space
Understanding The Elements of Differentiable Programming