Neural Tangent Kernel: Convergence and Generalization in Neural Networks

Neural Tangent Kernel: Convergence and Generalization in Neural Networks

10 Feb 2020 | Arthur Jacot, Franck Gabriel, Clément Hongler
This paper introduces the Neural Tangent Kernel (NTK), a key concept that describes the behavior of neural networks during training. The NTK is a kernel that captures the evolution of the network function during gradient descent. It is shown that in the infinite-width limit, the NTK converges to a deterministic limit and remains constant during training. This allows the study of neural network training in function space rather than parameter space. The convergence of the training process can be related to the positive-definiteness of the limiting NTK. The paper also shows that for least-squares regression, the network function follows a linear differential equation during training, and the convergence is fastest along the largest kernel principal components of the input data. Numerical experiments confirm that the behavior of wide neural networks is close to the theoretical limit. The NTK provides a theoretical foundation for understanding the generalization properties of neural networks and motivates the use of early stopping to reduce overfitting.This paper introduces the Neural Tangent Kernel (NTK), a key concept that describes the behavior of neural networks during training. The NTK is a kernel that captures the evolution of the network function during gradient descent. It is shown that in the infinite-width limit, the NTK converges to a deterministic limit and remains constant during training. This allows the study of neural network training in function space rather than parameter space. The convergence of the training process can be related to the positive-definiteness of the limiting NTK. The paper also shows that for least-squares regression, the network function follows a linear differential equation during training, and the convergence is fastest along the largest kernel principal components of the input data. Numerical experiments confirm that the behavior of wide neural networks is close to the theoretical limit. The NTK provides a theoretical foundation for understanding the generalization properties of neural networks and motivates the use of early stopping to reduce overfitting.
Reach us at info@study.space