16 Jun 2024 | Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljačić, Thomas Y. Hou, Max Tegmark
Kolmogorov-Arnold Networks (KANs) are introduced as a promising alternative to Multi-Layer Perceptrons (MLPs). Unlike MLPs, which use fixed activation functions on nodes, KANs use learnable activation functions on edges, represented as splines. This design allows KANs to achieve better accuracy and interpretability in small-scale AI + Science tasks. Theoretically and empirically, KANs exhibit faster neural scaling laws than MLPs. KANs can be visualized intuitively and interact with users, making them useful collaborators for scientific discovery. They are particularly effective in tasks with compositional structures, such as function fitting and solving partial differential equations (PDEs). KANs can learn both the compositional structure and univariate functions, outperforming MLPs in accuracy. Through examples in mathematics and physics, KANs demonstrate their ability to help scientists rediscover mathematical and physical laws. KANs are more efficient than MLPs, requiring fewer parameters and achieving better generalization. They also offer greater interpretability through simplification techniques and user interaction. KANs can be trained with sparsity regularization and pruning to achieve more interpretable results. The paper shows that KANs can achieve better scaling laws than MLPs, especially in high-dimensional tasks, and can be used for symbolic regression and special function fitting. KANs are also effective in continual learning without catastrophic forgetting. Overall, KANs offer a promising alternative to MLPs, with potential for improving current deep learning models.Kolmogorov-Arnold Networks (KANs) are introduced as a promising alternative to Multi-Layer Perceptrons (MLPs). Unlike MLPs, which use fixed activation functions on nodes, KANs use learnable activation functions on edges, represented as splines. This design allows KANs to achieve better accuracy and interpretability in small-scale AI + Science tasks. Theoretically and empirically, KANs exhibit faster neural scaling laws than MLPs. KANs can be visualized intuitively and interact with users, making them useful collaborators for scientific discovery. They are particularly effective in tasks with compositional structures, such as function fitting and solving partial differential equations (PDEs). KANs can learn both the compositional structure and univariate functions, outperforming MLPs in accuracy. Through examples in mathematics and physics, KANs demonstrate their ability to help scientists rediscover mathematical and physical laws. KANs are more efficient than MLPs, requiring fewer parameters and achieving better generalization. They also offer greater interpretability through simplification techniques and user interaction. KANs can be trained with sparsity regularization and pruning to achieve more interpretable results. The paper shows that KANs can achieve better scaling laws than MLPs, especially in high-dimensional tasks, and can be used for symbolic regression and special function fitting. KANs are also effective in continual learning without catastrophic forgetting. Overall, KANs offer a promising alternative to MLPs, with potential for improving current deep learning models.