23 Jul 2024 | Eric A. F. Reinhardt*, P. R. Dinesh, Sergei Gleyzer
SineKAN: Kolmogorov-Arnold Networks Using Sinusoidal Activation Functions
**Authors:** Eric A. F. Reinhardt, P.R. Dinesh, S. Gleyzer
**Abstract:**
This paper introduces SineKAN, a variant of Kolmogorov-Arnold Networks (KANs) that uses sinusoidal activation functions instead of B-Spline functions. KANs are an alternative to traditional multi-layer perceptrons (MLPs) based on the Kolmogorov-Arnold Representation Theorem, which allows for the approximation of any multivariate function using a sum of continuous univariate functions. The original implementation of KANs uses learnable B-Spline activation functions, which are effective but slower than MLPs. SineKAN aims to improve performance and speed while maintaining numerical accuracy.
**Key Contributions:**
1. **Model Architecture:** SineKAN uses a grid of learnable frequencies and amplitudes over fixed phase shifts, reducing the number of learnable parameters.
2. **Weight Initialization:** A novel initialization strategy is proposed to stabilize model performance across different grid sizes and depths.
3. **Universal Approximation:** The combination of learnable amplitudes and sinusoidal activation functions is shown to satisfy universal approximation properties.
4. **Performance and Speed:** SineKAN outperforms B-SplineKAN in terms of accuracy and inference speed, achieving up to 8 times faster performance on large batch sizes.
**Results:**
- **MNIST Benchmark:** SineKAN achieves better accuracy than B-SplineKAN across various hidden layer sizes and depths.
- **Inference Speed:** SineKAN is significantly faster than B-SplineKAN, with better scaling for deep models and large batch sizes.
**Discussion:**
SineKAN shows promise as a scalable and efficient alternative to B-SplineKAN, particularly for high-depth and large batch models. Further exploration is needed to compare performance across a broader range of tasks and activation functions.
**Conclusion:**
SineKAN is a promising model that combines the benefits of sinusoidal activation functions with the Kolmogorov-Arnold framework, offering improved numerical performance and speed.SineKAN: Kolmogorov-Arnold Networks Using Sinusoidal Activation Functions
**Authors:** Eric A. F. Reinhardt, P.R. Dinesh, S. Gleyzer
**Abstract:**
This paper introduces SineKAN, a variant of Kolmogorov-Arnold Networks (KANs) that uses sinusoidal activation functions instead of B-Spline functions. KANs are an alternative to traditional multi-layer perceptrons (MLPs) based on the Kolmogorov-Arnold Representation Theorem, which allows for the approximation of any multivariate function using a sum of continuous univariate functions. The original implementation of KANs uses learnable B-Spline activation functions, which are effective but slower than MLPs. SineKAN aims to improve performance and speed while maintaining numerical accuracy.
**Key Contributions:**
1. **Model Architecture:** SineKAN uses a grid of learnable frequencies and amplitudes over fixed phase shifts, reducing the number of learnable parameters.
2. **Weight Initialization:** A novel initialization strategy is proposed to stabilize model performance across different grid sizes and depths.
3. **Universal Approximation:** The combination of learnable amplitudes and sinusoidal activation functions is shown to satisfy universal approximation properties.
4. **Performance and Speed:** SineKAN outperforms B-SplineKAN in terms of accuracy and inference speed, achieving up to 8 times faster performance on large batch sizes.
**Results:**
- **MNIST Benchmark:** SineKAN achieves better accuracy than B-SplineKAN across various hidden layer sizes and depths.
- **Inference Speed:** SineKAN is significantly faster than B-SplineKAN, with better scaling for deep models and large batch sizes.
**Discussion:**
SineKAN shows promise as a scalable and efficient alternative to B-SplineKAN, particularly for high-depth and large batch models. Further exploration is needed to compare performance across a broader range of tasks and activation functions.
**Conclusion:**
SineKAN is a promising model that combines the benefits of sinusoidal activation functions with the Kolmogorov-Arnold framework, offering improved numerical performance and speed.