This paper studies the expressive power of shallow and deep neural networks with piecewise linear activation functions, focusing on approximation in Sobolev spaces. It establishes new rigorous upper and lower bounds for network complexity, showing that deep ReLU networks are more efficient than shallow ones in approximating smooth functions. For 1D Lipschitz functions, it describes adaptive depth-6 network architectures that outperform standard shallow architectures.
The paper analyzes the expressiveness of neural networks from the perspective of approximation theory, considering functions with derivatives up to certain orders (Sobolev-type spaces). It shows that deep networks can approximate smooth functions more efficiently than shallow ones, with specific bounds on the number of weights and computation units required for a given approximation error.
In Section 2, the paper describes the ReLU network model and shows that ReLU can be used to approximate any continuous piecewise linear activation function with only a constant factor increase in complexity. In Section 3, it establishes upper bounds on the complexity of approximating functions by ReLU networks, showing that deep networks are efficient for approximating smooth functions. It provides specific results for approximating the function $ f(x) = x^2 $ and general smooth functions.
In Section 4, the paper obtains lower bounds on the complexity of approximation by deep and shallow ReLU networks, using different approaches and assumptions. It shows that for certain functions, the complexity of approximation by deep networks is significantly lower than that of shallow networks. The paper also compares the efficiency of fixed-depth and adaptive architectures.
The paper concludes by discussing the obtained bounds and their implications, particularly comparing deep vs. shallow networks and fixed vs. adaptive architectures. It also notes that the results are related to previous work, with some overlap in the analysis of smooth functions and approximation bounds.This paper studies the expressive power of shallow and deep neural networks with piecewise linear activation functions, focusing on approximation in Sobolev spaces. It establishes new rigorous upper and lower bounds for network complexity, showing that deep ReLU networks are more efficient than shallow ones in approximating smooth functions. For 1D Lipschitz functions, it describes adaptive depth-6 network architectures that outperform standard shallow architectures.
The paper analyzes the expressiveness of neural networks from the perspective of approximation theory, considering functions with derivatives up to certain orders (Sobolev-type spaces). It shows that deep networks can approximate smooth functions more efficiently than shallow ones, with specific bounds on the number of weights and computation units required for a given approximation error.
In Section 2, the paper describes the ReLU network model and shows that ReLU can be used to approximate any continuous piecewise linear activation function with only a constant factor increase in complexity. In Section 3, it establishes upper bounds on the complexity of approximating functions by ReLU networks, showing that deep networks are efficient for approximating smooth functions. It provides specific results for approximating the function $ f(x) = x^2 $ and general smooth functions.
In Section 4, the paper obtains lower bounds on the complexity of approximation by deep and shallow ReLU networks, using different approaches and assumptions. It shows that for certain functions, the complexity of approximation by deep networks is significantly lower than that of shallow networks. The paper also compares the efficiency of fixed-depth and adaptive architectures.
The paper concludes by discussing the obtained bounds and their implications, particularly comparing deep vs. shallow networks and fixed vs. adaptive architectures. It also notes that the results are related to previous work, with some overlap in the analysis of smooth functions and approximation bounds.