On the hardness of learning under symmetries

On the hardness of learning under symmetries

3 Jan 2024 | Bobak T. Kiani, Thien Le, Hannah Lawrence, Stefanie Jegelka, Melanie Weber
This paper investigates the computational hardness of learning equivariant neural networks via gradient descent. While incorporating known symmetries into neural networks has empirically improved learning performance, theoretical research shows that learning shallow, non-symmetric networks is exponentially hard in the correlational statistical query (CSQ) model, which includes gradient descent. The authors ask whether known symmetries can alleviate this hardness. They answer negatively, showing that learning certain classes of equivariant networks remains hard, even with symmetry. The paper presents lower bounds for various symmetric network classes, including graph neural networks, convolutional networks, invariant polynomials, and frame-averaged networks, showing that their learning complexity scales superpolynomially or exponentially with input dimension. These results suggest that symmetry alone is not sufficient to make learning efficient via gradient descent. The authors also show that learning certain invariant functions is NP-hard, and that even simple message passing layers in graph neural networks can produce exponentially hard functions. They further demonstrate that learning invariant polynomials can be done efficiently with a modified algorithm, but not via gradient descent. The paper also explores the relationship between SQ and CSQ learning models, showing that for certain invariant function classes, SQ algorithms can learn efficiently while CSQ algorithms require exponentially more queries. This highlights the importance of considering the structure of the function class when analyzing learning complexity. Finally, the authors provide experimental results showing that the hard classes of invariant functions they propose are indeed difficult to learn, even with overparameterized networks. The results suggest that further assumptions or structural constraints are needed to achieve efficient learning of symmetric networks.This paper investigates the computational hardness of learning equivariant neural networks via gradient descent. While incorporating known symmetries into neural networks has empirically improved learning performance, theoretical research shows that learning shallow, non-symmetric networks is exponentially hard in the correlational statistical query (CSQ) model, which includes gradient descent. The authors ask whether known symmetries can alleviate this hardness. They answer negatively, showing that learning certain classes of equivariant networks remains hard, even with symmetry. The paper presents lower bounds for various symmetric network classes, including graph neural networks, convolutional networks, invariant polynomials, and frame-averaged networks, showing that their learning complexity scales superpolynomially or exponentially with input dimension. These results suggest that symmetry alone is not sufficient to make learning efficient via gradient descent. The authors also show that learning certain invariant functions is NP-hard, and that even simple message passing layers in graph neural networks can produce exponentially hard functions. They further demonstrate that learning invariant polynomials can be done efficiently with a modified algorithm, but not via gradient descent. The paper also explores the relationship between SQ and CSQ learning models, showing that for certain invariant function classes, SQ algorithms can learn efficiently while CSQ algorithms require exponentially more queries. This highlights the importance of considering the structure of the function class when analyzing learning complexity. Finally, the authors provide experimental results showing that the hard classes of invariant functions they propose are indeed difficult to learn, even with overparameterized networks. The results suggest that further assumptions or structural constraints are needed to achieve efficient learning of symmetric networks.
Reach us at info@study.space
[slides] On the hardness of learning under symmetries | StudySpace