Long Beach, California, PMLR 97, 2019 | Simon Kornblith 1 Mohammad Norouzi 1 Honglak Lee 1 Geoffrey Hinton 1
This paper revisits the problem of measuring similarity between neural network representations. The authors examine methods for comparing neural network representations using canonical correlation analysis (CCA) and show that CCA, and other statistics invariant to invertible linear transformations, cannot measure meaningful similarities between representations of higher dimension than the number of data points. They introduce a similarity index, centered kernel alignment (CKA), which measures the relationship between representational similarity matrices and is not limited by this constraint. CKA is equivalent to centered kernel alignment and is closely connected to CCA. Unlike CCA, CKA can reliably identify correspondences between representations in networks trained from different initializations.
The paper discusses the invariance properties of similarity indexes and their implications for measuring similarity of neural network representations. It argues that similarity indexes should be invariant to orthogonal transformations and isotropic scaling, but not invertible linear transformations. The authors show that invariance to invertible linear transformations leads to limitations, as any invariant similarity index gives the same result for any representation of width greater than or equal to the dataset size. They propose CKA as a similarity index that is invariant to orthogonal transformations and isotropic scaling, and show that it can reliably identify correspondences between representations in networks trained from different initializations.
The paper also compares CKA with other similarity indexes, including linear regression, canonical correlation analysis, and singular value CCA. It shows that CKA is closely related to these methods and that it can be used to measure similarity between neural network representations. The authors demonstrate that CKA is effective at identifying correspondences between layers of different architectures and that it is more reliable than other methods in this regard. They also show that CKA can be used to compare networks trained on different datasets and that it reveals consistent relationships between layers of CNNs trained with different random initializations.
The paper concludes that CKA is a useful method for comparing neural network representations and that it provides a more accurate measure of similarity than other methods. The authors also provide a unified framework for understanding the space of similarity indexes and an empirical framework for evaluation. They show that CKA captures intuitive notions of similarity, i.e., that neural networks trained from different initializations should be similar to each other. However, they leave open the question of whether there exist kernels beyond the linear and RBF kernels that would be better for analyzing neural network representations.This paper revisits the problem of measuring similarity between neural network representations. The authors examine methods for comparing neural network representations using canonical correlation analysis (CCA) and show that CCA, and other statistics invariant to invertible linear transformations, cannot measure meaningful similarities between representations of higher dimension than the number of data points. They introduce a similarity index, centered kernel alignment (CKA), which measures the relationship between representational similarity matrices and is not limited by this constraint. CKA is equivalent to centered kernel alignment and is closely connected to CCA. Unlike CCA, CKA can reliably identify correspondences between representations in networks trained from different initializations.
The paper discusses the invariance properties of similarity indexes and their implications for measuring similarity of neural network representations. It argues that similarity indexes should be invariant to orthogonal transformations and isotropic scaling, but not invertible linear transformations. The authors show that invariance to invertible linear transformations leads to limitations, as any invariant similarity index gives the same result for any representation of width greater than or equal to the dataset size. They propose CKA as a similarity index that is invariant to orthogonal transformations and isotropic scaling, and show that it can reliably identify correspondences between representations in networks trained from different initializations.
The paper also compares CKA with other similarity indexes, including linear regression, canonical correlation analysis, and singular value CCA. It shows that CKA is closely related to these methods and that it can be used to measure similarity between neural network representations. The authors demonstrate that CKA is effective at identifying correspondences between layers of different architectures and that it is more reliable than other methods in this regard. They also show that CKA can be used to compare networks trained on different datasets and that it reveals consistent relationships between layers of CNNs trained with different random initializations.
The paper concludes that CKA is a useful method for comparing neural network representations and that it provides a more accurate measure of similarity than other methods. The authors also provide a unified framework for understanding the space of similarity indexes and an empirical framework for evaluation. They show that CKA captures intuitive notions of similarity, i.e., that neural networks trained from different initializations should be similar to each other. However, they leave open the question of whether there exist kernels beyond the linear and RBF kernels that would be better for analyzing neural network representations.