Exploring Simple Siamese Representation Learning

Exploring Simple Siamese Representation Learning

20 Nov 2020 | Xinlei Chen, Kaiming He
SimSiam is a simple Siamese network that achieves competitive results in unsupervised visual representation learning without the need for negative sample pairs, large batches, or momentum encoders. The method uses two augmented views of an image, processed by a shared encoder network, and a prediction MLP that matches the outputs of the two views. A stop-gradient operation is critical in preventing the outputs from collapsing to a constant, which is essential for learning meaningful representations. The stop-gradient operation allows the model to optimize two sets of variables, effectively performing an alternating optimization similar to Expectation-Maximization. SimSiam achieves a validation accuracy of 67.7% on ImageNet, demonstrating its effectiveness. The method is also effective on downstream tasks and transfer learning, showing that Siamese architectures can be a core component of successful unsupervised representation learning. The study highlights the importance of the stop-gradient operation and the role of Siamese networks in learning invariant representations.SimSiam is a simple Siamese network that achieves competitive results in unsupervised visual representation learning without the need for negative sample pairs, large batches, or momentum encoders. The method uses two augmented views of an image, processed by a shared encoder network, and a prediction MLP that matches the outputs of the two views. A stop-gradient operation is critical in preventing the outputs from collapsing to a constant, which is essential for learning meaningful representations. The stop-gradient operation allows the model to optimize two sets of variables, effectively performing an alternating optimization similar to Expectation-Maximization. SimSiam achieves a validation accuracy of 67.7% on ImageNet, demonstrating its effectiveness. The method is also effective on downstream tasks and transfer learning, showing that Siamese architectures can be a core component of successful unsupervised representation learning. The study highlights the importance of the stop-gradient operation and the role of Siamese networks in learning invariant representations.
Reach us at info@study.space