Learning with 3D rotations, a hitchhiker’s guide to SO(3)

Learning with 3D rotations, a hitchhiker’s guide to SO(3)

19 Jun 2024 | A. René Geist, Jonas Frey, Mikel Zhobro, Anna Levina, Georg Martius
This paper provides a comprehensive survey and guide to rotation representations in machine learning, focusing on their properties and impact on deep learning with gradient-based optimization. The authors discuss various rotation representations, including Euler angles, exponential coordinates, axis-angle, quaternions, and high-dimensional representations like $\mathbb{R}^6$+Gram-Schmidt orthonormalization (GSO) and $\mathbb{R}^9$+Singular Value Decomposition (SVD). They highlight the importance of considering the dimensionality and topology of these representations, as well as the challenges posed by discontinuities and double cover properties. The paper emphasizes that rotation representations with four or fewer dimensions often suffer from discontinuities, which can lead to poor learning performance. High-dimensional representations, such as $\mathbb{R}^9$+SVD, are recommended for better continuity and stability. The authors also explore the impact of rotation representations on the learnability of functions, particularly in the context of rotation estimation and feature prediction tasks. Experiments are conducted to evaluate the performance of different rotation representations in various regression tasks, including point cloud alignment, cube rotation estimation from images, and 6D object pose estimation from RGB-D images. The results show that high-dimensional representations, especially $\mathbb{R}^9$+SVD, outperform low-dimensional representations in most cases. The paper concludes by discussing the advantages of high-dimensional representations and providing practical recommendations for selecting appropriate rotation representations based on the specific application and data characteristics.This paper provides a comprehensive survey and guide to rotation representations in machine learning, focusing on their properties and impact on deep learning with gradient-based optimization. The authors discuss various rotation representations, including Euler angles, exponential coordinates, axis-angle, quaternions, and high-dimensional representations like $\mathbb{R}^6$+Gram-Schmidt orthonormalization (GSO) and $\mathbb{R}^9$+Singular Value Decomposition (SVD). They highlight the importance of considering the dimensionality and topology of these representations, as well as the challenges posed by discontinuities and double cover properties. The paper emphasizes that rotation representations with four or fewer dimensions often suffer from discontinuities, which can lead to poor learning performance. High-dimensional representations, such as $\mathbb{R}^9$+SVD, are recommended for better continuity and stability. The authors also explore the impact of rotation representations on the learnability of functions, particularly in the context of rotation estimation and feature prediction tasks. Experiments are conducted to evaluate the performance of different rotation representations in various regression tasks, including point cloud alignment, cube rotation estimation from images, and 6D object pose estimation from RGB-D images. The results show that high-dimensional representations, especially $\mathbb{R}^9$+SVD, outperform low-dimensional representations in most cases. The paper concludes by discussing the advantages of high-dimensional representations and providing practical recommendations for selecting appropriate rotation representations based on the specific application and data characteristics.
Reach us at info@study.space