Understanding Fourier Transporter%3A Bi-Equivariant Robotic Manipulation in 3D

FourTRAN is a novel method for robotic manipulation tasks that leverages the SE(3) × SE(3) symmetry in pick-and-place problems to achieve high sample efficiency. The method uses a fiber space Fourier transformation to enable memory-efficient computation and achieves state-of-the-art results on the RLbench benchmark. FOURTRAN is an open-loop behavior cloning method trained using expert demonstrations to predict pick-place actions on new configurations. It is constrained by the symmetries of the pick and place actions independently. The method utilizes a fiber space Fourier transformation that allows for memory-efficient computation. Tests on the RLbench benchmark achieve state-of-the-art results across various tasks. The paper proposes Fourier Transporter (FOURTRAN), an approach to modeling SE(3) bi-equivariance using 3D convolutions and a Fourier representation of rotations. Unlike existing methods, our method encodes SO(3) bi-equivariance inside a general purpose policy learning model rather than relying on point descriptors. Our key innovation is to parameterize action distributions over SO(3) in the Fourier domain as coefficients of Wigner D-matrix entries. We embed this representation within a 3D translational convolution, thereby enabling us to do convolutions directly SE(3) without excessive computational cost and with minimal memory requirements. The end result is a policy learning model for imitation learning with high sample efficiency and high angular resolution that can outperform existing SE(3) methods by significant margins. Our contributions are: we analyze problems with bi-equivariant symmetry and provide a general theoretical solution to leverage the coupled symmetries; we propose Fourier Transporter (FOURTRAN) for leveraging bi-equivariant structure in manipulation pick-place problems in 2D and 3D; we achieve state-of-the-art performance on several RLbench tasks. The method is evaluated on 3D and 2D pick-place tasks, achieving significant improvements in sample efficiency and success rate. The paper concludes that FOURTRAN is a powerful architecture for pick-place tasks, with potential applications beyond robotic manipulation.FourTRAN is a novel method for robotic manipulation tasks that leverages the SE(3) × SE(3) symmetry in pick-and-place problems to achieve high sample efficiency. The method uses a fiber space Fourier transformation to enable memory-efficient computation and achieves state-of-the-art results on the RLbench benchmark. FOURTRAN is an open-loop behavior cloning method trained using expert demonstrations to predict pick-place actions on new configurations. It is constrained by the symmetries of the pick and place actions independently. The method utilizes a fiber space Fourier transformation that allows for memory-efficient computation. Tests on the RLbench benchmark achieve state-of-the-art results across various tasks. The paper proposes Fourier Transporter (FOURTRAN), an approach to modeling SE(3) bi-equivariance using 3D convolutions and a Fourier representation of rotations. Unlike existing methods, our method encodes SO(3) bi-equivariance inside a general purpose policy learning model rather than relying on point descriptors. Our key innovation is to parameterize action distributions over SO(3) in the Fourier domain as coefficients of Wigner D-matrix entries. We embed this representation within a 3D translational convolution, thereby enabling us to do convolutions directly SE(3) without excessive computational cost and with minimal memory requirements. The end result is a policy learning model for imitation learning with high sample efficiency and high angular resolution that can outperform existing SE(3) methods by significant margins. Our contributions are: we analyze problems with bi-equivariant symmetry and provide a general theoretical solution to leverage the coupled symmetries; we propose Fourier Transporter (FOURTRAN) for leveraging bi-equivariant structure in manipulation pick-place problems in 2D and 3D; we achieve state-of-the-art performance on several RLbench tasks. The method is evaluated on 3D and 2D pick-place tasks, achieving significant improvements in sample efficiency and success rate. The paper concludes that FOURTRAN is a powerful architecture for pick-place tasks, with potential applications beyond robotic manipulation.

FOURIER TRANSPORTER: BI-EQUIVARIANT ROBOTIC MANIPULATION IN 3D

2024 | Haojie Huang, Owen L. Howell, Dian Wang, Xupeng Zhu*, Robert Platt†, Robin Walters†

FOURIER TRANSPORTER: BI-EQUIVARIANT ROBOTIC MANIPULATION IN 3D

2024 | Haojie Huang, Owen L. Howell*, Dian Wang*, Xupeng Zhu*, Robert Platt†, Robin Walters†

2024 | Haojie Huang, Owen L. Howell, Dian Wang, Xupeng Zhu*, Robert Platt†, Robin Walters†