Open-TeleVision: Teleoperation with Immersive Active Visual Feedback

Open-TeleVision: Teleoperation with Immersive Active Visual Feedback

8 Jul 2024 | Xuxin Cheng*1, Jialong Li*1, Shiqi Yang1, Ge Yang2, Xiaolong Wang1
The paper introduces Open-TeleVision, an immersive teleoperation system designed to enhance the collection of high-quality, diverse, and scalable data for robot learning from demonstrations. The system allows operators to perceive the robot's surroundings in a stereoscopic manner and mirrors their arm and hand movements, creating an immersive experience. The effectiveness of the system is validated through experiments on four long-horizon, precise tasks (Can Sorting, Can Insertion, Folding, and Unloading) using two humanoid robots (Unitree H1 and Fourier GR-1). The system's key contributions include active visual feedback, which provides intuitive spatial perception and reduces occlusions, and the ability to control multi-finger dexterous hands and grippers. The paper also discusses the system's architecture, including the use of a web server, VR devices, and motion retargeting algorithms. Experiments demonstrate the system's ability to collect data for imitation learning, with the trained policies successfully performing complex tasks in real-world settings. The system is open-sourced and can enable remote control over long distances, as shown in a cross-country teleoperation experiment.The paper introduces Open-TeleVision, an immersive teleoperation system designed to enhance the collection of high-quality, diverse, and scalable data for robot learning from demonstrations. The system allows operators to perceive the robot's surroundings in a stereoscopic manner and mirrors their arm and hand movements, creating an immersive experience. The effectiveness of the system is validated through experiments on four long-horizon, precise tasks (Can Sorting, Can Insertion, Folding, and Unloading) using two humanoid robots (Unitree H1 and Fourier GR-1). The system's key contributions include active visual feedback, which provides intuitive spatial perception and reduces occlusions, and the ability to control multi-finger dexterous hands and grippers. The paper also discusses the system's architecture, including the use of a web server, VR devices, and motion retargeting algorithms. Experiments demonstrate the system's ability to collect data for imitation learning, with the trained policies successfully performing complex tasks in real-world settings. The system is open-sourced and can enable remote control over long distances, as shown in a cross-country teleoperation experiment.
Reach us at info@study.space
Understanding Open-TeleVision%3A Teleoperation with Immersive Active Visual Feedback