OmniH2O is a learning-based system for whole-body humanoid teleoperation and autonomy, enabling precise manipulation and locomotion through a universal kinematic pose interface. It supports real-time teleoperation via VR headsets, verbal instructions, and RGB cameras, and enables full autonomy using GPT-4o or policies learned from teleoperated demonstrations. The system demonstrates versatility in tasks such as sports, object manipulation, and human interaction. A key contribution is the development of a sim-to-real pipeline that includes large-scale motion retargeting, sparse sensor input imitation, and reward design for robustness. The system also releases the first humanoid whole-body control dataset, OmniH2O-6, containing six everyday tasks. The system supports diverse input sources, including VR, RGB cameras, and autonomous agents like GPT-4o. It enables scalable demonstration data collection and autonomous policy learning through imitation. The system's control policy is trained using a teacher-student framework, with the teacher policy using privileged information to distill a student policy with limited state space. The system is tested in simulation and the real world, showing strong motion tracking and robustness to disturbances and outdoor terrains. It supports human control via universal interfaces, including language-based motion generation and robustness tests against human strikes. The system also demonstrates autonomy through frontier models and imitation learning, with policies trained on teleoperated datasets. The system's contributions include a universal control interface, scalable demonstration collection, and humanoid autonomy via frontier models or demonstration learning. Limitations include the need for robot root odometry and safety concerns for extreme disturbances. Future work includes improving humanoid learning from demonstrations with additional sensors and better algorithms.OmniH2O is a learning-based system for whole-body humanoid teleoperation and autonomy, enabling precise manipulation and locomotion through a universal kinematic pose interface. It supports real-time teleoperation via VR headsets, verbal instructions, and RGB cameras, and enables full autonomy using GPT-4o or policies learned from teleoperated demonstrations. The system demonstrates versatility in tasks such as sports, object manipulation, and human interaction. A key contribution is the development of a sim-to-real pipeline that includes large-scale motion retargeting, sparse sensor input imitation, and reward design for robustness. The system also releases the first humanoid whole-body control dataset, OmniH2O-6, containing six everyday tasks. The system supports diverse input sources, including VR, RGB cameras, and autonomous agents like GPT-4o. It enables scalable demonstration data collection and autonomous policy learning through imitation. The system's control policy is trained using a teacher-student framework, with the teacher policy using privileged information to distill a student policy with limited state space. The system is tested in simulation and the real world, showing strong motion tracking and robustness to disturbances and outdoor terrains. It supports human control via universal interfaces, including language-based motion generation and robustness tests against human strikes. The system also demonstrates autonomy through frontier models and imitation learning, with policies trained on teleoperated datasets. The system's contributions include a universal control interface, scalable demonstration collection, and humanoid autonomy via frontier models or demonstration learning. Limitations include the need for robot root odometry and safety concerns for extreme disturbances. Future work includes improving humanoid learning from demonstrations with additional sensors and better algorithms.