30 Apr 2024 | Bart van Marum, Aayam Shrestha, Helei Duan, Pranay Dugar, Jeremy Dao, Alan Fern
This paper addresses the challenge of designing robust standing and walking (SaW) controllers for humanoid robots, focusing on the evaluation and comparison of different reward functions. The authors propose a low-cost, quantitative benchmarking method to assess real-world performance metrics such as command following, disturbance recovery, and energy efficiency. They also revisit reward function design, constructing a minimally constraining reward function to train SaW controllers. The benchmarking framework is validated through experiments on the Digit humanoid robot, revealing areas for improvement and providing clear trade-offs among different controllers. The results highlight the importance of systematic evaluation in advancing SaW control learning and suggest directions for future improvements, including the need for more flexible reward functions and expanded benchmarks.This paper addresses the challenge of designing robust standing and walking (SaW) controllers for humanoid robots, focusing on the evaluation and comparison of different reward functions. The authors propose a low-cost, quantitative benchmarking method to assess real-world performance metrics such as command following, disturbance recovery, and energy efficiency. They also revisit reward function design, constructing a minimally constraining reward function to train SaW controllers. The benchmarking framework is validated through experiments on the Digit humanoid robot, revealing areas for improvement and providing clear trade-offs among different controllers. The results highlight the importance of systematic evaluation in advancing SaW control learning and suggest directions for future improvements, including the need for more flexible reward functions and expanded benchmarks.