Learning Whole-Body Loco-Manipulation for Omni-Directional Task Space Pose Tracking with Wheeled-Quadrupedal-Manipulator

¹Kaiwen Jiang^*, ¹Zhen Fu^*, ¹Junde Guo, ^1,3Wei Zhang, ^2,3Hua Chen

¹Southern University of Science and Technology, ²Zhejiang University-University of Illinois Urbana-Champaign Institute, ³LimX Dynamics
^*Indicates Equal Contribution

arXiv

Video

Abstract

In this paper, we study the whole-body loco-manipulation problem using a Wheeled Quadrupedal Manipulator (WQM) platform. We focus on coordinating the floating base and robotic arm to achieve direct six-dimensional (6D) end-effector pose tracking in task space. This problem requires inherent balance among redundant degrees of freedom in whole-body motion. To address this challenge, we develop a novel Reward Fusion Module (RFM) that systematically integrates various reward terms in a nonlinear manner, accommodating the multi-stage and hierarchical nature of loco-manipulation. By combining our RFM with a teacher-student reinforcement learning paradigm, we present a complete scheme for 6D end-effector pose tracking. Extensive experiments, both in simulation and on hardware, demonstrate smooth and precise tracking performance, achieving state-of-the-art position errors below 5 cm and rotation errors under 0.1 rad.

Framework

The following figure illustrates the training paradigm for the whole-body policy using the Reward Fusion Module. Various reward terms are integrated by RFM to generate a reasonable total reward \(R_t \in \mathbb{R}\). This reward is then utilized by the generalized advantage estimator (GAE) for Actor training and computing TD error for Critic training. The unified policy takes three inputs: the 6D target world frame command \((R,p)\) from users, latent \(\hat{z}\) encoded from privilege estimator, and proprioceptive states, outputting the whole-body action probability for PPO training.

Whole-body 6-D Tracking

Our WQM platform tracks various 6-D poses with whole-body features and handles long-distance locomotion tasks.

Elevator Operation

Our WQM platform precisely pushes small buttons and enters elevators smoothly, demonstrating fine control.

Delicate Manipulation

Our WQM platform performs delicate operations, such as picking up cups and disposing them collectively.

Whiteboard Writing

Our WQM platform writes and draws on whiteboards, including text like "SUSTech", "ZJUI", "LimX", and geometric shapes.

Outdoor Loco-Manipulation

Our WQM platform navigates through complex outdoor environments, managing extended loco-manipulation tasks effectively.

Avoid Impedance

Our WQM platform demonstrates advanced obstacle avoidance while maintaining precise manipulation capabilities.

BibTeX

@article{jiang2024learning,
      title     = {Learning Whole-Body Loco-Manipulation for Omni-Directional Task Space Pose Tracking with Wheeled-Quadrupedal-Manipulator},
      author    = {Jiang, Kaiwen and Fu, Zhen and Guo, Junde and Zhang, Wei and Chen, Hua},
      journal   = {IEEE Robotics and Automation Letters},
      year      = {2024}
    }