Robotics

EHC-MM: Embodied Holistic Control for Mobile Manipulation

By Yixiang Jin Samsung R&D Institute China-Beijing

By Dingzhe Li Samsung R&D Institute China-Beijing

1 Introduction

Mobile manipulation is expected to play a significant role in industrial and domestic applications. However, in many practical robot applications, the planning for the mobile base and robot arm is often separated. This separation in planning processes can lead to an unnatural approach, resulting in the loss of optimal solutions.

Figure 1. Overview of proposed method EHC-MM

For the existing challenges, we propose the Embodied Holistic Control for Mobile Manipulation (EHC-MM), as shown in Figure 1. We formulate the DMCG problem in mobile manipulation as a quadratic programming problem, incorporating both obstacle avoidance and joint angle constraints. This enables efficient simultaneous planning of all robot joints. We design the sig(ω) to achieve embodied control. Specifically, sig(ω) accounts for both the reachability of the robot’s joints and the task objectives, allowing the robot to balance between movement and manipulation. Additionally, we developed a monitor-position-based servoing (MPBS) function to prevent the robot from losing track of the target during its operation.

2 Method

We present an architectural overview in Figure 1. We divide the entire mobile manipulation into four parts: Real Robot, Target Perception, Motion Planning and Controller which are commonly observed in most robot frameworks. Our enhancements primarily focus within the blue rectangular box depicted in Figure 1.

2.1 Kinematic Model and Control of the Mobile Manipulation

According to the chain rule in robot kinematics, the pose of the robot's end-effector can be represented as:

where {o} is the world reference frame. The matrix (_a^v)T denotes the transformation of the virtual base frame {v} to the world frame, through two perpendicular prismatic joints and a revolute joint. (x,y,α) represents the mobile base's position and orientation in the 2-D plane. (_b^v)T represents a constant transformation of the mobile base {b} to the virtual base frame on the robot. (_a^b)T signifies a constant relative transformation from the base frame to the base of the manipulator arm {a}, and (_e^a)T represents the forward kinematics of the arm where the end-effector frame is {e}.

2.2 EHC Function

For mobile manipulation, the accuracy and motion abilities of each joint exhibit variations. Generally, robot arms exhibit higher accuracy but have limited workspace and higher computational complexity. While the mobile base extends the operational workspace significantly, it generally shows a greater degree of motion inaccuracy compared to robot arms. Hence, we designed the Embodied Holistic Control(EHC) Function, as shown below:

where

sig (ω)dynamically balances the emphasis between the movement and manipulation, allowing the robot to switch its focus between efficient movement and precise manipulation as required by target pose. In particular, e(t) ∈ SE(3) represents the end-effector’s pose relative to the target from perception. rtd represents the set of end-effector poses with reachability greater than 95%, while e∗(r, e(t)) denotes the threshold pose that is most closely aligned with the direction of the current target error e(t), serving as a threshold for motion focus under the current task objective. In addition, R(·) represents the reachability of the robot arm’s end-effector. The activation function sig(·) is incorporated into the robot control function, amplifying the emphasis on the specific motion. This enhancement improves the overall efficiency of motion planning. Jm is the manipulability Jacobian.

2.3 Monitoring-Position-Based Servoing

Position-Based Servoing(PBS) methods face the challenge of target loss during the reaching process, which is particularly detrimental for reactive tasks. This is because, during the process of mobile manipulation, when the grasping orientation is not aligned with the forward-facing direction, such as when the orientation is downward or sideways, the end-effector of the robot arm changes its orientation prematurely during the reaching phase, resulting in a loss of visual contact with the target.

We utilize a controller that builds upon PBS and incorporates Monitoring-Position-Based Servoing (MPBS). This controller is designed to be parallel with the EHC function that dynamically balances the emphasis between monitoring and manipulation, based on sig(ω).

3 Results

3.1 Randomly Reaching

Experimental Protocol. We conduct this performance test in the simulation. We introduce different levels of noise to various joints, as shown in Table 1. In this task, the robot is asked to reach the random 50 points in sequence.

Table 1. The Magnitude of Noise Added to Each Joint

Results and Analysis. We first compare our first-layer We compared NEO(c), NEO(e) (Jesse and Corke), and EHC. In each set of experiments, the three approaches individually reached the same 50 random points. We conducted a total of 10 sets of experiments, thereby each method approaching 500 random points. The results of the experiment are displayed in the Table 2. We find that our approach aligns with the DMCG principle, prioritizing mobile base for moving at a distance, and robot arm for manipulation at close range. Additionally, due to the uncertainty associated with randomly generated points, some target poses may be challenging to reach. Hence, we set a maximum time limit of 30 seconds for each approach attempt. Otherwise, it is considered as a failure. Our method's ability to coordinate movements and manipulation enables the robot to reach the desired target pose in a shorter time, resulting in fewer failures.

Table 2. Randomly Reaching in Simulation

Table 3. Sequentially Grasping Multiple Objects in Real-World Experiments

3.2 Sequentially Grasping Multiple Objects

Experimental Protocol. In the real-world experiment, the robot starts from a fixed location and sequentially grasp three objects of different distances on a table 1.5 metres away. These three objects were placed at the front, middle and back positions on the table. The time is the average time from the start of the robot’s movement, to the completion of the grasp, considering only successful trails.

Results and Analysis. We compare TSMM, NEO(c), NEO(e) and EHC. We conduct 15 sets of experiments, each grasping 45 objects. The results of the experiment are displayed in Table 3, where NAN denotes that the value was not measured. Since, for the TSMM, measuring joint changes is meaningless as it includes two separate stages. Apart from the TSMM, we set a maximum grasping time of 30 seconds for each object. During real-world experiments, we find that the results in the real-world experiments align with those in simulations. EHC achieves shorter grasping times and higher success rates. we observe that the NEO(c) and NEO(e) still frequently perform major movements of the base even close to the target. Due to the limited precision of base movements, the robot faces difficulties in reaching the target smoothly, sometimes leading to failure. As for the TSMM, the robot frequently fails to grasp the most distant object because of the obstruction of the table. Thus, EHC demonstrates excellent performance in real-world scenarios.

3.3 Grasping the Object with Different Poses

Experimental Protocol. In real-world experiment, the robot starts from a fixed location and grasps a single object positioned 1.5 meters away using different grasp poses, including forward, downward, and sideways orientations. The time is the average time cost for each grasping pose.

Table 4. Grasping the Object with Different Poses

Results and Analysis. We compared NEO (without MPBS), EHC(without MPBS) and EHC(ours) as an ablation experiment. We conduct 15 sets of experiments, each involving 5 instances of forward, downward, and side ways grasping. The results of the experiment are displayed in the Table 4. A higher TfM value indicates that the robot is less likely to lose track of the target. We observe that, as to downward and sideways grasping orientations, without MPBS, the robot is more likely to lose track of the target in advance. That is because the track of the target is not considered carefully as a constraint in the QP problem. In EHC(ours), MPBS ensures that the robot maintains monitoring on the target when the robot is at a distance.

4 Conclusion

In this paper, we propose EHC-MM, leveraging the function of sig(ω). By formulating the DMCG principle as aquadratic programming (QP) problem, we enable simultaneous planning of the robot’s joints, rather than relying on traditional two-stage mobile manipulation. The function sig(ω) dynamically balances the robot’s emphasis between movement and manipulation with the consideration of therobot’s state and environment, thereby improving both the success rate and efficiency of manipulation tasks. Through simulation and real-world validation, we observe improved efficiency in mobile manipulation tasks using EHC. In real-world experiments, it achieves an impressive grasp success rate of 95.6%. The results demonstrate that the proposed method is highly effective for real-world deployments.

Link to the paper

https://arxiv.org/abs/2409.08527

References

[1] Haviland, Jesse, and Peter Corke. "NEO: A novel expeditious optimisation algorithm for reactive motion control of manipulators." IEEE Robotics and Automation Letters 6.2 (2021): 1043-1050.

#MobileManipulation #WholeBodyControl