Deep Reinforcement Learning (DRL) for Coordinated Payload Transport in Biped-Wheeled Robots A unified kinematic model integrated PyTorch-based framework that trains a DRL agent to coordinate two biped-wheeled robots for cooperative payload transport.
A demonstration repository showing:
- Deep Reinforcement Learning-based payload transport in simulation
- Sim-to-Real deployment on the Diablo biped-wheeled robots
-
Isaac Lab & Isaac Sim
- NVIDIA Omniverse Isaac Sim (4.5.0) & Isaac Lab (2.1) installed
- Installation guide
-
Workstation Requirements
- GPU: NVIDIA RTX 30xx series or higher (β₯16 GB VRAM)
- CPU: Intel Core i7 (9th Generation) AMD Ryzen 7
- RAM: β₯32 GB
- Ubuntu 22.04 LTS
-
Diablo Robot Hardware
- DirecDrive Tech's Diablo biped-wheeled robots (x2)
- ROS Noetic (Linux)
- Diablo URDF + control stack
-
OptiTrack Motion Capture
- Motive v3.0+ installed & calibrated
- OptiTrack (NatNet) streaming engine with mocap_optitrack ROS package
- Guide
- dual_diablo - Contains the environment file and RL agent files
- dual_diablo.py - Contains the actuator and additional configurations of the biped-wheeled robot in simulation
- USD_DualDiablo - Contains the USD files of the payload and biped-wheeled robot
Copy the folder dual_diablo (contains the environment and RL agent files) & dual_diablo.py (robot config file) in the IsaacLab directory as shown:
ββ IsaacLab
βββ source
βββ isaaclab_assets
βΒ Β βββ isaaclab_assets
βΒ Β βββ robots
βΒ Β βββ dual_diablo.py
βββ isaaclab_tasks
βββ isaaclab_tasks
βββ direct
βββ dual_diablo
Ensure the paths of waypoints/payload path is modified in the dual_diablo_env.py file and the USD path in dual_diablo.py file
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task DualDiablo_Task_Simple --num_envs 4096 --headless
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/play.py --task DualDiablo_Task_Simple --num_envs 4 --checkpoint /home/Your_Directory/IsaacLab/logs/rsl_rl/dualdiablo_rsl_rl/2025-05-13_20-18-28/model_500.pt
DualDiabloTraining.2.mp4
SimGradualSineS1.2.mp4
The deployment is organized into three phases:
- OptiTrack Setup
- Robot & Payload Setup
- DRL Interface Initialization
- Cameras: 12 OptiTrack units arranged around the workspace
- Reflective markers: β₯ 3 per rigid body (we use 4 for extra accuracy)

Figure 2: Markers on robot & payload (geometric center tracking)
-
Install ROSβOptiTrack packages
Follow the OptiTrack + ROS tutorial. -
Launch motion capture
roslaunch mocap_optitrack mocap.launch
-
Verify topics
rostopic list
Note: Repeat for each robot, swapping in its specific IP, hostname, and topic names.
- ROS Noetic (ROS 1)
- Diablo ROS1 SDK
mkdir -p ~/catkin_ws/src
cd ~/catkin_ws/src
git clone https://github.com/DDTRobot/diablo-sdk-v1.git
cd ~/catkin_ws
catkin_makeAdd to your ~/.bashrc (replace <robot_ip> and <network_ip>):
source /opt/ros/noetic/setup.bash
source ~/catkin_ws/devel/setup.bash
export ROS_HOSTNAME="<robot_ip>"
export ROS_MASTER_URI="http://<network_ip>:11311"-
C++ Controller
- File:
diablo-sdk-v1/example/movement_ctrl/main.cpp - Swap in your
cmd_vel_ego/cmd_vel_followertopics.
(SeeDiablo_Robot_Code/main.cppfor reference.)
- File:
-
Python Teleop
- Script:
script/teleop.py - Publishers:
DJ_teleop(ego) andDJ_teleop2(follower)
- Script:
-
Launch
rosrun diablo_sdk movement_ctrl_example
python3 teleop.py- Press
kfor mid-height,jfor full-height.
Run the ONNX-based ROS interface in Real_World_Code folder on the robot or the workstation (requires ONNX Runtime):
python3 Diablo_ROS_interface_ONNX_RSLRL.pyThis script subscribes to OptiTrack topics, feeds observations into your DRL model, and publishes the resulting body twists back to each robot.

Real World Video of Biped-Wheeled Robot: Diablo
If you find this research useful, please consider citing the paper
@unknown{unknown,
author = {Mehta, Dhruv and Joglekar, Ajinkya and Krovi, Venkat},
year = {2024},
month = {09},
pages = {},
title = {Deep Reinforcement Learning for Coordinated Payload Transport in Biped-Wheeled Robots},
doi = {10.13140/RG.2.2.10251.71207/1}
}
