Strands Docs ◆ MuJoCo ◆ NVIDIA GR00T ◆ LeRobot ◆ Robots Sim ◆ Project Board
strands-robots gives a Strands Agent
hands. One Robot() call returns a MuJoCo simulation (default - no GPU, no
hardware) or a real robot - same code, same natural-language control, both
auto-joined to a peer-to-peer mesh.
from strands import Agent
from strands_robots import Robot
robot = Robot("so100") # MuJoCo sim by default; mode="real" for hardware
Agent(tools=[robot])("pick up the red cube")Teleoperate a real arm to collect demos, fine-tune a policy on them, run it in sim and on hardware, hand work to a fleet peer, and expose it all on ROS 2 - one library, one mental model. Every line below is a distinct capability:
from strands import Agent
from strands_robots import Robot
from strands_robots.tools import train_policy
# 1. TELEOPERATE a real SO-101 with its leader arm and RECORD demos as a
# LeRobotDataset (one prompt drives cameras + teleop + recording).
follower = Robot("so101", mode="real", port="/dev/ttyACM0",
cameras={"front": {"type": "opencv", "index_or_path": "/dev/video0"}})
follower.attach_teleop("so101_leader", port="/dev/ttyACM1", id="leader")
Agent(tools=[follower])(
"start_recording(repo_id='me/pick', root='/tmp/pick', fps=30, "
"task='pick up the cube'); teleoperate for 60s; stop_recording"
)
# 2. POST-TUNE a policy on those demos (LoRA fine-tune; GPU box).
train_policy(action="train", provider="lerobot_local",
dataset_root="/tmp/pick", base_model="lerobot/smolvla_base",
output_dir="/tmp/pick_ckpt", method="lora", steps=20000)
# 3. RUN the tuned checkpoint - same policy on a MuJoCo twin AND the real arm.
twin = Robot("so101") # sim twin, no hardware
twin.run_policy(robot_name="so101", policy_provider="lerobot_local",
policy_config={"pretrained_name_or_path": "/tmp/pick_ckpt"}, duration=10.0)
follower.start_task("pick up the cube", policy_provider="lerobot_local",
policy_port=None, duration=10.0) # real arm, in-process
# 4. COORDINATE a fleet - tell a mesh peer to assist, in natural language.
follower.mesh.tell(follower.mesh.peers[0]["peer_id"], "hold the tray steady")
# 5. EXPOSE the running sim on ROS 2 - rviz / nav2 / any ros2 node can subscribe.
from strands_robots.simulation import Simulation
sim = Simulation(ros2_bridge=True); sim.create_world(); sim.add_robot("so101")
sim.step(100) # publishes /so101/joint_states + camera image_raw on the ROS 2 graph| Step | Capability | Surface |
|---|---|---|
| 1 | Teleop + dataset recording | Robot(mode="real"), attach_teleop, start_recording |
| 2 | Policy post-tuning | train_policy (LeRobot / GR00T trainers) |
| 3 | Sim + hardware policy rollout | run_policy (sim), start_task (hardware) |
| 4 | Fleet coordination | robot.mesh.tell / robot_mesh tool |
| 5 | ROS 2 interop | Simulation(ros2_bridge=True), use_ros |
Steps 1 and 3-real need hardware; step 2 needs a GPU. Everything runs in sim with no hardware (
Robot("so101")), so you can exercise the whole loop today.
- Sim-first, safe by default.
Robot("so100")spins up a MuJoCo world. You never accidentally drive real servos -mode="real"is an explicit opt-in. - 50+ robots, 8 categories. Arms, humanoids, quadrupeds, hands, drones, bimanual rigs - resolved from a single registry with auto-download of assets.
- Any policy. VLA models (NVIDIA GR00T, LeRobot ACT/Pi0/SmolVLA/Diffusion), plus classical motion planners, MPC, and scripted controllers behind one ABC.
- Mesh networking built in. Every robot is a Zenoh peer.
tell()another robot what to do; broadcast an E-STOP; bridge to AWS IoT Core for fleets. - 64-action simulation tool. World building, physics, rendering, domain randomization, and LeRobotDataset recording - all agent-callable.
- ROS 2 interop. Observe + command any ROS 2 graph (
use_ros), act as a robot with no rclpy (use_rtps), or expose a running sim as a ROS node. - One mental model. Sim and hardware share the same policy interface, the same mesh, and the same natural-language control surface.
graph LR
A[Natural Language<br/>'Pick up the red block'] --> B[Strands Agent]
B --> C[Robot<br/>sim or real]
C --> D[Policy Provider<br/>GR00T / LeRobot / planner / mock]
D --> E[Action Chunk]
E --> F[MuJoCo Sim<br/>or Hardware]
F -->|observation| C
classDef input fill:#2ea44f,stroke:#1b7735,color:#fff
classDef agent fill:#0969da,stroke:#044289,color:#fff
classDef policy fill:#8250df,stroke:#5a32a3,color:#fff
classDef hardware fill:#bf8700,stroke:#875e00,color:#fff
class A input
class B,C agent
class D,E policy
class F hardware
Examples use uv (curl -LsSf https://astral.sh/uv/install.sh | sh); plain pip works too.
uv pip install strands-robotsThe base install is light (numpy, opencv-headless, Pillow). Pull in only the extras you need:
| Extra | Installs | Use for |
|---|---|---|
sim-mujoco |
MuJoCo, robot_descriptions, imageio | Simulation (recommended starting point) |
sim-newton |
Newton, Warp, MuJoCo-Warp, trimesh | GPU-native simulation (NVIDIA GPU; batched envs, headless ray-traced render) |
lerobot |
LeRobot | Real hardware, local VLA inference, dataset recording |
molmoact2 |
LeRobot + transformers, peft, scipy | MolmoAct2 transformers-native VLA (needs lerobot from source until PyPI >= 0.5.2) |
groot-service |
pyzmq, msgpack | NVIDIA GR00T inference client |
curobo |
(empty; install cuRobo from source) | In-process collision-aware motion planning (CUDA GPU) |
wbc |
onnxruntime | GR00T Whole-Body-Control (SONIC) humanoid locomotion - in-process ONNX, no GPU |
mesh |
eclipse-zenoh, json5 | Peer-to-peer robot mesh |
mesh-iot |
awsiotsdk, awscrt, boto3 | AWS IoT Core mesh transport for fleets |
device-connect |
device-connect-edge, device-connect-agent-tools | Device-aware networking - discovery, RPC, events, safety (falls back to the built-in mesh if absent) |
benchmark-libero |
libero | LIBERO benchmark evaluation |
all |
everything above | Kitchen sink |
# Most users start here:
uv pip install "strands-robots[sim-mujoco]"
# Real hardware + local policies:
uv pip install "strands-robots[sim-mujoco,lerobot]"
# MolmoAct2 VLA (lerobot from source until a PyPI >= 0.5.2 ships PR #3604):
uv pip install "strands-robots[molmoact2]" \
"lerobot[feetech] @ git+https://github.com/huggingface/lerobot.git"
# Everything:
uv pip install "strands-robots[all]"From source:
git clone https://github.com/strands-labs/robots
cd robots
uv pip install -e ".[all,dev]"from strands import Agent
from strands_robots import Robot
robot = Robot("so100") # MuJoCo simulation
agent = Agent(tools=[robot])
agent("Wave the arm using the mock policy for 200 steps, then render a top-down view")Robot("so100") returns a Simulation instance - the full 64-action
simulation AgentTool. Drive it in natural language through an Agent, call its
methods directly (robot.render(camera_name="topdown")), or dispatch an action
by calling it (robot(action="render", camera_name="topdown")). See
Simulation.
Note:
Robot("so100")already creates the world and adds the robot for you. Do not callcreate_world()again on the returned instance - it will error with "World already exists." Thecreate_world()/add_robot()sequence shown in Simulation (MuJoCo) is for the low-levelSimulation(...)constructor, which starts empty.
from strands import Agent
from strands_robots import Robot, gr00t_inference
robot = Robot(
"so101",
mode="real",
cameras={
"front": {"type": "opencv", "index_or_path": "/dev/video0", "fps": 30},
"wrist": {"type": "opencv", "index_or_path": "/dev/video2", "fps": 30},
},
port="/dev/ttyACM0",
data_config="so100_dualcam",
)
agent = Agent(tools=[robot, gr00t_inference])
# Start the GR00T inference service (Docker, Jetson/x86 GPU)
agent.tool.gr00t_inference(
action="start",
checkpoint_path="/data/checkpoints/model",
port=8000,
data_config="so100_dualcam",
)
agent("Use so101 to pick up the red block with the GR00T policy on port 8000")from strands_robots import create_policy
# Direct HuggingFace inference - ACT, Pi0, SmolVLA, Diffusion, ...
policy = create_policy("lerobot/act_aloha_sim_transfer_cube_human")Drive any real robot - or a simulation - from one or more LeRobot
teleoperators. Teleoperator() mirrors the Robot() factory; attach_teleop()
teleoperate()run the control loop.
from strands_robots import Robot, Teleoperator
# Leader arm -> follower arm (both speak {motor}.pos -> zero config)
follower = Robot("so101", mode="real", port="/dev/ttyACM0")
follower.attach_teleop("so101_leader", port="/dev/ttyACM1", id="leader")
follower.teleoperate() # Ctrl+C or stop_teleoperate()
# Earth Rover Mini+ with WASD keys (velocity keys -> zero config)
rover = Robot("earthrover_mini_plus", mode="real", robot_ip="192.168.1.151")
rover.attach_teleop("keyboard_rover") # W/A/S/D
rover.teleoperate(block=True, duration=30)
# Cross-vocabulary or sim teleop -> supply a map_fn(action) -> action
robot.attach_teleop("keyboard_ee", map_fn=my_ik) # EE deltas -> joint .pos
robot.teleoperate(publish=True) # also stream over the mesh17 teleoperators (so100/so101/koch/omx/openarm leaders, bi_* leaders,
gamepad, keyboard, keyboard_ee, keyboard_rover, phone,
reachy2_teleoperator, unitree_g1, homunculus arm/glove) drive 14 robots.
Zero-config when action keys match; otherwise pass map_fn. Full matrix +
recipes: Teleoperation docs.
The physical-AI data loop, end to end: record a LeRobotDataset from sim or
hardware, stream it straight back for eval/training (no full download), and
optionally dump it to a mutable Hugging Face Storage Bucket. Needs the
lerobot extra (which bundles datasets + av + torchcodec).
from strands import Agent
from strands_robots import Robot
sim = Robot("so100", mesh=False)
agent = Agent(tools=[sim])
# 1. COLLECT — one natural-language prompt drives scene + cameras + policy + record.
agent(
"Create a world with the so100 robot, add a red cube and a front camera, "
"start recording (repo_id='local/demo', root='/tmp/demo', fps=30, "
"overwrite=True, task='pick up the red cube'), run the mock policy for "
"60 steps, then stop recording."
)
# 2. STREAM — read it back lazily; camera frames decode on the fly from the MP4
# shards, state/action from parquet. Nothing is re-materialized to disk.
reader = sim.stream_dataset("local/demo", root="/tmp/demo", shuffle=False)
for frame in reader:
frame["observation.images.front"] # (3, H, W) tensor, decoded from video
frame["observation.state"] # joint vector
frame["action"]
breakstream_dataset() is the in-process read counterpart to
start_recording/stop_recording. For full training, the upstream trainer uses
the same engine — python -m lerobot.scripts.train dataset.repo_id=... dataset.streaming=true.
Verify episode integrity. A recording's ground truth is the parquet under
meta/episodes/, not the count a model narrates while collecting. Collect
episodes with a deterministic Python loop (one run_policy(..., n_episodes=1)
plus save_episode() per episode) rather than trusting a model to count its own
tool calls, then confirm the dataset holds the episodes you intended - in-process
or from the shell:
sim.verify_dataset_episodes(expected=20) # reads parquet; status="error" on a mega-episode# exit 0 = pass, 1 = fail, so it drops straight into CI as a dataset gate
strands-robots verify-dataset /tmp/demo --expected 20This catches the "mega-episode" corruption class - a run that buffered every
frame into one episode_index=0 episode while reporting 20/20 - plus
meta/info.json vs parquet drift and zero-length episodes.
Dump to a Storage Bucket during collection (mutable, Xet-deduplicated — the Phase 1/2 collection target that avoids git-LFS history bloat) with one kwarg:
sim.stop_recording(bucket="your-org/robot-fave") # → hf://buckets/your-org/robot-fave/demoRequires the hf CLI (pip install -U huggingface_hub + hf auth login).
Proprio-only / no video (e.g. edge devices without a torchcodec wheel):
sim.stream_dataset(repo_id, drop_videos=True) streams state/action only and
never touches the video decoder.
macOS note (zero-touch). torchcodec links ffmpeg via
@rpath, and Homebrew's ffmpeg (/opt/homebrew/lib) is not on the default dyld search path — so video decode would normally fail withLibrary not loaded: @rpath/libavutil.NN.dylib. Onimport strands_robotswe auto-detect this and put Homebrew's ffmpeg onDYLD_FALLBACK_LIBRARY_PATH(re-exec'ing the interpreter once for a plain script run; never inside Jupyter/REPL/pytest, where it just prints the one-lineexportto run). It's a no-op off macOS, without torchcodec, or when the var is already set. Disable withSTRANDS_ROBOTS_NO_DYLD_SHIM=1. Seeexamples/06_agent_collect_and_stream.py.
See also Recording & datasets for the DatasetRecorder
direct API and append/resume workflow.
Robot() is a factory, not a wrapper - you get the real backend instance back
with all its methods.
Robot("so100") # mode="sim" (default, safe)
Robot("so100", mode="real") # explicit hardware opt-in
Robot("so100", mode="auto") # probe USB for servos, fall back to sim
Robot("my_arm", urdf_path="arm.xml") # bring your own MJCF/URDF| Parameter | Type | Default | Description |
|---|---|---|---|
name |
str |
required | Robot name or alias (see Supported robots) |
mode |
str |
"sim" |
"sim", "real", or "auto" (case-insensitive) |
backend |
str |
"mujoco" |
Sim backend (Isaac/Newton on the roadmap) |
urdf_path |
str |
None |
Explicit MJCF/URDF path (skips registry lookup) |
cameras |
dict |
None |
Camera config (mode="real" only) |
position |
list[float] |
[0,0,0] |
Spawn position in the sim world |
data_config |
str |
name | Observation/action schema name |
mesh |
bool |
True |
Auto-join the Zenoh mesh |
Safety/validation rules:
- Defaults to sim. Real hardware is always an explicit
mode="real". cameras=is rejected in sim mode - add sim cameras via theadd_cameraaction after creation.- Unknown robot names raise
ValueErrorunless you passurdf_path=. STRANDS_ROBOT_MODEoverrides detection; a typo'd value logs a warning and falls back to sim.
50+ robots across 8 categories, resolved from
registry/robots.json. Assets
(MJCF + meshes) auto-download from
robot_descriptions
/ MuJoCo Menagerie on
first use. List them at runtime with from strands_robots import list_robots; list_robots().
| Category | Count | Robots |
|---|---|---|
| Arm | 22 | so100, so101, koch, omx, panda, fr3, fr3_v2, ur5e, ur10e, xarm7, kinova_gen3, kuka_iiwa, sawyer, piper, yam, z1, vx300s, wx250s, arx_l5, openarm, hope_jr, dynamixel_2r |
| Humanoid | 18 | unitree_g1, unitree_h1, unitree_h1_2, apollo, talos, reachy2, rby1, fourier_n1, booster_t1, adam_lite, asimov_v0, cassie, elf2, jvrc, op3, open_duck_mini, toddlerbot_2xc, toddlerbot_2xm |
| Mobile | 13 | spot, go1, unitree_go2, unitree_a1, aliengo, anymal_b, anymal_c, stretch, stretch3, lekiwi, tiago_dual, earthrover, robot_soccer_kit |
| Hand | 8 | shadow_hand, shadow_dexee, allegro_hand, leap_hand, ability_hand, aero_hand, robotiq_2f85, robotiq_2f85_v4 |
| Bimanual | 3 | aloha, bi_openarm, trossen_wxai |
| Aerial | 2 | crazyflie, skydio_x2 |
| Expressive | 1 | reachy_mini |
| Mobile manip | 1 | google_robot |
Hardware-capable (drivable with mode="real" via LeRobot): so100,
so101, koch, omx, hope_jr, aloha, bi_openarm, reachy2,
unitree_g1, lekiwi, earthrover. All are simulatable.
There are two paths, depending on whether the robot needs project-specific metadata:
-
Standard
robot_descriptionsrobot (zero config). Any MJCF robot shipped by robot_descriptions resolves automatically without arobots.jsonentry - the asset is discovered and downloaded on first use:from strands_robots import Robot, list_discoverable sim = Robot("iiwa14") # discovered, not in robots.json print(list_discoverable()) # the MJCF long tail you can load directly
A curated
robots.jsonentry always wins over discovery, so overriding a discovered robot later is non-breaking. -
Custom or metadata-rich robot. If the robot needs a non-default joint count, hardware port, aliases, scene tweaks, or local mesh overrides, add a curated entry. For a robot that belongs in the shipped catalog, add it to
registry/robots.jsonand open a PR. For a machine-local robot, register it at runtime instead of editing the package:from strands_robots.registry import register_robot register_robot(name="my_arm", model_xml="my_arm.xml", asset_dir="~/robots/my_arm", joints=7, category="arm")
Import any of these and pass to Agent(tools=[...]). Each is a Strands
AgentTool returning {"status", "content"}.
| Tool | Purpose |
|---|---|
Robot(...) |
Universal robot - sim or hardware, natural-language + async control |
run_policy |
Multi-episode policy rollout with per-episode eval + dataset recording |
train_policy |
Post-tune (fine-tune) a policy on a recorded dataset (LeRobot / GR00T trainers, full or LoRA) |
use_lerobot |
Universal LeRobot bridge - call ANY lerobot module/class/config directly (like use_aws wraps boto3) |
lerobot_train |
Thin local wrapper over the lerobot-train CLI (the engine behind train_policy) |
robot_mesh |
Coordinate robots over the Zenoh mesh (tell, broadcast, E-STOP) |
use_ros |
Bridge to any ROS 2 graph - list/echo/publish topics, call services (in-process rclpy) |
use_rtps |
Join a ROS 2 graph as a DDS participant - publish/echo topics, act as a robot (pure cyclonedds, no rclpy, all ROS 2 distros) |
gr00t_inference |
Manage NVIDIA GR00T inference services (Docker lifecycle) |
lerobot_camera |
OpenCV / RealSense camera discovery, capture, record |
lerobot_calibrate |
List, view, back up, restore LeRobot calibrations |
lerobot_teleoperate |
Record demonstrations, replay episodes |
pose_tool |
Store, recall, and execute named robot poses |
serial_tool |
Low-level Feetech servo / raw serial communication |
download_assets |
Pre-fetch robot MJCF + meshes into the asset cache |
Robot tool actions
| Action | Parameters | Description |
|---|---|---|
execute |
instruction, policy_port, duration |
Blocking execution until complete |
start |
instruction, policy_port, duration |
Non-blocking async start |
status |
- | Current task status |
stop |
- | Interrupt running task (emergency stop) |
| In sim mode the same tool exposes the 64 Simulation actions - see Simulation (MuJoCo). |
GR00T inference tool actions
| Action | Parameters | Description |
|---|---|---|
start |
checkpoint_path, port, data_config |
Start inference service |
stop |
port |
Stop service on port |
status |
port |
Check service status |
list |
- | List running services |
find_containers |
- | Find GR00T Docker containers |
build_image / download_checkpoint / start_container |
- | Full container lifecycle orchestration |
TensorRT acceleration:
agent.tool.gr00t_inference(
action="start",
checkpoint_path="/data/checkpoints/model",
port=8000,
use_tensorrt=True,
vit_dtype="fp8", # ViT: fp16 | fp8
llm_dtype="nvfp4", # LLM: fp16 | nvfp4 | fp8
dit_dtype="fp8", # DiT: fp16 | fp8
)Camera / serial / pose / teleop tool actions
Camera - discover, capture, capture_batch, record, preview, test
Serial - list_ports, feetech_position, feetech_ping, send, monitor
Pose - store_pose, load_pose, list_poses, move_motor, incremental_move, reset_to_home
Teleop - start, stop, list, replay
All policies implement one ABC - async get_actions(observation, instruction, **kwargs).
The interface is deliberately agnostic about how actions are produced, so it
fits both VLA models and classical controllers.
from strands_robots import create_policy
create_policy("mock") # sinusoidal test actions
create_policy("groot", port=5555) # NVIDIA GR00T via ZMQ
create_policy("zmq://localhost:5555") # same, by URL
create_policy("lerobot/act_aloha_sim_transfer_cube") # local HF inference| Provider | Backend | Notes |
|---|---|---|
mock |
none | Sinusoidal trajectories; requires_images=False (~10x faster) |
groot |
NVIDIA GR00T N1.5/N1.6/N1.7 | Service mode (ZMQ to a Docker container) or local in-process (model_path=) |
lerobot_local |
HuggingFace | Direct ACT / Pi0 / SmolVLA / Diffusion inference, no server |
vera |
MIT VERA (DFoT/WAN planner + Jacobian IDM) | Two-stage video-to-action over a WebSocket GPU server (Docker); PushT + MimicGen, IK for eef-delta arms |
classDiagram
class Policy {
<<abstract>>
+get_actions(obs, instruction, **kwargs)
+set_robot_state_keys(keys)
+requires_images
+reset(seed)
+provider_name
}
class Gr00tPolicy
class LerobotLocalPolicy
class MockPolicy
class YourPolicy
Policy <|-- Gr00tPolicy
Policy <|-- LerobotLocalPolicy
Policy <|-- MockPolicy
Policy <|-- YourPolicy
GR00T data configs (embodiment schemas)
A data_config defines the video + state keys GR00T expects for an
embodiment. 27 ship in
policies/groot/data_configs.json;
the common ones:
| Config | Cameras | Description |
|---|---|---|
so100 / so101 |
1 (video.webcam) |
Single-arm, single camera |
so100_dualcam / so101_dualcam |
2 (front + wrist) | Single-arm, dual camera |
so100_4cam |
4 (front, wrist, top, side) | Single-arm, quad camera |
so101_tricam |
3 (front, wrist, side) | Single-arm, tri camera |
fourier_gr1_arms_only |
1 (ego) | Fourier GR-1 bimanual arms + hands |
unitree_g1 |
1 (ego) | G1 upper body (arms + hands) |
unitree_g1_full_body / _locomanip |
- | G1 legs + waist + arms + hands |
bimanual_panda_gripper |
3 | Dual Franka, EEF pose + gripper |
libero_panda |
2 (image + wrist) | LIBERO benchmark Panda |
oxe_droid / oxe_google / oxe_widowx |
1-2 | Open X-Embodiment schemas |
agibot_* / galaxea_r1_pro |
3 | AgiBot / Galaxea humanoids |
Pick the config matching your robot's camera + state layout; pass it as
data_config= to Robot(...), gr00t_inference(...), or create_policy("groot", ...).
Security:
lerobot_localloads HuggingFace models withtrust_remote_code=True(arbitrary code execution). You must opt in withexport STRANDS_TRUST_REMOTE_CODE=1. Only load models you trust.
nvidia/Cosmos3-Nano-Policy-DROID via a self-contained WebSocket client (cosmos3 / c3 / cosmos3://host:port); no openpi-client dep, no numpy<2 pin, so it composes with lerobot in one env.
Cosmos 3 server + client setup, embodiments, sim rollout
nvidia/Cosmos3-Nano-Policy-DROID
served by the Cosmos Framework RoboLab WebSocket policy server. The policy
client is self-contained - it speaks the server's msgpack+NumPy wire
protocol directly via websockets + a vendored numpy packer (no
openpi-client dependency, no numpy<2 pin), so it composes cleanly with
lerobot for dataset recording in the same env.
1. Start the server (holds the GPU), from a Cosmos Framework checkout:
uv sync --all-extras --group=cu130-train --group=policy-server
python -m cosmos_framework.scripts.action_policy_server_robolab \
--checkpoint-path nvidia/Cosmos3-Nano-Policy-DROID --port 8000
curl http://localhost:8000/healthz # -> 200 when ready (~4 min cold)2. Install the client (the cosmos3-service extra ships only msgpack
websockets- numpy-version agnostic):
uv pip install -e '.[sim-mujoco]'
uv pip install 'strands-robots[cosmos3-service]'3. Use it (cosmos3, c3, cosmos3://host:port, or the HF model-id all
resolve to Cosmos3Policy):
from strands_robots.policies import create_policy
policy = create_policy("cosmos3", embodiment="droid", port=8000)
policy.set_robot_state_keys([f"joint_{i}" for i in range(7)] + ["gripper"])
chunk = policy.get_actions_sync(observation, "pick up the cube")
# chunk == [{"joint_0": .., ..., "gripper": ..}, ...] (one dict per timestep)The droid embodiment (joint_pos/RoboArena) conditions on all three
camera views and the server rejects a partial observation. Your
observation_mapping must map a sim/robot camera onto each of
observation/wrist_image_left, observation/exterior_image_1_left, and
observation/exterior_image_2_left; an incomplete mapping raises an actionable
client-side ValueError naming the missing keys before any request is sent
(other embodiments such as umi/av/bridge need only observation/image):
policy = create_policy(
"cosmos3", embodiment="droid", port=8000,
observation_mapping={
"wrist": "observation/wrist_image_left",
"exterior": "observation/exterior_image_1_left",
"exterior2": "observation/exterior_image_2_left",
},
)4. Roll out in MuJoCo - the droid embodiment drives a Franka/DROID-class
arm, so use the franka (or panda) sim asset:
MUJOCO_GL=egl python examples/cosmos3_sim_rollout.py --record /tmp/c3.mp4Embodiments: droid (10D, chunk 32, 15 fps), umi, av, bridge. If the
server is not running, the policy raises a ConnectionError with the exact
command to start it.
The same interface fits cuRobo, MoveIt2, OMPL, MPC, and pure-IK / scripted
trajectories - anything mapping (observation, goal) to joint targets.
Non-VLA providers set requires_images = False (skip camera rendering) and
read their goal from well-known **kwargs keys instead of parsing the
instruction string:
| Key | Type | Meaning |
|---|---|---|
target_pose |
list[float] |
Cartesian goal [x, y, z, qw, qx, qy, qz] in base frame |
target_joints |
dict[str, float] |
Joint-space goal keyed by joint name (rad / m) |
world_update |
dict | None |
Per-call world refresh for collision-aware planners |
Providers MUST ignore unknown **kwargs rather than raising, so callers can
pass shared keys across providers without coupling to a backend.
from typing import Any
from strands_robots.policies import Policy, register_policy, create_policy
class ReachPolicy(Policy):
"""Linear interpolation from current joint state to target_joints."""
def __init__(self, steps: int = 32, **_: Any) -> None:
self._keys: list[str] = []
self._steps = steps
@property
def provider_name(self) -> str:
return "reach"
@property
def requires_images(self) -> bool:
return False # joint-state only -- skip camera rendering
def set_robot_state_keys(self, robot_state_keys: list[str]) -> None:
self._keys = list(robot_state_keys)
async def get_actions(self, observation_dict, instruction, **kwargs):
target = kwargs.get("target_joints")
if target is None:
raise ValueError("ReachPolicy requires target_joints kwarg")
state = observation_dict.get("observation.state", [0.0] * len(self._keys))
out = []
for s in range(1, self._steps + 1):
alpha = s / self._steps
out.append({k: (1 - alpha) * state[i] + alpha * target[k]
for i, k in enumerate(self._keys)})
return out
register_policy("reach", lambda: ReachPolicy, aliases=["lerp"])
policy = create_policy("reach")Reference non-VLA providers: MoveIt2, cuRobo, WBC/SONIC
Three reference implementations of the goal-kwarg contract above. Each has a runnable example + full install/deploy notes in its linked doc:
| Provider | Alias | Runs | Goal kwarg | Needs | Docs |
|---|---|---|---|---|---|
moveit2 |
moveit |
ZMQ sidecar (ROS 2 / moveit_py, out-of-process) |
target_pose / target_joints |
[moveit2] extra (pyzmq, msgpack); a running sidecar |
MoveIt2 docs |
curobo |
cumotion |
in-process CUDA | target_pose / target_joints (+ world_update) |
NVIDIA GPU; cuRobo from source (not on PyPI) | cuRobo source |
wbc |
sonic |
in-process ONNX (CPU) | target_velocity [vx, vy, omega] |
[wbc] extra (onnxruntime); a SONIC checkpoint |
WBC docs |
from strands_robots.policies import create_policy
# Collision-aware planning (GPU, in-process); plan is cached, streamed per tick.
policy = create_policy("curobo", robot_config="franka.yml", action_horizon=16)
actions = policy.get_actions_sync(
{"observation.state": [0.0, -0.79, 0.0, -2.36, 0.0, 1.57, 0.79]},
"reach for the red block", # ignored by planners
target_pose=[0.5, 0.0, 0.4, 1.0, 0.0, 0.0, 0.0],
)Agents share one goal vocabulary across VLA and planner providers:
Robot.start_task(..., policy_provider="curobo", target_pose=[...]) and
mesh.tell(peer, "...", policy_provider="curobo", target_pose=[...]) flow the
same target_pose / target_joints / world_update kwargs through.
Robot("so100") (sim mode) returns a Simulation - a MuJoCo-backed AgentTool
exposing 64 actions for world composition, physics, rendering, policy
execution, and dataset recording. Build it directly when you want full control:
from strands_robots.simulation import Simulation
sim = Simulation(tool_name="sim", mesh=False)
sim.create_world()
sim.add_robot(name="arm", data_config="so100")
sim.add_object(name="cube", shape="box", position=[0.3, 0, 0.05])
sim.add_camera(name="topdown", position=[0, 0, 1.5], target=[0, 0, 0])
# Wrist camera: mount ON the gripper body so it tracks the arm like the real
# SO101/SO100 hardware cam. position/target are in the body's LOCAL frame.
# Body names are namespaced "<robot>/<body>" (e.g. "arm/gripper").
sim.add_camera(name="wrist", position=[0, -0.05, 0], target=[0, -0.15, 0],
parent_body="arm/gripper")
sim.run_policy(robot_name="arm", policy_provider="mock", n_steps=200,
control_frequency=50.0)
frame = sim.render(camera_name="topdown") # {status, content:[text, image]}The actions, grouped
- World & scene:
create_world,load_scene,replace_scene_mjcf,patch_scene_mjcf,reset,get_state,save_state,load_state,destroy,export_xml. - Robots:
add_robot,remove_robot,list_robots,get_robot_state,list_urdfs,register_urdf,get_features. - Objects:
add_object,remove_object,move_object,list_objects. - Cameras & rendering:
add_camera,remove_camera,render,render_depth,render_all,start_cameras_recording,stop_cameras_recording,get_cameras_recording_status. - Physics:
step,set_timestep,set_gravity,apply_force,raycast,multi_raycast,get_contacts,get_contact_forces,get_body_state,set_joint_positions,set_joint_velocities,forward_kinematics,get_jacobian,get_mass_matrix,inverse_dynamics,get_total_mass,get_energy,get_sensor_data,set_body_properties,set_geom_properties. - Policy:
run_policy,start_policy,stop_policy,list_policies_running,replay_episode,eval_policy. - Randomization:
randomize. - Recording (LeRobotDataset):
start_recording,stop_recording,get_recording_status. - Benchmarks:
list_benchmarks,register_benchmark_from_file,evaluate_benchmark. - Viewer:
open_viewer,close_viewer.
Common footguns
- Planes must be static.
add_object(shape="plane")auto-setsis_static=True; passingis_static=Falseis a hard error. - Aim cameras. Pass
target=[x,y,z]to look at a point;target == positionerrors. - Wrist cameras mount on a body. Pass
parent_body="<robot>/gripper"toadd_cameraso the camera rides with the arm (realistic SO101/SO100 wrist cam). In that modeposition/targetare in the body's LOCAL frame, not world coordinates. Omitparent_bodyfor a world-fixed camera. - MP4 vs dataset recording.
start_cameras_recordingwrites plain MP4 ([sim-mujoco]only).start_recordingwrites a LeRobotDataset (parquet + MP4 + schema) and needs the[lerobot]extra. - Policy running → mutations blocked. While a policy runs, state-mutating actions error with "Cannot 'X' while a policy is running." Stop it first.
- Horizon parameters.
run_policytakes eitherdurationorn_steps(both withcontrol_frequency).fast_mode=Trueskips the between-step sleep for batch eval / data collection. - Name collisions. Objects, bodies, robots, and cameras share the MuJoCo
name table. Multi-robot joints/actuators are namespaced
{robot}/{joint}.
Self-healing: unknown parameters are rejected with "Unknown parameter X for action Y. Valid: [...]", missing required params produce "Action X requires parameter Y.", and vectors/dtypes are validated before MuJoCo sees them - so the agent learns the contract without crashing the process.
Third-party backends. create_simulation(name) discovers backends beyond
the built-in mujoco/newton registry via Python
entry points.
A sibling package - e.g. strands-robots-sim,
which ships the heavy Isaac Sim and Newton backends out-of-tree - registers its
SimEngine subclasses under the strands_robots.backends group in its
pyproject.toml, and they become available on pip install without patching
this package:
[project.entry-points."strands_robots.backends"]
isaac = "strands_robots_sim.isaac.simulation:IsaacSimulation"
newton = "strands_robots_sim.newton.simulation:NewtonSimulation"
warp = "strands_robots_sim.newton.simulation:NewtonSimulation"Built-in backends always take precedence over plugins of the same name, plugin
discovery is lazy (it never slows cold import), and list_backends() returns
the merged builtin + plugin set.
Every Robot() and Simulation() is automatically a peer on a local Zenoh
mesh - no setup. Peers on the same LAN discover each other via multicast
scouting, sharing a single ref-counted zenoh.Session per process.
from strands_robots import Robot
a = Robot("so100") # auto-joins the mesh
b = Robot("so100") # second peer (another process)
print(a.mesh.peers) # list[dict] - discovers b
print(a.mesh.peers_by_id[b.peer_id]) # dict[peer_id -> info] for O(1) lookup
info = a.mesh.get_peer(b.peer_id) # None-safe single lookup
a.mesh.tell(b.peer_id, "pick up the cube")
a.mesh.emergency_stop() # broadcast E-STOP, audited to disktell() routes to hardware and sim peers. Per-call policy kwargs
(target_pose, target_joints, world_update) and constructor extras are
forwarded end-to-end via policy_config, so a planner-style policy on a sim
peer sees the goal payload it needs:
a.mesh.tell(
b.peer_id,
"reach for the red block",
policy_provider="curobo",
target_pose=[0.3, 0.0, 0.4, 1.0, 0.0, 0.0, 0.0],
robot_name="arm_left", # disambiguate in multi-robot sims
duration=10.0,
)Expose the mesh to an agent with the robot_mesh tool (peers, status,
tell, send, broadcast, stop, emergency_stop, subscribe, watch,
inbox). Disable globally with STRANDS_MESH=false or per-robot with
Robot("so100", mesh=False). Install with uv pip install "strands-robots[mesh]".
For frictionless single-machine experiments, set STRANDS_MESH_LOCAL_DEV=1 -
one env var that runs the mesh without mTLS/ACL on localhost. It defaults the
auth mode to none and satisfies the insecure-acknowledgement second
factor by itself, so you don't also need STRANDS_MESH_I_KNOW_THIS_IS_INSECURE=1.
An explicit STRANDS_MESH_AUTH_MODE=mtls still wins. Never set
STRANDS_MESH_LOCAL_DEV on a shared or production network.
For robots across networks, bridge the mesh to AWS IoT Core over MQTT5/mTLS,
with Device Shadow mirroring, S3 camera offload, and account-wide Fleet
Provisioning. Hardened with CA pinning, strict thing-name validation,
deny-by-default IoT policy scoping, and a safety audit log.
Install with uv pip install "strands-robots[mesh-iot]". See the
Configuration matrix for the STRANDS_MESH_* knobs.
strands-robots speaks ROS 2 from four complementary angles - a Strands agent can
observe, command, be, and expose a ROS 2 system. Full guide:
ROS 2 Integration / docs/ros2-integration.md.
A Strands agent (Claude Opus via Amazon Bedrock) given the use_ros tool drives
a real ROS 2 turtlesim in a closed-loop square - reading pose, correcting
heading, re-driving - over 43 in-process tool calls. Runnable:
examples/ros2/use_ros/.
| Surface | What it does | Backend | Needs sourced ROS 2 |
|---|---|---|---|
use_ros |
List/echo/publish topics, call services on any ROS 2 graph | in-process rclpy |
yes |
use_rtps |
Join a graph as a DDS peer and act as a robot (publish topics a real stack consumes) | pure cyclonedds (pip) |
no - macOS/CI/Jetson, all distros |
RosBridgedRobot |
Drive a cmd_vel/odom ROS 2 base as a first-class strands Robot |
use_ros |
yes |
SimEngine(ros2_bridge=True) |
Publish a running MuJoCo sim's joint_states + camera image_raw so rviz/nav2/agents can subscribe |
rclpy |
yes |
# Observe + command a live ROS 2 graph, in plain English:
from strands import Agent
from strands_robots.tools import use_ros
Agent(tools=[use_ros])("list the topics, drive /turtle1 forward, confirm the pose changed")
# Or expose a simulation as a ROS 2 node any tool can subscribe to:
from strands_robots.simulation import Simulation
sim = Simulation(ros2_bridge=True)
sim.create_world(); sim.add_robot("so101")
sim.step(10) # publishes /so101/joint_states + camera image_raw on the ROS 2 domainrclpy ships with a sourced ROS 2 distro (not on PyPI). The [ros2] extra adds
only the pip-installable cyclonedds binding that use_rtps uses - so the
pure-RTPS path needs no ROS install at all. Every surface degrades to a clear,
structured error when its backend is unavailable; the default install never
touches ROS 2.
| Variable | Description | Default |
|---|---|---|
STRANDS_ROBOT_MODE |
Robot() factory mode: sim / real / auto |
sim |
STRANDS_ASSETS_DIR |
Robot model asset cache directory | ~/.strands_robots/assets/ |
STRANDS_TRUST_REMOTE_CODE |
Set 1 to allow HF trust_remote_code for lerobot_local |
unset |
STRANDS_ROBOTS_NO_DYLD_SHIM |
Set 1 to disable the macOS auto-fix that puts Homebrew ffmpeg on the dyld path for torchcodec video streaming (see Recording & streaming datasets) |
unset |
MUJOCO_GL |
MuJoCo GL backend (egl, osmesa, glfw) |
auto |
GROOT_API_TOKEN |
API token for the GR00T inference service | unset |
STRANDS_MESH |
Set false to disable Zenoh mesh globally |
true |
STRANDS_MESH_LOCAL_DEV |
Set 1 for a one-var localhost preset (auth none, no second factor needed) |
unset |
Mesh / IoT / GR00T-container env vars (advanced)
| Variable | Description | Default |
|---|---|---|
STRANDS_MESH_AUTH_MODE |
Wire auth: mtls or none (none needs a second factor) |
mtls |
STRANDS_MESH_I_KNOW_THIS_IS_INSECURE |
Second factor required to bring up AUTH_MODE=none |
unset |
STRANDS_MESH_PORT |
TCP port for the local Zenoh router | 7447 |
ZENOH_CONNECT |
Comma-separated remote Zenoh endpoints to connect to | unset |
ZENOH_LISTEN |
Comma-separated endpoints for the local Zenoh listener | unset |
STRANDS_MESH_AUDIT_DIR |
Directory for the safety audit log (mesh_audit.jsonl) |
~/.strands_robots/ |
STRANDS_MESH_CA_PINS |
Additional SHA-256 CA pins (comma-separated 64-char hex) | unset |
STRANDS_MESH_DISABLE_CA_PIN |
Skip CA pin check on download path (break-glass) | false |
STRANDS_MESH_CAMERA_PRESIGN_TTL |
TTL (s) for S3 presigned camera URLs; capped at 3600 | 60 |
STRANDS_MESH_ACL_FILE |
Path to a JSON5 Zenoh ACL file; unset = permissive default. See examples/mesh_acl_example.json5 (role-scoped) and examples/mesh_acl_strict_per_peer.json5 (per-peer). |
unset |
STRANDS_MESH_POLICY_HOST_ALLOW |
Comma-separated allowlist of VLA policy-server hosts/CIDRs for inference | loopback only |
STRANDS_MESH_HITL_ACTIONS |
robot_mesh actions needing a human-in-the-loop interrupt: all / none / subset of emergency_stop,broadcast,tell,send,stop,subscribe,watch |
actuation default |
STRANDS_MESH_SUBSCRIBE_ALLOW |
Extra Zenoh key-expr patterns the robot_mesh subscribe action may target, beyond the built-in low-impact set |
shared classes only |
STRANDS_MESH_OVERRIDE_CODE |
Shared secret for e-stop resume HMAC proof; unset means no remote resume possible | unset |
STRANDS_MESH_INPUT_VALUE_ABS |
Absolute value clamp for teleop joint commands (radians) | 12.566 (4pi) |
STRANDS_MESH_INPUT_MAX_HZ |
Per-receiver teleop apply-rate ceiling (0 = unlimited) | 100 |
STRANDS_MESH_MAX_PEERS |
Peer registry cap; evicts oldest on overflow | 1024 |
STRANDS_MESH_RESUME_MAX_FAILS |
Failed resume attempts before cooldown engages | 5 |
STRANDS_MESH_RESUME_BACKOFF_S |
Cooldown (seconds) after exceeding resume fail threshold | 30 |
STRANDS_MESH_INPUT_AUDIT_EVERY |
Emit input_stream_applied audit event every N frames (0 = off) |
100 |
STRANDS_ESTOP_DEDUP_TTL_S |
E-stop fan-out Lambda dedup window (seconds) | 30 |
STRANDS_MESH_BRIDGE_TOPICS |
Comma-separated topic suffixes the Zenoh<->IoT bridge forwards (exact match). Unset = the safe default set (presence,health,safety/event,safety/estop,safety/resume,cmd,response,broadcast). High-volume topics (state,pose,imu,odom,lidar) and LAN-only topics (camera,input,hand) are deliberately NOT bridged |
default set |
STRANDS_MESH_BRIDGE_TOPICS_PREFIX |
Comma-separated topic suffixes the bridge matches as a path prefix (so response matches response/<turn-id>). Extend this (not STRANDS_MESH_BRIDGE_TOPICS) when adding an RPC-shape topic with a per-turn tail |
response |
STRANDS_GR00T_IMAGE |
Container image the gr00t_inference tool runs (must pass the image allowlist; agent cannot choose it) |
gr00t:latest |
STRANDS_GR00T_IMAGE_ALLOW |
Extra image-name patterns (trailing * = tag wildcard) added to the built-in allowlist (gr00t:*, nvcr.io/nvidia/isaac-gr00t:*) |
built-in only |
Benchmark / diagnostic env vars (LIBERO, GR00T bisection)
| Variable | Description | Default |
|---|---|---|
STRANDS_LIBERO_ACTION_LOG / _MAX |
Per-step OSC controller diagnostics | unset / 50 |
STRANDS_LIBERO_STATE_LOG / _MAX |
Per-step state values fed to GR00T | unset / 50 |
STRANDS_GROOT_WIRE_LOG / _MAX_CALLS |
Dump pre/post inference payloads to verify LOCAL vs SERVICE parity | unset / 10 |
~/.strands_robots/
└── assets/ # auto-downloaded MJCF + meshes
├── trs_so_arm100/
├── franka_emika_panda/
└── ...
Clear with rm -rf ~/.strands_robots/assets/; relocate with
export STRANDS_ASSETS_DIR=/path/to/dir.
strands-robots ships a LIBERO
benchmark integration on the MuJoCo backend - byte-equivalent to upstream
LIBERO at the model level, reaching success_rate >= 0.92 on libero-10/SCENE5.
Register declarative benchmarks from file and evaluate policies via the
list_benchmarks, register_benchmark_from_file, and evaluate_benchmark
simulation actions. Install with uv pip install "strands-robots[benchmark-libero]".
strands_robots/
├── __init__.py # Lazy-loaded public API (Robot, Simulation, policies)
├── robot.py # Robot() factory (sim/real/auto dispatch)
├── hardware_robot.py # HardwareRobot - async LeRobot control
├── policies/
│ ├── base.py # Policy ABC
│ ├── factory.py # create_policy() + runtime registration
│ ├── mock.py # MockPolicy (non-VLA reference)
│ ├── groot/ # NVIDIA GR00T (ZMQ/HTTP client + data configs)
│ └── lerobot_local/ # Direct HuggingFace inference (RTC, processors)
├── registry/ # robots.json (50+) + policies.json + loaders
├── simulation/
│ ├── base.py # SimEngine ABC
│ ├── factory.py # create_simulation() + backend registry
│ ├── models.py # SimWorld / SimRobot / SimObject / SimCamera
│ └── mujoco/ # MuJoCo backend (64-action AgentTool)
├── mesh/ # Zenoh mesh: core, sensors, input, audit, transport, iot
├── benchmarks/libero/ # LIBERO suite + BDDL parser + adapter
└── tools/ # gr00t_inference, lerobot_*, pose, serial, robot_mesh
uv pip install -e ".[all,dev]"
hatch run test # unit tests
hatch run test-integ # integration tests (GPU + model weights)
hatch run lint # ruff check + format --check + mypy
hatch run format # ruff check --fix + ruff formatPython 3.12+ required. See AGENTS.md for conventions and the accumulated code-review learnings.
Found a vulnerability? Do not open a public issue. Follow the disclosure process in SECURITY.md (AWS VDP / HackerOne).
Note the trust_remote_code gate on lerobot_local (see
Policy providers) and the mesh CA-pinning / thing-name
validation controls in the Configuration matrix.
Issues and PRs welcome. Track work on the Strands Labs - Robots project board; it is the source of truth for roadmap and follow-ups.
Apache-2.0 - see LICENSE.
