Skip to content

Commit 8218d28

Browse files
authored
mbodied agents v2 (#23)
1 parent 7a37ce4 commit 8218d28

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+906
-2423
lines changed

README.md

Lines changed: 25 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,12 @@
55
[![Ubuntu](https://github.com/MbodiAI/opensource/actions/workflows/ubuntu.yml/badge.svg)](https://github.com/MbodiAI/opensource/actions/workflows/ubuntu.yml)
66
[![PyPI Version](https://img.shields.io/pypi/v/mbodied-agents.svg)](https://pypi.python.org/pypi/mbodied-agents)
77
[![Documentation Status](https://readthedocs.com/projects/mbodi-ai-mbodied-agents/badge/?version=latest)](https://mbodi-ai-mbodied-agents.readthedocs-hosted.com/en/latest/?badge=latest)
8-
[![Example Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1DAQkuuEYj8demiuJS1_10FIyTI78Yzh4?usp=sharing)
8+
99

1010
Documentation: [mbodied agents docs](https://mbodi-ai-mbodied-agents.readthedocs-hosted.com/en)
1111

12+
Example colab: [![Example Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/16liQspSIzRazWb_qa_6Z0MRKmMTr2s1s?usp=sharing)
13+
1214
# mbodied agents
1315
Welcome to **mbodied agents**, a toolkit for integrating state-of-the-art transformers into robotics systems. The goals for this repo are to minimize the ambiguouty, heterogeneity, and data scarcity currently holding generative AI back from wide-spread adoption in robotics. It provides strong type hints for the various types of robot actions and provides a unified interface for:
1416

@@ -17,7 +19,7 @@ Welcome to **mbodied agents**, a toolkit for integrating state-of-the-art transf
1719
- Automatically recording observations and actions to hdf5
1820
- Exporting to the most popular ML formats such as [Gym Spaces](https://gymnasium.farama.org/index.html) and [Huggingface Datasets](https://huggingface.co/docs/datasets/en/index)
1921

20-
And most importantly, the entire library is __100% configurable to any observation and action space__. That's right. With **mbodied agents**, the days of wasting precious engineering time on tedious formatting and post-processing are over. Jump to [Getting Started](#getting-started) to get up and running on [real hardware](https://colab.research.google.com/drive/1DAQkuuEYj8demiuJS1_10FIyTI78Yzh4?usp=sharing) or a [mujoco simulation](https://colab.research.google.com/drive/1sZtVLv17g9Lin1O2DyecBItWXwzUVUeH)
22+
And most importantly, the entire library is __100% configurable to any observation and action space__. That's right. With **mbodied agents**, the days of wasting precious engineering time on tedious formatting and post-processing are over. Jump to [Getting Started](#getting-started) to get up and running on [real hardware](https://colab.research.google.com/drive/16liQspSIzRazWb_qa_6Z0MRKmMTr2s1s?usp=sharing) or a [mujoco simulation](https://colab.research.google.com/drive/1sZtVLv17g9Lin1O2DyecBItWXwzUVUeH)
2123

2224

2325
## Updates
@@ -38,18 +40,11 @@ Please join our [Discord](https://discord.gg/RNzf3RCxRJ) for interesting discuss
3840

3941
- [Mbodied Agents](#mbodied-agents)
4042
- [Overview](#overview)
41-
- [Support Matrix](#support-matrix)
4243
- [Installation](#installation)
44+
- [Dev Environment Setup](#dev-environment-setup)
4345
- [Getting Started](#getting-started)
4446
- [Glossary](#glossary)
4547
- [Building Blocks](#building-blocks)
46-
- [The Sample class](#the-sample-class)
47-
- [Message](#message)
48-
- [Backend](#backend)
49-
- [Cognitive Agent](#cognitive-agent)
50-
- [Controls](#controls)
51-
- [Hardware Interface](#hardware-interface)
52-
- [Recorder](#recorder)
5348
- [Directory Structure](#directory-structure)
5449
- [Contributing](#contributing)
5550

@@ -85,6 +80,26 @@ If you would like to integrate a new backend, sense, or motion control, it is ve
8580

8681
`pip install mbodied-agents`
8782

83+
## Dev Environment Setup
84+
85+
1. Clone this repo:
86+
87+
```console
88+
git clone https://github.com/MbodiAI/mbodied-agents.git
89+
```
90+
91+
2. Install system dependencies:
92+
93+
```console
94+
source install.bash
95+
```
96+
97+
3. Then for each new terminal, run:
98+
99+
```console
100+
hatch shell
101+
```
102+
88103
## Getting Started
89104

90105
### Real Robot Hardware

assets/architecture.jpg

9.87 KB
Loading

examples/simple_robot_agent.ipynb

Lines changed: 0 additions & 700 deletions
This file was deleted.

examples/simple_robot_agent.py

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
# Copyright 2024 Mbodi AI
2-
#
2+
#
33
# Licensed under the Apache License, Version 2.0 (the "License");
44
# you may not use this file except in compliance with the License.
55
# You may obtain a copy of the License at
6-
#
6+
#
77
# https://www.apache.org/licenses/LICENSE-2.0
8-
#
8+
#
99
# Unless required by applicable law or agreed to in writing, software
1010
# distributed under the License is distributed on an "AS IS" BASIS,
1111
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
@@ -18,14 +18,14 @@
1818
import click
1919
from pydantic import BaseModel, Field
2020
from pydantic_core import from_json
21-
from gym import spaces
21+
from gymnasium import spaces
2222

23-
from mbodied_agents.agents.language import CognitiveAgent
24-
from mbodied_agents.agents.sense.audio_handler import AudioHandler
23+
from mbodied_agents.agents.language import LanguageAgent
24+
from mbodied_agents.agents.sense.audio.audio_handler import AudioHandler
2525
from mbodied_agents.base.sample import Sample
2626
from mbodied_agents.hardware.sim_interface import SimInterface
2727
from mbodied_agents.types.controls import HandControl
28-
from mbodied_agents.types.vision import Image
28+
from mbodied_agents.types.sense.vision import Image
2929
from mbodied_agents.data.recording import Recorder
3030

3131

@@ -53,9 +53,9 @@ class AnswerAndActionsList(Sample):
5353
)
5454

5555

56-
# This prompt is used to provide context to the CognitiveAgent.
56+
# This prompt is used to provide context to the LanguageAgent.
5757
SYSTEM_PROMPT = f"""
58-
You are a robot with vision capabilities.
58+
You are a robot with vision capabilities.
5959
For each task given, you respond in JSON format. Here's the JSON schema:
6060
{AnswerAndActionsList.model_json_schema()}
6161
"""
@@ -69,32 +69,34 @@ def main(backend: str, disable_audio: bool, record_dataset: bool) -> None:
6969
"""Main function to initialize and run the robot interaction.
7070
7171
Args:
72-
backend: The backend to use for the CognitiveAgent (e.g., "openai").
72+
backend: The backend to use for the LanguageAgent (e.g., "openai").
7373
disable_audio: If True, disables audio input/output.
7474
record_dataset: If True, enables recording of the interaction data for training.
7575
7676
Example:
7777
To run the script with OpenAI backend and disable audio:
7878
python script.py --backend openai --disable_audio
7979
"""
80-
# Initialize the intelligent Robot Agent.
81-
robot_agent = CognitiveAgent(context=SYSTEM_PROMPT, api_service=backend)
80+
# Initialize the intelligent Robot Agent with language interface.
81+
robot_agent = LanguageAgent(context=SYSTEM_PROMPT, api_service=backend)
8282

8383
# Use a mock robot interface for movement visualization.
8484
robot_interface = SimInterface()
8585

8686
# Enable or disable audio input/output capabilities.
8787
if disable_audio:
8888
os.environ["NO_AUDIO"] = "1"
89-
audio = AudioHandler(use_pyaudio=False) # Prefer to use use_pyaudio=False for MAC.
89+
# Prefer to use use_pyaudio=False for MAC.
90+
audio = AudioHandler(use_pyaudio=False)
9091

9192
# Data recorder for every conversation and action.
9293
if record_dataset:
9394
observation_space = spaces.Dict({
9495
'image': Image(size=(224, 224)).space(),
9596
'instruction': spaces.Text(1000)
9697
})
97-
action_space = AnswerAndActionsList(actions=[HandControl()] * 6).space()
98+
action_space = AnswerAndActionsList(
99+
actions=[HandControl()] * 6).space()
98100
recorder = Recorder(
99101
'example_recorder',
100102
out_dir='saved_datasets',
@@ -116,7 +118,8 @@ def main(backend: str, disable_audio: bool, record_dataset: bool) -> None:
116118
print("Response:", response)
117119

118120
# Validate the response to the pydantic object.
119-
answer_actions = AnswerAndActionsList.model_validate(from_json(response))
121+
answer_actions = AnswerAndActionsList.model_validate(
122+
from_json(response))
120123

121124
# Let the robot speak.
122125
if answer_actions.answer:

0 commit comments

Comments
 (0)