You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The design-doc-first half of #198, split out so it's independently
claimable. Specify how an OpenRange episode maps to an
open-trajectory-gym trajectory. No implementation — just the seam shape.
What OpenRange emits
From src/openrange/core/episode.py:
per step: Observation (visible_state, events) + the harness-supplied AgentTurn (message, tool_calls, tool_results)
terminal: EpisodeReport.episode_result: EpisodeResult — {success, subgoals, reason}, structured and never a scalar
(openrange-pack-sdk_types.py)
To specify
Observation encoding — what an observation is when the surface is
HTTP routes / MCP tools / files (map visible_state + events).
Reward — defer to the RewardAdapter from Add per-domain RewardAdapter #199 (EpisodeResult →
scalar / vector); do not bake a scalar into the trajectory.
done / terminal — from the episode's terminal_reason.
Runtime shape — fresh-world-per-rollout vs. pooled; curriculum via auto_evolve(...).
Acceptance
An ADR or DESIGN.md section answering (1)–(5), reviewed and merged.
Unblocks #198 (implementation); pairs with #199 (reward) and #200
(curriculum loop). Read open-trajectory-gym's trajectory format first.
The design-doc-first half of #198, split out so it's independently
claimable. Specify how an OpenRange episode maps to an
open-trajectory-gym trajectory. No implementation — just the seam shape.
What OpenRange emits
From
src/openrange/core/episode.py:Observation(visible_state,events) + the harness-suppliedAgentTurn(message,tool_calls,tool_results)EpisodeReport.episode_result: EpisodeResult—{success, subgoals, reason}, structured and never a scalar(
openrange-pack-sdk_types.py)To specify
HTTP routes / MCP tools / files (map
visible_state+events).AgentTurn.tool_calls→ trajectory action.RewardAdapterfrom Add per-domain RewardAdapter #199 (EpisodeResult→scalar / vector); do not bake a scalar into the trajectory.
terminal_reason.auto_evolve(...).Acceptance
An ADR or
DESIGN.mdsection answering (1)–(5), reviewed and merged.Unblocks #198 (implementation); pairs with #199 (reward) and #200
(curriculum loop). Read open-trajectory-gym's trajectory format first.