Skip to content

Abstraction for browser automation tasks#232

Open
rcholic wants to merge 4 commits intomainfrom
planner_executor_agent2
Open

Abstraction for browser automation tasks#232
rcholic wants to merge 4 commits intomainfrom
planner_executor_agent2

Conversation

@rcholic
Copy link
Contributor

@rcholic rcholic commented Mar 15, 2026

Summary

Abstract WebBenchTask to a generic AutomationTask for the SDK's PlannerExecutorAgent to handle broad web automation tasks like "buy a laptop on xyz.com". The design removes hardcoded WebBench dependencies, enables dynamic heuristics composition by the planner, and implements rollback/recovery to last known good state.


Design Goals

  1. Generic Task Model: Replace WebBench-specific WebBenchTask with AutomationTask that works for any browser automation
  2. Dynamic Heuristics: Let planner compose element-selection heuristics on-the-fly via HeuristicHint
  3. Deterministic Verification: Use existing PredicateSpec system (url_contains, exists, etc.) for post-step verification
  4. Recovery/Rollback: Implement URL-based recovery when verification fails repeatedly
  5. Backward Compatibility: WebBench can continue using WebBenchTask via factory method

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant