An LLM-powered autonomous browser agent that plans and executes structured web workflows using deterministic Playwright automation.
This project demonstrates a production-style execution loop combining: - Structured LLM planning - Deterministic browser automation - Strict JSON plan validation - Single-worker queue management - HTTP API with synchronous wait support - Interval-based scheduling - Run lifecycle tracking with bounded memory - Structured logging and artifact capture
- Node.js 20+ (required)
- Playwright-compatible system dependencies
- Gemini API key for LLM planning
The agent automates a multi-section healthcare form located at:
https://magical-medical-form.netlify.app/
It dynamically discovers the page structure, builds a catalog of available controls, generates a strictly validated JSON execution plan via an LLM, and executes that plan deterministically using Playwright.
The system is designed to behave predictably, avoid uncontrolled agent behavior, and maintain clear execution boundaries.
High-level execution flow:
- Snapshot page and build section catalog
- Generate strict JSON plan via LLM
- Validate plan against schema
- Execute actions deterministically
- Capture artifacts and emit structured logs
- Persist run state in bounded registry
- Return status via API or scheduler
Core design principles:
- Deterministic execution (no freeform browsing)
- Label-driven element resolution
- Strict JSON-only LLM output contract
- Bounded model calls
- Single browser worker to avoid race conditions
- Memory-bounded run storage
- Dynamically inspects the DOM
- Catalogs:
- Input labels
- Select labels and options
- Section toggles
- Submit button
- Prompts the LLM to return strictly formatted JSON
- Validates plan shape before execution
- Enforces action limits and call bounds
Supported actions:
- Expand section
- Fill input
- Select dropdown option
- Click submit
- Done
All actions are executed using hardened primitives:
- Scroll-into-view before interaction
- Strict selector resolution
- No guessing of fields or options
- Full-page screenshot capture on key stages
Artifacts are written to:
artifacts/
Start, monitor, and optionally wait for runs via API.
Endpoints:
GET /api/v1/health
POST /api/v1/run
GET /api/v1/runs/:idCreate asynchronous run:
curl -X POST "http://localhost:3000/api/v1/run" -H "Content-Type: application/json" -d '{"vars":{"firstName":"John","lastName":"Doe"}}'Create run and wait synchronously:
curl -X POST "http://localhost:3000/api/v1/run?wait=true&timeoutMs=60000" -H "Content-Type: application/json" -d '{"vars":{"firstName":"John","lastName":"Doe"}}'Supports interval-based execution with overlap prevention.
SCHEDULE_INTERVAL_MS=300000
SCHEDULE_RUN_ON_START=true- Single-worker queue
- In-memory run registry
- TTL-based pruning (
RUN_TTL_MS) - Max run count enforcement (
MAX_RUNS) - Waiter cleanup on timeout
Run states:
- queued
- running
- succeeded
- failed
The project uses environment variables for runtime configuration.
Copy the provided example file:
cp .env.example .envThen open .env and provide your Gemini API key:
GOOGLE_GENERATIVE_AI_API_KEY=your_api_key_hereThe .env.example file documents all available configuration variables, including:
- Target URL for the automation workflow
- LLM model configuration
- Model safety limits
- API server configuration
- Scheduler interval settings
- Run lifecycle management limits
npm install
npx playwright install
cp .env.example .envAfter copying the environment file, open .env and add your Gemini API key.
CLI:
npm run cliAPI server:
npm run apiScheduler:
npm run scheduleThe workflow supports runtime overrides for all form fields via API.
Supported variables:
- firstName
- lastName
- dateOfBirth
- medicalId
- gender
- bloodType
- allergies
- currentMedications
- emergencyContactName
- emergencyContactPhone
Example:
curl -X POST "http://localhost:3000/api/v1/run?wait=true" -H "Content-Type: application/json" -d '{"vars":{"firstName":"Alice","lastName":"Ng"}}'src/workflow.ts # Workflow orchestration
src/agent/loop.ts # LLM planning loop
src/agent/catalog.ts # Section catalog builder
src/agent/snapshot.ts # DOM snapshot extraction
src/agent/resolve.ts # Selector resolution
src/agent/tools.ts # Playwright primitives
src/agent/types.ts # Plan types and validation
src/agent/json.ts # Strict JSON extraction
src/api/server.ts # API + run queue
src/api/registry.ts # Run registry
src/scheduler.ts # Interval scheduler
src/logger.ts # Structured logging
__tests__/helpers/mocks.ts # Shared mock factories
__tests__/agent/json.test.ts # JSON extraction tests
__tests__/agent/types.test.ts # Plan validation tests
__tests__/agent/resolve.test.ts # Selector resolution tests
__tests__/agent/tools.test.ts # Playwright primitive tests
__tests__/agent/loop.test.ts # LLM planning loop tests
__tests__/api/registry.test.ts # Registry pruning/waiter tests
__tests__/api/server.test.ts # API integration tests
__tests__/scheduler/scheduler.test.ts # Scheduler overlap tests
__tests__/workflow/workflow.test.ts # Workflow helper tests
116 deterministic tests across 9 files using Vitest. All tests are fully mocked — no network calls, no real browser launches, no LLM API calls.
npm test # Run all tests
npm run test:watch # Watch mode
npm run test:coverage # Generate coverage reportCoverage reports are output to coverage/.
- In-memory run storage
- Single worker execution
- No authentication layer
- Not horizontally scalable
- Durable job queue (Redis / BullMQ)
- Persistent run storage
- Browser pooling
- Multi-tenant isolation
- Metrics and tracing
- Stronger submission verification
ISC