-
Notifications
You must be signed in to change notification settings - Fork 861
feat: add browser-cdp skill for CDP-based browser automation #141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,200 @@ | ||
| --- | ||
| name: browser-cdp | ||
| description: "Control a browser via Chrome DevTools Protocol (CDP) proxy. Use when: (1) navigating to URLs and reading page content, (2) taking screenshots, (3) executing JavaScript in the browser, (4) clicking elements or filling forms, (5) searching and installing Chrome Web Store extensions, (6) interacting with web APIs that require a real browser. NOT for: simple HTTP requests (use curl), local file operations, or when no CDP proxy is available." | ||
| metadata: | ||
| { | ||
| "openclaw": { | ||
| "emoji": "🌐", | ||
| "requires": { "bins": ["curl"] } | ||
| } | ||
| } | ||
| --- | ||
|
|
||
| # Browser CDP | ||
|
|
||
| Control a real browser through a Chrome DevTools Protocol proxy. | ||
|
|
||
| ## Overview | ||
|
|
||
| This skill provides browser automation via a lightweight HTTP proxy that wraps CDP. The proxy exposes REST endpoints for navigation, screenshots, JS evaluation, clicking, and more — no Playwright/Puppeteer dependency needed. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| Install the required Python dependency: | ||
|
|
||
| ```bash | ||
| pip install psutil | ||
| ``` | ||
|
|
||
| A CDP proxy must be running on `http://localhost:3456`. Start it from the repository root with: | ||
|
|
||
| ```bash | ||
| python3 skills/browser-cdp/scripts/cdp_proxy.py | ||
| ``` | ||
|
|
||
| This launches Chrome/Edge with remote debugging enabled and proxies CDP commands over HTTP. | ||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## When to Use | ||
|
|
||
| ✅ **USE this skill when:** | ||
|
|
||
| - "Open this URL and tell me what's on the page" | ||
| - "Take a screenshot of the current page" | ||
| - "Run this JavaScript on the page" | ||
| - "Click the button that says..." | ||
| - "Search for and install a Chrome extension" | ||
| - "Log into this site and do something" | ||
| - Any task requiring a real browser context | ||
|
|
||
| ❌ **DON'T use this skill when:** | ||
|
|
||
| - Simple HTTP API calls → use `curl` directly | ||
| - Downloading files → use `curl -O` | ||
| - Parsing HTML from a saved file → use `python3` with BeautifulSoup | ||
| - No CDP proxy running → ask the user to start it first | ||
|
|
||
| ## API Reference | ||
|
|
||
| All endpoints are relative to `http://localhost:3456`. | ||
|
|
||
| ### GET /targets | ||
|
|
||
| List all open browser tabs. | ||
|
|
||
| ```bash | ||
| curl -s http://localhost:3456/targets | python3 -m json.tool | ||
| ``` | ||
|
|
||
| Response: | ||
| ```json | ||
| [ | ||
| { "id": "ABC123", "title": "Google", "url": "https://google.com" } | ||
| ] | ||
| ``` | ||
|
|
||
| ### GET /navigate?url=<URL> | ||
|
|
||
| Navigate a tab to a URL. Uses the most recently created tab, or specify `?target=<targetId>`. | ||
|
|
||
| ```bash | ||
| curl -s "http://localhost:3456/navigate?url=https://example.com" | ||
| ``` | ||
|
|
||
| ### GET /screenshot | ||
|
|
||
| Take a PNG screenshot of the current page. | ||
|
|
||
| ```bash | ||
| # Save to file | ||
| curl -s -o screenshot.png http://localhost:3456/screenshot | ||
| ``` | ||
|
|
||
| ### POST /eval | ||
|
|
||
| Execute JavaScript in the page. The request body is **plain text** (not JSON), sent as `Content-Type: text/plain`. | ||
|
|
||
| ```bash | ||
| curl -s -X POST http://localhost:3456/eval \ | ||
| -H "Content-Type: text/plain" \ | ||
| -d "document.title" | ||
| ``` | ||
|
|
||
| For multi-line scripts, pipe from stdin or use a heredoc: | ||
|
|
||
| ```bash | ||
| curl -s -X POST http://localhost:3456/eval \ | ||
| -H "Content-Type: text/plain" \ | ||
| -d "JSON.stringify(Array.from(document.querySelectorAll('a')).map(a => ({text: a.innerText, href: a.href})))" | ||
| ``` | ||
|
|
||
| ### GET /click?selector=<CSS> | ||
|
|
||
| Click an element matching a CSS selector. | ||
|
|
||
| ```bash | ||
| curl -s "http://localhost:3456/click?selector=%23submit-btn" | ||
| ``` | ||
|
|
||
| ### GET /new | ||
|
|
||
| Open a new browser tab and return its target ID. | ||
|
|
||
| ```bash | ||
| curl -s http://localhost:3456/new | ||
| ``` | ||
|
|
||
| Response: | ||
| ```json | ||
| { "id": "NEW_TAB_ID", "title": "about:blank", "url": "about:blank" } | ||
| ``` | ||
|
|
||
| ## Common Workflows | ||
|
|
||
| ### Navigate and extract page content | ||
|
|
||
| ```bash | ||
| # Open a page | ||
| curl -s "http://localhost:3456/navigate?url=https://example.com" | ||
|
|
||
| # Extract all text content | ||
| curl -s -X POST http://localhost:3456/eval \ | ||
| -H "Content-Type: text/plain" \ | ||
| -d "document.body.innerText" | ||
|
|
||
| # Extract all links | ||
| curl -s -X POST http://localhost:3456/eval \ | ||
| -H "Content-Type: text/plain" \ | ||
| -d "JSON.stringify([...document.querySelectorAll('a')].map(a => ({text: a.textContent.trim(), href: a.href})))" | ||
| ``` | ||
|
|
||
| ### Take a screenshot | ||
|
|
||
| ```bash | ||
| curl -s "http://localhost:3456/navigate?url=https://example.com" | ||
| curl -s -o page.png http://localhost:3456/screenshot | ||
| ``` | ||
|
|
||
| ### Search and install a Chrome extension | ||
|
|
||
| ```bash | ||
| # Search the Chrome Web Store (no login required for search) | ||
| curl -s "http://localhost:3456/navigate?url=https://chromewebstore.google.com/search/example%20extension" | ||
|
|
||
| # Extract extension IDs from search results | ||
| curl -s -X POST http://localhost:3456/eval \ | ||
| -H "Content-Type: text/plain" \ | ||
| -d "JSON.stringify([...document.querySelectorAll('a[data-id]')].map(a => ({id: a.dataset.id, title: a.textContent.trim()})))" | ||
|
|
||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| # Install an extension (requires the extension ID) | ||
| curl -s "http://localhost:3456/navigate?url=https://chromewebstore.google.com/detail/<extension-id>" | ||
| # Then click the "Add to Chrome" button | ||
| curl -s "http://localhost:3456/click?selector=%5Bdata-id%3Dinstall-button%5D" | ||
| ``` | ||
|
|
||
| ### Fill a form and submit | ||
|
|
||
| ```bash | ||
| # Navigate to the form | ||
| curl -s "http://localhost:3456/navigate?url=https://example.com/login" | ||
|
|
||
| # Fill in fields | ||
| curl -s -X POST http://localhost:3456/eval \ | ||
| -H "Content-Type: text/plain" \ | ||
| -d "document.querySelector('#username').value = 'myuser'" | ||
| curl -s -X POST http://localhost:3456/eval \ | ||
| -H "Content-Type: text/plain" \ | ||
| -d "document.querySelector('#password').value = 'mypass'" | ||
|
|
||
| # Submit | ||
| curl -s "http://localhost:3456/click?selector=%23login-form+%3E+button" | ||
| ``` | ||
|
|
||
| ## Notes | ||
|
|
||
| - The CDP proxy must be running before using any commands | ||
| - If the proxy is not running, ask the user to start it: `python3 skills/browser-cdp/scripts/cdp_proxy.py` | ||
| - Use URL encoding for query parameters with special characters | ||
| - The `/eval` endpoint returns the result of the last expression (like a REPL) | ||
| - Screenshots are returned as PNG binary data | ||
| - For complex multi-step interactions, chain `/eval` and `/click` calls | ||
| - The proxy supports a `?target=<targetId>` parameter on most endpoints to target a specific tab | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.