Skip to content

Commit 215a1c0

Browse files
CI bot using minisweagent (#28)
* initial implementation * added missing files * include PR description in markdown context * use `uv` for the whole python part * removed masking code * added better git handling * added git client oops * improved naming for git-related files * added TODO for instance template * removed darwin flag * added pr number to the inference tags * fixed uv dependencies on mini-swe-agent * added local runner for CI Bot for testing and experimentation * removed dist/ from repo * added to ignores * cleaned up types * fixed dependency on my swe-agent fork * bot does recognizably interesting stuff locally * failure logs come in smoothly now * added instructions for machine readable itmeouts * updated agent output * added test-mode * pr only * removed any suggestion comments from CI bot * fix type issues * removed trajectory json * added extra files * added docker compose * added fix details section to PR comment * added build files * fixed lint errors * fixed linting errors * added episode id write to CH * Fix tests * Add some additional security fixes to CI bot (#48) * added episode level metrics * added episode id handling * refactored a little * added feedback for episode IDs * added tailscale login with ACL tag to workflow * changed env var names for clarity * skip configuration * bundled and added an extra log line * add another env var * added tensorzero gateway url option * added a two-job structure * update the agent to write a better patch * fixed bundle * try episode feedback * use tailscale for feedback * removed default write permissions * fixed PR comments and updated pin to main * removed some unused code * fix tests --------- Co-authored-by: Shuyang Li <[email protected]>
1 parent 3d2447d commit 215a1c0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+84784
-16527
lines changed

.github/workflows/ci-failure-diagnosis.yml

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,6 @@ on:
66
types:
77
- completed
88

9-
permissions:
10-
contents: write
11-
pull-requests: write
12-
actions: read
13-
149
jobs:
1510
generate-patch:
1611
if: ${{ github.event.workflow_run.conclusion == 'failure' }}
@@ -48,14 +43,13 @@ jobs:
4843

4944
- name: Generate patch
5045
id: generate
51-
uses: tensorzero/experimental-ci-bot/generate-pr-patch@viraj/pr-only
46+
uses: tensorzero/experimental-ci-bot/generate-pr-patch@main
5247
env:
5348
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
5449
with:
5550
token: ${{ secrets.GITHUB_TOKEN }}
5651
mode: patch-only
5752
tensorzero-base-url: http://localhost:3000
58-
tensorzero-diff-patched-successfully-metric-name: tensorzero_github_ci_bot_diff_patched_successfully
5953
output-artifacts-dir: debug-logs
6054
clickhouse-url: ${{ secrets.CI_BOT_CLICKHOUSE_URL }}
6155
clickhouse-table: GitHubBotPullRequestToInferenceMap

.github/workflows/provide-pull-request-feedback.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,10 @@ jobs:
2323

2424
- name: Send PR Feedback
2525
# TODO: currently pinned to miniswe-agent branch; switch back to main when ready.
26-
uses: tensorzero/experimental-ci-bot/create-pr-feedback@viraj/pr-only
26+
uses: tensorzero/experimental-ci-bot/create-pr-feedback@main
2727
with:
28-
tensorzero-base-url: http://localhost:3000
28+
tensorzero-base-url: http://ci-bot-gateway:3000
2929
# TODO: Switch to tensorzero_github_ci_bot_agent_pr_merged for episode-level feedback when agent creates PRs
30-
tensorzero-pr-merged-metric-name: tensorzero_github_ci_bot_pr_merged
30+
tensorzero-pr-merged-metric-name: ci_fix_pr_merged_agent
3131
clickhouse-url: ${{ secrets.CI_BOT_CLICKHOUSE_URL }}
3232
clickhouse-table: GitHubBotPullRequestToInferenceMap

.gitignore

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# Dependency directory
22
node_modules
33

4+
45
# Rest pulled from https://github.com/github/gitignore/blob/master/Node.gitignore
56
# Logs
67
logs
@@ -100,3 +101,14 @@ __tests__/runner/*
100101
# IDE files
101102
.idea
102103
*.code-workspace
104+
105+
# Python
106+
.venv
107+
venv
108+
env
109+
__pycache__
110+
*.py[cod]
111+
*$py.class
112+
*.so
113+
.Python
114+
.envrc

.prettierignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,10 @@
33
dist/
44
node_modules/
55
coverage/
6+
7+
# Python
8+
.venv/
9+
venv/
10+
env/
11+
__pycache__/
12+
*.pyc

README.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,111 @@
66
[![CodeQL](https://github.com/actions/typescript-action/actions/workflows/codeql-analysis.yml/badge.svg)](https://github.com/actions/typescript-action/actions/workflows/codeql-analysis.yml)
77
[![Coverage](./badges/coverage.svg)](./badges/coverage.svg)
88

9+
## Running Locally
10+
11+
You can run the mini-swe-agent locally to test PRs before deploying to GitHub
12+
Actions.
13+
14+
### Prerequisites
15+
16+
1. Install dependencies:
17+
18+
```bash
19+
npm install
20+
npm run bundle # Build the CLI
21+
```
22+
23+
1. Set up required environment variables:
24+
25+
```bash
26+
# GitHub authentication (choose one):
27+
export GITHUB_TOKEN=$(gh auth token) # If using gh CLI
28+
# OR
29+
export GITHUB_TOKEN=ghp_your_token_here
30+
31+
# Model API keys (at least one required):
32+
export ANTHROPIC_API_KEY=your_anthropic_key
33+
# OR
34+
export OPENAI_API_KEY=your_openai_key
35+
```
36+
37+
### Usage
38+
39+
#### Dry Run (Local Testing)
40+
41+
Test the agent without creating PRs or comments on GitHub:
42+
43+
```bash
44+
npm run cli -- --repo owner/repo --pr 123 --dry-run
45+
```
46+
47+
This will:
48+
49+
- Clone the PR repository
50+
- Run the mini-swe-agent to analyze and fix issues
51+
- Display the generated patch locally
52+
- Not make any changes to GitHub
53+
54+
#### Live Mode (Create PRs/Comments)
55+
56+
Run the agent and create actual PRs or inline comments on GitHub:
57+
58+
```bash
59+
npm run cli -- --repo owner/repo --pr 456
60+
```
61+
62+
This will:
63+
64+
- Clone the PR repository
65+
- Run the mini-swe-agent
66+
- Create a follow-up PR or post inline comments based on the agent's decision
67+
68+
#### With CI Failure Context
69+
70+
If you have a specific workflow run that failed, you can provide its ID:
71+
72+
```bash
73+
npm run cli -- --repo owner/repo --pr 789 --workflow-run-id 12345
74+
```
75+
76+
### CLI Options
77+
78+
```text
79+
-r, --repo <owner/repo> Repository in "owner/repo" format
80+
-p, --pr <number> Pull request number (required)
81+
-d, --dry-run Show patch locally without PRs/comments
82+
-t, --token <token> GitHub token (default: GITHUB_TOKEN or gh)
83+
-w, --workflow-run-id <id> Workflow run ID for failure logs
84+
-o, --output-dir <path> Directory for debug artifacts
85+
--clickhouse-url <url> ClickHouse URL for tracking
86+
--clickhouse-table <name> ClickHouse table name
87+
-c, --cost-limit <dollars> Cost limit (default: 3.0)
88+
--timeout <minutes> Timeout in minutes (default: 30)
89+
-h, --help Show help message
90+
```
91+
92+
### Examples
93+
94+
```bash
95+
# Dry run on a public repository
96+
npm run cli -- --repo tensorzero/tensorzero --pr 100 --dry-run
97+
98+
# Run on your own repository with custom settings
99+
export GITHUB_TOKEN=$(gh auth token)
100+
npm run cli -- \
101+
--repo myorg/myrepo \
102+
--pr 42 \
103+
--cost-limit 5.0 \
104+
--timeout 45 \
105+
--output-dir ./debug-output
106+
107+
# Analyze a specific failed workflow run
108+
npm run cli -- \
109+
--repo owner/repo \
110+
--pr 123 \
111+
--workflow-run-id 9876543210
112+
```
113+
9114
## Developing
10115

11116
- `npm install`

0 commit comments

Comments
 (0)