You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -84,6 +84,7 @@ One of the core designs of `debug-gym` is the notion of tools. Users can dynamic
84
84
|`view`| It is used to change an agent's focus to a particular source code file. This is particularly useful when dealing with a repository with multiple files. |
85
85
|`eval`| It runs the current code repository using the provided entrypoint (e.g., pytest), and returns the terminal's output (e.g., error message). |
86
86
|`pdb`| Interactive debugger wrapping the [Python pdb tool](https://docs.python.org/3/library/pdb.html). In additon, users can choose to maintain a set of persistent breakpoints (as in some programming IDEs), which are not reset after every eval. With such feature, a new pdb debugging session is activated automatically, with all the breakpoints restored. Note such breakpoint can be cleared by pdb commands such as `cl`. |
87
+
|`grep`| Search for patterns in files within the repository. Supports both literal string matching and regular expressions. Can search in specific files, directories, or the entire repository. Useful for finding code patterns, function definitions, variable usage, or identifying files containing specific text. |
87
88
|`rewrite`| It can be used to rewrite a certain piece of code to fix the bug. The inputs of this tool call include the file path, the start and end line numbers, and the new code. |
88
89
89
90
Upon importing a tool, its action space and observation space will be automatically merged into `debug-gym`'s action space and observation space; its instruction will also be merged into the overall instruction provided to the agent (e.g., as system prompt).
@@ -101,6 +102,7 @@ We provide the below LLM-based agents, they all have minimal design and serve th
101
102
|`debug_agent`|`pdb`, `rewrite`, `view`, `eval`| A minimal agent that dumps all available information into its prompt and queries the LLM to generate a command. |
102
103
|`rewrite_agent`|`rewrite`, `view`, `eval`| A `debug_agent` but `pdb` tool is disabled (an agent keeps rewriting). |
103
104
|`debug_5_agent`|`pdb`, `rewrite`, `view`, `eval`| A `debug_agent`, but `pdb` tool is only enabled after certain amount of rewrites. |
105
+
|`grep_agent`|`grep`, `rewrite`, `view`, `eval`| A variant of `rewrite_agent` that includes the `grep` tool for searching patterns in the codebase before making changes. |
104
106
|`solution_agent`|`pdb`, `eval`| An oracle agent that applies a gold patch (only works with `swebench` and `swesmith` benchmarks for now). The agent checks that tests are failing before applying the patch, and passing after. It also checks that `pdb` tool can be used as expected. |
0 commit comments