We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent d35fd70 commit a9cc5b1Copy full SHA for a9cc5b1
README.md
@@ -320,7 +320,11 @@ Here are some tips to speed up the evaluation:
320
You can inspect the failed samples by using the following command:
321
322
```bash
323
-bigcodebench.inspect --eval-results sample-sanitized-calibrated_eval_results.json --in-place
+# Inspect the failed samples and save the results to `inspect/`
324
+bigcodebench.inspect --eval_results sample-sanitized-calibrated_eval_results.json --split complete --subset hard
325
+
326
+# Re-run the inspection in place
327
+bigcodebench.inspect --eval_results sample-sanitized-calibrated_eval_results.json --split complete --subset hard --in_place
328
```
329
330
## 🚀 Full Script
0 commit comments