-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathrun_eval.py
More file actions
27 lines (20 loc) · 813 Bytes
/
run_eval.py
File metadata and controls
27 lines (20 loc) · 813 Bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/usr/bin/env python
"""
Entry point for running evaluations on all candidate implementations for each task.
This script uses the evaluator defined in ``src/evaluator.py`` to iterate over task
candidates, run their test suites, and print a concise summary. It keeps the
functionality minimal by design, since the goal is to demonstrate how a small
evaluation harness works rather than to build a full‑featured tool.
"""
from src.evaluator import evaluate_all_candidates
def main() -> None:
results = evaluate_all_candidates()
print("\n=== Evaluation summary ===")
for r in results:
print(
f"[{r['task_name']}] {r['candidate_name']}: "
f"{r['passed']} / {r['total']} tests passed "
f"({r['status']})"
)
if __name__ == "__main__":
main()