-
Notifications
You must be signed in to change notification settings - Fork 6
Step1 parser: include stdout/stderr/cmd from failed tasks #15
Description
Problem
The step1 job parser (scripts/job_parser.py) only captures res.msg from failed task events, which is often just a generic message like "non-zero return code". The actual diagnostic details — stdout, stderr, cmd, and rc — are available in res but are not extracted.
This forces Claude to go back to the raw job log to find the real error, adding latency and extra steps to the analysis.
Example: Job 2035762 failed with "non-zero return code" in step1, but the actual root cause (error: pathspec 'cert-manager-fallback' did not match any file(s) known to git) was only available by manually parsing the raw event data.
Proposed Change
In _extract_failed_tasks(), after building the base task info dict:
- Extract
res.stdout,res.stderr,res.cmd, andres.rcwhen present - Fallback: if
resfields are empty/suppressed, captureevent.stdout(the rendered Ansible output which often contains the full error even whenresfields are suppressed)
No impact on step4 — it only reads task_path, task, play, role, task_action, error_message, duration, and timestamp. The new fields are purely additive and only benefit Step 5 (Claude's analysis).
File
skills/root-cause-analysis/scripts/job_parser.py — _extract_failed_tasks() (line ~159)