Add a Claude Code session-transcript adapter by ejentum · Pull Request #8 · lamenting-hawthorn/SkillLoop

ejentum · 2026-06-13T14:44:23Z

Claude Code session-transcript adapter

Adds an adapter so SkillLoop can ingest Claude Code sessions, alongside the existing generic and hermes adapters.

Claude Code persists sessions as JSONL under ~/.claude/projects/<slug>/*.jsonl with a nested envelope ({message: {role, content: [blocks]}}), block-array content (text / tool_use / tool_result), and interleaved non-message meta lines, so the generic adapter cannot read it. This normalizes that into AgentTrace:

Two-pass tool matching: indexes tool_result blocks and matches them back to the originating tool_use across turns, populating ToolCall name / arguments / result / status / started_at / ended_at / duration_ms.
Flattens text blocks into content; tolerates a partial trailing JSON line on a live session; drops meta lines and pure-tool_result turns.
--no-sidechains flag to exclude subagent sidechain turns.
Forward-compatible extended-thinking capture. Note: Claude Code stores thinking blocks with an empty thinking field plus a signature, so the reasoning text is stripped from the transcript and there is nothing to recover from a Claude Code session (documented in code).

Usage

skillloop ingest claude-code --latest
skillloop ingest claude-code <path-to.jsonl> [--project <slug>] [--no-sidechains]

Tests

6 new tests in tests/test_claude_code.py, verified end to end against a real ~860-message session.

Heads-up (separate from this PR): on Windows, test_store.py::test_store_preserves_raw_trace_and_hashes fails because raw_artifact_ref is stored with os.sep (backslashes) while the test asserts forward slashes. Happy to send a one-line fix as a follow-up.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added Claude Code adapter support for importing and normalizing session transcripts into structured messages and tool calls.
- Extended the ingest command to load either the latest session (scoped by project) or a provided input file, with options to control sidechain inclusion.
Bug Fixes
- Improved resilience when processing JSONL transcripts by tolerating malformed trailing lines and rejecting sessions with no usable content.
Tests
- Added/expanded tests covering tool-result parsing, thinking metadata handling, sidechain filtering, malformed JSON handling, and newest-session selection.

Adds skillloop/adapters/claude_code.py and wires `ingest claude-code`, so SkillLoop can ingest Claude Code sessions alongside the generic and hermes adapters. Claude Code stores sessions as JSONL under ~/.claude/projects/<slug>/*.jsonl with a nested envelope and block-array content, so the generic adapter cannot read them. The adapter normalizes that into AgentTrace: - two-pass parse matching tool_use blocks to their tool_result across turns, populating ToolCall name/arguments/result/status/started_at/ended_at/duration_ms - flattens text blocks into content; tolerates a partial trailing line on a live session; drops meta lines and pure tool_result turns - --no-sidechains to exclude subagent sidechain turns - forward-compatible extended-thinking capture (Claude Code persists thinking blocks with empty text + a signature, so the reasoning text is stripped) 6 tests added; full suite green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-13T14:44:35Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b547c9df-e08f-4cb8-95f2-a19817a25605

📥 Commits

Reviewing files that changed from the base of the PR and between c7ac64c and 597a33a.

📒 Files selected for processing (2)

skillloop/adapters/claude_code.py
tests/test_claude_code.py

🚧 Files skipped from review as they are similar to previous changes (1)

skillloop/adapters/claude_code.py

📝 Walkthrough

Walkthrough

This PR introduces a Claude Code session adapter for SkillLoop that parses JSONL session transcripts into structured AgentMessage objects, with file loading/discovery helpers, CLI integration, and comprehensive test coverage.

Changes

Claude Code Session Adapter

Layer / File(s)	Summary
Adapter parsing foundations and normalization logic `skillloop/adapters/claude_code.py`	Module constants and role filtering; internal helpers parse JSONL lines, extract text from content blocks, normalize tool results, and parse timestamps. `normalize_claude_code_session` performs two-pass tool-result indexing by `tool_use_id`, builds `AgentMessage` entries with `ToolCall` objects (name, arguments, result, status, duration), extracts thinking/redacted-thinking into metadata, and skips turns without usable content.
File loading and transcript discovery `skillloop/adapters/claude_code.py`	`load_claude_code_session` reads and normalizes a transcript file, raises `ValueError` when no usable messages exist, and returns an `AgentTrace` with adapter metadata and SHA-256 checksum. `latest_claude_code_session` validates the base projects directory, searches for JSONL transcripts (optionally scoped to a project), selects the newest by modification time, and returns its `Path`.
CLI command integration for Claude Code adapter `skillloop/cli.py`	Imports adapter loading functions; extends `cmd_ingest` with a `claude-code` branch that sources data from `--latest` (scoped by `--projects-dir` and `--project`) or `--input`; updates argparse to include `claude-code` as an allowed adapter, adds `--projects-dir`, `--project`, and `--no-sidechains` options, and clarifies help text for both Hermes DB and Claude Code behavior.
Comprehensive test coverage for Claude Code adapter `tests/test_claude_code.py`	Test helper writes JSONL session traces. Tests cover: parsing and extracting tool calls with status/duration, rejecting empty sessions via `ValueError`, rejecting mid-file malformed JSON, tolerating malformed trailing lines by loading valid preceding content, selecting newest file by timestamp, preserving thinking text in metadata while excluding it from message body, filtering sidechains based on `include_sidechains` flag, and handling empty thinking blocks with only signatures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A new Claude transcript flows,

Tool calls and thinking it knows,

Files sorted by time,

CLI keeps rhythm,

Robust tests ensure nothing slips through!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and clearly describes the main change: adding a new Claude Code adapter for ingesting session transcripts into the SkillLoop system.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/test_claude_code.py (1)

55-63: ⚡ Quick win

Add a regression test for malformed non-trailing JSON lines.

Line 55 onward covers empty sessions, but there’s no assertion that a malformed middle JSONL line raises an error instead of being silently dropped. This protects the parsing contract and prevents future silent data loss regressions.

🧪 Suggested test

+import pytest
+
 def test_claude_code_adapter_rejects_empty_session(tmp_path):
@@
     raise AssertionError("expected ValueError for a session with no usable messages")
+
+
+def test_claude_code_adapter_rejects_malformed_non_trailing_line(tmp_path):
+    session = tmp_path / "malformed_mid.jsonl"
+    session.write_text(
+        "\n".join(
+            [
+                json.dumps({"message": {"role": "user", "content": [{"type": "text", "text": "hi"}]}}),
+                '{"message": ',  # malformed non-trailing line
+                json.dumps({"message": {"role": "assistant", "content": [{"type": "text", "text": "bye"}]}}),
+            ]
+        )
+        + "\n",
+        encoding="utf-8",
+    )
+    with pytest.raises(ValueError):
+        load_claude_code_session(session)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_claude_code.py` around lines 55 - 63, Add a regression test that
ensures load_claude_code_session raises on a malformed non-trailing JSONL line:
create a temp session file via _write_session containing a valid message, then a
malformed JSON line (e.g., truncated or invalid JSON), then another valid
message, and assert that calling load_claude_code_session(session) raises
ValueError (similar to test_claude_code_adapter_rejects_empty_session but
specifically exercising a malformed middle line); reference the test name
test_claude_code_adapter_rejects_empty_session and the loader function
load_claude_code_session when locating where to add the new test.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@skillloop/adapters/claude_code.py`:
- Around line 69-76: The loop currently swallows every JSONDecodeError for any
line (iterating raw_text -> raw), but we should only tolerate a malformed JSON
on the final non-empty line; change the loop to detect whether the current `raw`
is the last non-empty line (e.g., pre-split `raw_text.splitlines()` into a list
and use index/enumeration or compute remaining non-empty lines) and in the
except json.JSONDecodeError block only `continue` when there are no further
non-empty lines; otherwise re-raise (or propagate) the error so corrupted middle
lines are not silently dropped. Reference the variables and loop using
`raw_text`, `raw`, `parsed` and the JSON parse try/except surrounding
`json.loads(raw)`.

---

Nitpick comments:
In `@tests/test_claude_code.py`:
- Around line 55-63: Add a regression test that ensures load_claude_code_session
raises on a malformed non-trailing JSONL line: create a temp session file via
_write_session containing a valid message, then a malformed JSON line (e.g.,
truncated or invalid JSON), then another valid message, and assert that calling
load_claude_code_session(session) raises ValueError (similar to
test_claude_code_adapter_rejects_empty_session but specifically exercising a
malformed middle line); reference the test name
test_claude_code_adapter_rejects_empty_session and the loader function
load_claude_code_session when locating where to add the new test.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 774ee182-7695-4351-af9b-b5977d7a6f3f

📥 Commits

Reviewing files that changed from the base of the PR and between f7ba1b2 and c7ac64c.

📒 Files selected for processing (3)

skillloop/adapters/claude_code.py
skillloop/cli.py
tests/test_claude_code.py

Necmttn · 2026-06-21T09:22:54Z

Nice adapter. One preservation detail I would keep even if AgentTrace stays normalized: retain raw envelope refs for every source line.

The most useful fields are uuid, parentUuid, timestamp, session_id, and byte or line offset. Dropping pure tool_result turns is fine for the main view, but a raw_event_ref on the normalized tool result makes replay/debugging possible when matching fails or a future Claude field changes.

It also helps render subagent sidechains later without losing the original transcript topology.

Generated with ax.

coderabbitai Bot requested changes Jun 13, 2026

View reviewed changes

Comment thread skillloop/adapters/claude_code.py Outdated

fix: reject malformed Claude Code JSONL lines

597a33a

coderabbitai Bot approved these changes Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a Claude Code session-transcript adapter#8

Add a Claude Code session-transcript adapter#8
ejentum wants to merge 2 commits into
lamenting-hawthorn:mainfrom
ejentum:feat/claude-code-adapter

ejentum commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Necmttn commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ejentum commented Jun 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code session-transcript adapter

Usage

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Necmttn commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ejentum commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading