eph5xx · eph5xx · Apr 20, 2026 · Apr 20, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -4,15 +4,15 @@
 
 A Claude Code skillset (slash commands + subagent definitions) that helps founders evaluate startup problems and discover product opportunities. Eight commands: `/tweak:browse-hn` searches HN via Algolia and ranks candidate threads to feed into `/tweak:analyze-hn-post`; `/tweak:analyze-hn-post` analyzes a single Hacker News discussion to identify technology shifts and surface product ideas; `/tweak:evaluate` runs 14 independent subagents to produce a weighted scorecard with assumption tracking; `/tweak:improve` reads a completed run and generates three concrete idea tweaks (small reframe, medium reshape, big reimagine) grounded in scoring rubrics; `/tweak:diff` compares two evaluation runs side-by-side showing score deltas, potential shifts, and per-dimension movers; `/tweak:list` and `/tweak:show` browse and open artifacts accumulated under `~/.tweakidea/`; `/tweak:share` uploads a run's HTML report to a secret GitHub gist and returns a shareable rendered preview URL.
 
-**Core Value:** Help founders make better decisions -- evaluate whether a problem is worth solving, or discover what problems are emerging from technology shifts.
+**Core Value:** Help founders make better decisions — evaluate whether a problem is worth solving, or discover what problems are emerging from technology shifts.
 
 ### Constraints
 
-- **Platform**: Claude Code CLI -- slash commands + agent definitions only. Exceptions: `analyze-hn-post` and `browse-hn` both shell out to `uv` + Python (`hnparse.py` and `hnsearch.py`) for HN fetching
+- **Platform**: Claude Code CLI — slash commands + agent definitions only. Exceptions: `analyze-hn-post` and `browse-hn` both shell out to `uv` + Python (`hnparse.py` and `hnsearch.py`) for HN fetching
 - **Evaluation model**: All 14 dimensions from EVALUATION.md must be covered, no skipping
 - **Independence**: Each subagent evaluates its dimension without seeing other dimensions' results
-- **Clean context**: Each evaluation run should be independent -- no cross-contamination between runs
-- **JSON contracts**: All pipeline artifacts are typed JSON validated against schemas in `schemas/`. The orchestrator does zero text processing on agent output -- all aggregation is in `scripts/compute.py`
+- **Clean context**: Each evaluation run should be independent — no cross-contamination between runs
+- **JSON contracts**: All pipeline artifacts are typed JSON validated against schemas in `schemas/`. The orchestrator does zero text processing on agent output — all aggregation is in `scripts/compute.py`
 
 ## Key Decisions
 
@@ -22,7 +22,7 @@ A Claude Code skillset (slash commands + subagent definitions) that helps founde
 - **Researcher model:** Sonnet (web research in Prepare stage)
 - **Context isolation:** FOUNDER.md passed ONLY to founder-market-fit evaluator; other 13 evaluators get EVALUATION_CONTEXT only
 - **HN data location:** `~/.tweakidea/hn/hn-{id}/`
-- **HN analysis model:** Inline (no subagent -- runs in command context)
+- **HN analysis model:** Inline (no subagent — runs in command context)
 
 ## Development
 
@@ -37,13 +37,13 @@ Source content lives in root-level directories (`agents/`, `commands/`, `skills/
 **Do:**
 
 - Re-run `node bin/install.js --local` after editing any file in `agents/`, `commands/`, or `skills/`
-- Run `npm test` before committing -- exercises both Node and Python test suites
+- Run `npm test` before committing — exercises both Node and Python test suites
 - Validate JSON changes against the matching schema in `schemas/` when modifying agent output formats
-- Keep all 14 dimensions -- never skip, merge, or remove a dimension
+- Keep all 14 dimensions — never skip, merge, or remove a dimension
 
 **Don't:**
 
-- Edit files inside `.claude/` -- they are generated by the installer and will be overwritten
-- Parse evaluator output in the orchestrator -- `scripts/compute.py` handles all parsing
-- Add npm dependencies -- the project uses only Node built-ins
+- Edit files inside `.claude/` — they are generated by the installer and will be overwritten
+- Parse evaluator output in the orchestrator — `scripts/compute.py` handles all parsing
+- Add npm dependencies — the project uses only Node built-ins
 - Inject founder profile data into any evaluator except Founder-Market Fit
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,110 +1,70 @@
 # Contributing to TweakIdea
 
-Thanks for your interest in contributing! TweakIdea is a set of Claude Code markdown files — there's no compiled code and no build step, but there is a `package.json` for npm distribution and a Node.js installer (`bin/install.js`). "Development" means editing markdown files that contain natural language instructions.
+TweakIdea is a Claude Code skillset — markdown files in `agents/`, `commands/`, and `skills/`, plus small Python helpers in `scripts/` and `skills/ti-hnparse/`. There's no compiled code; editing it means editing markdown and the occasional Python script.
 
-## Project Structure
+## Quick start
+
+```bash
+git clone https://github.com/eph5xx/tweakidea.git
+cd tweakidea
+node bin/install.js --local      # populates ./.claude/ from source dirs
+# edit a file in agents/, commands/, or skills/
+node bin/install.js --local      # refresh ./.claude/
+npm test                          # smoke tests
+```
+
+The installer is the only build step. Always re-run it after changing source files — Claude Code reads from `.claude/`, not from the root-level source directories.
+
+## Project structure
 
 ```
 tweakidea/
+  README.md                 # User-facing intro and command catalog
+  CLAUDE.md                 # Project context loaded by Claude in this repo
+  CONTRIBUTING.md           # This file
+  LICENSE                   # MIT
+  package.json              # npm metadata; no runtime deps
   bin/
-    install.js            # Installer: copies source into .claude/ for Claude Code
-  commands/tweak/
-    browse-hn.md          # HN candidate browser (read-only, no subagents)
-    diff.md               # Compare two evaluation runs side-by-side (read-only)
-    evaluate.md           # Evaluation pipeline (6 stages)
-    improve.md            # Generate three idea tweaks from a scored run (read-only)
-    analyze-hn-post.md    # HN opportunity discovery (7 phases)
-    list.md               # List runs, HN analyses, and founder profiles (read-only)
-    show.md               # Open any artifact by timestamp, keyword, or query (read-only)
-  agents/
-    ti-extractor.md       # Hypothesis extractor (Sonnet)
-    ti-evaluator.md       # Dimension evaluator (spawned 14x with Sonnet)
-    ti-narrative.md       # Cross-dimensional narrative author (Opus)
-    ti-researcher.md      # Web research agent (Sonnet)
+    install.js              # Installer (also handles --uninstall)
+    manifest.js             # Registry of agents/commands/skills the installer copies
+  commands/tweak/           # 8 slash commands
+    analyze-hn-post.md
+    browse-hn.md
+    diff.md
+    evaluate.md
+    improve.md
+    list.md
+    share.md
+    show.md
+  agents/                   # Subagents spawned by /tweak:evaluate
+    ti-evaluator.md         # Spawned 14× on Sonnet (one per dimension)
+    ti-extractor.md         # Hypothesis extractor (Sonnet)
+    ti-narrative.md         # Cross-dimensional narrative author (Opus)
+    ti-researcher.md        # Web research (Sonnet)
   skills/
     ti-scoring/
-      SKILL.md            # Skill metadata (auto-loaded, not user-invocable)
-      EVALUATION.md       # Scoring algorithm, weights, evidence tiers
-      dimensions/         # 14 dimension definition files
-    ti-founder/
-      SKILL.md            # Founder profile template and fit question guidance
-    ti-report/
-      SKILL.md            # Report skill metadata
-      template.md.j2      # Markdown report template (Jinja2)
-      template.html.j2    # HTML report template (Jinja2)
-      styles.css          # Embedded CSS for HTML report
-    ti-hnparse/
-      SKILL.md            # Skill metadata (not user-invocable)
-      hnparse.py          # HN post fetcher script (Python, runs via uv)
-      hnsearch.py         # HN search script for browse-hn (Python, runs via uv)
-  schemas/
-    version.json            # Run metadata (pipeline version, schema version)
-    idea.json               # Captured idea (problem, solution, context)
-    hypotheses.json         # Extracted testable claims with dimension tags
-    assumptions.json        # Founder-confirmed hypothesis statuses
-    research.json           # Web research output (3 clusters)
-    dimension-evaluation.json # Single dimension evaluator output
-    numbers.json            # Computed scorecard (weighted scores, verdict, rankings)
-    strengths-weaknesses.json # Top 3 strengths + top 3 weaknesses
-    next-steps.json         # Actionable next steps with impact estimates
-    potential.json          # Per-dimension potential uplift narratives
+      SKILL.md
+      EVALUATION.md         # Scoring algorithm, weights, evidence tiers
+      dimensions/           # 14 dimension definition files
+    ti-founder/             # Founder profile template + fit question guidance
+    ti-report/              # Jinja2 report templates + CSS
+    ti-hnparse/             # HN fetch + Algolia search Python scripts
+  schemas/                  # 10 JSON schemas validating pipeline artifacts
   scripts/
     compute.py              # Deterministic scorecard computation
     render_report.py        # Jinja2 report renderer (markdown + HTML)
-    run-tests.cjs           # Test runner (npm test entry point)
-    lib/
-      registry.py           # Dimension registry parser
-      schema.py             # JSON schema validation
-      errors.py             # Error types for compute pipeline
-    tests/                  # Python test suite (16+ files + fixtures/)
-  package.json            # npm package metadata (name, version, bin entry)
+    run-tests.cjs           # Node test runner (entry point for `npm test`)
+    lib/                    # Shared modules: registry, schema, errors
+    tests/                  # Python test suite (run via uv)
+  tests/                    # Node smoke tests (frontmatter, install, pack)
+  assets/
+    report.png              # README screenshot
+  .github/workflows/        # CI: test on PRs, publish on release
 ```
 
-Source files live in the root-level directories above. The installer (`bin/install.js`) copies them into `.claude/` where Claude Code discovers them. **Edit the root-level files, not the `.claude/` copies.**
-
-- **Commands** (`commands/tweak/`): Slash command entry points invoked by users. `evaluate.md` is the evaluation orchestrator (6-stage pipeline). `analyze-hn-post.md` is the HN opportunity discovery orchestrator (7-phase pipeline). `browse-hn.md` is a read-only HN candidate browser that shells out to `hnsearch.py` and ranks results for `/tweak:analyze-hn-post` — no subagents, no writes. `diff.md` compares two evaluation runs side-by-side (score deltas, potential shifts, per-dimension movers) — read-only, no subagents. `improve.md` reads a completed run and generates three concrete idea tweaks (small, medium, big) grounded in scoring rubrics — read-only, no subagents. `list.md` and `show.md` are read-only browsers over artifacts under `~/.tweakidea/` — no subagents, no writes.
-
-- **Agents** (`agents/`): Subagent definitions spawned by the evaluate orchestrator. Each runs in an independent context window. `ti-extractor.md` extracts testable hypotheses from idea text. `ti-evaluator.md` is spawned 14 times (once per dimension) on Sonnet. `ti-narrative.md` authors the cross-dimensional narrative prose on Opus. `ti-researcher.md` gathers web research on Sonnet.
-
-- **Skills** (`skills/`): Reference knowledge auto-loaded into agent context. `ti-scoring/` contains the evaluation framework (14 dimensions + weights), scoring rubrics, and individual dimension definitions. `ti-founder/` contains the founder profile template and fit question guidance. `ti-report/` contains Jinja2 report templates (markdown + HTML) and embedded CSS. `ti-hnparse/` contains the Python script that fetches HN posts via the Algolia API.
-
-## Evaluation Pipeline
-
-The `/tweak:evaluate` command runs a 6-stage pipeline (Stage 0 through Stage 5):
-
-0. **Init** — Capture idea, resolve script paths, create run directory, write `version.json` + `idea.json`
-1. **Parallel Research + Extraction** — (Lane A) ti-researcher writes `research.json`; (Lane B) ti-extractor writes `hypotheses.json`, founder confirms, orchestrator writes `assumptions.json`
-2. **Parallel Dimension Evaluation** — Spawn 14 independent evaluator agents in parallel; each writes its own `dimensions/{slug}.json`
-3. **Compute + Narrative** — (Stage 3a) `scripts/compute.py` reads dimension JSONs and writes `numbers.json`; (Stage 3b) ti-narrative reads numbers.json and writes 3 prose JSON files
-4. **Render** — `scripts/render_report.py` writes `report.md` + `report.html` via Jinja2
-5. **Confirm** — Display artifact summary and offer browser open
-
-Each evaluator agent gets its own context window and never sees other dimensions' results. This prevents anchoring bias. The orchestrator performs zero text processing on evaluator output -- all math is in `scripts/compute.py`.
-
-## Suggest-from-HN Pipeline
-
-The `/tweak:analyze-hn-post` command runs a 7-phase pipeline (no subagents — analysis runs inline):
-
-1. **Capture** — Parse HN URL or item ID from arguments (or prompt interactively)
-2. **Fetch** — Run `hnparse.py` via `uv` to download the post, article, and comment tree
-3. **Read** — Load the full `content.md` (all chunks — late comments often have the strongest signals)
-4. **Analyze** — Identify 3-6 technology shifts with evidence-grounded product opportunities
-5. **Write Shifts** — Save analysis to `~/.tweakidea/hn/hn-{id}/shifts.md`
-6. **Confirm** — Present opportunities for user selection via multi-select prompts
-7. **Write Ideas** — Save confirmed opportunities as detailed problem/solution writeups to `ideas.md`
+**Edit the root-level files, not the `.claude/` copies** — `.claude/` is generated by the installer and overwritten on every run.
 
-## Dimension Files
-
-Each dimension in `skills/ti-scoring/dimensions/` defines:
-- **Signals table**: What the evaluator looks for (strong/moderate/weak indicators)
-- **Rubric criteria**: 5-level binary scoring (each level has PASS/FAIL criteria)
-- **Weight**: How much this dimension contributes to the final score
-
-To modify a dimension's scoring behavior, edit its dimension file. To add a new dimension, create a new file following the existing pattern and update EVALUATION.md to include it in the framework.
-
-## Making Changes
-
-### What to change where
+## Where to make X kind of change
 
 | Change | Files to edit |
 |--------|--------------|
@@ -116,64 +76,49 @@ To modify a dimension's scoring behavior, edit its dimension file. To add a new
 | Cross-dimensional narrative | `agents/ti-narrative.md` |
 | Research strategy | `agents/ti-researcher.md` |
 | Evaluator behavior | `agents/ti-evaluator.md` |
+| Hypothesis extraction | `agents/ti-extractor.md` |
 | Founder profile template | `skills/ti-founder/SKILL.md` |
 | Report templates | `skills/ti-report/` |
-| Hypothesis extraction | `agents/ti-extractor.md` |
-| Suggest pipeline behavior | `commands/tweak/analyze-hn-post.md` |
+| Analyze-HN pipeline behavior | `commands/tweak/analyze-hn-post.md` |
 | Browse-HN pipeline behavior | `commands/tweak/browse-hn.md` |
-| HN fetch/parse logic | `skills/ti-hnparse/hnparse.py` |
+| HN fetch / parse logic | `skills/ti-hnparse/hnparse.py` |
 | HN search logic | `skills/ti-hnparse/hnsearch.py` |
 | Diff comparison behavior | `commands/tweak/diff.md` |
 | Improve tweak generation | `commands/tweak/improve.md` |
-| List/Show browser behavior | `commands/tweak/list.md`, `commands/tweak/show.md` |
-
-### PR conventions
-
-1. **One concern per PR** — don't mix dimension changes with pipeline changes
-2. **Describe the "why"** — explain what behavior you're changing and why the current behavior is wrong or insufficient
-3. **Test with a real idea** — run `/tweak:evaluate` with a real startup idea and confirm the output makes sense
-4. **Keep dimension files consistent** — if you change the structure of one dimension file, update all 14 to match
+| List / Show browser behavior | `commands/tweak/list.md`, `commands/tweak/show.md` |
+| Share / upload behavior | `commands/tweak/share.md` |
+| Installer behavior | `bin/install.js`, `bin/manifest.js` |
 
 ## Testing
 
-Run the full test suite:
+Two test suites. CI runs both on every PR via `.github/workflows/test.yml`.
+
+**Node suite** — smoke tests for packaging, frontmatter, install:
 
 ```bash
 npm test
 ```
 
-This invokes `scripts/run-tests.cjs`, which runs the Python test suite in `scripts/tests/` via `uv`. Tests cover:
-
-- Schema validation for all 10 JSON contracts (`test_schemas.py`)
-- Compute pipeline correctness (`test_compute.py`)
-- Report rendering against golden fixtures (`test_render.py`)
-- Agent prompt contract verification (`test_evaluator_agent.py`, `test_narrative_agent.py`, etc.)
-- No dangling references to renamed agents/skills (`test_no_dangling_refs.py`, `test_skill_rename.py`)
-- Packaging and installation (`test_packaging.py`, `test_install_uv_wrapper.py`)
+Runs `node scripts/run-tests.cjs`, which executes `tests/*.test.cjs` (`frontmatter.test.cjs`, `install-smoke.test.cjs`, `pack-smoke.test.cjs`) via Node's built-in `--test` runner.
 
-To run a single test file:
+**Python suite** — compute pipeline, schemas, agent contracts, render fixtures:
 
 ```bash
-uv run python -m pytest scripts/tests/test_compute.py -v
+uv run scripts/tests/run_tests.py        # all tests
+uv run scripts/tests/run_tests.py -v     # verbose
 ```
 
-## User Data
-
-TweakIdea stores user data outside the repository:
-
-- `~/.tweakidea/FOUNDER.md` — Persistent founder profile
-- `~/.tweakidea/runs/{timestamp}/` — Evaluation run outputs
-- `~/.tweakidea/hn/hn-{id}/` — HN suggest outputs (content.md, shifts.md, ideas.md)
+Uses `unittest` (not `pytest`). To run a single file:
 
-These paths are never committed to the repository. If you're testing, your personal data stays in your home directory.
-
-- `.claude/` — Generated by the installer from source directories. Do not edit these files directly; do not commit them.
+```bash
+uv run --with jsonschema python -m unittest scripts.tests.test_compute -v
+```
 
-## Development Workflow
+Run from the repo root — tests import from the `scripts.tests` package.
 
-Source files live in root-level directories (`agents/`, `commands/`, `skills/`). The installer copies them into `.claude/` for Claude Code discovery.
+## PR conventions
 
-1. **Setup:** `node bin/install.js --local` — populates `.claude/` from source dirs
-2. **Edit:** Change files in `agents/`, `commands/`, or `skills/`
-3. **Refresh:** Re-run `node bin/install.js --local` to update `.claude/`
-4. **Test npx flow:** `npx .` from repo root
+1. **One concern per PR** — don't mix dimension changes with pipeline changes.
+2. **Describe the "why"** — explain what behavior you're changing and why the current behavior is wrong or insufficient.
+3. **Test with a real idea** — run `/tweak:evaluate` end-to-end and confirm the output makes sense.
+4. **Keep dimension files consistent** — if you change the structure of one dimension file, update all 14 to match.