Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@

A Claude Code skillset (slash commands + subagent definitions) that helps founders evaluate startup problems and discover product opportunities. Eight commands: `/tweak:browse-hn` searches HN via Algolia and ranks candidate threads to feed into `/tweak:analyze-hn-post`; `/tweak:analyze-hn-post` analyzes a single Hacker News discussion to identify technology shifts and surface product ideas; `/tweak:evaluate` runs 14 independent subagents to produce a weighted scorecard with assumption tracking; `/tweak:improve` reads a completed run and generates three concrete idea tweaks (small reframe, medium reshape, big reimagine) grounded in scoring rubrics; `/tweak:diff` compares two evaluation runs side-by-side showing score deltas, potential shifts, and per-dimension movers; `/tweak:list` and `/tweak:show` browse and open artifacts accumulated under `~/.tweakidea/`; `/tweak:share` uploads a run's HTML report to a secret GitHub gist and returns a shareable rendered preview URL.

**Core Value:** Help founders make better decisions -- evaluate whether a problem is worth solving, or discover what problems are emerging from technology shifts.
**Core Value:** Help founders make better decisions evaluate whether a problem is worth solving, or discover what problems are emerging from technology shifts.

### Constraints

- **Platform**: Claude Code CLI -- slash commands + agent definitions only. Exceptions: `analyze-hn-post` and `browse-hn` both shell out to `uv` + Python (`hnparse.py` and `hnsearch.py`) for HN fetching
- **Platform**: Claude Code CLI slash commands + agent definitions only. Exceptions: `analyze-hn-post` and `browse-hn` both shell out to `uv` + Python (`hnparse.py` and `hnsearch.py`) for HN fetching
- **Evaluation model**: All 14 dimensions from EVALUATION.md must be covered, no skipping
- **Independence**: Each subagent evaluates its dimension without seeing other dimensions' results
- **Clean context**: Each evaluation run should be independent -- no cross-contamination between runs
- **JSON contracts**: All pipeline artifacts are typed JSON validated against schemas in `schemas/`. The orchestrator does zero text processing on agent output -- all aggregation is in `scripts/compute.py`
- **Clean context**: Each evaluation run should be independent no cross-contamination between runs
- **JSON contracts**: All pipeline artifacts are typed JSON validated against schemas in `schemas/`. The orchestrator does zero text processing on agent output all aggregation is in `scripts/compute.py`

## Key Decisions

Expand All @@ -22,7 +22,7 @@ A Claude Code skillset (slash commands + subagent definitions) that helps founde
- **Researcher model:** Sonnet (web research in Prepare stage)
- **Context isolation:** FOUNDER.md passed ONLY to founder-market-fit evaluator; other 13 evaluators get EVALUATION_CONTEXT only
- **HN data location:** `~/.tweakidea/hn/hn-{id}/`
- **HN analysis model:** Inline (no subagent -- runs in command context)
- **HN analysis model:** Inline (no subagent runs in command context)

## Development

Expand All @@ -37,13 +37,13 @@ Source content lives in root-level directories (`agents/`, `commands/`, `skills/
**Do:**

- Re-run `node bin/install.js --local` after editing any file in `agents/`, `commands/`, or `skills/`
- Run `npm test` before committing -- exercises both Node and Python test suites
- Run `npm test` before committing exercises both Node and Python test suites
- Validate JSON changes against the matching schema in `schemas/` when modifying agent output formats
- Keep all 14 dimensions -- never skip, merge, or remove a dimension
- Keep all 14 dimensions never skip, merge, or remove a dimension

**Don't:**

- Edit files inside `.claude/` -- they are generated by the installer and will be overwritten
- Parse evaluator output in the orchestrator -- `scripts/compute.py` handles all parsing
- Add npm dependencies -- the project uses only Node built-ins
- Edit files inside `.claude/` they are generated by the installer and will be overwritten
- Parse evaluator output in the orchestrator `scripts/compute.py` handles all parsing
- Add npm dependencies the project uses only Node built-ins
- Inject founder profile data into any evaluator except Founder-Market Fit
205 changes: 75 additions & 130 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,110 +1,70 @@
# Contributing to TweakIdea

Thanks for your interest in contributing! TweakIdea is a set of Claude Code markdown files — there's no compiled code and no build step, but there is a `package.json` for npm distribution and a Node.js installer (`bin/install.js`). "Development" means editing markdown files that contain natural language instructions.
TweakIdea is a Claude Code skillset — markdown files in `agents/`, `commands/`, and `skills/`, plus small Python helpers in `scripts/` and `skills/ti-hnparse/`. There's no compiled code; editing it means editing markdown and the occasional Python script.

## Project Structure
## Quick start

```bash
git clone https://github.com/eph5xx/tweakidea.git
cd tweakidea
node bin/install.js --local # populates ./.claude/ from source dirs
# edit a file in agents/, commands/, or skills/
node bin/install.js --local # refresh ./.claude/
npm test # smoke tests
```

The installer is the only build step. Always re-run it after changing source files — Claude Code reads from `.claude/`, not from the root-level source directories.

## Project structure

```
tweakidea/
README.md # User-facing intro and command catalog
CLAUDE.md # Project context loaded by Claude in this repo
CONTRIBUTING.md # This file
LICENSE # MIT
package.json # npm metadata; no runtime deps
bin/
install.js # Installer: copies source into .claude/ for Claude Code
commands/tweak/
browse-hn.md # HN candidate browser (read-only, no subagents)
diff.md # Compare two evaluation runs side-by-side (read-only)
evaluate.md # Evaluation pipeline (6 stages)
improve.md # Generate three idea tweaks from a scored run (read-only)
analyze-hn-post.md # HN opportunity discovery (7 phases)
list.md # List runs, HN analyses, and founder profiles (read-only)
show.md # Open any artifact by timestamp, keyword, or query (read-only)
agents/
ti-extractor.md # Hypothesis extractor (Sonnet)
ti-evaluator.md # Dimension evaluator (spawned 14x with Sonnet)
ti-narrative.md # Cross-dimensional narrative author (Opus)
ti-researcher.md # Web research agent (Sonnet)
install.js # Installer (also handles --uninstall)
manifest.js # Registry of agents/commands/skills the installer copies
commands/tweak/ # 8 slash commands
analyze-hn-post.md
browse-hn.md
diff.md
evaluate.md
improve.md
list.md
share.md
show.md
agents/ # Subagents spawned by /tweak:evaluate
ti-evaluator.md # Spawned 14× on Sonnet (one per dimension)
ti-extractor.md # Hypothesis extractor (Sonnet)
ti-narrative.md # Cross-dimensional narrative author (Opus)
ti-researcher.md # Web research (Sonnet)
skills/
ti-scoring/
SKILL.md # Skill metadata (auto-loaded, not user-invocable)
EVALUATION.md # Scoring algorithm, weights, evidence tiers
dimensions/ # 14 dimension definition files
ti-founder/
SKILL.md # Founder profile template and fit question guidance
ti-report/
SKILL.md # Report skill metadata
template.md.j2 # Markdown report template (Jinja2)
template.html.j2 # HTML report template (Jinja2)
styles.css # Embedded CSS for HTML report
ti-hnparse/
SKILL.md # Skill metadata (not user-invocable)
hnparse.py # HN post fetcher script (Python, runs via uv)
hnsearch.py # HN search script for browse-hn (Python, runs via uv)
schemas/
version.json # Run metadata (pipeline version, schema version)
idea.json # Captured idea (problem, solution, context)
hypotheses.json # Extracted testable claims with dimension tags
assumptions.json # Founder-confirmed hypothesis statuses
research.json # Web research output (3 clusters)
dimension-evaluation.json # Single dimension evaluator output
numbers.json # Computed scorecard (weighted scores, verdict, rankings)
strengths-weaknesses.json # Top 3 strengths + top 3 weaknesses
next-steps.json # Actionable next steps with impact estimates
potential.json # Per-dimension potential uplift narratives
SKILL.md
EVALUATION.md # Scoring algorithm, weights, evidence tiers
dimensions/ # 14 dimension definition files
ti-founder/ # Founder profile template + fit question guidance
ti-report/ # Jinja2 report templates + CSS
ti-hnparse/ # HN fetch + Algolia search Python scripts
schemas/ # 10 JSON schemas validating pipeline artifacts
scripts/
compute.py # Deterministic scorecard computation
render_report.py # Jinja2 report renderer (markdown + HTML)
run-tests.cjs # Test runner (npm test entry point)
lib/
registry.py # Dimension registry parser
schema.py # JSON schema validation
errors.py # Error types for compute pipeline
tests/ # Python test suite (16+ files + fixtures/)
package.json # npm package metadata (name, version, bin entry)
run-tests.cjs # Node test runner (entry point for `npm test`)
lib/ # Shared modules: registry, schema, errors
tests/ # Python test suite (run via uv)
tests/ # Node smoke tests (frontmatter, install, pack)
assets/
report.png # README screenshot
.github/workflows/ # CI: test on PRs, publish on release
```

Source files live in the root-level directories above. The installer (`bin/install.js`) copies them into `.claude/` where Claude Code discovers them. **Edit the root-level files, not the `.claude/` copies.**

- **Commands** (`commands/tweak/`): Slash command entry points invoked by users. `evaluate.md` is the evaluation orchestrator (6-stage pipeline). `analyze-hn-post.md` is the HN opportunity discovery orchestrator (7-phase pipeline). `browse-hn.md` is a read-only HN candidate browser that shells out to `hnsearch.py` and ranks results for `/tweak:analyze-hn-post` — no subagents, no writes. `diff.md` compares two evaluation runs side-by-side (score deltas, potential shifts, per-dimension movers) — read-only, no subagents. `improve.md` reads a completed run and generates three concrete idea tweaks (small, medium, big) grounded in scoring rubrics — read-only, no subagents. `list.md` and `show.md` are read-only browsers over artifacts under `~/.tweakidea/` — no subagents, no writes.

- **Agents** (`agents/`): Subagent definitions spawned by the evaluate orchestrator. Each runs in an independent context window. `ti-extractor.md` extracts testable hypotheses from idea text. `ti-evaluator.md` is spawned 14 times (once per dimension) on Sonnet. `ti-narrative.md` authors the cross-dimensional narrative prose on Opus. `ti-researcher.md` gathers web research on Sonnet.

- **Skills** (`skills/`): Reference knowledge auto-loaded into agent context. `ti-scoring/` contains the evaluation framework (14 dimensions + weights), scoring rubrics, and individual dimension definitions. `ti-founder/` contains the founder profile template and fit question guidance. `ti-report/` contains Jinja2 report templates (markdown + HTML) and embedded CSS. `ti-hnparse/` contains the Python script that fetches HN posts via the Algolia API.

## Evaluation Pipeline

The `/tweak:evaluate` command runs a 6-stage pipeline (Stage 0 through Stage 5):

0. **Init** — Capture idea, resolve script paths, create run directory, write `version.json` + `idea.json`
1. **Parallel Research + Extraction** — (Lane A) ti-researcher writes `research.json`; (Lane B) ti-extractor writes `hypotheses.json`, founder confirms, orchestrator writes `assumptions.json`
2. **Parallel Dimension Evaluation** — Spawn 14 independent evaluator agents in parallel; each writes its own `dimensions/{slug}.json`
3. **Compute + Narrative** — (Stage 3a) `scripts/compute.py` reads dimension JSONs and writes `numbers.json`; (Stage 3b) ti-narrative reads numbers.json and writes 3 prose JSON files
4. **Render** — `scripts/render_report.py` writes `report.md` + `report.html` via Jinja2
5. **Confirm** — Display artifact summary and offer browser open

Each evaluator agent gets its own context window and never sees other dimensions' results. This prevents anchoring bias. The orchestrator performs zero text processing on evaluator output -- all math is in `scripts/compute.py`.

## Suggest-from-HN Pipeline

The `/tweak:analyze-hn-post` command runs a 7-phase pipeline (no subagents — analysis runs inline):

1. **Capture** — Parse HN URL or item ID from arguments (or prompt interactively)
2. **Fetch** — Run `hnparse.py` via `uv` to download the post, article, and comment tree
3. **Read** — Load the full `content.md` (all chunks — late comments often have the strongest signals)
4. **Analyze** — Identify 3-6 technology shifts with evidence-grounded product opportunities
5. **Write Shifts** — Save analysis to `~/.tweakidea/hn/hn-{id}/shifts.md`
6. **Confirm** — Present opportunities for user selection via multi-select prompts
7. **Write Ideas** — Save confirmed opportunities as detailed problem/solution writeups to `ideas.md`
**Edit the root-level files, not the `.claude/` copies** — `.claude/` is generated by the installer and overwritten on every run.

## Dimension Files

Each dimension in `skills/ti-scoring/dimensions/` defines:
- **Signals table**: What the evaluator looks for (strong/moderate/weak indicators)
- **Rubric criteria**: 5-level binary scoring (each level has PASS/FAIL criteria)
- **Weight**: How much this dimension contributes to the final score

To modify a dimension's scoring behavior, edit its dimension file. To add a new dimension, create a new file following the existing pattern and update EVALUATION.md to include it in the framework.

## Making Changes

### What to change where
## Where to make X kind of change

| Change | Files to edit |
|--------|--------------|
Expand All @@ -116,64 +76,49 @@ To modify a dimension's scoring behavior, edit its dimension file. To add a new
| Cross-dimensional narrative | `agents/ti-narrative.md` |
| Research strategy | `agents/ti-researcher.md` |
| Evaluator behavior | `agents/ti-evaluator.md` |
| Hypothesis extraction | `agents/ti-extractor.md` |
| Founder profile template | `skills/ti-founder/SKILL.md` |
| Report templates | `skills/ti-report/` |
| Hypothesis extraction | `agents/ti-extractor.md` |
| Suggest pipeline behavior | `commands/tweak/analyze-hn-post.md` |
| Analyze-HN pipeline behavior | `commands/tweak/analyze-hn-post.md` |
| Browse-HN pipeline behavior | `commands/tweak/browse-hn.md` |
| HN fetch/parse logic | `skills/ti-hnparse/hnparse.py` |
| HN fetch / parse logic | `skills/ti-hnparse/hnparse.py` |
| HN search logic | `skills/ti-hnparse/hnsearch.py` |
| Diff comparison behavior | `commands/tweak/diff.md` |
| Improve tweak generation | `commands/tweak/improve.md` |
| List/Show browser behavior | `commands/tweak/list.md`, `commands/tweak/show.md` |

### PR conventions

1. **One concern per PR** — don't mix dimension changes with pipeline changes
2. **Describe the "why"** — explain what behavior you're changing and why the current behavior is wrong or insufficient
3. **Test with a real idea** — run `/tweak:evaluate` with a real startup idea and confirm the output makes sense
4. **Keep dimension files consistent** — if you change the structure of one dimension file, update all 14 to match
| List / Show browser behavior | `commands/tweak/list.md`, `commands/tweak/show.md` |
| Share / upload behavior | `commands/tweak/share.md` |
| Installer behavior | `bin/install.js`, `bin/manifest.js` |

## Testing

Run the full test suite:
Two test suites. CI runs both on every PR via `.github/workflows/test.yml`.

**Node suite** — smoke tests for packaging, frontmatter, install:

```bash
npm test
```

This invokes `scripts/run-tests.cjs`, which runs the Python test suite in `scripts/tests/` via `uv`. Tests cover:

- Schema validation for all 10 JSON contracts (`test_schemas.py`)
- Compute pipeline correctness (`test_compute.py`)
- Report rendering against golden fixtures (`test_render.py`)
- Agent prompt contract verification (`test_evaluator_agent.py`, `test_narrative_agent.py`, etc.)
- No dangling references to renamed agents/skills (`test_no_dangling_refs.py`, `test_skill_rename.py`)
- Packaging and installation (`test_packaging.py`, `test_install_uv_wrapper.py`)
Runs `node scripts/run-tests.cjs`, which executes `tests/*.test.cjs` (`frontmatter.test.cjs`, `install-smoke.test.cjs`, `pack-smoke.test.cjs`) via Node's built-in `--test` runner.

To run a single test file:
**Python suite** — compute pipeline, schemas, agent contracts, render fixtures:

```bash
uv run python -m pytest scripts/tests/test_compute.py -v
uv run scripts/tests/run_tests.py # all tests
uv run scripts/tests/run_tests.py -v # verbose
```

## User Data

TweakIdea stores user data outside the repository:

- `~/.tweakidea/FOUNDER.md` — Persistent founder profile
- `~/.tweakidea/runs/{timestamp}/` — Evaluation run outputs
- `~/.tweakidea/hn/hn-{id}/` — HN suggest outputs (content.md, shifts.md, ideas.md)
Uses `unittest` (not `pytest`). To run a single file:

These paths are never committed to the repository. If you're testing, your personal data stays in your home directory.

- `.claude/` — Generated by the installer from source directories. Do not edit these files directly; do not commit them.
```bash
uv run --with jsonschema python -m unittest scripts.tests.test_compute -v
```

## Development Workflow
Run from the repo root — tests import from the `scripts.tests` package.

Source files live in root-level directories (`agents/`, `commands/`, `skills/`). The installer copies them into `.claude/` for Claude Code discovery.
## PR conventions

1. **Setup:** `node bin/install.js --local`populates `.claude/` from source dirs
2. **Edit:** Change files in `agents/`, `commands/`, or `skills/`
3. **Refresh:** Re-run `node bin/install.js --local` to update `.claude/`
4. **Test npx flow:** `npx .` from repo root
1. **One concern per PR**don't mix dimension changes with pipeline changes.
2. **Describe the "why"** — explain what behavior you're changing and why the current behavior is wrong or insufficient.
3. **Test with a real idea** run `/tweak:evaluate` end-to-end and confirm the output makes sense.
4. **Keep dimension files consistent** — if you change the structure of one dimension file, update all 14 to match.
Loading