Ship: "compare" command (signal-level repository comparison)

Add a "compare" command that compares two supported targets using the existing ExplainThisRepo pipeline.

No new pipeline. No duplicated logic.

Reuse:

- target normalization
- local vs GitHub resolution
- file and directory support
- signal extraction
- output system

Add only:

- structured signal comparator
- comparison report renderer

## Why this matters

Right now the tool explains one codebase well.

The next real step is helping users answer:

- how are these two systems different?
- which one is simpler?
- which one is backend vs frontend?
- what actually changed between them?

This cannot be solved by diffing generated explanations. That approach is noisy and unreliable.

The comparison must be built on extracted signals.

CLI

Primary:

```
explainthisrepo compare <target-a> <target-b>
```
Examples:
```
explainthisrepo compare facebook/react vercel/next.js
explainthisrepo compare . ../another-project
explainthisrepo compare owner/repo/path/to/file.py ./file.py
```

Aliases already supported:

etr compare <target-a> <target-b>
explain-this-repo compare <target-a> <target-b>

## Supported inputs

Both sides accept anything currently supported:

- GitHub repo
- local repo
- GitHub directory
- local directory
- GitHub file
- local file

Examples:

```bash
explainthisrepo compare owner/repo owner/repo
explainthisrepo compare ./dir ./other-dir
explainthisrepo compare owner/repo/path/file.py ./file.py
explainthisrepo compare owner/repo ./file.py
```

## Supported pair types

- repo vs repo
- local repo vs local repo
- repo vs local repo
- repo vs file
- file vs file
- directory vs directory
- local dir vs GitHub dir
- GitHub file vs local file
- private repo vs public repo
- monorepo vs single-package repo

Output must adapt to what is actually comparable. No forced symmetry.

## Core flow

input A + input B
→ normalize
→ resolve
→ extract signals (existing pipeline)
→ compare signals (new layer)
→ render report

## Hard rule

Do not compare generated explanations.

Comparison must be:

1. extract structured signals
2. compare structured signals
3. generate explanation from diff

Anything else introduces noise and inconsistency.

## Canonical analysis model

Target layer

- repo
- local_repo
- github_repo
- directory
- file

## Signal layer

Repos / directories:

- structure
- entrypoints
- manifests
- configs
- dependencies
- tech stack
- high-signal files
- architecture hints

Files:

- file type
- size
- imports
- exports
- symbols
- purpose
- logic shape

Interpretation layer

- framework
- runtime
- package manager
- app shape
- frontend vs backend
- monorepo vs single package
- CLI vs library vs app
- entry path

Diff layer

- same
- added
- removed
- changed
- stronger signal
- conflicting signal
- not comparable

## Internal design

TargetResolver (existing)

Used twice. No change.

SignalExtractor (existing)

Must return structured data. Not just prose.

Comparator (new)

Takes two signal objects → returns structured diff.

Example:

```bash
{
  "stack": {
    "only_in_a": ["fastapi"],
    "only_in_b": ["express"],
    "shared": ["docker"]
  },
  "entrypoints": {
    "a": ["app/main.py"],
    "b": ["src/index.ts"]
  }
}
```

ReportRenderer (existing + extension)

Renders comparison output.

Output

Default file:

COMPARE.md

Respect:

--output

Same behavior as existing commands.

Modes

Reuse:

- "--quick" → one-line verdict
- "--simple" → short comparison
- "--detailed" → structured diff
- default → full report

Do not reuse:

- `--stack`
- `--map`

Report structure

```bash
# Compare Report

## Summary

## What Target A looks like

## What Target B looks like

## Similarities

## Differences
- Stack
- Entry points
- Structure
- Architecture
- File types
- Dependency shape
- App type

## What matters most

## Confidence and limits
```

Behavior rules

- Do not hallucinate missing signals
- Mark partial comparisons (e.g. file vs repo)
- Only compare overlapping domains
- Keep system deterministic
- LLM is optional and only for phrasing, not discovery

Optional scoring

```bash
{
  "stack_similarity": 0.72,
  "structure_similarity": 0.41,
  "architecture_distance": 0.83,
  "confidence": "high"
}
```

## Implementation plan

1. Add "compare" command routing
2. Reuse resolver for both inputs
3. Ensure structured signal output
4. Build comparator (pure logic)
5. Extend renderer
6. Add tests
7. Update docs

## Test matrix

- repo vs repo
- local vs local
- repo vs local
- repo vs file
- file vs file
- dir vs dir
- local dir vs GitHub dir
- GitHub file vs local file
- private vs public repo
- monorepo vs single-package repo

Also:

- invalid input
- same target comparison
- bad paths
- output writing
- stdout mode
- no-LLM mode

## Failure to avoid

Do not implement:

EXPLAIN(A) + EXPLAIN(B) → text diff

This is unstable and untrustworthy.

Correct approach:

signals → diff → explanation

## Future

Design comparator to support multiple inputs:

explainthisrepo compare A B C

Not required now. Must not require rewrite later.

## Constraint

This must remain:

- signal-first
- deterministic
- single pipeline
- minimal new surface area

If new logic starts duplicating extraction or branching per input type, the design has failed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ship: "compare" command (signal-level repository comparison) #179

Why this matters

Supported inputs

Supported pair types

Core flow

Hard rule

Canonical analysis model

Signal layer

Internal design

Implementation plan

Test matrix

Failure to avoid

Future

Constraint

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Ship: "compare" command (signal-level repository comparison) #179

Description

Why this matters

Supported inputs

Supported pair types

Core flow

Hard rule

Canonical analysis model

Signal layer

Internal design

Implementation plan

Test matrix

Failure to avoid

Future

Constraint

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions