nightshift: tech-debt-classify — test gaps, god modules, missing docs

# nightshift: tech-debt-classify — Microck/traccia

> Automated tech debt classification by [nightshift](https://github.com/Microck/hermes-nightshift-glm).

## Summary

Analyzed 19 source modules (6,710 lines) and 8 test files. The codebase is well-structured with good type annotation coverage on return types, but has significant gaps in **test coverage** (10 of 17 modules lack tests), **function documentation** (nearly zero docstrings outside cli.py), and several **oversized functions** that should be decomposed.

---

## 🔴 High Severity

### 1. Critical test coverage gaps — 10 modules untested

**Effort:** High | **Severity:** High

The following source modules have **no corresponding test file**:

| Module | Lines | Functions |
|--------|-------|-----------|
| `rendering.py` | 930 | 26 |
| `storage.py` | 517 | 28 |
| `llm.py` | 506 | 33 |
| `family_normalizer.py` | 449 | 18 |
| `cli.py` | 432 | 31 |
| `bootstrap.py` | 390 | 4 |
| `document_normalizer.py` | 287 | 11 |
| `extraction.py` | 192 | 8 |
| `pipeline_support.py` | 163 | 8 |
| `taxonomy.py` | 123 | 1 |
| `config.py` | 96 | 4 |
| `utils.py` | 45 | 8 |

**Recommendation:** Prioritize `storage.py`, `llm.py`, and `rendering.py` — they are the largest untested modules with critical I/O and data transformation logic.

### 2. Oversized functions needing decomposition

**Effort:** Medium | **Severity:** High

Several functions exceed 50 lines significantly, making them hard to test and maintain:

| File | Function | Lines |
|------|----------|-------|
| `rendering.py` | `_write_viewer` | **222** |
| `pipeline.py` | `recompute_graph` | **179** |
| `pipeline.py` | `ingest_directory` | **135** |
| `storage.py` | `replace_graph` | **98** |
| `rendering.py` | `_write_node_pages` | **86** |
| `rendering.py` | `_write_obsidian_skill_notes` | **87** |
| `rendering.py` | `_write_obsidian_evidence_notes` | **54** |
| `rendering.py` | `_write_profile` | **51** |
| `pipeline.py` | `_build_person_skill_state` | **80** |
| `parsers.py` | `_parse_reddit_export` | **72** |
| `parsers.py` | `_parse_source_content` | **70** |
| `cli.py` | `doctor` | **52** |

**Recommendation:** Break `recompute_graph` (179 lines) and `_write_viewer` (222 lines) into smaller composable functions. Each sub-function should be independently testable.

---

## 🟡 Medium Severity

### 3. Near-zero docstring coverage

**Effort:** Low-Medium | **Severity:** Medium

| File | Functions | Docstrings |
|------|-----------|------------|
| `pipeline.py` | 59 | **0** |
| `llm.py` | 33 | **0** |
| `rendering.py` | 26 | **0** |
| `storage.py` | 28 | **0** |
| `parsers.py` | 27 | **0** |
| `family_normalizer.py` | 18 | **0** |
| `cli.py` | 31 | **3** |

Only `cli.py` has any docstrings (3/31). This makes it hard for new contributors (or the CLAUDE.md agent) to understand intent.

**Recommendation:** Add docstrings to all public functions, starting with `storage.py` and `pipeline.py` which define the core data flow.

### 4. Broad exception handling in pipeline.py

**Effort:** Low | **Severity:** Medium

```python
# src/traccia/pipeline.py:225
except Exception as exc:
```

A single broad `except Exception` in the pipeline module could silently swallow errors from parsing, graph manipulation, or I/O operations.

**Recommendation:** Replace with specific exception types (e.g., `ParserError`, `StorageError`, `GraphError`). At minimum, log the full traceback before re-raising.

### 5. rendering.py is a god module (930 lines, 26 functions)

**Effort:** Medium | **Severity:** Medium

`rendering.py` handles viewer generation, node pages, profile writing, and Obsidian export all in one file. These are distinct concerns.

**Recommendation:** Split into `rendering/viewer.py`, `rendering/nodes.py`, `rendering/profile.py`, and `rendering/obsidian.py`.

---

## 🟢 Low Severity

### 6. Return type annotations are good, but parameter types could improve

**Effort:** Low | **Severity:** Low

Return type annotations are near 100% across all modules (231/232 functions). This is excellent. However, parameter type annotations should be audited for completeness, especially in `llm.py` (32/33 return types).

### 7. No linting beyond F + I rules

**Effort:** Low | **Severity:** Low

The ruff config only enables `F` (pyflakes) and `I` (isort) rules. Consider enabling additional rule sets:

```toml
[tool.ruff.lint]
select = ["F", "I", "UP", "B", "SIM", "TCH", "RUF"]
```

This would catch bugbears (`B`), simplify patterns (`SIM`), and flag unnecessary type-checking blocks (`TCH`).

### 8. `pipeline.py` complexity

**Effort:** Medium | **Severity:** Low

At 1,251 lines with 59 functions, `pipeline.py` is the largest module. While functions are reasonably sized individually, the orchestration logic (`ingest_directory`, `recompute_graph`) creates deep call stacks. Consider extracting a `pipeline/` package with separate orchestrators.

---

## Classification Summary

| Category | Count | Key Examples |
|----------|-------|-------------|
| **Test gaps** | 10 modules | rendering, storage, llm, family_normalizer |
| **Code smells** | 12 functions >50 lines | _write_viewer (222), recompute_graph (179) |
| **Documentation gaps** | 225 undocumented functions | pipeline, llm, rendering, storage |
| **Architecture** | 2 god modules | rendering.py (930), pipeline.py (1251) |
| **Error handling** | 1 broad except | pipeline.py:225 |
| **Linting** | Minimal rules | F+I only |

## Priority Recommendations

1. **Add tests for `storage.py` and `llm.py`** — core I/O and backend modules with zero coverage
2. **Decompose `_write_viewer` (222 lines)** — largest single function in the codebase
3. **Decompose `recompute_graph` (179 lines)** — critical graph computation logic
4. **Add docstrings to public functions in `pipeline.py` and `storage.py`**
5. **Split `rendering.py` into focused submodules**
6. **Replace broad `except Exception` with specific exception types**
7. **Expand ruff linting rules** to catch more code quality issues


File	Function	Lines
`rendering.py`	`_write_viewer`	222
`pipeline.py`	`recompute_graph`	179
`pipeline.py`	`ingest_directory`	135
`storage.py`	`replace_graph`	98
`rendering.py`	`_write_node_pages`	86
`rendering.py`	`_write_obsidian_skill_notes`	87
`rendering.py`	`_write_obsidian_evidence_notes`	54
`rendering.py`	`_write_profile`	51
`pipeline.py`	`_build_person_skill_state`	80
`parsers.py`	`_parse_reddit_export`	72
`parsers.py`	`_parse_source_content`	70
`cli.py`	`doctor`	52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nightshift: tech-debt-classify — test gaps, god modules, missing docs #27

nightshift: tech-debt-classify — Microck/traccia

Summary

🔴 High Severity

1. Critical test coverage gaps — 10 modules untested

2. Oversized functions needing decomposition

🟡 Medium Severity

3. Near-zero docstring coverage

4. Broad exception handling in pipeline.py

5. rendering.py is a god module (930 lines, 26 functions)

🟢 Low Severity

6. Return type annotations are good, but parameter types could improve

7. No linting beyond F + I rules

8. `pipeline.py` complexity

Classification Summary

Priority Recommendations

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Module	Lines	Functions
`rendering.py`	930	26
`storage.py`	517	28
`llm.py`	506	33
`family_normalizer.py`	449	18
`cli.py`	432	31
`bootstrap.py`	390	4
`document_normalizer.py`	287	11
`extraction.py`	192	8
`pipeline_support.py`	163	8
`taxonomy.py`	123	1
`config.py`	96	4
`utils.py`	45	8

Category	Count	Key Examples
Test gaps	10 modules	rendering, storage, llm, family_normalizer
Code smells	12 functions >50 lines	_write_viewer (222), recompute_graph (179)
Documentation gaps	225 undocumented functions	pipeline, llm, rendering, storage
Architecture	2 god modules	rendering.py (930), pipeline.py (1251)
Error handling	1 broad except	pipeline.py:225
Linting	Minimal rules	F+I only

nightshift: tech-debt-classify — test gaps, god modules, missing docs #27

Description

nightshift: tech-debt-classify — Microck/traccia

Summary

🔴 High Severity

1. Critical test coverage gaps — 10 modules untested

2. Oversized functions needing decomposition

🟡 Medium Severity

3. Near-zero docstring coverage

4. Broad exception handling in pipeline.py

5. rendering.py is a god module (930 lines, 26 functions)

🟢 Low Severity

6. Return type annotations are good, but parameter types could improve

7. No linting beyond F + I rules

8. pipeline.py complexity

Classification Summary

Priority Recommendations

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

8. `pipeline.py` complexity