nightshift: tech-debt-classify — Microck/traccia
Automated tech debt classification by nightshift.
Summary
Analyzed 19 source modules (6,710 lines) and 8 test files. The codebase is well-structured with good type annotation coverage on return types, but has significant gaps in test coverage (10 of 17 modules lack tests), function documentation (nearly zero docstrings outside cli.py), and several oversized functions that should be decomposed.
🔴 High Severity
1. Critical test coverage gaps — 10 modules untested
Effort: High | Severity: High
The following source modules have no corresponding test file:
| Module |
Lines |
Functions |
rendering.py |
930 |
26 |
storage.py |
517 |
28 |
llm.py |
506 |
33 |
family_normalizer.py |
449 |
18 |
cli.py |
432 |
31 |
bootstrap.py |
390 |
4 |
document_normalizer.py |
287 |
11 |
extraction.py |
192 |
8 |
pipeline_support.py |
163 |
8 |
taxonomy.py |
123 |
1 |
config.py |
96 |
4 |
utils.py |
45 |
8 |
Recommendation: Prioritize storage.py, llm.py, and rendering.py — they are the largest untested modules with critical I/O and data transformation logic.
2. Oversized functions needing decomposition
Effort: Medium | Severity: High
Several functions exceed 50 lines significantly, making them hard to test and maintain:
| File |
Function |
Lines |
rendering.py |
_write_viewer |
222 |
pipeline.py |
recompute_graph |
179 |
pipeline.py |
ingest_directory |
135 |
storage.py |
replace_graph |
98 |
rendering.py |
_write_node_pages |
86 |
rendering.py |
_write_obsidian_skill_notes |
87 |
rendering.py |
_write_obsidian_evidence_notes |
54 |
rendering.py |
_write_profile |
51 |
pipeline.py |
_build_person_skill_state |
80 |
parsers.py |
_parse_reddit_export |
72 |
parsers.py |
_parse_source_content |
70 |
cli.py |
doctor |
52 |
Recommendation: Break recompute_graph (179 lines) and _write_viewer (222 lines) into smaller composable functions. Each sub-function should be independently testable.
🟡 Medium Severity
3. Near-zero docstring coverage
Effort: Low-Medium | Severity: Medium
| File |
Functions |
Docstrings |
pipeline.py |
59 |
0 |
llm.py |
33 |
0 |
rendering.py |
26 |
0 |
storage.py |
28 |
0 |
parsers.py |
27 |
0 |
family_normalizer.py |
18 |
0 |
cli.py |
31 |
3 |
Only cli.py has any docstrings (3/31). This makes it hard for new contributors (or the CLAUDE.md agent) to understand intent.
Recommendation: Add docstrings to all public functions, starting with storage.py and pipeline.py which define the core data flow.
4. Broad exception handling in pipeline.py
Effort: Low | Severity: Medium
# src/traccia/pipeline.py:225
except Exception as exc:
A single broad except Exception in the pipeline module could silently swallow errors from parsing, graph manipulation, or I/O operations.
Recommendation: Replace with specific exception types (e.g., ParserError, StorageError, GraphError). At minimum, log the full traceback before re-raising.
5. rendering.py is a god module (930 lines, 26 functions)
Effort: Medium | Severity: Medium
rendering.py handles viewer generation, node pages, profile writing, and Obsidian export all in one file. These are distinct concerns.
Recommendation: Split into rendering/viewer.py, rendering/nodes.py, rendering/profile.py, and rendering/obsidian.py.
🟢 Low Severity
6. Return type annotations are good, but parameter types could improve
Effort: Low | Severity: Low
Return type annotations are near 100% across all modules (231/232 functions). This is excellent. However, parameter type annotations should be audited for completeness, especially in llm.py (32/33 return types).
7. No linting beyond F + I rules
Effort: Low | Severity: Low
The ruff config only enables F (pyflakes) and I (isort) rules. Consider enabling additional rule sets:
[tool.ruff.lint]
select = ["F", "I", "UP", "B", "SIM", "TCH", "RUF"]
This would catch bugbears (B), simplify patterns (SIM), and flag unnecessary type-checking blocks (TCH).
8. pipeline.py complexity
Effort: Medium | Severity: Low
At 1,251 lines with 59 functions, pipeline.py is the largest module. While functions are reasonably sized individually, the orchestration logic (ingest_directory, recompute_graph) creates deep call stacks. Consider extracting a pipeline/ package with separate orchestrators.
Classification Summary
| Category |
Count |
Key Examples |
| Test gaps |
10 modules |
rendering, storage, llm, family_normalizer |
| Code smells |
12 functions >50 lines |
_write_viewer (222), recompute_graph (179) |
| Documentation gaps |
225 undocumented functions |
pipeline, llm, rendering, storage |
| Architecture |
2 god modules |
rendering.py (930), pipeline.py (1251) |
| Error handling |
1 broad except |
pipeline.py:225 |
| Linting |
Minimal rules |
F+I only |
Priority Recommendations
- Add tests for
storage.py and llm.py — core I/O and backend modules with zero coverage
- Decompose
_write_viewer (222 lines) — largest single function in the codebase
- Decompose
recompute_graph (179 lines) — critical graph computation logic
- Add docstrings to public functions in
pipeline.py and storage.py
- Split
rendering.py into focused submodules
- Replace broad
except Exception with specific exception types
- Expand ruff linting rules to catch more code quality issues
nightshift: tech-debt-classify — Microck/traccia
Summary
Analyzed 19 source modules (6,710 lines) and 8 test files. The codebase is well-structured with good type annotation coverage on return types, but has significant gaps in test coverage (10 of 17 modules lack tests), function documentation (nearly zero docstrings outside cli.py), and several oversized functions that should be decomposed.
🔴 High Severity
1. Critical test coverage gaps — 10 modules untested
Effort: High | Severity: High
The following source modules have no corresponding test file:
rendering.pystorage.pyllm.pyfamily_normalizer.pycli.pybootstrap.pydocument_normalizer.pyextraction.pypipeline_support.pytaxonomy.pyconfig.pyutils.pyRecommendation: Prioritize
storage.py,llm.py, andrendering.py— they are the largest untested modules with critical I/O and data transformation logic.2. Oversized functions needing decomposition
Effort: Medium | Severity: High
Several functions exceed 50 lines significantly, making them hard to test and maintain:
rendering.py_write_viewerpipeline.pyrecompute_graphpipeline.pyingest_directorystorage.pyreplace_graphrendering.py_write_node_pagesrendering.py_write_obsidian_skill_notesrendering.py_write_obsidian_evidence_notesrendering.py_write_profilepipeline.py_build_person_skill_stateparsers.py_parse_reddit_exportparsers.py_parse_source_contentcli.pydoctorRecommendation: Break
recompute_graph(179 lines) and_write_viewer(222 lines) into smaller composable functions. Each sub-function should be independently testable.🟡 Medium Severity
3. Near-zero docstring coverage
Effort: Low-Medium | Severity: Medium
pipeline.pyllm.pyrendering.pystorage.pyparsers.pyfamily_normalizer.pycli.pyOnly
cli.pyhas any docstrings (3/31). This makes it hard for new contributors (or the CLAUDE.md agent) to understand intent.Recommendation: Add docstrings to all public functions, starting with
storage.pyandpipeline.pywhich define the core data flow.4. Broad exception handling in pipeline.py
Effort: Low | Severity: Medium
A single broad
except Exceptionin the pipeline module could silently swallow errors from parsing, graph manipulation, or I/O operations.Recommendation: Replace with specific exception types (e.g.,
ParserError,StorageError,GraphError). At minimum, log the full traceback before re-raising.5. rendering.py is a god module (930 lines, 26 functions)
Effort: Medium | Severity: Medium
rendering.pyhandles viewer generation, node pages, profile writing, and Obsidian export all in one file. These are distinct concerns.Recommendation: Split into
rendering/viewer.py,rendering/nodes.py,rendering/profile.py, andrendering/obsidian.py.🟢 Low Severity
6. Return type annotations are good, but parameter types could improve
Effort: Low | Severity: Low
Return type annotations are near 100% across all modules (231/232 functions). This is excellent. However, parameter type annotations should be audited for completeness, especially in
llm.py(32/33 return types).7. No linting beyond F + I rules
Effort: Low | Severity: Low
The ruff config only enables
F(pyflakes) andI(isort) rules. Consider enabling additional rule sets:This would catch bugbears (
B), simplify patterns (SIM), and flag unnecessary type-checking blocks (TCH).8.
pipeline.pycomplexityEffort: Medium | Severity: Low
At 1,251 lines with 59 functions,
pipeline.pyis the largest module. While functions are reasonably sized individually, the orchestration logic (ingest_directory,recompute_graph) creates deep call stacks. Consider extracting apipeline/package with separate orchestrators.Classification Summary
Priority Recommendations
storage.pyandllm.py— core I/O and backend modules with zero coverage_write_viewer(222 lines) — largest single function in the codebaserecompute_graph(179 lines) — critical graph computation logicpipeline.pyandstorage.pyrendering.pyinto focused submodulesexcept Exceptionwith specific exception types