Skip to content

feat: implement issues #48 #50 #78 #79 and ATS enhancements#80

Merged
athola merged 13 commits intomainfrom
ats-updates-0.2.3
Feb 16, 2026
Merged

feat: implement issues #48 #50 #78 #79 and ATS enhancements#80
athola merged 13 commits intomainfrom
ats-updates-0.2.3

Conversation

@athola
Copy link
Owner

@athola athola commented Feb 1, 2026

Summary

Implements multiple backlog issues for v0.2.3 release, plus ATS scoring enhancements.

Fixes #48, fixes #50, fixes #78, fixes #79

Changes

Issue Fixes

ATS Enhancements (additional scope)

Architecture

  • Moved TaxonomyCache file I/O to shell layer (functional core / imperative shell)
  • TaxonomyCache Protocol in core, TaxonomyLocalCache in shell

Known Issues (from review)

Test Plan

  • All tests pass
  • 93.39% coverage
  • Architecture tests pass (layer separation)
  • Pre-commit hooks pass

athola added 10 commits January 25, 2026 18:21
Implements GitHub issues #78 and #79:

- Add frozen=True to Degree dataclass for immutability
- Use object.__setattr__ in __post_init__ for type transformation
- Add type_value property with cast() for proper type narrowing
- Add test for frozen dataclass immutability

The frozen design ensures thread-safety and hashability while
maintaining backwards compatibility through object.__setattr__
for type conversion during initialization.
Implements GitHub issue #48:

- File path errors now include actionable suggestions
- Format validation errors show clear expected values
- Empty file path error provides example
- File not found error includes helpful guidance

These improvements make CLI errors more user-friendly and
actionable, helping users understand what went wrong
and how to fix it.
Research document evaluating alternative similarity algorithms
for resume-job matching beyond Jaccard N-gram:

- Comprehensive comparison of 6 algorithms
- Analysis of strengths and limitations
- Recommendation: current tournament approach is optimal
- Suggests Levenshtein as low-effort enhancement for short fields

Issue #51 (PDF rendering) already documented in
wiki/PDF-Renderer-Evaluation.md

Resolves: #61 (research task)
Documents: #51 (already exists)
Implements GitHub issue #53:

- Add "cover" to TemplateType enum for cover letter templates
- Update TemplateType.values() to include cover template
- Add test coverage for cover template validation

The cover.html template already existed. This change makes it
a selectable option for users via --template cover when
generating documents from Markdown.

Users can now generate cover letters using:
  simple-resume generate --render-file cover.md --template cover
Implements GitHub issue #50 - add at least three new resume templates

Added three new HTML templates with distinct visual styles:

1. resume_modern.html - Clean minimalist design with Helvetica font
   - Horizontal contact bar
   - Blue accent color
   - Tag-based skills display

2. resume_professional.html - Corporate serif design
   - Georgia font for traditional look
   - Centered header with double border
   - Classic two-column skills layout

3. resume_creative.html - Bold gradient sidebar design
   - Purple gradient sidebar
   - Card-based entries
   - Modern sans-serif typography

All templates extend resume_base.html and are fully compatible
with the existing generation pipeline.
- Add TaxonomyConfig for opt-in API integration (disabled by default)
- Add TaxonomyCache with file system caching and TTL
- Add SkillsTaxonomyFetcher with graceful fallback to hardcoded list
- Add get_enhanced_skills() main entry point
- Implement TDD approach with 14 comprehensive tests
- Offline-first design: API disabled by default, graceful degradation

Closes #59
- Extract TaxonomyCache Protocol in core for interface definition
- Add NullTaxonomyCache in core for no-op implementation
- Move TaxonomyLocalCache to shell/taxonomy_cache.py for file I/O
- Update test regex for improved error message format (issue #48)
- Update test imports to use shell layer cache

Fixes architecture test violation. All 1728 tests pass, 93% coverage.
Enhance keyword extraction with capitalized word patterns and fallback
extraction for documents with minimal technical markers. Add
MIN_FALLBACK_WORD_LENGTH constant and percentage flag to score_resume().

Update all sample YAML files with ATS-optimized content featuring
realistic tech skills (Python, JavaScript, React, AWS, Docker,
Kubernetes, SQL, REST APIs, CI/CD) to serve as better testing fixtures.

Add comprehensive taxonomy cache tests covering:
- NullTaxonomyCache no-op implementation
- Corrupted JSON handling
- Write failure resilience
- LinkedIn taxonomy stub behavior
- Successful API fetch caching
Add TestKeywordExtraction class with 8 tests covering:
- Acronym extraction (AWS, API)
- CamelCase term extraction
- Quoted phrase extraction
- Skill pattern extraction
- Fallback extraction for plain text
- max_keywords limit enforcement
- Duplicate removal
- Short keyword filtering

Improves keyword.py coverage from 68% to 98%.
Bump version to 0.2.3 and update documentation for new features:

- Offline-first skills taxonomy API with local caching (7-day TTL)
- Percentage parameter for score_resume() producing 0-100 scale
- Improved keyword extraction with capitalized word patterns
- TaxonomyCache refactored to functional core / imperative shell

Update README with new templates (resume_modern, resume_professional,
resume_creative) and skills taxonomy documentation. Sync wiki API
reference and Getting Started guide with current template list.

Fix duplicate [Unreleased] heading in CHANGELOG.
@athola
Copy link
Owner Author

athola commented Feb 11, 2026

PR Review: feat: implement issues #48 #50 #52 #53 #78 #79

Verdict: Changes Requested

Scope Analysis

Issue Status Notes
#48 Error messages ✅ In-scope Validation messages improved with context/suggestions
#50 Three templates ✅ In-scope creative, modern, professional HTML templates added
#52 JSON Resume ⚠️ Questionable No new JSON Resume code in diff. PR says "importer already existed" — if pre-existing, shouldn't be listed as fixed here
#53 Cover letter ⚠️ Incomplete TemplateType.COVER enum added but no cover letter HTML template found in changed files
#78 Frozen Degree ✅ In-scope frozen=True with object.__setattr__ in __post_init__ — clean approach
#79 Degree.type annotation ✅ In-scope Runtime narrowing with cast() import, simplified type_value property

Out-of-scope additions (not in any listed issue):

These out-of-scope additions represent ~1000 lines of new code + ~500 lines of tests. Consider splitting into a separate PR or updating the PR title/description to accurately reflect scope.


Blocking Issues

1. Duplicate entry in creative_terms.py
_GENERAL_TERMS contains "ninja" → "expert" twice (lines 99-104 and 115-119). Remove the duplicate.

2. Missing cover letter template file
Issue #53 claims cover letter support and TemplateType.COVER = "cover" is added to the enum, but no cover.html template exists in the diff. Either add the template or remove #53 from the fixes list.

3. Issue #52 (JSON Resume) claim
No JSON Resume code changes in this PR. If the importer already existed, this issue should not be listed in the PR as "fixes #52" — it will auto-close the issue without new work.


Non-Blocking Improvements

4. confidence field should be an Enum
In CreativeTerm, confidence: str accepts arbitrary strings. A Confidence(Enum) with LOW/MEDIUM/HIGH would prevent typos and improve type safety.

5. f-strings in logging calls
taxonomy.py lines 212, 222, 225 use logger.debug(f"...") / logger.warning(f"..."). Use %s-style formatting for lazy evaluation (string isn't built if log level is disabled).

6. TAXONOMY_CACHE_TTL defined twice
Defined in both core/ats/taxonomy.py:29 and shell/taxonomy_cache.py:19. Single source of truth — shell should import from core.

7. TaxonomyCache alias shadows Protocol
taxonomy_cache.py:76 defines TaxonomyCache = TaxonomyLocalCache which shadows the TaxonomyCache Protocol from core. This creates import confusion. Use only the concrete name TaxonomyLocalCache.

8. Sample YAMLs lost diversity
All sample resumes now have near-identical skills (Python, JavaScript, React, AWS, Docker, Kubernetes, SQL, REST APIs, CI/CD). They no longer demonstrate different resume styles/industries — reducing their value as test fixtures and demos.


Code Quality

  • Architecture: Core/shell separation properly maintained (TaxonomyCache Protocol in core, file I/O in shell) ✅
  • Tests: Comprehensive coverage for new modules (taxonomy: 16 tests, creative terms: 17 tests, keyword scorer: 30 tests) ✅
  • Version: pyproject.toml bumped to 0.2.3, CHANGELOG updated ✅
  • No security concerns identified ✅
  • No AI slop detected in documentation ✅

Recommendation

Fix the 3 blocking issues (duplicate entry, missing cover template, #52 claim), then this is ready to merge. The out-of-scope work is quality code — just needs the PR description updated to accurately reflect what's included.

@athola
Copy link
Owner Author

athola commented Feb 11, 2026

Test Plan

Automated Verification

  • make test — All existing + new tests pass
  • make lint — No ruff/mypy violations
  • make format — Code formatting clean

Manual Verification

Blocking Issue Verification

  • Confirm creative_terms.py _GENERAL_TERMS has no duplicate "ninja" entries
  • Confirm cover.html template file exists in shell/assets/templates/html/
  • Confirm fixes #52 is either backed by new code or removed from PR description

Out-of-Scope Sanity Checks

  • Taxonomy: get_enhanced_skills() returns hardcoded list by default (offline-first)
  • Taxonomy: get_enhanced_skills(use_taxonomy=True) falls back gracefully
  • Creative terms: expand_term("rockstar developer") returns "senior developer"
  • Sample YAMLs: Spot-check that samples still generate valid PDFs/HTML

@athola athola changed the title feat: implement issues #48 #50 #52 #53 #78 #79 feat: implement issues #48 #50 #78 #79 and ATS enhancements Feb 11, 2026
@athola
Copy link
Owner Author

athola commented Feb 11, 2026

Comprehensive PR Review (5-Agent Analysis)

Five specialized analysis agents reviewed this PR independently. Findings have been deduplicated and aggregated below.


Critical Issues (must fix before merge)

1. Creative term expansion inflates keyword count, distorting ATS scores
src/simple_resume/core/ats/keyword.py:334-343 -- When creative expansion is enabled, expanded terms are appended to the keywords list. This inflates total_keywords (diluting the score denominator), double-counts when both original and expanded terms are present, and penalizes the score if the expanded professional term is not found in the resume. The expanded term should replace the original creative keyword, not be added alongside it. Additionally, keywords.append(expanded) mutates the caller's original list, causing unbounded growth on repeated calls with the same list.
(Code Review, Test Coverage)

2. Bare except Exception swallows programming bugs in taxonomy fetcher
src/simple_resume/core/ats/taxonomy.py:221-222 -- Catches all exceptions including TypeError, AttributeError, NotImplementedError, RecursionError, and MemoryError, downgrading them to a warning log. When a user passes an unsupported taxonomy name, the NotImplementedError from _fetch_from_api is silently caught and the user gets hardcoded data with no indication their configuration was ignored. Fix: catch only (OSError, ConnectionError, TimeoutError, ValueError) and re-raise NotImplementedError.
(Silent Failure Hunter, Code Review)

3. API stubs are production mocks that silently fail
src/simple_resume/core/ats/taxonomy.py:248-286 -- _fetch_onet() and _fetch_linkedin() return empty lists (falsy in Python), causing the caller's if skills: check to silently fall through to hardcoded fallback. Users who set enabled=True get the exact same results as enabled=False with no error or warning. Stubs should raise NotImplementedError, or TaxonomyConfig should validate that the requested taxonomy is actually implemented.
(Silent Failure Hunter)

4. XSS vulnerability in new HTML templates via autoescape false
shell/assets/templates/html/resume_creative.html:261, resume_modern.html:228, resume_professional.html:199 -- All three new templates use {% autoescape false %} to render entry.description. If resume YAML contains malicious HTML/JS, it renders verbatim. Either sanitize description content before templating or document this as a known limitation.
(Code Review)

5. _extract_keywords capitalized/acronym regex patterns are dead code
src/simple_resume/core/ats/keyword.py:108-160 -- original_text = text is assigned after _preprocess_text() has already lowercased text. Regex patterns like r"\b[A-Z]{2,}\b" and r"\b[A-Z][a-z]{2,}\b" will never match against lowercased input. The primary extraction paths (technical terms, capitalized words) are effectively dead code in the default case_sensitive=False mode. Fix: preserve the original text before preprocessing.
(Code Review)


Important Issues (should fix)

6. Duplicate "ninja" entry in _GENERAL_TERMS
src/simple_resume/core/ats/creative_terms.py:103-107 and 115-119 -- Identical CreativeTerm(creative="ninja", ...) appears twice. Propagates into all industry dictionaries via spread. Remove the duplicate at lines 115-119.
(Code Review, Comment Analyzer, Type Design)

7. TAXONOMY_CACHE_TTL constant duplicated across core and shell layers
src/simple_resume/core/ats/taxonomy.py:29 and shell/taxonomy_cache.py:20 -- Same value defined independently. If one is updated without the other, cache TTL silently diverges. Shell should import from core or accept TTL from TaxonomyConfig.
(Code Review, Comment Analyzer)

8. Cache write failure logged at wrong severity
src/simple_resume/shell/taxonomy_cache.py:68-72 -- Write failures to cache are logged at warning but represent data loss (fetched skills won't be cached, causing repeated API calls). Should be error level.
(Silent Failure Hunter)

9. Cache corruption indistinguishable from cache miss
src/simple_resume/shell/taxonomy_cache.py:45-57 -- Both corrupted JSON and missing files return None. Users cannot distinguish "cache not populated yet" from "cache is corrupted and being silently ignored." Log corrupted cache reads at warning level with the file path.
(Silent Failure Hunter)

10. Hidden side effect in _get_cache_path getter
src/simple_resume/shell/taxonomy_cache.py:33-36 -- _get_cache_path() calls self.cache_dir.mkdir(parents=True, exist_ok=True) on every invocation. A getter should not create directories. Move directory creation to __post_init__ or the constructor.
(Silent Failure Hunter, Test Coverage)

11. Fallback logged at INFO when user action was ignored
src/simple_resume/core/ats/taxonomy.py:225-226 -- When a user explicitly enables taxonomy API and the system falls back to hardcoded data, logging at INFO is too quiet. Should be WARNING with context about what was configured vs. what happened.
(Silent Failure Hunter)

12. TaxonomyConfig is mutable with no validation
src/simple_resume/core/ats/taxonomy.py -- TaxonomyConfig dataclass allows mutation after creation and has no field validation. Adding frozen=True is a one-keyword fix for significant safety improvement.
(Type Design)

13. CreativeTerm.confidence uses free-form str instead of constrained type
src/simple_resume/core/ats/creative_terms.py:42 -- Typed as str with documented values "low"/"medium"/"high" but no runtime validation. The field is never used in any logic. Use Literal["low", "medium", "high"] or a Confidence enum, or remove the field until needed (YAGNI).
(Code Review, Type Design, Comment Analyzer)

14. TemplateType.values() is manually maintained
src/simple_resume/core/constants/__init__.py:152-161 -- Every enum member must be explicitly listed. If a member is added without updating values(), the set becomes inconsistent. Replace with {member.value for member in cls}. Same issue exists in OutputFormat.values().
(Type Design)

15. Mutable global lists can be corrupted
src/simple_resume/core/ats/taxonomy.py:33-93 (HARDCODED_SKILLS), creative_terms.py (_GENERAL_TERMS, _TECH_TERMS, etc.) -- Plain list objects at module level. Any consumer can mutate shared state. Use tuple for immutable module-level sequences.
(Type Design)


Suggestions (nice to have)

  • f-strings in logging: Multiple locations in taxonomy.py (lines 211, 222, 225) use f-string interpolation inside logger.debug()/logger.warning(). Use lazy formatting: logger.debug("Using cached skills from %s", taxonomy).
  • SkillsTaxonomyFetcher exposes mutable internals: self.config and self.cache are public attributes. Prefix with _ or use __slots__.
  • normalize_term() docstring understates behavior: Says "lowercase, trimmed" but also collapses whitespace via re.sub(r"\s+", " ", ...). Update to "lowercase, whitespace-collapsed, trimmed."
  • get_enhanced_skills() doctest is fragile: Claims len(skills) > 50 but HARDCODED_SKILLS has exactly 59 entries. Use len(skills) > 0 or assert isinstance(skills, list).
  • taxonomy.py references non-existent "ADR002": Lines 15-17 and 121. Only ADR003 and ADR008 files exist under wiki/architecture/. Verify ADR002 is accessible or fix the reference.
  • Redundant inline comments in taxonomy.py: Lines 203, 208, 214, 218, 224 narrate obvious code steps. The 22-line method is self-explanatory.
  • COVER breaks TemplateType naming convention: All other values are resume_* prefixed. Consider whether cover letter templates belong in a separate type.
  • Degree type annotation (str) contradicts runtime guarantee: The __post_init__ narrows the type, but the declared type still says str.

Test Coverage Gaps

Gap File Rating Description
_get_cache_path mkdir failure shell/taxonomy_cache.py:33-36 8/10 No test for OSError when cache dir cannot be created. Could crash in read-only containers.
score() mutates caller's keyword list core/ats/keyword.py:334-343 8/10 No test verifying the original list is unchanged after calling score() with creative expansion.
_fetch_from_api with unknown taxonomy core/ats/taxonomy.py:241-246 8/10 NotImplementedError only tested indirectly via get_skills which catches all exceptions. Need direct test.
Path traversal in taxonomy_name shell/taxonomy_cache.py:36 7/10 taxonomy_name="../escape" could write outside cache directory. No sanitization or test.
_normalize_score edge cases core/ats/base.py:149-152 5/10 Division-by-zero guard (min_val == max_val) exists but is untested.
Overly loose assertions tests/unit/test_ats_keyword_scorer.py:477-500 Minor Tests assert score >= 0.0 which is tautologically true for any valid result. Strengthen to verify fuzzy_matches > 0 / == 0.
Confidence tests test dataclass, not behavior tests/unit/test_creative_terms.py:142-170 Minor TestConfidenceScoring only verifies field assignment works, not any scoring logic.

Type Design Summary

Type Encap. Invariant Expr. Usefulness Enforcement Key Issue
Degree 7 5 8 7 Type annotation contradicts runtime guarantee
CreativeTerm 6 3 7 2 confidence: str should be enum
Industry 9 9 8 9 Well-designed, no issues
TaxonomyConfig 5 4 7 2 Mutable, no validation
TaxonomyCache (Protocol) 8 7 9 6 Name collision with shell alias
NullTaxonomyCache 9 9 8 9 Textbook Null Object -- excellent
TemplateType 8 8 8 7 Manual values() sync risk

Strengths

  • Functional core / imperative shell architecture is well-followed. TaxonomyCache Protocol in core with TaxonomyLocalCache implementation in shell is a clean separation. NullTaxonomyCache is a textbook Null Object pattern.
  • Validation error messages are excellent. context dictionaries with suggestion keys and errors lists in ValidationError give users actionable feedback. This is the standard the rest of the codebase should follow.
  • Module-level docstrings are thorough. taxonomy.py and creative_terms.py follow an excellent pattern: purpose statement, feature list, and design cross-references.
  • KeywordScorer handles edge cases without try/except. Returns explicit zero-score results with descriptive error fields. Clean, testable, no hidden failures.
  • Degree frozen dataclass raises ValueError for empty types rather than silently defaulting.
  • Test suite is strong overall (209 tests passing). Tests follow good practices: behavior over implementation, good fixture usage, parametrize for explicit input/output pairs, BDD Scenario pattern in validation tests.
  • Taxonomy tests precisely test the offline-first contract. Validates disabled API returns hardcoded skills, cache miss falls back, cached data used before API calls.
  • Error handling in TaxonomyLocalCache is well-tested. Corrupted JSON, write failures, and invalid data types all have dedicated tests.

Recommended Action (Priority Order)

  1. Fix creative term expansion logic (Critical feat: add comprehensive testing suite and CI/CD pipeline #1) -- Replace original keyword instead of appending; prevent caller list mutation
  2. Narrow except Exception to specific types (Critical Migrate from inline styles to external CSS files #2) -- Let NotImplementedError and programming bugs propagate
  3. Make API stubs raise NotImplementedError (Critical Fix format #3) -- Or validate taxonomy support in TaxonomyConfig
  4. Fix dead regex patterns in _extract_keywords (Critical Feature: add wide variety of generated custom color schemes #5) -- Preserve original text before preprocessing
  5. Remove duplicate "ninja" entry (Important refactor(core, docs, style): Rename project to simple-resume, refacto… #6) -- One-line fix
  6. Deduplicate TAXONOMY_CACHE_TTL (Important Feature: add NLP resume screening capability such that we can screen resumes before generation #7) -- Single source of truth in core
  7. Add frozen=True to TaxonomyConfig (Important Feature: do not fully convert to HTML/PDF by default; generate the markdown or latex intermediary at resume_private/output, and use --render flag to do full render to PDF/HTML #12) -- One keyword, significant safety gain
  8. Make TemplateType.values() dynamic (Important fix(packaging): ensure source files are present in the source distribution #14) -- {m.value for m in cls} eliminates maintenance trap
  9. Address XSS in templates (Critical Feature: Add ability to create a simple LaTeX generated PDF or HTML resume without custom assets, icons, etc. #4) -- Sanitize or document the autoescape false usage
  10. Add missing tests for cache mkdir failure, keyword list mutation, and path traversal

- Extract KeywordScorerConfig dataclass to fix PLR0913 (too many args)
- Extract _expand_creative_terms helper to fix PLR0912 (too many branches)
- Add TaxonomySource enum replacing string literals
- Update all tests for new config pattern
- Create issues #81 #82 #83 for deferred work
Fix YAML parsing in pre-commit config where unquoted `: ` broke the
no-ai-attribution hook and column-1 Python code terminated the
wheel-install-test block scalar early.

Add __all__ to cli/main.py for mypy export check, use explicit kwargs
for ty type safety in generate.py, add missing data_dir/output_dir
fields to runtime GenerateOptions, and auto-register shell services
in core generate() for standalone Python API usage.
…ignature

The _generate_format() function now explicitly passes parallel and browser
kwargs to session.generate_all() in batch mode. The two batch-mode test
assertions were missing these parameters, causing CI failures.
@athola athola merged commit 512b9e3 into main Feb 16, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant