Skip to content

Releases: rolfedh/doc-utils

v0.1.42

03 Feb 20:26

Choose a tag to compare

Added

  • convert-id-attributes-to-ids - New tool to convert :id: attributes to AsciiDoc anchors
    • Converts :id: value to [id="value_{context}"] format
    • Optional --clean-up flag removes boilerplate comments and include directives
    • Removes // define ID as an attribute comments
    • Removes // assign ID conditionally comments
    • Removes include::{modules}/common/id.adoc[] directives
    • Dry-run mode for safe preview
    • CLI command: convert-id-attributes-to-ids [--clean-up] [--dry-run] [path]

Release v0.1.41

27 Jan 22:36

Choose a tag to compare

New Tools

insert-abstract-role

Adds [role="_abstract"] attribute above the first paragraph for DITA short description conversion.

insert-procedure-title

Adds .Procedure block title for DITA task compliance. Resolves AsciiDocDITA.TaskContents Vale warnings.

Bug Fixes

  • Fixed callout conversion to preserve {nbsp} and + markers between code blocks and explanations
  • Fixed false "missing explanations" warnings when {nbsp} spacers follow code blocks
  • Improved detection to stop at admonition blocks ([NOTE], [IMPORTANT], etc.)

See CHANGELOG.md for full details.

v0.1.40

23 Jan 14:03

Choose a tag to compare

Added

  • inventory-conditionals - New tool to create timestamped inventory of AsciiDoc conditionals

    • Scans for ifdef, ifndef, ifeval, and endif directives
    • Groups results by conditional name for easy review
    • Auto-generates reports to ./reports/ directory
    • Supports txt, csv, json, md output formats
    • CLI command: inventory-conditionals [directory] [options]
  • find-duplicate-content - New tool to find duplicate/similar content blocks across AsciiDoc files

    • Detects duplicate notes, tips, warnings, tables, step sequences, and code blocks
    • Uses Jaccard similarity for fuzzy matching (configurable threshold)
    • Helps identify copy-pasted content that could be refactored into shared modules
    • Auto-generates reports to ./reports/ directory
    • CLI command: find-duplicate-content [directory] [options]
  • find-duplicate-includes - New tool to find files included from multiple locations

    • Scans for include:: macros and identifies multiply-included files
    • Excludes common files (attributes.adoc, etc.) by default
    • Reports source file and line number for each inclusion
    • Helps audit content reuse patterns in modular documentation
    • CLI command: find-duplicate-includes [directory] [options]

Enhanced

  • find-unused-attributes - Added --remove option to automatically remove unused attribute definitions
    • Modifies attribute files in place after confirmation
    • Provides preview output before removal

Documentation

  • Added Content Reuse Assessment guide (docs/content-reuse-assessment.md)
    • Documents how to use doc-utils tools for Pre-Migration Reuse Readiness Tasks
    • Maps tools to Content Reuse Assessment Worksheet tasks
  • Added tool documentation pages:
    • docs/tools/inventory-conditionals.md
    • docs/tools/find-duplicate-content.md
    • docs/tools/find-duplicate-includes.md
  • Updated CLAUDE.md with new tools and modules

v0.1.38 - check-published-links

16 Dec 17:14

Choose a tag to compare

New Tool: check-published-links

A wrapper around linkchecker to validate links on published HTML documentation with intelligent handling of documentation-specific issues.

Features

  • URL rewriting - Corrects misresolved paths for platforms with URL routing issues
  • Smart false positive handling - Automatically filters host:port placeholders, Maven Central 403s, localhost URLs
  • Configuration file support - INI-style config with [settings], [ignore-patterns], and [rewrite-rules] sections
  • Single and bulk modes - Check one URL or a list of documentation pages
  • Detailed reports - Timestamped reports with error categorization
  • Configurable options - Timeout, reports directory, custom ignore patterns

Usage

# Single URL
check-published-links https://docs.example.com/guide/

# Bulk validation
check-published-links --file urls-to-check.txt

# With URL rewriting
check-published-links https://docs.example.com/guide/ \
    --rewrite-pattern "/docs/en/product/" \
    --rewrite-replacement "/docs/en/PRODUCT_V1.0/"

# Using config file
check-published-links https://docs.example.com/guide/ --config linkcheck.conf

Prerequisites

Requires linkchecker to be installed:

pipx install linkchecker

Documentation

See check-published-links documentation for full details.

v0.1.37

08 Dec 13:10

Choose a tag to compare

Added

  • convert-tables-to-deflists - New tool to convert AsciiDoc tables to definition lists
    • Converts 2-column tables by default (column 1 → term, column 2 → definition)
    • Multi-column support with --columns TERM,DEF option (e.g., --columns 1,3)
    • Automatically skips callout tables (use convert-callouts-to-deflist for those)
    • Detects and handles header rows automatically
    • Preserves conditional directives (ifdef::/ifndef::/endif::)
    • Dry-run mode by default for safe preview, use --apply to modify files
    • Full exclusion support (--exclude-dir, --exclude-file, --exclude-list)
    • CLI command: convert-tables-to-deflists [--apply] [--columns TERM,DEF] [path]

Documentation

  • Added docs/tools/convert-tables-to-deflists.md with full tool documentation
  • Updated docs/tools/index.md with new tool entry
  • Updated CLAUDE.md with new tool in CLI Tools list and Project Structure

v0.1.35

11 Nov 12:53

Choose a tag to compare

Added

  • Commented references tracking - Enhanced archive-unused-files and archive-unused-images with intelligent handling of commented references
    • New --commented flag to include files/images referenced only in commented lines in archive operations
    • Default behavior: Files/images referenced only in comments are considered "used" and will NOT be archived
    • Automatic generation of detailed reports showing commented-only references with exact locations (file paths, line numbers, and text)
    • Report paths: ./archive/commented-references-report.txt (files) and ./archive/commented-image-references-report.txt (images)
    • Dual tracking system separates uncommented references from commented-only references
    • State management automatically moves items from "commented-only" to "used" when uncommented reference found

Enhanced

  • Detection patterns - Added robust regex-based commented line detection
    • Files: ^\s*//.*include::(.+?)\[ detects commented includes with whitespace variations
    • Images: ^\s*// checks if entire line is commented before checking for image references
    • Handles AsciiDoc comment syntax variations correctly

Documentation

  • GitHub Pages - Updated archive-unused-files.md with commented references behavior section
    • Added explanation of default behavior vs --commented flag
    • Added "Working with Commented References" examples section with practical workflows
  • GitHub Pages - Updated archive-unused-images.md with commented references behavior section
    • Added detailed commented references behavior documentation
    • Added workflow examples for reviewing and archiving commented-only content
  • GitHub Pages - Updated tools/index.md with "NEW" badges for commented references features
    • Updated both archive-unused-files and archive-unused-images feature lists
    • Updated quick usage examples to demonstrate new functionality
  • CLAUDE.md - Added comprehensive "Commented References Tracking" section
    • Documented implementation details, detection patterns, and test coverage
    • Added to "Recent Improvements" section for future development reference

Tests

  • Test coverage - Added comprehensive tests for commented references functionality
    • test_archive_unused_files_commented_references(): Verifies default behavior treats commented-only as "used"
    • test_archive_unused_files_with_commented_flag(): Verifies --commented flag includes commented-only files
    • test_archive_unused_images_commented_references(): Verifies image detection for commented-only references
    • test_archive_unused_images_with_commented_flag(): Verifies --commented flag includes commented-only images
    • All tests use line-by-line exact matching to avoid substring false positives
    • Total test coverage: 10/10 tests passing (4 new tests added)

v0.1.34

11 Nov 12:53

Choose a tag to compare

Release 0.1.34

See CHANGELOG.md for details.

v0.1.33

28 Oct 17:12

Choose a tag to compare

Fixed

  • Callout detection - Fixed false positives with Java generics and angle-bracket syntax
    • Removed user-replaceable value extraction from both detector.py and converter_deflist.py
    • Java generics like CrudRepository<MyEntity, Integer> no longer incorrectly extracted
    • Full code lines now preserved in definition list terms
    • Fixes issue where <MyEntity, Integer> was extracted instead of the complete code line
  • Callout detection - Fixed semicolon removal bug in Java/C/C++/JavaScript code
    • Removed semicolon (;) from CALLOUT_WITH_COMMENT regex pattern
    • Semicolons now correctly preserved as statement terminators
    • Example: Optional<String> name; <1> now correctly becomes Optional<String> name;
    • Semicolon was being treated as comment marker (Lisp-style) causing false removal
  • Interactive tool - Fixed quit/exit behavior and added explicit quit option
    • Ctrl+C now immediately exits script (was continuing to summary)
    • Added 'Q' (capital) option to quit script entirely at any prompt
    • Clarified 'q' (lowercase) as "Skip current file" instead of ambiguous "Quit"
    • All lowercase options now case-insensitive for better user experience

Documentation

  • GitHub Pages - Updated convert-callouts-to-deflist.md to reflect removal of angle-bracket extraction
    • Changed "Intelligent Value Extraction" to "Code Line Extraction" section
    • Updated examples to show full code lines in definition list terms
  • GitHub Pages - Updated convert-callouts-interactive.md keyboard shortcuts
    • Documented distinction between 'q' (skip file) and 'Q' (quit script)
    • Added Ctrl+C immediate exit behavior
  • CLAUDE.md - Added pipx installation and upgrade procedures
    • Documented full upgrade procedure with build artifact cleaning
    • Added key phrase "upgrade doc-utils" for Claude automation
    • Explained why cleaning build artifacts is critical before reinstalling

v0.1.32

22 Oct 20:10

Choose a tag to compare

Fixed

  • Version reporting - Fixed version.py not synced with pyproject.toml
    • Updated doc_utils/version.py from 0.1.30 to match current release
    • Now properly reports v0.1.32 instead of outdated v0.1.30

Added

  • CLI tools - Added --version option to callout conversion tools
    • convert-callouts-to-deflist --version now displays version information
    • convert-callouts-interactive --version now displays version information
    • Both tools import version from doc_utils.version for consistency

v0.1.31

22 Oct 19:53

Choose a tag to compare

Added

  • convert-callouts-to-deflist - Added --force option to strip callouts despite warnings

    • Allows conversion to proceed when callout warnings are present (missing explanations or mismatches)
    • Strips callouts from blocks with missing explanations without creating explanation lists
    • Converts blocks with callout mismatches using available explanations
    • Requires confirmation prompt before proceeding (skipped in dry-run mode)
    • Useful for intentionally shared explanations between conditional blocks
    • Documented in warnings report and GitHub Pages with comprehensive examples
  • Warnings Report - Automatic generation of structured AsciiDoc warnings report

    • Enabled by default, generates callout-warnings-report.adoc in current directory
    • Reduces console spam by showing minimal summary: "⚠️ 4 Warning(s) - See callout-warnings-report.adoc for details"
    • Report includes summary by warning type, recommended actions, and force mode documentation
    • Callout mismatch analysis detects duplicates, missing callouts, extra callouts, and off-by-one errors
    • Missing explanations section lists possible causes (shared explanations, unexpected location, missing)
    • Command-line options: --warnings-report (default), --no-warnings-report, --warnings-file=<path>
    • Can be committed to git to track warning resolution progress
  • convert-callouts-to-deflist - Warning for code blocks with callouts but no explanations

    • Detects when code block has callouts but no explanation table or list found
    • Provides helpful diagnostic message with possible causes
    • Suggests manual review for shared explanations or documentation errors

Enhanced

  • Table parser - Improved detection of callout explanations in tables

    • Now handles cell type specifiers without leading pipe (e.g., a| at start of line)
    • Accepts both |cell and a|cell formats for AsciiDoc cell type specifiers
    • Recognizes all cell type specifiers: a, s, h, d, m, e, v
    • Fixed issue where code block closing delimiter (----) was incorrectly treated as new code block start
    • Added logic to skip closing delimiter before searching for callout table
  • Table parser - Support for plain number callouts in tables (in addition to angle-bracket format)

    • Tables can now use plain numbers (1, 2, 3) instead of angle-bracket format (<1>, <2>, <3>)
    • Unified detection via _is_callout_or_number() method accepting both formats
    • Increased detection rate: 58% more files detected, 130% more code blocks found
    • Example: First column can be 1 or <1>, both are recognized as callout references
  • Validation warnings - Show duplicate callout numbers in explanations

    • Warnings now preserve duplicates: [1, 2, 3, 4, 5, 7, 8, 8, 9] instead of deduplicated [1, 2, 3, 4, 5, 7, 8, 9]
    • Added get_table_callout_numbers() method to extract raw callout numbers from tables
    • Updated validate_callouts() to return lists instead of sets to preserve duplicates
    • Helps identify table rows with incorrect callout numbering
  • User guidance - Added suggestion messages when warnings occur

    • Console shows: "Suggestion: Review and fix the callout issues listed in [report], then rerun this command."
    • Warnings report includes "Recommended Actions" section with 4-step workflow
    • Clear guidance on when to use force mode and how to review changes

Fixed

  • CRITICAL - Preserve content between code block and explanations
    • Fixed bug where converter deleted endif:: directives, continuation markers (+), and paragraph text
    • Now uses detector.last_table.start_line to accurately find where explanations begin
    • Preserves slice new_lines[content_end + 1:explanation_start_line] containing critical AsciiDoc directives
    • Applies to both comments format and definition list/bullets formats
    • Prevents corruption of conditional compilation blocks in documentation

Documentation

  • convert-callouts-to-deflist.md - Documented force mode option

    • Added --force option to Options section with "USE WITH CAUTION" warning
    • Added "Force Mode" subsection with confirmation prompt example
    • Documented what force mode does for missing explanations and callout mismatches
    • Included 6-step recommended workflow
    • Provided real-world example of appropriate force mode usage (shared explanations in conditionals)
  • Warnings Report - Documented warnings report feature

    • Added "Warnings Report File" section explaining enabled-by-default behavior
    • Documented command-line options for controlling report generation
    • Listed benefits: clean console output, structured format, git tracking, AsciiDoc rendering