Releases: rolfedh/doc-utils
v0.1.42
Added
- convert-id-attributes-to-ids - New tool to convert
:id:attributes to AsciiDoc anchors- Converts
:id: valueto[id="value_{context}"]format - Optional
--clean-upflag removes boilerplate comments and include directives - Removes
// define ID as an attributecomments - Removes
// assign ID conditionallycomments - Removes
include::{modules}/common/id.adoc[]directives - Dry-run mode for safe preview
- CLI command:
convert-id-attributes-to-ids [--clean-up] [--dry-run] [path]
- Converts
Release v0.1.41
New Tools
insert-abstract-role
Adds [role="_abstract"] attribute above the first paragraph for DITA short description conversion.
insert-procedure-title
Adds .Procedure block title for DITA task compliance. Resolves AsciiDocDITA.TaskContents Vale warnings.
Bug Fixes
- Fixed callout conversion to preserve
{nbsp}and+markers between code blocks and explanations - Fixed false "missing explanations" warnings when
{nbsp}spacers follow code blocks - Improved detection to stop at admonition blocks (
[NOTE],[IMPORTANT], etc.)
See CHANGELOG.md for full details.
v0.1.40
Added
-
inventory-conditionals - New tool to create timestamped inventory of AsciiDoc conditionals
- Scans for
ifdef,ifndef,ifeval, andendifdirectives - Groups results by conditional name for easy review
- Auto-generates reports to
./reports/directory - Supports txt, csv, json, md output formats
- CLI command:
inventory-conditionals [directory] [options]
- Scans for
-
find-duplicate-content - New tool to find duplicate/similar content blocks across AsciiDoc files
- Detects duplicate notes, tips, warnings, tables, step sequences, and code blocks
- Uses Jaccard similarity for fuzzy matching (configurable threshold)
- Helps identify copy-pasted content that could be refactored into shared modules
- Auto-generates reports to
./reports/directory - CLI command:
find-duplicate-content [directory] [options]
-
find-duplicate-includes - New tool to find files included from multiple locations
- Scans for
include::macros and identifies multiply-included files - Excludes common files (attributes.adoc, etc.) by default
- Reports source file and line number for each inclusion
- Helps audit content reuse patterns in modular documentation
- CLI command:
find-duplicate-includes [directory] [options]
- Scans for
Enhanced
- find-unused-attributes - Added
--removeoption to automatically remove unused attribute definitions- Modifies attribute files in place after confirmation
- Provides preview output before removal
Documentation
- Added Content Reuse Assessment guide (
docs/content-reuse-assessment.md)- Documents how to use doc-utils tools for Pre-Migration Reuse Readiness Tasks
- Maps tools to Content Reuse Assessment Worksheet tasks
- Added tool documentation pages:
docs/tools/inventory-conditionals.mddocs/tools/find-duplicate-content.mddocs/tools/find-duplicate-includes.md
- Updated CLAUDE.md with new tools and modules
v0.1.38 - check-published-links
New Tool: check-published-links
A wrapper around linkchecker to validate links on published HTML documentation with intelligent handling of documentation-specific issues.
Features
- URL rewriting - Corrects misresolved paths for platforms with URL routing issues
- Smart false positive handling - Automatically filters host:port placeholders, Maven Central 403s, localhost URLs
- Configuration file support - INI-style config with
[settings],[ignore-patterns], and[rewrite-rules]sections - Single and bulk modes - Check one URL or a list of documentation pages
- Detailed reports - Timestamped reports with error categorization
- Configurable options - Timeout, reports directory, custom ignore patterns
Usage
# Single URL
check-published-links https://docs.example.com/guide/
# Bulk validation
check-published-links --file urls-to-check.txt
# With URL rewriting
check-published-links https://docs.example.com/guide/ \
--rewrite-pattern "/docs/en/product/" \
--rewrite-replacement "/docs/en/PRODUCT_V1.0/"
# Using config file
check-published-links https://docs.example.com/guide/ --config linkcheck.confPrerequisites
Requires linkchecker to be installed:
pipx install linkcheckerDocumentation
See check-published-links documentation for full details.
v0.1.37
Added
- convert-tables-to-deflists - New tool to convert AsciiDoc tables to definition lists
- Converts 2-column tables by default (column 1 → term, column 2 → definition)
- Multi-column support with
--columns TERM,DEFoption (e.g.,--columns 1,3) - Automatically skips callout tables (use convert-callouts-to-deflist for those)
- Detects and handles header rows automatically
- Preserves conditional directives (
ifdef::/ifndef::/endif::) - Dry-run mode by default for safe preview, use
--applyto modify files - Full exclusion support (
--exclude-dir,--exclude-file,--exclude-list) - CLI command:
convert-tables-to-deflists [--apply] [--columns TERM,DEF] [path]
Documentation
- Added
docs/tools/convert-tables-to-deflists.mdwith full tool documentation - Updated
docs/tools/index.mdwith new tool entry - Updated
CLAUDE.mdwith new tool in CLI Tools list and Project Structure
v0.1.35
Added
- Commented references tracking - Enhanced archive-unused-files and archive-unused-images with intelligent handling of commented references
- New
--commentedflag to include files/images referenced only in commented lines in archive operations - Default behavior: Files/images referenced only in comments are considered "used" and will NOT be archived
- Automatic generation of detailed reports showing commented-only references with exact locations (file paths, line numbers, and text)
- Report paths:
./archive/commented-references-report.txt(files) and./archive/commented-image-references-report.txt(images) - Dual tracking system separates uncommented references from commented-only references
- State management automatically moves items from "commented-only" to "used" when uncommented reference found
- New
Enhanced
- Detection patterns - Added robust regex-based commented line detection
- Files:
^\s*//.*include::(.+?)\[detects commented includes with whitespace variations - Images:
^\s*//checks if entire line is commented before checking for image references - Handles AsciiDoc comment syntax variations correctly
- Files:
Documentation
- GitHub Pages - Updated archive-unused-files.md with commented references behavior section
- Added explanation of default behavior vs --commented flag
- Added "Working with Commented References" examples section with practical workflows
- GitHub Pages - Updated archive-unused-images.md with commented references behavior section
- Added detailed commented references behavior documentation
- Added workflow examples for reviewing and archiving commented-only content
- GitHub Pages - Updated tools/index.md with "NEW" badges for commented references features
- Updated both archive-unused-files and archive-unused-images feature lists
- Updated quick usage examples to demonstrate new functionality
- CLAUDE.md - Added comprehensive "Commented References Tracking" section
- Documented implementation details, detection patterns, and test coverage
- Added to "Recent Improvements" section for future development reference
Tests
- Test coverage - Added comprehensive tests for commented references functionality
test_archive_unused_files_commented_references(): Verifies default behavior treats commented-only as "used"test_archive_unused_files_with_commented_flag(): Verifies --commented flag includes commented-only filestest_archive_unused_images_commented_references(): Verifies image detection for commented-only referencestest_archive_unused_images_with_commented_flag(): Verifies --commented flag includes commented-only images- All tests use line-by-line exact matching to avoid substring false positives
- Total test coverage: 10/10 tests passing (4 new tests added)
v0.1.34
Release 0.1.34
See CHANGELOG.md for details.
v0.1.33
Fixed
- Callout detection - Fixed false positives with Java generics and angle-bracket syntax
- Removed user-replaceable value extraction from both
detector.pyandconverter_deflist.py - Java generics like
CrudRepository<MyEntity, Integer>no longer incorrectly extracted - Full code lines now preserved in definition list terms
- Fixes issue where
<MyEntity, Integer>was extracted instead of the complete code line
- Removed user-replaceable value extraction from both
- Callout detection - Fixed semicolon removal bug in Java/C/C++/JavaScript code
- Removed semicolon (;) from CALLOUT_WITH_COMMENT regex pattern
- Semicolons now correctly preserved as statement terminators
- Example:
Optional<String> name; <1>now correctly becomesOptional<String> name; - Semicolon was being treated as comment marker (Lisp-style) causing false removal
- Interactive tool - Fixed quit/exit behavior and added explicit quit option
- Ctrl+C now immediately exits script (was continuing to summary)
- Added 'Q' (capital) option to quit script entirely at any prompt
- Clarified 'q' (lowercase) as "Skip current file" instead of ambiguous "Quit"
- All lowercase options now case-insensitive for better user experience
Documentation
- GitHub Pages - Updated convert-callouts-to-deflist.md to reflect removal of angle-bracket extraction
- Changed "Intelligent Value Extraction" to "Code Line Extraction" section
- Updated examples to show full code lines in definition list terms
- GitHub Pages - Updated convert-callouts-interactive.md keyboard shortcuts
- Documented distinction between 'q' (skip file) and 'Q' (quit script)
- Added Ctrl+C immediate exit behavior
- CLAUDE.md - Added pipx installation and upgrade procedures
- Documented full upgrade procedure with build artifact cleaning
- Added key phrase "upgrade doc-utils" for Claude automation
- Explained why cleaning build artifacts is critical before reinstalling
v0.1.32
Fixed
- Version reporting - Fixed version.py not synced with pyproject.toml
- Updated doc_utils/version.py from 0.1.30 to match current release
- Now properly reports v0.1.32 instead of outdated v0.1.30
Added
- CLI tools - Added
--versionoption to callout conversion toolsconvert-callouts-to-deflist --versionnow displays version informationconvert-callouts-interactive --versionnow displays version information- Both tools import version from doc_utils.version for consistency
v0.1.31
Added
-
convert-callouts-to-deflist - Added
--forceoption to strip callouts despite warnings- Allows conversion to proceed when callout warnings are present (missing explanations or mismatches)
- Strips callouts from blocks with missing explanations without creating explanation lists
- Converts blocks with callout mismatches using available explanations
- Requires confirmation prompt before proceeding (skipped in dry-run mode)
- Useful for intentionally shared explanations between conditional blocks
- Documented in warnings report and GitHub Pages with comprehensive examples
-
Warnings Report - Automatic generation of structured AsciiDoc warnings report
- Enabled by default, generates
callout-warnings-report.adocin current directory - Reduces console spam by showing minimal summary: "
⚠️ 4 Warning(s) - See callout-warnings-report.adoc for details" - Report includes summary by warning type, recommended actions, and force mode documentation
- Callout mismatch analysis detects duplicates, missing callouts, extra callouts, and off-by-one errors
- Missing explanations section lists possible causes (shared explanations, unexpected location, missing)
- Command-line options:
--warnings-report(default),--no-warnings-report,--warnings-file=<path> - Can be committed to git to track warning resolution progress
- Enabled by default, generates
-
convert-callouts-to-deflist - Warning for code blocks with callouts but no explanations
- Detects when code block has callouts but no explanation table or list found
- Provides helpful diagnostic message with possible causes
- Suggests manual review for shared explanations or documentation errors
Enhanced
-
Table parser - Improved detection of callout explanations in tables
- Now handles cell type specifiers without leading pipe (e.g.,
a|at start of line) - Accepts both
|cellanda|cellformats for AsciiDoc cell type specifiers - Recognizes all cell type specifiers: a, s, h, d, m, e, v
- Fixed issue where code block closing delimiter (
----) was incorrectly treated as new code block start - Added logic to skip closing delimiter before searching for callout table
- Now handles cell type specifiers without leading pipe (e.g.,
-
Table parser - Support for plain number callouts in tables (in addition to angle-bracket format)
- Tables can now use plain numbers (1, 2, 3) instead of angle-bracket format (
<1>,<2>,<3>) - Unified detection via
_is_callout_or_number()method accepting both formats - Increased detection rate: 58% more files detected, 130% more code blocks found
- Example: First column can be
1or<1>, both are recognized as callout references
- Tables can now use plain numbers (1, 2, 3) instead of angle-bracket format (
-
Validation warnings - Show duplicate callout numbers in explanations
- Warnings now preserve duplicates:
[1, 2, 3, 4, 5, 7, 8, 8, 9]instead of deduplicated[1, 2, 3, 4, 5, 7, 8, 9] - Added
get_table_callout_numbers()method to extract raw callout numbers from tables - Updated
validate_callouts()to return lists instead of sets to preserve duplicates - Helps identify table rows with incorrect callout numbering
- Warnings now preserve duplicates:
-
User guidance - Added suggestion messages when warnings occur
- Console shows: "Suggestion: Review and fix the callout issues listed in [report], then rerun this command."
- Warnings report includes "Recommended Actions" section with 4-step workflow
- Clear guidance on when to use force mode and how to review changes
Fixed
- CRITICAL - Preserve content between code block and explanations
- Fixed bug where converter deleted
endif::directives, continuation markers (+), and paragraph text - Now uses
detector.last_table.start_lineto accurately find where explanations begin - Preserves slice
new_lines[content_end + 1:explanation_start_line]containing critical AsciiDoc directives - Applies to both comments format and definition list/bullets formats
- Prevents corruption of conditional compilation blocks in documentation
- Fixed bug where converter deleted
Documentation
-
convert-callouts-to-deflist.md - Documented force mode option
- Added
--forceoption to Options section with "USE WITH CAUTION" warning - Added "Force Mode" subsection with confirmation prompt example
- Documented what force mode does for missing explanations and callout mismatches
- Included 6-step recommended workflow
- Provided real-world example of appropriate force mode usage (shared explanations in conditionals)
- Added
-
Warnings Report - Documented warnings report feature
- Added "Warnings Report File" section explaining enabled-by-default behavior
- Documented command-line options for controlling report generation
- Listed benefits: clean console output, structured format, git tracking, AsciiDoc rendering