Skip to content

Commit 8a136ce

Browse files
committed
Prepare release v0.1.35
1 parent 4f85a7a commit 8a136ce

13 files changed

+453
-44
lines changed

CHANGELOG.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,46 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
## [0.1.35] - 2025-11-11
11+
12+
### Added
13+
- **Commented references tracking** - Enhanced archive-unused-files and archive-unused-images with intelligent handling of commented references
14+
- New `--commented` flag to include files/images referenced only in commented lines in archive operations
15+
- Default behavior: Files/images referenced only in comments are considered "used" and will NOT be archived
16+
- Automatic generation of detailed reports showing commented-only references with exact locations (file paths, line numbers, and text)
17+
- Report paths: `./archive/commented-references-report.txt` (files) and `./archive/commented-image-references-report.txt` (images)
18+
- Dual tracking system separates uncommented references from commented-only references
19+
- State management automatically moves items from "commented-only" to "used" when uncommented reference found
20+
21+
### Enhanced
22+
- **Detection patterns** - Added robust regex-based commented line detection
23+
- Files: `^\s*//.*include::(.+?)\[` detects commented includes with whitespace variations
24+
- Images: `^\s*//` checks if entire line is commented before checking for image references
25+
- Handles AsciiDoc comment syntax variations correctly
26+
27+
### Documentation
28+
- **GitHub Pages** - Updated archive-unused-files.md with commented references behavior section
29+
- Added explanation of default behavior vs --commented flag
30+
- Added "Working with Commented References" examples section with practical workflows
31+
- **GitHub Pages** - Updated archive-unused-images.md with commented references behavior section
32+
- Added detailed commented references behavior documentation
33+
- Added workflow examples for reviewing and archiving commented-only content
34+
- **GitHub Pages** - Updated tools/index.md with "NEW" badges for commented references features
35+
- Updated both archive-unused-files and archive-unused-images feature lists
36+
- Updated quick usage examples to demonstrate new functionality
37+
- **CLAUDE.md** - Added comprehensive "Commented References Tracking" section
38+
- Documented implementation details, detection patterns, and test coverage
39+
- Added to "Recent Improvements" section for future development reference
40+
41+
### Tests
42+
- **Test coverage** - Added comprehensive tests for commented references functionality
43+
- `test_archive_unused_files_commented_references()`: Verifies default behavior treats commented-only as "used"
44+
- `test_archive_unused_files_with_commented_flag()`: Verifies --commented flag includes commented-only files
45+
- `test_archive_unused_images_commented_references()`: Verifies image detection for commented-only references
46+
- `test_archive_unused_images_with_commented_flag()`: Verifies --commented flag includes commented-only images
47+
- All tests use line-by-line exact matching to avoid substring false positives
48+
- Total test coverage: 10/10 tests passing (4 new tests added)
49+
1050
## [0.1.33] - 2025-10-28
1151

1252
### Fixed

CLAUDE.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -600,6 +600,36 @@ When contributing to this project:
600600
601601
## Recent Improvements (Latest Refactoring)
602602
603+
### Commented References Tracking (v0.1.35)
604+
1. **Enhanced Archive Tools**: Added intelligent handling of commented references in both archive-unused-files and archive-unused-images
605+
- **Default Behavior**: Files/images referenced only in commented lines are considered "used" and will NOT be archived
606+
- **New `--commented` Flag**: Include commented-only references in archive operations
607+
- **Detailed Reports**: Automatically generates reports showing files/images referenced only in comments with exact locations
608+
2. **Implementation**: Dual tracking system for referenced content
609+
- `referenced_files` set: Tracks uncommented includes/images
610+
- `commented_only_files` dict: Tracks commented-only references with file paths, line numbers, and text
611+
- State management: Files move from "commented-only" to "used" when uncommented reference found
612+
- Report generation in `./archive/commented-references-report.txt` and `./archive/commented-image-references-report.txt`
613+
3. **Detection Patterns**: Robust regex-based commented line detection
614+
- For files: `^\s*//.*include::(.+?)\[` detects commented includes
615+
- For images: `^\s*//` checks if entire line is commented
616+
- Handles whitespace variations in AsciiDoc comment syntax
617+
4. **CLI Changes**: New flag added to both tools
618+
- `archive-unused-files --commented`: Include files with commented-only references
619+
- `archive-unused-images --commented`: Include images with commented-only references
620+
- Default behavior prioritizes safety (preserving potentially useful content)
621+
5. **Documentation**: Comprehensive updates across all documentation
622+
- Added "Commented References Behavior" sections to tool documentation
623+
- Added "Working with Commented References" examples showing practical workflows
624+
- Updated GitHub Pages tool index with new feature descriptions
625+
- Created detailed release notes in `/tmp/release-notes-commented-references.txt`
626+
6. **Test Coverage**: Added comprehensive tests for both tools
627+
- `test_archive_unused_files_commented_references()`: Verifies default behavior
628+
- `test_archive_unused_files_with_commented_flag()`: Verifies --commented flag
629+
- `test_archive_unused_images_commented_references()`: Verifies image detection
630+
- `test_archive_unused_images_with_commented_flag()`: Verifies image --commented flag
631+
- All tests use line-by-line exact matching to avoid substring false positives
632+
603633
### Definition Prefix Options (v0.1.34)
604634
1. **New CLI Options for convert-callouts-to-deflist**: Added prefix support for definition list format
605635
- `-s, --specifies`: Adds "Specifies " prefix before each definition

archive_unused_files.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ def main():
2222
epilog='By default, automatically discovers all modules and assemblies directories in the repository.'
2323
)
2424
parser.add_argument('--archive', action='store_true', help='Move the files to a dated zip in the archive directory.')
25+
parser.add_argument('--commented', action='store_true', help='Include files that are referenced only in commented lines in the archive operation.')
2526
parser.add_argument('--scan-dir', action='append', default=[], help='Specific directory to scan (can be used multiple times). If not specified, auto-discovers directories.')
2627
parser.add_argument('--exclude-dir', action='append', default=[], help='Directory to exclude (can be used multiple times).')
2728
parser.add_argument('--exclude-file', action='append', default=[], help='File to exclude (can be used multiple times).')
@@ -35,13 +36,13 @@ def main():
3536

3637
exclude_dirs = list(args.exclude_dir)
3738
exclude_files = list(args.exclude_file)
38-
39+
3940
if args.exclude_list:
4041
list_dirs, list_files = parse_exclude_list_file(args.exclude_list)
4142
exclude_dirs.extend(list_dirs)
4243
exclude_files.extend(list_files)
4344

44-
find_unused_adoc(scan_dirs, archive_dir, args.archive, exclude_dirs, exclude_files)
45+
find_unused_adoc(scan_dirs, archive_dir, args.archive, exclude_dirs, exclude_files, args.commented)
4546

4647
if __name__ == '__main__':
4748
main()

archive_unused_images.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ def main():
1818
check_version_on_startup()
1919
parser = argparse.ArgumentParser(description='Archive unused image files.')
2020
parser.add_argument('--archive', action='store_true', help='Move the files to a dated zip in the archive directory.')
21+
parser.add_argument('--commented', action='store_true', help='Include images that are referenced only in commented lines in the archive operation.')
2122
parser.add_argument('--exclude-dir', action='append', default=[], help='Directory to exclude (can be used multiple times).')
2223
parser.add_argument('--exclude-file', action='append', default=[], help='File to exclude (can be used multiple times).')
2324
parser.add_argument('--exclude-list', type=str, help='Path to a file containing directories or files to exclude, one per line.')
@@ -29,13 +30,13 @@ def main():
2930

3031
exclude_dirs = list(args.exclude_dir)
3132
exclude_files = list(args.exclude_file)
32-
33+
3334
if args.exclude_list:
3435
list_dirs, list_files = parse_exclude_list_file(args.exclude_list)
3536
exclude_dirs.extend(list_dirs)
3637
exclude_files.extend(list_files)
3738

38-
find_unused_images(scan_dirs, archive_dir, args.archive, exclude_dirs, exclude_files)
39+
find_unused_images(scan_dirs, archive_dir, args.archive, exclude_dirs, exclude_files, args.commented)
3940

4041
if __name__ == '__main__':
4142
main()

doc_utils/unused_adoc.py

Lines changed: 84 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -60,10 +60,10 @@ def find_scan_directories(base_path='.', exclude_dirs=None):
6060

6161
return scan_dirs
6262

63-
def find_unused_adoc(scan_dirs=None, archive_dir='./archive', archive=False, exclude_dirs=None, exclude_files=None):
63+
def find_unused_adoc(scan_dirs=None, archive_dir='./archive', archive=False, exclude_dirs=None, exclude_files=None, include_commented=False):
6464
# Print safety warning
6565
print("\n⚠️ SAFETY: Work in a git branch! Run without --archive first to preview.\n")
66-
66+
6767
# If no scan_dirs provided, auto-discover them
6868
if not scan_dirs:
6969
scan_dirs = find_scan_directories(exclude_dirs=exclude_dirs)
@@ -75,46 +75,107 @@ def find_unused_adoc(scan_dirs=None, archive_dir='./archive', archive=False, exc
7575
print("No 'modules' or 'assemblies' directories found containing .adoc files.")
7676
print("Please run this tool from your documentation repository root.")
7777
return
78-
78+
7979
# Detect repository type
8080
repo_type = detect_repo_type()
8181
print(f"Detected repository type: {repo_type}")
82-
82+
8383
# Collect all .adoc files in scan directories
8484
asciidoc_files = collect_files(scan_dirs, {'.adoc'}, exclude_dirs, exclude_files)
85-
86-
# Track which files are referenced
87-
referenced_files = set()
88-
85+
86+
# Track which files are referenced (uncommented and commented separately)
87+
referenced_files = set() # Files in uncommented includes
88+
commented_only_files = {} # Files referenced ONLY in commented lines: {basename: [(file, line_num, line_text)]}
89+
8990
if repo_type == 'topic_map':
9091
# For OpenShift-docs style repos, get references from topic maps
9192
topic_references = get_all_topic_map_references()
9293
# Convert to basenames for comparison
9394
referenced_files.update(os.path.basename(ref) for ref in topic_references)
94-
95-
# Always scan for include:: directives in all .adoc files
95+
96+
# Patterns for finding includes (both commented and uncommented)
9697
include_pattern = re.compile(r'include::(.+?)\[')
98+
commented_include_pattern = re.compile(r'^\s*//.*include::(.+?)\[')
99+
97100
adoc_files = collect_files(['.'], {'.adoc'}, exclude_dirs, exclude_files)
98-
101+
99102
for file_path in adoc_files:
100103
try:
101104
with open(file_path, 'r', encoding='utf-8') as f:
102-
content = f.read()
103-
includes = include_pattern.findall(content)
104-
# Extract just the filename from the include path
105-
for include in includes:
106-
# Handle both relative and absolute includes
107-
include_basename = os.path.basename(include)
108-
referenced_files.add(include_basename)
105+
lines = f.readlines()
106+
107+
for line_num, line in enumerate(lines, 1):
108+
# Check if this is a commented include
109+
commented_match = commented_include_pattern.search(line)
110+
if commented_match:
111+
include_basename = os.path.basename(commented_match.group(1))
112+
# Track location of commented reference
113+
if include_basename not in commented_only_files:
114+
commented_only_files[include_basename] = []
115+
commented_only_files[include_basename].append((file_path, line_num, line.strip()))
116+
else:
117+
# Check for uncommented includes
118+
uncommented_match = include_pattern.search(line)
119+
if uncommented_match:
120+
include_basename = os.path.basename(uncommented_match.group(1))
121+
referenced_files.add(include_basename)
122+
# If we found an uncommented reference, remove from commented_only tracking
123+
if include_basename in commented_only_files:
124+
del commented_only_files[include_basename]
109125
except Exception as e:
110126
print(f"Warning: could not read {file_path}: {e}")
111-
112-
# Find unused files by comparing basenames
113-
unused_files = [f for f in asciidoc_files if os.path.basename(f) not in referenced_files]
127+
128+
# Determine which files are unused based on the include_commented flag
129+
if include_commented:
130+
# When --commented is used: treat files with commented-only references as unused
131+
# Only files with uncommented references are considered "used"
132+
unused_files = [f for f in asciidoc_files if os.path.basename(f) not in referenced_files]
133+
commented_only_unused = []
134+
else:
135+
# Default behavior: files referenced only in commented lines are considered "used"
136+
# They should NOT be in the unused list, but we track them for reporting
137+
all_referenced = referenced_files.union(set(commented_only_files.keys()))
138+
unused_files = [f for f in asciidoc_files if os.path.basename(f) not in all_referenced]
139+
140+
# Generate list of files referenced only in comments for the report
141+
commented_only_unused = []
142+
for basename, references in commented_only_files.items():
143+
# Find the full path for this basename in asciidoc_files
144+
matching_files = [f for f in asciidoc_files if os.path.basename(f) == basename]
145+
for f in matching_files:
146+
commented_only_unused.append((f, references))
147+
114148
unused_files = list(dict.fromkeys(unused_files)) # Remove duplicates
115-
149+
150+
# Print summary
116151
print(f"Found {len(unused_files)} unused files out of {len(asciidoc_files)} total files in scan directories")
117-
152+
153+
# Generate detailed report for commented-only references
154+
if commented_only_unused and not include_commented:
155+
report_path = os.path.join(archive_dir, 'commented-references-report.txt')
156+
os.makedirs(archive_dir, exist_ok=True)
157+
158+
with open(report_path, 'w', encoding='utf-8') as report:
159+
report.write("Files Referenced Only in Commented Lines\n")
160+
report.write("=" * 70 + "\n\n")
161+
report.write(f"Found {len(commented_only_unused)} files that are referenced only in commented-out includes.\n")
162+
report.write("These files are considered 'used' by default and will NOT be archived.\n\n")
163+
report.write("To archive these files along with other unused files, use the --commented flag.\n\n")
164+
report.write("-" * 70 + "\n\n")
165+
166+
for file_path, references in sorted(commented_only_unused):
167+
report.write(f"File: {file_path}\n")
168+
report.write(f"Referenced in {len(references)} commented line(s):\n")
169+
for ref_file, line_num, line_text in references:
170+
report.write(f" {ref_file}:{line_num}\n")
171+
report.write(f" {line_text}\n")
172+
report.write("\n")
173+
174+
print(f"\n📋 Found {len(commented_only_unused)} files referenced only in commented lines.")
175+
print(f" Detailed report saved to: {report_path}")
176+
print(f" These files are considered 'used' and will NOT be archived by default.")
177+
print(f" To include them in the archive operation, use the --commented flag.\n")
178+
118179
return write_manifest_and_archive(
119180
unused_files, archive_dir, 'to-archive', 'to-archive', archive=archive
120181
)

0 commit comments

Comments
 (0)