Skip to content

docs: clarify OCR requires markitdown-ocr plugin#1608

Open
Jah-yee wants to merge 1 commit intomicrosoft:mainfrom
Jah-yee:fix/ocr-documentation-clarity
Open

docs: clarify OCR requires markitdown-ocr plugin#1608
Jah-yee wants to merge 1 commit intomicrosoft:mainfrom
Jah-yee:fix/ocr-documentation-clarity

Conversation

@Jah-yee
Copy link

@Jah-yee Jah-yee commented Mar 11, 2026

Summary

This PR addresses issue #1601: OCR is not working for PDFs with embedded images.

The README currently lists "Images (EXIF metadata and OCR)" as a built-in feature, but OCR for PDFs actually requires the separate markitdown-ocr plugin to be installed and configured.

Changes

  • Updated feature list to clarify OCR requires markitdown-ocr plugin
  • Added CLI usage example for the OCR plugin
  • Addresses the confusion reported in issue OCR is not working #1601

Testing

  • Documentation builds correctly
  • CLI example added matches plugin documentation

- Update feature list to note OCR requires markitdown-ocr plugin
- Add CLI usage example for OCR plugin
- Addresses issue microsoft#1601: OCR is not working
@Jah-yee Jah-yee mentioned this pull request Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants