Automate PDF bookmark creation using OCR and AI processing. This tool allows you to generate bookmarks for PDFs by processing table of contents (TOC) images or manually inputting JSON data. It supports both text-based and scanned PDFs with OCR functionality.
- OCR Processing: Extract text from TOC images using Tesseract OCR.
- AI-Powered JSON Generation: Use OpenAI's GPT to refine OCR output into structured JSON.
- Manual JSON Input: Load or paste JSON data for bookmark creation.
- Scanned PDF Support: Enable OCR for scanned PDFs to locate bookmark positions accurately.
- Page Offset Adjustment: Adjust page numbers to match the PDF's actual content.
- Modern GUI: Built with CustomTkinter for a sleek and user-friendly interface.
- Cross-Platform: Works on Windows, macOS, and Linux.
- Python 3.8+: Ensure Python is installed on your system.
- Tesseract OCR: Install Tesseract from UB-Mannheim.
-
Clone the repository:
git clone https://github.com/AhmedMoustafaa/PDF-Bookmark-Automator.git cd PDF-Bookmark-Automator -
Install dependencies:
pip install -r requirements.txt
-
Configure Tesseract (for OCRing the image-based table of contents):
- Set the Tesseract path in the Settings tab of the application.
- Alternatively, add Tesseract to your system PATH.
-
Set OpenAI API Key (only if you wish to use LLM to automatically refine OCR output):
- Obtain an API key from OpenAI.
- Enter the key in the Settings tab.
-
Run the application:
python app/main.py
- Go to the OCR Processing tab.
- Select images of your table of contents (PNG/JPG).
- Click Run OCR to extract text.
- Review the OCR output and click Refine with LLM to generate structured JSON.
- Go to the Manual JSON Input tab.
- Use the Copy LLM Prompt Template button to get a template for generating JSON.
- Paste the JSON or load it from a file.
- Click Validate JSON to ensure the structure is correct.
-
Go to the PDF Operations tab.
-
Select the target PDF file.
-
Set the page offset if needed (e.g., if TOC page numbers don't match the PDF)
offset = [pdf-viewer page number] - [page number]. -
Enable OCR for Scanned Pages if working with scanned PDFs.
-
Click Apply Bookmarks to generate the bookmarked PDF.
- Tesseract Path: Set the path to the Tesseract executable.
- OpenAI API Key: Enter your OpenAI API key for AI processing.
- Default Offset: Set a default page offset for all operations.
The application saves settings in config.json:
{
"tesseract_path": "/path/to/tesseract",
"openai_api_key": "your-api-key",
"default_offset": 0
}-
OCR Errors:
- Ensure Tesseract is installed and the path is correct.
- Use high-quality images for better OCR accuracy.
-
PDF Encryption:
- Decrypt PDFs before processing.
-
Page Offset Issues:
- Calculate the offset as
PDF_page = JSON_page + offset.
- Calculate the offset as
-
OpenAI API Errors:
- Verify your API key and ensure you have sufficient credits.
We welcome contributions! Please follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature/YourFeature). - Commit your changes (
git commit -m 'Add some feature'). - Push to the branch (
git push origin feature/YourFeature). - Open a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
- CustomTkinter for the modern GUI.
- PyMuPDF for PDF manipulation.
- Tesseract OCR for text recognition.
- OpenAI for AI-powered JSON generation.
- REPO: pdf-bookmark inspired by the need and this old REPO written with JAVA
