This tool converts PDF files and images into editable PowerPoint presentations (.pptx) by leveraging structured data from the MinerU PDF Extractor. It accurately reconstructs text, images, and layout, providing a high-fidelity, editable version of the original document.
The application features a user-friendly graphical interface (GUI) and is designed for easy use.
As a user, you only need the packaged Windows release (CPU or GPU variant). You do not need to install Python or any libraries.
-
Download the Application: Get the latest package from the project's Releases page.
MinerU2PPT-win64-cpu-setup.exe: CPU-only package (recommended default).MinerU2PPT-win64-gpu-cu118-setup.exe: CUDA 11.8 GPU package.MinerU2PPT-win64-gpu-cu126-setup.exe: CUDA 12.6 GPU package.MinerU2PPT-win64-gpu-cu129-setup.exe: CUDA 12.9 GPU package.
-
Get the MinerU JSON File:
- Go to the MinerU PDF/Image Extractor.
- Upload your PDF or image file and let it process.
- Download the resulting JSON file. This file contains the structural information that our tool needs for the conversion.

-
Run the Converter:
- Double-click the executable to start the application.
- Select Input File: Drag and drop your PDF or image file onto the first input field, or use the "Browse..." button.
- Select JSON File: Drag and drop the JSON file you downloaded from MinerU onto the second input field.
- Output Path: The output path for your new PowerPoint file will be automatically filled in. You can change it by typing directly or using the "Save As..." button.
- Options:
- Remove Watermark: Check this box to automatically erase elements like page numbers or footers.
- Generate Debug Images: Keep this unchecked unless you are troubleshooting.
- Click Start Conversion.
-
Open Your File: Once the conversion is complete, click the "Open Output Folder" button to find your new
.pptxfile.
The application also supports converting multiple files at once in Batch Mode.
- Switch to Batch Mode: Click the "Batch Mode" button in the top-right corner of the application. The interface will switch to the batch processing view.
- Add Tasks:
- Click the "Add Task" button. A new window will pop up.
- In the popup, select the Input File, the corresponding MinerU JSON File, and specify the Output Path.
- Set the Remove Watermark option for this specific task.
- Click "OK" to add the task to the list.
- Manage Tasks: You can add multiple tasks to the list. If you need to remove a task, select it from the list and click "Delete Task".
- Start Batch Conversion: Once all your tasks are added, click "Start Batch Conversion". The application will process each task sequentially. A log will show the progress for each file.
This section provides instructions for running the application from source and packaging it for distribution.
- Clone the repository.
- It is recommended to use a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies based on your development target.
- Default (GPU CUDA 11.8 full dev):
pip install -r requirements.txt
- GPU CUDA 12.6 full dev:
pip install -r requirements-gpu-cu126.txt -r requirements-build.txt
- GPU CUDA 12.9 full dev:
pip install -r requirements-gpu-cu129.txt -r requirements-build.txt
- CPU full dev (for CI/other contributors):
pip install -r requirements-dev-cpu.txt
- Default (GPU CUDA 11.8 full dev):
- To run the GUI application:
python gui.py
- To use the CLI:
python main.py --json <path_to_json> --input <path_to_pdf_or_image> --output <path_to_ppt> [OPTIONS]
--ocr-device {auto,gpu,cpu}: OCR device policy. Default isauto(gpu -> cpufallback).--ocr-model-root <path>: Optional local PaddleOCR model root. When omitted, PaddleOCR will download models automatically on first run.--ocr-model-variant {auto,lite,server}: OCR model variant. Defaultautopicks server when GPU is available, otherwise lite (mobile models).--ocr-font-distance-threshold <float>: Font sensitivity for OCR bbox refinement (default60.0). Higher values tend to produce larger text boxes.
Example:
python main.py --json "demo/case1/MinerU_xxx.json" --input "demo/case1/PixPin_xxx.png" --output "out.pptx" --ocr-device auto --ocr-font-distance-threshold 60If you want regression to also produce PPT files for direct visual review, run:
python -m pytest "tests/integration/test_case1_ocr.py" -k all_demo_cases_generate_ppt_outputs_for_manual_reviewGenerated PPT files will be saved to:
tmp/regression_ppt_outputs/case1.pptxtmp/regression_ppt_outputs/case2.pptxtmp/regression_ppt_outputs/case3.pptxtmp/regression_ppt_outputs/case4.pptxtmp/regression_ppt_outputs/case5.pptx
This project now recommends onedir/installer-style packaging over onefile for better runtime stability and easier model deployment.
-
Install PyInstaller:
pip install pyinstaller
-
OCR models (auto-download by default): By default PaddleOCR downloads models automatically on first run. If you want to provide local models, pass
--ocr-model-rootor setMINERU_OCR_MODEL_ROOT.Optional local layout (when provided):
models/paddleocr/<variant>/<lang>/det models/paddleocr/<variant>/<lang>/rec models/paddleocr/<variant>/<lang>/cls # optional if angle classification is enabledWhere
<variant>isliteorserver. -
Build the onedir package:
pyinstaller --clean gui.spec
-
Find build output: The packaged app directory will be generated under
dist/MinerU2PPT/.
- Documentation domains:
docs/architecture/docs/testing/docs/core-flow/docs/api/
- Core flow docs:
docs/core-flow/font-size-normalization-pre-render.mddocs/core-flow/ocr-bbox-xy-refine-flow.mddocs/core-flow/watermark-ir-removal-flow.md
- Architecture docs:
docs/architecture/ocr-engine-configuration.md
- Testing docs:
docs/testing/font-size-normalization-testing.mddocs/testing/ocr-bbox-refine-testing.mddocs/testing/ocr-configuration-testing.mddocs/testing/watermark-ir-removal-testing.md
