Skip to content

Commit 41bc452

Browse files
committed
Add UI image
1 parent 463c8ec commit 41bc452

File tree

2 files changed

+103
-71
lines changed

2 files changed

+103
-71
lines changed

README.md

Lines changed: 103 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -26,62 +26,30 @@
2626

2727
This project provides a powerful and flexible PDF analysis microservice built with **Clean Architecture** principles. The service enables OCR, segmentation, and classification of different parts of PDF pages, identifying elements such as texts, titles, pictures, tables, formulas, and more. Additionally, it determines the correct reading order of these identified elements and can convert PDFs to various formats including Markdown and HTML with **automatic translation support** powered by Ollama.
2828

29-
### ✨ Key Features
29+
The service offers both a **user-friendly Gradio web interface** for interactive use and a **comprehensive REST API** for programmatic access and integration.
3030

31-
- 🔍 **Advanced PDF Layout Analysis** - Segment and classify PDF content with high accuracy
32-
- 🖼️ **Visual & Fast Models** - Choose between VGT (Vision Grid Transformer) for accuracy or LightGBM for speed
33-
- 📝 **Multi-format Output** - Export to JSON, Markdown, HTML, and visualize PDF segmentations
34-
- 🌍 **Automatic Translation** - Translate documents to multiple languages using Ollama models
35-
- 🌐 **OCR Support** - 150+ language support with Tesseract OCR
36-
- 📊 **Table & Formula Extraction** - Extract tables as HTML and formulas as LaTeX
37-
- 🏗️ **Clean Architecture** - Modular, testable, and maintainable codebase
38-
- 🐳 **Docker-Ready** - Easy deployment with GPU support
39-
-**RESTful API** - Comprehensive API with 10+ endpoints
40-
41-
<table>
42-
<tr>
43-
<td>
44-
<img src="https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample1.png"/>
45-
</td>
46-
<td>
47-
<img src="https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample2.png"/>
48-
</td>
49-
<td>
50-
<img src="https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample3.png"/>
51-
</td>
52-
<td>
53-
<img src="https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample4.png"/>
54-
</td>
55-
</tr>
56-
</table>
57-
58-
### 🔗 Project Links
59-
60-
- **GitHub**: [pdf-document-layout-analysis](https://github.com/huridocs/pdf-document-layout-analysis)
61-
- **HuggingFace**: [pdf-document-layout-analysis](https://huggingface.co/HURIDOCS/pdf-document-layout-analysis)
62-
- **DockerHub**: [pdf-document-layout-analysis](https://hub.docker.com/r/huridocs/pdf-document-layout-analysis/)
31+
<div align="center">
32+
<img src="https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/ui.png" alt="Gradio Web UI" width="800"/>
33+
<p><em>Gradio Web Interface - Easy-to-use UI for PDF analysis, conversion, and translation</em></p>
34+
</div>
6335

6436
---
6537

6638
## 🚀 Quick Start
6739

6840
### 1. Start the Service
6941

70-
**Standard PDF Analysis (recommended for most users):**
71-
```bash
72-
make start
73-
```
74-
75-
**With Translation Features (includes Ollama container):**
7642
```bash
77-
make start_translation
43+
just start
7844
```
7945

80-
The service will be available at `http://localhost:5060`
46+
The service provides two interfaces:
47+
- **🎨 Web UI (Gradio)**: `http://localhost:7860` - User-friendly interface for all features
48+
- **🔌 REST API**: `http://localhost:5060` - Programmatic access for integrations
8149

8250
**See all available commands:**
8351
```bash
84-
make help
52+
just --list
8553
```
8654

8755
**Check service status:**
@@ -90,7 +58,18 @@ make help
9058
curl http://localhost:5060/info
9159
```
9260

93-
### 2. Basic PDF Analysis
61+
### 2. Using the Web UI
62+
63+
Simply open your browser and navigate to `http://localhost:7860` to access the intuitive web interface. The UI provides:
64+
65+
- 📄 **PDF Analysis** - Upload and analyze PDFs with visual results
66+
- 🔄 **Format Conversion** - Convert to Markdown or HTML
67+
- 🌍 **Translation** - Translate documents to multiple languages
68+
- 👁️ **Visualization** - View segmentation overlays on your PDFs
69+
- 🔍 **OCR Processing** - Apply OCR to scanned documents
70+
- 📑 **TOC Extraction** - Extract table of contents
71+
72+
### 3. Using the REST API
9473

9574
**Analyze a PDF document (VGT model - high accuracy):**
9675
```bash
@@ -102,18 +81,61 @@ curl -X POST -F 'file=@/path/to/your/document.pdf' http://localhost:5060
10281
curl -X POST -F 'file=@/path/to/your/document.pdf' -F "fast=true" http://localhost:5060
10382
```
10483

105-
### 3. Stop the Service
84+
### 4. Stop the Service
10685

10786
```bash
108-
make stop
87+
just stop
10988
```
11089

111-
> 💡 **Tip**: Replace `/path/to/your/document.pdf` with the actual path to your PDF file. The service will return a JSON response with segmented content and metadata.
90+
> 💡 **Tip**: The Web UI at `http://localhost:7860` is the easiest way to get started. For automation and integration, use the REST API at `http://localhost:5060`.
11291
92+
---
93+
94+
## ✨ Key Features
95+
96+
- 🎨 **User-Friendly Web UI** - Intuitive Gradio interface for easy PDF processing
97+
- 🔍 **Advanced PDF Layout Analysis** - Segment and classify PDF content with high accuracy
98+
- 🖼️ **Visual & Fast Models** - Choose between VGT (Vision Grid Transformer) for accuracy or LightGBM for speed
99+
- 📝 **Multi-format Output** - Export to JSON, Markdown, HTML, and visualize PDF segmentations
100+
- 🌍 **Automatic Translation** - Translate documents to multiple languages using Ollama models
101+
- 🌐 **OCR Support** - 150+ language support with Tesseract OCR
102+
- 📊 **Table & Formula Extraction** - Extract tables as HTML and formulas as LaTeX
103+
- 🏗️ **Clean Architecture** - Modular, testable, and maintainable codebase
104+
- 🐳 **Docker-Ready** - Easy deployment with GPU support
105+
-**RESTful API** - Comprehensive API with 10+ endpoints
106+
107+
### 📸 Example Results
108+
109+
<table>
110+
<tr>
111+
<td>
112+
<img src="https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample1.png"/>
113+
</td>
114+
<td>
115+
<img src="https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample2.png"/>
116+
</td>
117+
<td>
118+
<img src="https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample3.png"/>
119+
</td>
120+
<td>
121+
<img src="https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample4.png"/>
122+
</td>
123+
</tr>
124+
</table>
125+
126+
### 🔗 Project Links
127+
128+
- **GitHub**: [pdf-document-layout-analysis](https://github.com/huridocs/pdf-document-layout-analysis)
129+
- **HuggingFace**: [pdf-document-layout-analysis](https://huggingface.co/HURIDOCS/pdf-document-layout-analysis)
130+
- **DockerHub**: [pdf-document-layout-analysis](https://hub.docker.com/r/huridocs/pdf-document-layout-analysis/)
131+
132+
---
113133

114134
## 📋 Table of Contents
115135

136+
- [🚀 Overview](#-overview)
116137
- [🚀 Quick Start](#-quick-start)
138+
- [✨ Key Features](#-key-features)
117139
- [⚙️ Dependencies](#-dependencies)
118140
- [📋 Requirements](#-requirements)
119141
- [📚 API Reference](#-api-reference)
@@ -601,37 +623,36 @@ For detailed information about the dataset, visit the [DocLayNet repository](htt
601623

602624
2. **Create virtual environment:**
603625
```bash
604-
make install_venv
626+
just install_venv
605627
```
606628

607629
3. **Activate environment:**
608630
```bash
609-
make activate
610-
# or manually: source .venv/bin/activate
631+
source .venv/bin/activate
611632
```
612633

613634
4. **Install dependencies:**
614635
```bash
615-
make install
636+
just install
616637
```
617638

618639
### Code Quality
619640

620641
**Format code:**
621642
```bash
622-
make formatter
643+
just formatter
623644
```
624645

625646
**Check formatting:**
626647
```bash
627-
make check_format
648+
just check_format
628649
```
629650

630651
### Testing
631652

632653
**Run tests:**
633654
```bash
634-
make test
655+
just test
635656
```
636657

637658
**Integration tests:**
@@ -642,26 +663,34 @@ python -m pytest src/tests/integration/test_end_to_end.py
642663

643664
### Docker Development
644665

645-
**Build and start (detached mode):**
666+
**Build and start:**
646667
```bash
647-
# With GPU
648-
make start_detached_gpu
668+
# Standard start (includes translation features)
669+
just start
670+
671+
# Start without translation support
672+
just start_no_translation
649673

650-
# Without GPU
651-
make start_detached
674+
# Start in detached mode (API only, no UI)
675+
just start_detached
652676

653-
# With translation features
654-
make start_translation
655-
make start_translation_no_gpu
677+
# Start in detached mode with GPU (API only, no UI)
678+
just start_detached_gpu
679+
680+
# Force CPU mode with translation
681+
just start_no_gpu
656682
```
657683

658684
**Clean up Docker resources:**
659685
```bash
686+
# Stop all services
687+
just stop
688+
660689
# Remove containers
661-
make remove_docker_containers
690+
just remove_docker_containers
662691

663692
# Remove images
664-
make remove_docker_images
693+
just remove_docker_images
665694
```
666695

667696
### Project Structure
@@ -673,12 +702,15 @@ pdf-document-layout-analysis/
673702
│ ├── use_cases/ # Application logic
674703
│ ├── adapters/ # External integrations
675704
│ ├── ports/ # Interface definitions
676-
│ └── drivers/ # Framework configurations
705+
│ ├── drivers/ # Framework configurations
706+
│ ├── app.py # FastAPI application
707+
│ └── gradio_app.py # Gradio web interface
677708
├── test_pdfs/ # Test PDF files
678709
├── models/ # ML model storage
679710
├── docker-compose.yml # Docker configuration
680-
├── Dockerfile # Container definition
681-
├── Makefile # Development commands
711+
├── Dockerfile # FastAPI container definition
712+
├── Dockerfile.gradio # Gradio container definition
713+
├── justfile # Development commands (just)
682714
├── pyproject.toml # Python project configuration
683715
└── requirements.txt # Python dependencies
684716
```
@@ -724,7 +756,7 @@ docker exec -it pdf-document-layout-analysis /bin/bash
724756

725757
**Free up disk space:**
726758
```bash
727-
make free_up_space
759+
just free_up_space
728760
```
729761

730762
### Order of Output Elements
@@ -927,8 +959,8 @@ We welcome contributions to improve the PDF Document Layout Analysis service!
927959

928960
3. **Set Up Development Environment**
929961
```bash
930-
make install_venv
931-
make install
962+
just install_venv
963+
just install
932964
```
933965

934966
4. **Make Your Changes**
@@ -938,8 +970,8 @@ We welcome contributions to improve the PDF Document Layout Analysis service!
938970

939971
5. **Run Tests and Quality Checks**
940972
```bash
941-
make test
942-
make check_format
973+
just test
974+
just check_format
943975
```
944976

945977
6. **Submit a Pull Request**

images/ui.png

443 KB
Loading

0 commit comments

Comments
 (0)