Skip to content

Commit 1e16e6d

Browse files
committed
docs: make AGENTS.md the single source of truth, CLAUDE.md links to it
- Remove Claude-specific references from AGENTS.md - Slim down CLAUDE.md to just link to AGENTS.md for easier maintenance
1 parent 1a54d5d commit 1e16e6d

File tree

2 files changed

+3
-92
lines changed

2 files changed

+3
-92
lines changed

AGENTS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# CLAUDE.md
1+
# AGENTS.md
22

3-
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
3+
This file provides guidance to AI coding agents when working with code in this repository.
44

55
## What This Is
66

CLAUDE.md

Lines changed: 1 addition & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -1,92 +1,3 @@
11
# CLAUDE.md
22

3-
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4-
5-
## What This Is
6-
7-
PaddleOCR is a production-ready OCR and document AI engine built on PaddlePaddle. It does text detection, recognition, document structure analysis, and information extraction.
8-
9-
## Build & Verify
10-
11-
```bash
12-
pip install -e ".[all]" # Dev install (paddlepaddle installed separately)
13-
pytest tests/ # Tests (resource-intensive skipped by default)
14-
pre-commit run --all-files # Lint/format
15-
```
16-
17-
## Project Structure
18-
19-
```
20-
PaddleOCR/
21-
├── paddleocr/ # Public API (3.x) — what users import
22-
│ ├── __init__.py # Top-level exports (__all__ is the source of truth)
23-
│ ├── _pipelines/ # High-level pipelines (OCR, PPStructureV3, etc.)
24-
│ ├── _models/ # Individual model wrappers (TextDetection, etc.)
25-
│ └── _cli.py # CLI entry point
26-
├── ppocr/ # Internal training framework (not user-facing)
27-
│ ├── modeling/ # Model architectures (Backbone, Neck, Head)
28-
│ ├── data/ # Data loading and augmentation
29-
│ ├── losses/ # Loss functions
30-
│ ├── metrics/ # Evaluation metrics
31-
│ └── postprocess/ # Post-processing
32-
├── tools/ # Train/infer/eval scripts (tools/train.py)
33-
├── configs/ # YAML configs organized by task (det/, rec/, table/, etc.)
34-
├── deploy/ # Deployment (C++, Docker, ONNX, mobile)
35-
├── tests/ # Tests (models/ + pipelines/)
36-
└── agent_docs/ # Detailed AI-readable documentation
37-
```
38-
39-
Two layers — understand which you're working in:
40-
41-
- **`paddleocr/`** — Public API (3.x). `_pipelines/` has high-level pipelines, `_models/` has individual model wrappers. Users import from here.
42-
- **`ppocr/`** — Internal training framework. Used by `tools/train.py`, not by end users.
43-
44-
## Discovering Available Pipelines & Models
45-
46-
**Do NOT rely on hardcoded lists.** Always discover dynamically from source:
47-
48-
- **Pipelines**: Read `__all__` in `paddleocr/_pipelines/__init__.py`
49-
- **Models**: Read `__all__` in `paddleocr/_models/__init__.py`
50-
- **All public exports**: Read `__all__` in `paddleocr/__init__.py`
51-
52-
Each pipeline inherits from `PaddleXPipelineWrapper` (in `_pipelines/base.py`).
53-
Each model inherits from `PaddleXPredictorWrapper` (in `_models/base.py`).
54-
55-
To understand a specific pipeline or model, read its source file in the corresponding directory.
56-
57-
## Critical: 3.x API Only
58-
59-
PaddleOCR 3.x is **not backwards compatible** with 2.x. Never generate 2.x-style code:
60-
- Use `.predict()` not `.ocr()` (deprecated)
61-
- Results are objects with `.print()`, `.save_to_img()`, `.save_to_json()` — not nested lists
62-
- `PPStructure` is removed — use `PPStructureV3`
63-
- For single-task inference, use model classes (`TextDetection`, `TextRecognition`) not `det`/`rec` params
64-
65-
## Code Style & Conventions
66-
67-
- Follow existing patterns in the file you're modifying
68-
- Use type hints for function signatures
69-
- Use `pre-commit run --all-files` to lint before committing — this runs ruff, trailing whitespace fixes, and other checks
70-
- Error messages should be clear and actionable
71-
- No `eval()`, `exec()`, or `pickle` on user-controlled input
72-
73-
## Testing
74-
75-
- Tests live in `tests/` with subdirectories `models/` and `pipelines/`
76-
- Run with `pytest tests/` — resource-intensive tests are skipped by default
77-
- When adding a new pipeline or model, add corresponding tests
78-
- Test the public API (`.predict()`, result object methods), not internal implementation details
79-
80-
## PR & Commit Guidelines
81-
82-
- PR titles: concise, lowercase, descriptive of what changed
83-
- PR descriptions: explain the "why", not just the "what"
84-
- Keep PRs focused — one logical change per PR
85-
- Ensure `pre-commit run --all-files` passes before pushing
86-
87-
## Detailed Docs
88-
89-
Read these as needed — don't load them all upfront:
90-
- `agent_docs/inference_api.md` — Pipelines, models, constructor params, CLI, usage patterns
91-
- `agent_docs/training.md` — Training commands, config YAML structure, internal framework
92-
- `agent_docs/config_system.md` — YAML config structure, sections, overrides, transforms, builder flow
3+
See [AGENTS.md](AGENTS.md) for all project guidance.

0 commit comments

Comments
 (0)