Skip to content

eplt/clearcut

Repository files navigation

ClearCut

Local background removal ensemble for macOS — multiple AI models compete, the best output wins.

Python 3.11+ License: MIT macOS 14+


What It Does

Different background-removal models fail differently — one over-trims fine hair, another leaves halos, a third erases thin accessories. ClearCut runs multiple local AI models against each image, scores every output with five quality heuristics, and automatically selects the best transparent PNG. No cloud APIs, no data leaving your machine. Built for macOS with Apple Silicon support, it handles batch processing of entire folders with CSV/JSON reports and visual contact sheets for uncertain cases.

┌─────────────┐     ┌──────────────┐     ┌──────────────┐     ┌──────────┐
│  Discovery  │────▶│ Preprocess   │────▶│   Engines    │────▶│ Scoring  │
│  (scanner)  │     │ (EXIF, RGB)  │     │ rembg/BRIA/  │     │ (5-heur.)│
└─────────────┘     └──────────────┘     │   BiRefNet   │     └────┬─────┘
                                          └──────────────┘          │
                                                              ┌─────▼─────┐
                                                              │ Selection │
                                                              │ + Reports │
                                                              └───────────┘

Prerequisites

  • macOS 14+ (Sonoma or later)
  • Python 3.11+
  • 2 GB free disk space (for virtual environment and model weights)

Installation

git clone https://github.com/eplt/clearcut.git
cd clearcut
chmod +x scripts/bootstrap_macos.sh
./scripts/bootstrap_macos.sh
source .venv/bin/activate

For BRIA or BiRefNet engines (optional, higher quality):

pip install torch torchvision transformers

Quick Start

# Process a folder with rembg (fast, ~400ms/image)
PYTHONPATH=src python -m clearcut.cli run -i ./data/input -o ./data/output -e rembg

# Add BRIA engine for higher quality (~2-5s/image)
PYTHONPATH=src python -m clearcut.cli run -i ./data/input -o ./data/output -e rembg,bria

# Debug a single image with visual contact sheet
PYTHONPATH=src python -m clearcut.cli inspect -i ./photo.png -e rembg,bria

# Benchmark engine performance on a sample set
PYTHONPATH=src python -m clearcut.cli benchmark -i ./samples -o ./benchmark -e rembg,bria,birefnet

Usage

run — Batch process

Processes all images in an input folder, runs each enabled engine, scores outputs, and writes the best result per image.

PYTHONPATH=src python -m clearcut.cli run \
  --input ./data/input \
  --output ./data/output \
  --engines rembg,bria \
  --save-masks \
  --save-runnerups \
  --contact-sheets
Flag Default Description
--input, -i (required) Input folder or single image
--output, -o (required) Output directory
--engines, -e rembg,bria Comma-separated engine names
--config, -c configs/default.yaml Config YAML path
--save-masks / --no-save-masks --save-masks Save raw alpha mask PNGs
--save-runnerups / --no-save-runnerups --save-runnerups Save non-winning engine outputs
--contact-sheets / --no-contact-sheets --contact-sheets Generate visual review sheets
--verbose, -v False Debug-level logging

inspect — Single image debug

Processes one image and generates a contact sheet with all engine outputs, masks, and scores side by side.

PYTHONPATH=src python -m clearcut.cli inspect -i ./photo.png -e rembg,bria

benchmark — Compare engine performance

Processes a set of images and prints a ranking summary with mean scores per engine.

PYTHONPATH=src python -m clearcut.cli benchmark -i ./samples -o ./benchmark -e rembg,bria,birefnet

Output Structure

data/output/
├── selected/              ← Best PNG per image
│   └── photo001.png
├── per_engine/            ← Every engine's output + masks
│   └── photo001/
│       ├── rembg.png
│       └── rembg_mask.png
└── reports/
    ├── run_summary.csv    ← Full scoring breakdown
    ├── run_summary.json   ← Nested JSON report
    └── timing_by_engine.csv

Available Engines

Engine Quality Speed License Notes
rembg (u2net) Good ~400ms/img Apache-2.0 Baseline, always available
BRIA RMBG 2.0 Excellent ~2-5s/img Non-commercial* Higher quality, requires PyTorch
BiRefNet Best ~5-10s/img MIT High-res segmentation, optional

* BRIA RMBG 2.0 weights are licensed for non-commercial use unless a separate license is obtained.

Scoring System

Each engine output is scored on five dimensions (0–100) with a weighted composite:

Heuristic Weight Measures
Foreground completeness 30% Subject parts accidentally removed
Background suppression 20% Stray background pixels remaining
Edge quality 25% Halos, jagged boundaries, over-feathering
Center/crop sanity 10% Subject wrongly truncated at borders
Consensus 15% Agreement with other engines
final_score = 0.30×foreground + 0.20×background + 0.25×edge + 0.10×center + 0.15×consensus

Configuration

Edit configs/default.yaml:

Parameter Default Description
input_extensions [".png", ".jpg", ".jpeg"] File extensions to process
recursive true Scan subdirectories
save_masks true Write raw alpha mask PNGs
save_runnerups true Write non-winning engine outputs
review_gap_threshold 3.0 Score gap below which to flag for review
minimum_acceptable_score 60.0 Minimum score to consider acceptable
preprocess.exif_transpose true Apply EXIF orientation tag
preprocess.pad_border_px 0 Add border padding (0 = disabled)
preprocess.max_dimension null Downscale to max dimension (null = disabled)
engines.rembg.enabled true Enable rembg engine
engines.bria.enabled true Enable BRIA engine
engines.birefnet.enabled false Enable BiRefNet engine

Project Structure

clearcut/
├── pyproject.toml              # Package metadata
├── configs/
│   ├── default.yaml            # Main configuration
│   └── models.yaml             # Model metadata & licensing
├── src/clearcut/
│   ├── cli.py                  # Typer CLI (run, inspect, benchmark)
│   ├── config.py               # YAML config loading + overrides
│   ├── discovery.py            # Image folder scanning
│   ├── preprocess.py           # EXIF, conversion, padding
│   ├── pipeline.py             # Main orchestrator
│   ├── scoring.py              # 5-heuristic scoring
│   ├── selection.py            # Winner selection logic
│   ├── reporting.py            # CSV, JSON, timing reports
│   ├── contact_sheet.py        # Visual review sheet generation
│   ├── utils/                  # I/O, image ops, timing, hashing
│   └── engines/
│       ├── base.py             # Engine protocol + EngineResult
│       ├── rembg_engine.py     # rembg (u2net) adapter
│       ├── bria_engine.py      # BRIA RMBG 2.0 adapter
│       └── birefnet_engine.py  # BiRefNet adapter (optional)
├── scripts/
│   ├── bootstrap_macos.sh      # One-command setup
│   └── download_models.py      # Download model weights
└── tests/                      # 28 tests (discovery, preprocess, scoring, selection)

Security

All processing runs locally on your machine. No cloud APIs, no network calls during inference, no telemetry. Model weights are downloaded once from GitHub/HuggingFace on first run and cached locally. Input images never leave your disk.


Troubleshooting

"No engines are available/enabled!" — Check --engines flag lists valid names. For BRIA/BiRefNet: pip install torch torchvision transformers.

BRIA engine skipped: torch not installed — Install ML dependencies: pip install torch torchvision transformers.

Slow processing — BRIA on CPU takes 2–5s/image. Use only rembg for speed: -e rembg (~400ms/image).

Out of memory — Set max_dimension: 1024 in config, or disable heavy engines and use rembg only.


Roadmap

  • Image-level parallel processing (max_workers > 1)
  • HTML review dashboard for flagged images
  • Apple Silicon MPS acceleration for BRIA/BiRefNet
  • Result caching by source hash to skip unchanged images

Contributing

Contributions welcome — open an issue or submit a pull request.


License

MIT

Note: BRIA RMBG 2.0 model weights carry a separate non-commercial license. See BRIA's license terms.


Acknowledgments


Author

Edward Tsang — blockchain & AI engineer. Open to consulting → Email · LinkedIn

About

Local background removal ensemble for macOS — multiple AI models compete, the best output wins.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors