Production-ready medical NLP toolkit powered by state-of-the-art transformers
Transform clinical text into structured insights with a single line of code. OpenMed delivers enterprise-grade entity extraction, assertion detection, and medical reasoning—no vendor lock-in, no compromise on accuracy.
from openmed import analyze_text
result = analyze_text(
"Patient started on imatinib for chronic myeloid leukemia.",
model_name="disease_detection_superclinical"
)
for entity in result.entities:
print(f"{entity.label:<12} {entity.text:<35} {entity.confidence:.2f}")
# DISEASE chronic myeloid leukemia 0.98
# DRUG imatinib 0.95- Specialized Models: 12+ curated medical NER models outperforming proprietary solutions
- HIPAA-Compliant PII Detection: Smart de-identification with all 18 Safe Harbor identifiers
- One-Line Deployment: From prototype to production in minutes
- Dockerized REST API: FastAPI endpoints for service deployments
- Batch Processing: Multi-file workflows with progress tracking
- Production-Ready: Configuration profiles, profiling tools, and medical-aware tokenization
- Zero Lock-In: Apache 2.0 licensed, runs on your infrastructure
# Install with Hugging Face support
uv pip install "openmed[hf]"
# Or include REST service dependencies
uv pip install "openmed[hf,service]"1️⃣ Python API — One-liner for scripts and notebooks
from openmed import analyze_text
result = analyze_text(
"Patient received 75mg clopidogrel for NSTEMI.",
model_name="pharma_detection_superclinical"
)2️⃣ REST API Service — FastAPI endpoints for app backends
uvicorn openmed.service.app:app --host 0.0.0.0 --port 80803️⃣ Batch Processing — Programmatic multi-document workflows
from openmed import BatchProcessor
processor = BatchProcessor(
model_name="disease_detection_superclinical",
confidence_threshold=0.55,
group_entities=True,
)
result = processor.process_texts([
"Patient started metformin for type 2 diabetes.",
"Imatinib started for chronic myeloid leukemia.",
])- Curated Model Registry: Metadata-rich catalog with 12+ specialized medical NER models
- PII Detection & De-identification: HIPAA-compliant de-identification with smart entity merging
- Medical-Aware Tokenization: Clean handling of clinical patterns (
COVID-19,CAR-T,IL-6) - Advanced NER Processing: Confidence filtering, entity grouping, and span alignment
- Multiple Output Formats: Dict, JSON, HTML, CSV for any downstream system
- Batch Processing: Multi-text and multi-file workflows with progress tracking
- Configuration Profiles:
dev/prod/test/fastpresets with flexible overrides - Performance Profiling: Built-in inference timing and bottleneck analysis
- Dockerized REST API:
GET /health,POST /analyze,POST /pii/extract,POST /pii/deidentify - Service Reliability Hardening: request validation, shared pipeline preload, and timeout/error envelopes
Comprehensive guides available at openmed.life/docs
Quick links:
- Getting Started — Installation and first analysis
- Analyze Text Helper — Python API reference
- PII Detection Guide — Complete de-identification tutorial (v0.5.0)
- Batch Processing — Multi-text and multi-file workflows
- Configuration Profiles — Environment-specific presets
- REST Service — FastAPI and Docker usage
- Model Registry — Browse available models
- Configuration — Settings and environment variables
OpenMed includes a Docker-friendly FastAPI service with reliability hardening:
GET /healthPOST /analyzePOST /pii/extractPOST /pii/deidentify
uv pip install -e ".[hf,service]"
uvicorn openmed.service.app:app --host 0.0.0.0 --port 8080Optional shared model warm-up:
OPENMED_SERVICE_PRELOAD_MODELS=disease_detection_superclinical,OpenMed/OpenMed-PII-SuperClinical-Small-44M-v1 \
uvicorn openmed.service.app:app --host 0.0.0.0 --port 8080docker build -t openmed:0.6.2 .
docker run --rm -p 8080:8080 -e OPENMED_PROFILE=prod openmed:0.6.2curl -X POST http://127.0.0.1:8080/pii/extract \
-H "Content-Type: application/json" \
-d '{"text":"Paciente: Maria Garcia, DNI: 12345678Z","lang":"es"}'See the full service guide at REST Service docs.
Non-2xx responses now use a unified envelope:
{
"error": {
"code": "validation_error",
"message": "Request validation failed",
"details": [
{
"field": "body.text",
"message": "Text must not be blank",
"type": "value_error"
}
]
}
}OpenMed includes a curated registry of 12+ specialized medical NER models:
| Model | Specialization | Entity Types | Size |
|---|---|---|---|
disease_detection_superclinical |
Disease & Conditions | DISEASE, CONDITION, DIAGNOSIS | 434M |
pharma_detection_superclinical |
Drugs & Medications | DRUG, MEDICATION, TREATMENT | 434M |
pii_detection_superclinical |
PII & De-identification | NAME, DATE, SSN, PHONE, EMAIL, ADDRESS | 434M |
anatomy_detection_electramed |
Anatomy & Body Parts | ANATOMY, ORGAN, BODY_PART | 109M |
gene_detection_genecorpus |
Genes & Proteins | GENE, PROTEIN | 109M |
from openmed import extract_pii, deidentify
# Extract PII entities with smart merging (default)
result = extract_pii(
"Patient: John Doe, DOB: 01/15/1970, SSN: 123-45-6789",
model_name="pii_detection_superclinical",
use_smart_merging=True # Prevents entity fragmentation
)
# De-identify with multiple methods
masked = deidentify(text, method="mask") # [NAME], [DATE]
removed = deidentify(text, method="remove") # Complete removal
replaced = deidentify(text, method="replace") # Synthetic data
hashed = deidentify(text, method="hash") # Cryptographic hashing
shifted = deidentify(text, method="shift_dates", date_shift_days=180)Smart Entity Merging (NEW in v0.5.0): Fixes tokenization fragmentation by merging split entities like dates (01/15/1970 instead of 01 + /15/1970), ensuring production-ready de-identification.
HIPAA Compliance: Covers all 18 Safe Harbor identifiers with configurable confidence thresholds.
📓 Complete PII Notebook | 📖 Documentation
OpenMed now supports multilingual PII extraction and de-identification across en, fr, de, it, es, nl, hi, and te.
French, German, Italian, and Spanish expose the full 35-model family; Dutch, Hindi, and Telugu currently ship one flagship public model each, bringing the total PII catalog to 179 models.
from openmed import extract_pii
dutch = extract_pii(
"Patiënt: Eva de Vries, geboortedatum: 15 januari 1984, BSN: 123456782, telefoon: +31 6 12345678",
lang="nl",
model_name="OpenMed/OpenMed-PII-Dutch-SuperClinical-Large-434M-v1",
use_smart_merging=True,
)
hindi = extract_pii(
"रोगी: अनीता शर्मा, जन्मतिथि: 15 जनवरी 1984, फोन: +91 9876543210, पता: 12 गली संख्या 5, नई दिल्ली 110001",
lang="hi",
model_name="OpenMed/OpenMed-PII-Hindi-SuperClinical-Large-434M-v1",
use_smart_merging=True,
)
telugu = extract_pii(
"రోగి: సితా రెడ్డి, జన్మ తేదీ: 15 జనవరి 1984, ఫోన్: +91 9876543210, చిరునామా: 12 వీధి 5, హైదరాబాద్ 500001",
lang="te",
model_name="OpenMed/OpenMed-PII-Telugu-SuperClinical-Large-434M-v1",
use_smart_merging=True,
)
print([(e.label, e.text) for e in dutch.entities])
print([(e.label, e.text) for e in hindi.entities])
print([(e.label, e.text) for e in telugu.entities])from openmed import BatchProcessor, OpenMedConfig
config = OpenMedConfig.from_profile("prod")
processor = BatchProcessor(
model_name="disease_detection_superclinical",
config=config,
group_entities=True,
)
result = processor.process_texts([
"Metastatic breast cancer treated with trastuzumab.",
"Acute lymphoblastic leukemia diagnosed.",
])from openmed import analyze_text
# Apply a profile programmatically
result = analyze_text(
text,
model_name="disease_detection_superclinical",
config_profile="prod" # High confidence, grouped entities
)from openmed import analyze_text, profile_inference
with profile_inference() as profiler:
result = analyze_text(text, model_name="disease_detection_superclinical")
print(profiler.summary()) # Inference time, bottlenecks, recommendationsWe welcome contributions! Whether it's bug reports, feature requests, or pull requests.
- 🐛 Found a bug? Open an issue
OpenMed is released under the Apache-2.0 License.
If you use OpenMed in your research, please cite:
@misc{panahi2025openmedneropensourcedomainadapted,
title={OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets},
author={Maziyar Panahi},
year={2025},
eprint={2508.01630},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.01630},
}If you find OpenMed useful, consider giving it a star ⭐ to help others discover it!
Built with ❤️ by the OpenMed team