Part of the DocGemma project: docgemma-app (Docker deployment) | docgemma-frontend (Vue 3 UI)
Competition: The MedGemma Impact Challenge on Kaggle
Agentic medical AI backend with autonomous tool calling, powered by MedGemma via remote vLLM endpoint. Designed for resource-limited healthcare environments. Compatible with MedGemma 27B and MedGemma 1.5 4B.
DocGemma Connect is a FastAPI backend that orchestrates an AI agent capable of clinical decision support. It uses a LangGraph-based workflow with structured tool calling to query drug safety databases, search medical literature, manage electronic health records (FHIR R4), and analyze medical images — all with a human-in-the-loop approval system for write operations.
- Agentic reasoning — 7-node LangGraph workflow with binary intent classification, tool selection, and streamed synthesis
- 10 integrated tools — Drug safety (OpenFDA), drug interactions (RxNav), medical literature (PubMed), clinical trials (ClinicalTrials.gov), FHIR EHR operations, and medical image analysis
- Human-in-the-loop — Write tools (prescribe medication, add allergy, save note) require explicit user approval before execution
- Real-time streaming — WebSocket-based chat with incremental token streaming and agent status events
- Clinical trace — Full reasoning chain captured per turn (thinking, tool calls, synthesis) with step durations
- Local FHIR R4 store — JSON file-backed EHR with patient search, chart retrieval, and resource creation
| Layer | Technology |
|---|---|
| Framework | FastAPI + Uvicorn |
| Agent orchestration | LangGraph |
| Structured output | Outlines (JSON schema validation) |
| Model endpoint | vLLM-compatible (OpenAI API format) |
| Data validation | Pydantic v2 |
| Async HTTP | httpx |
| Real-time | WebSockets |
| Package manager | UV |
| Python | 3.12+ |
src/docgemma/
├── api/
│ ├── main.py # FastAPI app factory & lifespan
│ ├── config.py # Environment-based configuration
│ ├── models/ # Pydantic request/response/event schemas
│ ├── routers/
│ │ ├── health.py # GET /api/health
│ │ ├── sessions.py # Session CRUD + WebSocket chat
│ │ ├── patients.py # Patient/EHR endpoints
│ │ └── imaging.py # Medical image upload/serving
│ └── services/
│ ├── agent_runner.py # LangGraph execution with interrupt support
│ └── session_store.py # Disk-backed session persistence
├── agent/
│ ├── graph.py # 7-node LangGraph workflow definition
│ ├── nodes.py # Node implementations
│ ├── state.py # AgentState TypedDict
│ ├── prompts.py # Empirically-tuned prompts (856 experiments)
│ └── schemas.py # LLM output schemas
├── tools/
│ ├── registry.py # Tool dispatcher & registration
│ ├── drug_safety.py # OpenFDA API
│ ├── drug_interactions.py # RxNav API
│ ├── medical_literature.py # PubMed E-utilities
│ ├── clinical_trials.py # ClinicalTrials.gov v2
│ ├── image_analysis.py # Vision API (vLLM-compatible)
│ └── fhir_store/ # Local FHIR R4 JSON store
│ ├── store.py # FhirJsonStore client
│ ├── search.py # Patient search
│ ├── chart.py # Chart retrieval
│ ├── allergies.py # Allergy management
│ ├── medications.py # Medication prescribing
│ ├── notes.py # Clinical note creation
│ └── seed.py # Database seeding
└── model.py # DocGemma LLM client wrapper
data/
├── fhir/ # FHIR R4 resources (JSON by type)
├── imaging/ # Medical images
└── sessions/ # Session persistence
- Python 3.12+
- UV package manager (recommended)
- Access to a vLLM-compatible endpoint serving MedGemma
# Clone the repository
git clone https://github.com/galinilin/docgemma-connect.git
cd docgemma-connect
# Install dependencies
uv syncCreate a .env file in the project root:
# Required — remote model endpoint
DOCGEMMA_ENDPOINT=https://your-vllm-endpoint.com
DOCGEMMA_API_KEY=your-api-key
DOCGEMMA_MODEL=google/medgemma-27b-it
# Optional — server settings
DOCGEMMA_HOST=0.0.0.0
DOCGEMMA_PORT=8000
DOCGEMMA_DEBUG=false
# Optional — feature flags
DOCGEMMA_LOAD_MODEL=true
DOCGEMMA_TOOL_APPROVAL=true
# Optional — storage paths
DOCGEMMA_SESSIONS_DIR=data/sessions
# Optional — external API keys (for tools)
HF_TOKEN=your-huggingface-token# Via the CLI entrypoint
uv run docgemma-serve
# Or via uvicorn directly
uvicorn docgemma.api.main:app --host 0.0.0.0 --port 8000 --reloadThe server starts at http://localhost:8000 by default. API docs are available at /docs (Swagger UI).
GET /api/health
Returns server status, model loaded state, and version.
POST /api/sessions # Create a new chat session
GET /api/sessions # List all sessions
GET /api/sessions/{session_id} # Get session details
DELETE /api/sessions/{session_id} # Delete a session
WS /api/sessions/{session_id}/ws
Client messages:
send_message— Send a user message (with optionalimage_base64)approve_tool— Approve a pending tool call (with optionaledited_args)reject_tool— Reject a pending tool call (withreason)cancel— Cancel the current agent run
Server events:
node_start/node_end— Graph execution milestonesagent_status— Human-readable status updatestool_approval_request— Requests user approval for a write tooltool_execution_start/tool_execution_end— Tool invocation lifecyclestreaming_text— Incremental response tokenscompletion— Final response with full clinical trace
GET /api/patients # Search patients (?name=&dob=)
POST /api/patients # Create a patient
GET /api/patients/{patient_id} # Full patient chart
POST /api/patients/{patient_id}/allergies # Add allergy
POST /api/patients/{patient_id}/medications # Prescribe medication
POST /api/patients/{patient_id}/notes # Save clinical note
POST /api/patients/{patient_id}/imaging # Upload image (multipart)
GET /api/imaging/{media_id} # Serve image
DELETE /api/imaging/{media_id} # Delete image
The agent uses a 7-node LangGraph workflow with binary classification at each decision point:
input_assembly → preliminary_thinking → intent_classify
│
┌──────────────┴──────────────┐
DIRECT TOOL_NEEDED
│ │
│ tool_select
│ │
│ tool_execute (interrupt)
│ │
│ result_classify
│ │ │
│ SUFFICIENT INSUFFICIENT
│ │ │
│ │ (loop back to
│ │ tool_select)
└────────────┬───────────┘
│
synthesize (stream)
- Temperature 0.0 for classification nodes (deterministic)
- Temperature 0.5 for thinking and synthesis (controlled creativity)
- Outlines enforces valid JSON output on classification steps
- Prompts are empirically tuned from 856 MedGemma experiments
| Tool | Source | Type |
|---|---|---|
check_drug_safety |
OpenFDA | Read |
check_drug_interactions |
RxNav | Read |
search_medical_literature |
PubMed E-utilities | Read |
find_clinical_trials |
ClinicalTrials.gov v2 | Read |
search_patient |
Local FHIR | Read |
get_patient_chart |
Local FHIR | Read |
add_allergy |
Local FHIR | Write (requires approval) |
prescribe_medication |
Local FHIR | Write (requires approval) |
save_clinical_note |
Local FHIR | Write (requires approval) |
analyze_medical_image |
vLLM Vision API | Read |
| Repository | Description |
|---|---|
| docgemma-app | One-command Docker deployment — clone, configure, docker compose up |
| docgemma-frontend | Vue 3 web interface with real-time chat, EHR management, and tool approval UI |