MLflow Tracking Server Deploy Now
Containerized MLflow tracking server deployment with PostgreSQL-backed metadata and optional S3-compatible artifact storage.
Watch the full step-by-step guide on YouTube:
β If this repository is useful for your work or research, consider starring it to support visibility and future development.
This project exists because local MLflow is often not enough once experiments become heavier. A self-hosted server gives you a persistent web UI, cleaner experiment management, shared access, better PostgreSQL-backed metadata support, and less local machine clutter from artifacts and databases.
Referral link for Railway credits:
MLflow can run locally with mlflow ui, but local usage becomes limiting when you need:
- a persistent tracking server accessible from multiple projects
- PostgreSQL-backed experiment metadata instead of local SQLite constraints
- cleaner artifact management via S3-compatible storage
- fewer local resource issues related to storage, RAM usage, and clutter
- a hosted UI for prompt optimization, tracing, and experiment review
This repository captures the deployment shape used to support MLflow-based optimization and observability for companion projects such as AI agent backends.
- host a shared MLflow server for experiment tracking
- support GenAI tracing, prompt registry, and prompt optimization workflows
- persist experiment metadata in PostgreSQL
- persist artifacts in S3-compatible object storage
- avoid messy local MLflow state and large artifact folders
The following two diagrams serve different purposes:
- Deployment architecture overview shows the infrastructure pieces needed to run MLflow reliably.
- Usage workflow overview shows how related application repositories interact with the tracking server.
This diagram explains the recommended hosted shape for the MLflow server.
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#111827', 'primaryTextColor': '#F9FAFB', 'primaryBorderColor': '#60A5FA', 'lineColor': '#94A3B8', 'secondaryColor': '#1F2937', 'tertiaryColor': '#0F172A', 'fontSize': '15px'}}}%%
flowchart LR
classDef deploy fill:#0F172A,stroke:#60A5FA,color:#F8FAFC,stroke-width:2px;
classDef runtime fill:#111827,stroke:#34D399,color:#F8FAFC,stroke-width:2px;
classDef store fill:#111827,stroke:#F59E0B,color:#F8FAFC,stroke-width:2px;
classDef ext fill:#111827,stroke:#C084FC,color:#F8FAFC,stroke-width:2px;
A[Developer or CI] --> B[Docker image]
B --> C[Hosted MLflow server]
C --> D[PostgreSQL backend store]
C --> E[S3-compatible artifact store]
U[Browser or client SDK] --> C
class A,B deploy;
class C runtime;
class D,E store;
class U ext;
This diagram focuses on how MLflow is used by related app repositories and optimization workflows.
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#111827', 'primaryTextColor': '#F9FAFB', 'primaryBorderColor': '#F59E0B', 'lineColor': '#94A3B8', 'secondaryColor': '#1F2937', 'tertiaryColor': '#0F172A', 'fontSize': '15px'}}}%%
flowchart TD
classDef input fill:#0F172A,stroke:#60A5FA,color:#F8FAFC,stroke-width:2px;
classDef process fill:#111827,stroke:#34D399,color:#F8FAFC,stroke-width:2px;
classDef output fill:#111827,stroke:#C084FC,color:#F8FAFC,stroke-width:2px;
A[Application or agent repo] --> B[Log runs, traces, prompts, datasets]
B --> C[MLflow tracking server]
C --> D[MLflow UI]
C --> E[PostgreSQL metadata]
C --> F[S3 artifacts]
class A input;
class B,C process;
class D,E,F output;
For this project family, hosted MLflow helps because:
- it stores experiment metadata in PostgreSQL, which is better suited for richer experiment data than local SQLite
- it keeps artifacts outside the local machine
- it allows you to turn the server on only when needed
- it gives a central UI for prompt optimization, evaluation, feedback, and traces
The deployment cost pattern described by the project owner is roughly:
- PostgreSQL on Railway: very low monthly cost
- S3 artifact storage: usually minimal cost at small scale
- MLflow runtime server: the main recurring cost
Operational note:
- if you want to pause the server on Railway, you can use a sleep-based custom start command such as
sleep infinitywhen you intentionally want the service idle
By default, artifacts written inside the container are not persistent.
Without object storage, artifacts end up under:
/app/mlruns
That means redeploys or restarts can lose locally stored artifacts.
For a better production experience, configure an S3-compatible bucket using:
BACKEND_S3=s3://mlflow-artifacts/
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_DEFAULT_REGION=This repositoryβs container startup uses:
--default-artifact-root ${BACKEND_S3:-$BACKEND_s3}
So if BACKEND_S3 is defined, MLflow will use that persistent object store automatically.
Current deployment runtime is defined in Dockerfile.
The image uses:
ghcr.io/mlflow/mlflow:v3.10.1-full- a hosted MLflow server process on port
8080 - PostgreSQL via
BACKEND_STORE_URI - optional object storage via
BACKEND_S3
Copy .env.example and set:
BACKEND_STORE_URI- optional
BACKEND_S3 - optional AWS credentials for artifact storage
docker build -t mlflow-tracking-server .
docker run --rm -p 8080:8080 --env-file .env mlflow-tracking-serverhttp://localhost:8080
This repository is intentionally compatible with a simple hosted deployment model such as Railway.
Recommended production setup:
- add a PostgreSQL service for
BACKEND_STORE_URI - add an optional S3-compatible bucket for persistent artifacts
- deploy this containerized MLflow service
- expose the public MLflow UI URL to companion repositories through environment variables like
MLFLOW_TRACKING_URI
This repository also contains guidance material derived from the MLflow workflows used in job-agent-backend. Those documents are included here so MLflow server usage and MLflow workflow guidance live closer to the tracking-server deployment itself.
See:
docs/JOB_AGENT_BACKEND_MLFLOW_GUIDE.mddocs/MLFLOW_SERVER_DEPLOYMENT_GUIDE.mdscripts/test-mlflow.py
.
βββ docs/
β βββ JOB_AGENT_BACKEND_MLFLOW_GUIDE.md
β βββ MLFLOW_SERVER_DEPLOYMENT_GUIDE.md
βββ scripts/
β βββ test-mlflow.py
βββ .env.example
βββ .gitignore
βββ AGENTS.md
βββ Dockerfile
βββ LICENSE
βββ README.md
Recommended GitHub repository name: mlflow-tracking-server
Alternative acceptable names:
mlflow-railway-serverself-hosted-mlflow-servermlflow-server-deployment
These terms help both search engines and LLM-based discovery systems understand the repository purpose.
This repository is licensed under the MIT License. See LICENSE.