feat: Add MLflow Prompt Registry provider #4170

williamcaban · 2025-11-16T19:06:15Z

MLflow Prompt Registry Provider

Summary

This PR adds a new remote MLflow provider for the Prompts API, enabling centralized prompt management and versioning using MLflow's Prompt Registry (MLflow 3.4+).

What's New

Remote Provider: `remote::mlflow`

A production-ready provider that integrates Llama Stack's Prompts API with MLflow's centralized prompt registry, supporting:

Version Control: Immutable prompt versioning with full history
Default Version Management: Easy version switching via aliases
Auto Variable Extraction: Automatic detection of {{ variable }} placeholders
Centralized Storage: Team collaboration via shared MLflow server
Metadata Preservation: Llama Stack metadata stored as MLflow tags

Quick Start

1. Configure Llama Stack

Basic configuration with SQLite (default):

prompts:
  - provider_id: reference-prompts
    provider_type: inline::reference
    config:
      run_config:
        storage:
          stores:
            prompts:
              type: sqlite
              db_path: ./prompts.db

With PostgreSQL:

prompts:
  - provider_id: postgres-prompts
    provider_type: inline::reference
    config:
      run_config:
        storage:
          stores:
            prompts:
              type: postgres
              url: postgresql://user:pass@localhost/llama_stack

2. Use the Prompts API

from llama_stack_client import LlamaStackClient

client = LlamaStackClient(base_url="http://localhost:5000")

# Create a prompt
prompt = client.prompts.create(
    prompt="Summarize the following text in {{ num_sentences }} sentences:\n\n{{ text }}",
    variables=["num_sentences", "text"]
)
print(f"Created prompt: {prompt.prompt_id} (v{prompt.version})")

# Retrieve prompt
retrieved = client.prompts.get(prompt_id=prompt.prompt_id)
print(f"Retrieved: {retrieved.prompt}")

# Update prompt (creates version 2)
updated = client.prompts.update(
    prompt_id=prompt.prompt_id,
    prompt="Summarize in exactly {{ num_sentences }} sentences:\n\n{{ text }}",
    version=1,
    set_as_default=True
)
print(f"Updated to version: {updated.version}")

# List all prompts
prompts = client.prompts.list()
print(f"Found {len(prompts.data)} prompts")

# Delete prompt
client.prompts.delete(prompt_id=prompt.prompt_id)

mattf

@williamcaban this is the right direction

few things jump out for me -

the existing impl needs to be moved to be an inline::reference impl (let the llm know it's in src/llama_stack/core/prompts/prompts.py)
mlflow credential handling needs to be added, should follow the pattern in inference (provider-data backstopped by config)
tests don't need to be standalone

williamcaban · 2025-11-24T02:36:24Z

PR is now ready for review and includes the following updates:

Moved the previous prompts.py as an inline provider (see inline_reference.mdx for details)
Defined a remote provider with MLflow supporting authentication (see remote_mlflow.mdx for details)
Removed any dependencies on prompt caching

docs/docs/providers/prompts/index.mdx

docs/docs/providers/prompts/remote_mlflow.mdx

scripts/test_mlflow_prompts_manual.py

src/llama_stack/providers/remote/prompts/mlflow/mapping.py

src/llama_stack/providers/remote/prompts/mlflow/mlflow.py

franciscojavierarceo · 2025-11-24T04:20:42Z

tests/unit/providers/remote/prompts/mlflow/test_mapping.py

+        """Create ID mapper instance."""
+        return PromptIDMapper(use_metadata=True)
+
+    def test_to_mlflow_name_valid_id(self, mapper):


some of these tests could be shortened. i understand Claude often generates them easily but sometimes they're a little excessive.

franciscojavierarceo

some small nits but this is looking very exciting!

franciscojavierarceo · 2025-11-24T04:22:12Z

docs/docs/providers/prompts/index.mdx

@@ -0,0 +1,92 @@
+---


thank you for adding the prompts docs!!

we also have this in the UI, maybe we should mention it?

Are you referring to making it available in the UI? If so, could that be a follow-up PR? I would prefer to avoid adding more to this one.

mattf

why do we need to maintain a mapping from prompt id to mlflow prompt name?

tests/integration/providers/remote/prompts/mlflow/conftest.py

williamcaban · 2025-11-24T23:25:31Z

@mattf

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

mattf · 2025-11-25T15:20:01Z

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

when would a deployer want to set use_metadata_tags=False? can we always use metadata and skip the id/name translations?

williamcaban · 2025-11-26T01:59:34Z

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

when would a deployer want to set use_metadata_tags=False? can we always use metadata and skip the id/name translations?

We can remove that option because, in practice, setting it to false has significant downsides. Let me remove the option.

mattf · 2025-11-26T11:26:46Z

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

when would a deployer want to set use_metadata_tags=False? can we always use metadata and skip the id/name translations?

We can remove that option because, in practice, setting it to false has significant downsides. Let me remove the option.

thanks. do we still need to have the id mapping?

mattf

@williamcaban why create a func for token extraction and a client but then not use them?

by requesting pr review you're asking others to read all this code (over 3k lines). please review it all before requesting.

src/llama_stack/core/prompts/prompts.py

mattf · 2025-11-26T11:46:42Z

src/llama_stack/providers/remote/prompts/mlflow/mlflow.py

+        if self.config.auth_credential is not None:
+            import os
+
+            # MLflow reads MLFLOW_TRACKING_TOKEN from environment
+            os.environ["MLFLOW_TRACKING_TOKEN"] = self.config.auth_credential.get_secret_value()
+            logger.debug("Set MLFLOW_TRACKING_TOKEN from config auth_credential")


because you use env.MLFLOW_TRACKING_TOKEN in the sample_run_config, the auth_credential will already be set from the MLFLOW_TRACKING_TOKEN.

is setting MLFLOW_TRACKING_TOKEN the only way to communicate the token to the client?

This can be either—an environment variable at runtime or a setting in the config file.

src/llama_stack/providers/remote/prompts/mlflow/mlflow.py

mattf · 2025-11-26T11:53:08Z

src/llama_stack/providers/remote/prompts/mlflow/mlflow.py

+            logger.debug("Set MLFLOW_TRACKING_TOKEN from config auth_credential")
+
+        # Initialize client
+        self.mlflow_client = MlflowClient()


this is unused.

This is used in the edge case of managing prompts created outside Llama Stack. I'll update the code for clarity.

Add a new remote provider that integrates MLflow's Prompt Registry with Llama Stack's prompts API, enabling centralized prompt management and versioning using MLflow as the backend. Features: - Full implementation of Llama Stack Prompts protocol - Support for prompt versioning and default version management - Automatic variable extraction from Jinja2-style templates - MLflow tag-based metadata for efficient prompt filtering - Flexible authentication (config, environment variables, per-request) - Bidirectional ID mapping (pmpt_<hex> ↔ llama_prompt_<hex>) - Comprehensive error handling and validation Implementation: - Remote provider: src/llama_stack/providers/remote/prompts/mlflow/ - Inline reference provider: src/llama_stack/providers/inline/prompts/reference/ - MLflow 3.4+ required for Prompt Registry API support - Deterministic ID mapping ensures consistency across conversions Testing: - 15 comprehensive unit tests (config validation, ID mapping) - 18 end-to-end integration tests (full CRUD workflows) - GitHub Actions workflow for automated CI testing with MLflow server - Integration test fixtures with automatic server setup Documentation: - Complete provider configuration reference - Setup and usage examples with code samples - Authentication options and security best practices Signed-off-by: William Caban <[email protected]> Co-Authored-By: Claude <[email protected]>

williamcaban requested review from ashwinb, bbrowning, ehhuang, franciscojavierarceo, hardikjshah, leseb, mattf, raghotham, reluctantfuturist, slekkala1, terrytangyuan and yanxi0830 as code owners November 16, 2025 19:06

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 16, 2025

williamcaban mentioned this pull request Nov 18, 2025

feat(cache): add cache store abstraction layer #4166

Closed

15 tasks

mattf reviewed Nov 18, 2025

View reviewed changes

williamcaban force-pushed the feat/mlflow-prompt-registry branch from dd0758c to cb69e41 Compare November 23, 2025 17:17

williamcaban requested a review from cdoern as a code owner November 23, 2025 17:17

williamcaban marked this pull request as draft November 23, 2025 17:27

williamcaban force-pushed the feat/mlflow-prompt-registry branch from cb69e41 to 1e68fa8 Compare November 24, 2025 02:29

williamcaban marked this pull request as ready for review November 24, 2025 02:33

williamcaban changed the title ~~WIP feat: Add MLflow Prompt Registry provider~~ feat: Add MLflow Prompt Registry provider Nov 24, 2025