Bug Description
Custom Embedding Models: Setup and Known Issues
Adding a custom embedding model
To add a custom embedding model to the OpenAI Embeddings component dropdown, add the model name to OPENAI_EMBEDDING_MODEL_NAMES in lfx/base/models/openai_constants.py:
OPENAI_EMBEDDING_MODEL_NAMES = [
"text-embedding-3-small",
"text-embedding-3-large",
"text-embedding-ada-002",
"BAAI/bge-large-en-v1.5B",
"koni", # ← custom entry
]
The list comprehension at OPENAI_EMBEDDING_MODELS_DETAILED automatically generates full metadata dicts with model_type="embeddings" for each name.
Why embedding models require list entries
The OpenAI Chat Model component's dropdown has combobox=True, which lets users type arbitrary model names. The OpenAI Embeddings component does not — custom embedding models must be present in the list to be selectable.
The base_url is configured separately
The model catalog only controls which names appear in the dropdown. The actual endpoint is set per-component via the "OpenAI API Base" field or the OPENAI_API_BASE environment variable.
Fixed: embedding entries dropped by models.dev override
Original bug (Patch 1)
apply_models_dev_overrides() replaced entire static model lists with models.dev data, silently discarding all custom entries. Fixed by merging custom entries back into the override result. See custom-models-override-bug-report.md for full details.
Follow-up bug (Patch 2 — composite-key dedup)
After Patch 1, a subtler issue remained: if the same model name appears in both a chat list and an embedding list (e.g. "koni" in OPENAI_MODELS_DETAILED with model_type="llm" and in OPENAI_EMBEDDING_MODELS_DETAILED with model_type="embeddings"), the embedding entry was still silently dropped.
Root cause: The merge logic used name alone as the dedup key. When the first group (chat models) was processed, the custom "koni" chat entry was added to the override. When the second group (embedding models) was processed, "koni" was already in the override's name set, so the embedding entry was filtered out — even though it's a different model type.
Fix: Changed the dedup key from name to (name, model_type):
Before (broken for same-name cross-type entries):
override_names = {m.get("name") for m in overrides[provider]}
custom_entries = [m for m in group if m.get("name") not in override_names]
After (correctly distinguishes chat vs embedding):
override_keys = {
(m.get("name"), m.get("model_type", "llm")) for m in overrides[provider]
}
custom_entries = [
m for m in group
if (m.get("name"), m.get("model_type", "llm")) not in override_keys
]
This applies to both the first-group and second-group merge paths in apply_models_dev_overrides() (lfx/base/models/models_dev_catalog.py).
Verification
After applying both patches and restarting Langflow, a model named "koni" correctly appears in:
The OpenAI Chat Model dropdown (as model_type="llm")
The OpenAI Embeddings dropdown (as model_type="embeddings")
Reproduction
as above
Expected behavior
as Above
Who can help?
@ogabrielluiz
Operating System
Ubuntu 22.0.4
Langflow Version
1.10.0rc0
Python Version
3.12
Screenshot
No response
Flow File
embeddings.md
Bug Description
Custom Embedding Models: Setup and Known Issues
Adding a custom embedding model
To add a custom embedding model to the OpenAI Embeddings component dropdown, add the model name to OPENAI_EMBEDDING_MODEL_NAMES in lfx/base/models/openai_constants.py:
OPENAI_EMBEDDING_MODEL_NAMES = [
"text-embedding-3-small",
"text-embedding-3-large",
"text-embedding-ada-002",
"BAAI/bge-large-en-v1.5B",
"koni", # ← custom entry
]
The list comprehension at OPENAI_EMBEDDING_MODELS_DETAILED automatically generates full metadata dicts with model_type="embeddings" for each name.
Why embedding models require list entries
The OpenAI Chat Model component's dropdown has combobox=True, which lets users type arbitrary model names. The OpenAI Embeddings component does not — custom embedding models must be present in the list to be selectable.
The base_url is configured separately
The model catalog only controls which names appear in the dropdown. The actual endpoint is set per-component via the "OpenAI API Base" field or the OPENAI_API_BASE environment variable.
Fixed: embedding entries dropped by models.dev override
Original bug (Patch 1)
apply_models_dev_overrides() replaced entire static model lists with models.dev data, silently discarding all custom entries. Fixed by merging custom entries back into the override result. See custom-models-override-bug-report.md for full details.
Follow-up bug (Patch 2 — composite-key dedup)
After Patch 1, a subtler issue remained: if the same model name appears in both a chat list and an embedding list (e.g. "koni" in OPENAI_MODELS_DETAILED with model_type="llm" and in OPENAI_EMBEDDING_MODELS_DETAILED with model_type="embeddings"), the embedding entry was still silently dropped.
Root cause: The merge logic used name alone as the dedup key. When the first group (chat models) was processed, the custom "koni" chat entry was added to the override. When the second group (embedding models) was processed, "koni" was already in the override's name set, so the embedding entry was filtered out — even though it's a different model type.
Fix: Changed the dedup key from name to (name, model_type):
Before (broken for same-name cross-type entries):
override_names = {m.get("name") for m in overrides[provider]}
custom_entries = [m for m in group if m.get("name") not in override_names]
After (correctly distinguishes chat vs embedding):
override_keys = {
(m.get("name"), m.get("model_type", "llm")) for m in overrides[provider]
}
custom_entries = [
m for m in group
if (m.get("name"), m.get("model_type", "llm")) not in override_keys
]
This applies to both the first-group and second-group merge paths in apply_models_dev_overrides() (lfx/base/models/models_dev_catalog.py).
Verification
After applying both patches and restarting Langflow, a model named "koni" correctly appears in:
The OpenAI Chat Model dropdown (as model_type="llm")
The OpenAI Embeddings dropdown (as model_type="embeddings")
Reproduction
as above
Expected behavior
as Above
Who can help?
@ogabrielluiz
Operating System
Ubuntu 22.0.4
Langflow Version
1.10.0rc0
Python Version
3.12
Screenshot
No response
Flow File
embeddings.md