You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
HTTP request timeout in milliseconds (min: 1, max: 300000)
api-version
string
No
2023-05-15
Azure OpenAI API version (Azure only)
chunking.enabled
bool
No
true
Enable automatic chunking
chunking.size-chars
int
No
800
Chunk size in characters
chunking.overlap-chars
int
No
100
Overlap between chunks in characters
cache.enabled
bool
No
true
Enable embedding cache
cache.level
string
No
L1
Cache level: L1 (in-memory) or L1L2 (memory + distributed)
cache.ttl-seconds
int
No
86400
Cache time to live in seconds
endpoint.enabled
bool
No
false
Whether to expose /embed REST endpoint
endpoint.path
string
No
/embed
Embed endpoint path
endpoint.roles
string[]
No
["authenticated"]
Roles allowed to access endpoint
health.enabled
bool
No
false
Enable embedding health checks
health.threshold-ms
int
No
1000
Max acceptable latency for health check in milliseconds
health.test-text
string
No
"health check"
Input text used to validate embedding provider
health.expected-dimensions
int
No
-
Expected embedding vector size for validation
* Required only when enabled: true
Rules
runtime.embeddings
- when enabled=true:
- provider, base-url, api-key, and model must all be present
provider-specific
- api-version is valid only when provider=azure-openai
- api-version should be ignored or rejected when provider=openai
chunking
- chunking.overlap-chars must be less than chunking.size-chars
endpoint
- endpoint.path should not contain a query string
- endpoint.path should only contain a fragment
- endpoint.roles should contain unique values
Endpoint
Query Parameters (Override Defaults)
$chunking.size-chars=512 — Override chunk size
$chunking.overlap-chars=50 — Override overlap
$chunking.enabled=false — Disable chunking (treat entire text as one)
POST https://{server}/embed?$chunking.size-chars=512&$chunking.overlap-chars=50
Content-Type: application/json
Request
[
{
"key": "document-1",
"text": "Long document text here that may exceed chunk size..."
},
{
"key": "document-2",
"text": "Another document..."
}
]
dab configure --runtime.embeddings.enabled true
dab configure --runtime.embeddings.provider "azure-openai"
dab configure --runtime.embeddings.base-url "@env('EMBEDDINGS_ENDPOINT')"
dab configure --runtime.embeddings.api-key "@env('EMBEDDINGS_API_KEY')"
dab configure --runtime.embeddings.model "@env('EMBEDDINGS_MODEL')"
dab configure --runtime.embeddings.dimensions 1536
dab configure --runtime.embeddings.timeout-ms 30000
dab configure --runtime.embeddings.api-version "@env('EMBEDDINGS_API_VERSION')"
dab configure --runtime.embeddings.chunking.enabled true
dab configure --runtime.embeddings.chunking.size-chars 800
dab configure --runtime.embeddings.chunking.overlap-chars 100
dab configure --runtime.embeddings.cache.enabled true
dab configure --runtime.embeddings.cache.level "L1"
dab configure --runtime.embeddings.cache.ttl-seconds 86400
dab configure --runtime.embeddings.endpoint.enabled false
dab configure --runtime.embeddings.endpoint.path /embed
dab configure --runtime.embeddings.endpoint.roles "authenticated"
REST Endpoint
Optionally exposes /embed for on-demand embedding, secured by role
When host.mode is development, anonymous is automatically allowed
Health, Hot Reload, Startup
Startup validation fails fast if enabled: true and any required field is missing
Health check covers embedding provider connectivity
Listens for hot-reload events and reloads configuration
REST API Reference
Azure OpenAI:
URL: POST {base-url}/openai/deployments/{model}/embeddings?api-version={api-version}
What?
A subsystem in DAB to fetch embeddings from text.
Note
Handle chunking natively.
Overview
/embedendpointWhy?
For the sake of future use around cache and parameter substitution.
Configuration (
runtime.embeddings){ "runtime": { "embeddings": { "enabled": true, "provider": "azure-openai", // Enum: azure-openai | openai "base-url": "@env('EMBEDDINGS_ENDPOINT')", "api-key": "@env('EMBEDDINGS_API_KEY')", "model": "@env('EMBEDDINGS_MODEL')", "dimensions": 1536, "timeout-ms": 30000, "api-version": "@env('EMBEDDINGS_API_VERSION')", "chunking": { "enabled": true, "size-chars": 800, "overlap-chars": 100 }, "cache": { "enabled": true, "level": "L1", // Enum: L1 | L1L2 "ttl-seconds": 86400 }, "endpoint": { "enabled": false, "path": "/embed", "roles": ["authenticated"] }, "health": { "enabled": false, // Enable embedding health checks "threshold-ms": 1000, // Max acceptable latency in milliseconds "test-text": "health check", // Input used to validate provider "expected-dimensions": 1536 // Expected embedding vector size } } } }Properties
azure-openaioropenai@env())L1(in-memory) orL1L2(memory + distributed)/embedREST endpointRules
Endpoint
Query Parameters (Override Defaults)
$chunking.size-chars=512— Override chunk size$chunking.overlap-chars=50— Override overlap$chunking.enabled=false— Disable chunking (treat entire text as one)Request
[ { "key": "document-1", "text": "Long document text here that may exceed chunk size..." }, { "key": "document-2", "text": "Another document..." } ]Response
[ { "key": "document-1", "data": [ [0.123, 0.456, ...], [0.789, 0.101, ...] ] }, { "key": "document-2", "data": [ [0.111, 0.222, ...] ] } ]Command Line
REST Endpoint
/embedfor on-demand embedding, secured by rolehost.modeisdevelopment,anonymousis automatically allowedHealth, Hot Reload, Startup
enabled: trueand any required field is missingREST API Reference
Azure OpenAI:
POST {base-url}/openai/deployments/{model}/embeddings?api-version={api-version}api-key: {api-key}{ "input": string | string[], "dimensions": 1536 }OpenAI:
POST {base-url}/v1/embeddingsAuthorization: Bearer {api-key}{ "model": "{model}", "input": string | string[], "dimensions": 1536 }