[Enh]: Embedding Phase 1 & 2: Internal subsystem for embedding text.

## What?

A subsystem in DAB to fetch embeddings from text.

> [!NOTE]
> Handle chunking natively.

### Overview

1. Provide the embedding subsystem
2. Provide the `/embed` endpoint
3. Provide a chunking mechanism
5. Cache embeddings in L1 & L2 cache

## Why?

For the sake of future use around cache and parameter substitution.

## Configuration (`runtime.embeddings`)

```json
{
  "runtime": {
    "embeddings": {
      "enabled": true, 
      "provider": "azure-openai", // Enum: azure-openai | openai
      "base-url": "@env('EMBEDDINGS_ENDPOINT')",
      "api-key": "@env('EMBEDDINGS_API_KEY')",
      "model": "@env('EMBEDDINGS_MODEL')",
      "dimensions": 1536,
      "timeout-ms": 30000,
      "api-version": "@env('EMBEDDINGS_API_VERSION')",
      "chunking": {
        "enabled": true,
        "size-chars": 800,
        "overlap-chars": 100
      },
      "cache": {
        "enabled": true,
        "level": "L1", // Enum: L1 | L1L2
        "ttl-seconds": 86400 
      },
      "endpoint": {
        "enabled": false,
        "path": "/embed",
        "roles": ["authenticated"]
      },
      "health": {
        "enabled": false, // Enable embedding health checks
        "threshold-ms": 1000, // Max acceptable latency in milliseconds
        "test-text": "health check", // Input used to validate provider
        "expected-dimensions": 1536 // Expected embedding vector size
      }
    }
  }
}
```

### Properties

| Parameter                  | Type     | Required | Default           | Description                                                    |
| -------------------------- | -------- | -------- | ----------------- | -------------------------------------------------------------- |
| enabled                    | bool     | No       | true              | Master toggle for embedding subsystem                          |
| provider                   | string   | Yes*     | -                 | `azure-openai` or `openai`                                     |
| base-url                   | string   | Yes*     | -                 | Provider base URL                                              |
| api-key                    | string   | Yes*     | -                 | Authentication key (supports `@env()`)                         |
| model                      | string   | Yes*     | -                 | Deployment name (Azure) or model name (OpenAI)                 |
| dimensions                 | int      | No       | 1536              | Embedding vector size                                          |
| timeout-ms                 | int      | No       | 30000             | HTTP request timeout in milliseconds (min: 1, max: 300000)     |
| api-version                | string   | No       | 2023-05-15        | Azure OpenAI API version (Azure only)                          |
| chunking.enabled           | bool     | No       | true              | Enable automatic chunking                                      |
| chunking.size-chars        | int      | No       | 800               | Chunk size in characters                                       |
| chunking.overlap-chars     | int      | No       | 100               | Overlap between chunks in characters                           |
| cache.enabled              | bool     | No       | true              | Enable embedding cache                                         |
| cache.level                | string   | No       | L1                | Cache level: `L1` (in-memory) or `L1L2` (memory + distributed) |
| cache.ttl-seconds          | int      | No       | 86400             | Cache time to live in seconds                                  |
| endpoint.enabled           | bool     | No       | false             | Whether to expose `/embed` REST endpoint                       |
| endpoint.path              | string   | No       | /embed            | Embed endpoint path                                            |
| endpoint.roles             | string[] | No       | ["authenticated"] | Roles allowed to access endpoint                               |
| health.enabled             | bool     | No       | false             | Enable embedding health checks                                 |
| health.threshold-ms        | int      | No       | 1000              | Max acceptable latency for health check in milliseconds        |
| health.test-text           | string   | No       | "health check"    | Input text used to validate embedding provider                 |
| health.expected-dimensions | int      | No       | -                 | Expected embedding vector size for validation                  |

> \* Required only when `enabled: true`

### Rules

```
runtime.embeddings
- when enabled=true:
  - provider, base-url, api-key, and model must all be present

provider-specific
- api-version is valid only when provider=azure-openai
- api-version should be ignored or rejected when provider=openai

chunking
- chunking.overlap-chars must be less than chunking.size-chars

endpoint
- endpoint.path should not contain a query string
- endpoint.path should only contain a fragment
- endpoint.roles should contain unique values
```

### Endpoint

Query Parameters (Override Defaults)
- `$chunking.size-chars=512` — Override chunk size
- `$chunking.overlap-chars=50` — Override overlap
- `$chunking.enabled=false` — Disable chunking (treat entire text as one)

```
POST https://{server}/embed?$chunking.size-chars=512&$chunking.overlap-chars=50
Content-Type: application/json
```

### Request

```json
[
  {
    "key": "document-1",
    "text": "Long document text here that may exceed chunk size..."
  },
  {
    "key": "document-2",
    "text": "Another document..."
  }
]
```

### Response

```json
[
  {
    "key": "document-1",
    "data": [ [0.123, 0.456, ...], [0.789, 0.101, ...] ]
  },
  {
    "key": "document-2",
    "data": [ [0.111, 0.222, ...] ]
  }
]
```

### Command Line

```
dab configure --runtime.embeddings.enabled true
dab configure --runtime.embeddings.provider "azure-openai"
dab configure --runtime.embeddings.base-url "@env('EMBEDDINGS_ENDPOINT')"
dab configure --runtime.embeddings.api-key "@env('EMBEDDINGS_API_KEY')"
dab configure --runtime.embeddings.model "@env('EMBEDDINGS_MODEL')"
dab configure --runtime.embeddings.dimensions 1536
dab configure --runtime.embeddings.timeout-ms 30000
dab configure --runtime.embeddings.api-version "@env('EMBEDDINGS_API_VERSION')"

dab configure --runtime.embeddings.chunking.enabled true
dab configure --runtime.embeddings.chunking.size-chars 800
dab configure --runtime.embeddings.chunking.overlap-chars 100

dab configure --runtime.embeddings.cache.enabled true
dab configure --runtime.embeddings.cache.level "L1"
dab configure --runtime.embeddings.cache.ttl-seconds 86400

dab configure --runtime.embeddings.endpoint.enabled false
dab configure --runtime.embeddings.endpoint.path /embed
dab configure --runtime.embeddings.endpoint.roles "authenticated"
```

### REST Endpoint

- Optionally exposes `/embed` for on-demand embedding, secured by role
- When `host.mode` is `development`, `anonymous` is automatically allowed

### Health, Hot Reload, Startup

- Startup validation fails fast if `enabled: true` and any required field is missing
- Health check covers embedding provider connectivity
- Listens for hot-reload events and reloads configuration

## REST API Reference

_Azure OpenAI:_
- URL: `POST {base-url}/openai/deployments/{model}/embeddings?api-version={api-version}`
- Headers: `api-key: {api-key}`
- Body: `{ "input": string | string[], "dimensions": 1536 }`

_OpenAI:_
- URL: `POST {base-url}/v1/embeddings`
- Headers: `Authorization: Bearer {api-key}`
- Body: `{ "model": "{model}", "input": string | string[], "dimensions": 1536 }`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enh]: Embedding Phase 1 & 2: Internal subsystem for embedding text. #3103

What?

Overview

Why?

Configuration (`runtime.embeddings`)

Properties

Rules

Endpoint

Request

Response

Command Line

REST Endpoint

Health, Hot Reload, Startup

REST API Reference

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parameter	Type	Required	Default	Description
enabled	bool	No	true	Master toggle for embedding subsystem
provider	string	Yes*	-	`azure-openai` or `openai`
base-url	string	Yes*	-	Provider base URL
api-key	string	Yes*	-	Authentication key (supports `@env()`)
model	string	Yes*	-	Deployment name (Azure) or model name (OpenAI)
dimensions	int	No	1536	Embedding vector size
timeout-ms	int	No	30000	HTTP request timeout in milliseconds (min: 1, max: 300000)
api-version	string	No	2023-05-15	Azure OpenAI API version (Azure only)
chunking.enabled	bool	No	true	Enable automatic chunking
chunking.size-chars	int	No	800	Chunk size in characters
chunking.overlap-chars	int	No	100	Overlap between chunks in characters
cache.enabled	bool	No	true	Enable embedding cache
cache.level	string	No	L1	Cache level: `L1` (in-memory) or `L1L2` (memory + distributed)
cache.ttl-seconds	int	No	86400	Cache time to live in seconds
endpoint.enabled	bool	No	false	Whether to expose `/embed` REST endpoint
endpoint.path	string	No	/embed	Embed endpoint path
endpoint.roles	string[]	No	["authenticated"]	Roles allowed to access endpoint
health.enabled	bool	No	false	Enable embedding health checks
health.threshold-ms	int	No	1000	Max acceptable latency for health check in milliseconds
health.test-text	string	No	"health check"	Input text used to validate embedding provider
health.expected-dimensions	int	No	-	Expected embedding vector size for validation

[Enh]: Embedding Phase 1 & 2: Internal subsystem for embedding text. #3103

Description

What?

Overview

Why?

Configuration (runtime.embeddings)

Properties

Rules

Endpoint

Request

Response

Command Line

REST Endpoint

Health, Hot Reload, Startup

REST API Reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Configuration (`runtime.embeddings`)