Skip to content

Propagate trace context in outgoing requestsΒ #2154

@frzifus

Description

@frzifus

πŸš€ Describe the new functionality needed

When tracing is enabled and llama-stack makes a request to an external service, it should propagate the trace header. (OpenTelemetry docs - Context Propagation)

The header entry will look like the following:

traceparent: 00-cd7088c08c5a37ba3fc0e27248981a71-0d36bf7af9548756-01

But when llama-stack makes a request to e.g. vllm, the outgoing request is missing this:

POST /v1/chat/completions HTTP/1.1
Host: localhost:8000
Accept: application/json
Accept-Encoding: gzip, deflate
Authorization: Bearer fake
Content-Length: 143
Content-Type: application/json
User-Agent: AsyncOpenAI/Python 1.76.1
X-Stainless-Arch: x64
X-Stainless-Async: async:asyncio
X-Stainless-Lang: python
X-Stainless-Os: Linux
X-Stainless-Package-Version: 1.76.1
X-Stainless-Read-Timeout: 600
X-Stainless-Retry-Count: 0
X-Stainless-Runtime: CPython
X-Stainless-Runtime-Version: 3.10.17

{"messages":[{"role":"user","content":[{"type":"text","text":"Berlin is"}]}],"model":"vllm","max_tokens":2048,"stream":false,"temperature":0.0}

πŸ’‘ Why is this needed? What if we don't build it?

When the trace information are not further propagated it leads to a lack of e2e visibility.

Similar to accepting trace information from incoming requests (#2097).

Other thoughts

More details: #2097 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions