fix: nest root LLM runs under the flow trace in Langfuse (#13429)#13539
fix: nest root LLM runs under the flow trace in Langfuse (#13429)#13539erichare wants to merge 1 commit into
Conversation
A model invoked as the root LangChain run (no wrapping chain) — reproduced with Ollama — was emitted by the langfuse v3 CallbackHandler as a separate orphan trace: parent=None, userId=None, sessionId=None, with the token usage detached from the flow trace, breaking cost/usage attribution. Root cause: the SDK only applies the constructor `trace_context` on the chain path (`on_chain_start`); the generation path calls `start_observation` without it, so with no active OpenTelemetry span the generation starts a brand-new trace (metadata `is_langchain_root: true`). `get_langchain_callback` now returns a `CallbackHandler` subclass that, for root LLM runs only (`parent_run_id is None`), activates the flow's component (or root) span as the current OTel span while the SDK creates the generation. The generation then inherits the flow `trace_id` and nests under the component span, restoring user/session attribution and token metrics. Non-root runs (wrapping chain/agent present) are left untouched. Adds focused unit tests plus an end-to-end test that drives the real langfuse SDK with an in-memory OpenTelemetry exporter and asserts the generation shares the flow trace_id and parents under the component span.
WalkthroughThe PR fixes Langfuse orphan traces by introducing OpenTelemetry span re-parenting. New utilities compute and activate the flow's component span as the OTel parent context when instantiating the Langfuse LangChain callback, ensuring root LLM runs are nested under the flow trace and remain linked to sessions and users. ChangesLangfuse OTel Span Re-parenting for Orphan Traces
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 8 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (8 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Test Coverage AdvisorNo source changes detected without accompanying tests. Thanks for keeping coverage up! 🎉
|
Codecov Report✅ All modified and coverable lines are covered by tests. ❌ Your project check has failed because the head coverage (54.19%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## release-1.10.1 #13539 +/- ##
==================================================
+ Coverage 58.42% 58.45% +0.03%
==================================================
Files 2289 2289
Lines 219033 219063 +30
Branches 31120 32923 +1803
==================================================
+ Hits 127961 128051 +90
+ Misses 89616 89556 -60
Partials 1456 1456
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Summary
Fixes #13429 (follow-up to #13319).
When a flow runs a model as the root LangChain run — no wrapping chain, reproduced with Ollama — the langfuse v3
CallbackHandleremitted the LLM generation as a separate, orphan trace instead of nesting it under the flow trace:userId/sessionId<flow_id>)"Ollama")None/Nonemodel=llama3.2,parent=None, tokens 32→2) with metadatais_langchain_root: trueBecause the orphan trace has no
userId/sessionId, the token-usage metrics could not be attributed to a session or user, and the generation was invisible inside the flow trace.Root cause
The langfuse v3 LangChain
CallbackHandleronly applies its constructortrace_contexton the chain path (on_chain_start→start_observation(trace_context=...)). The generation path (__on_llm_action, used byon_chat_model_start/on_llm_start) callsstart_observation(as_type="generation", ...)withouttrace_context. When the model is the root run (parent_run_id is None) there is no active OpenTelemetry span in context, so the SDK starts a brand-new root trace for the generation — theis_langchain_root: truecondition the issue identified.This is purely an SDK behavior; the prior tracing fixes (#13266, #13341, #13344) do not address it.
The fix
LangFuseTracer.get_langchain_callbacknow returns a smallCallbackHandlersubclass that, for root LLM runs only (parent_run_id is None), activates the flow's component (or root) span as the current OpenTelemetry span while the SDK creates the generation span. The generation then:trace_id(no longer orphaned), andso it shares the parent trace's
userId/sessionIdand its token usage is attributed correctly.Mechanics:
_build_otel_parent_span(trace_id, parent_span_id)builds a non-recording OTel parent from values we already hold — mirroring the SDK's own_create_remote_parent_spanused on the chain path — using only public OTel API (no langfuse private attributes). It degrades toNone(default SDK behavior) if the ids aren't valid hex.super().on_chat_model_start/super().on_llm_startinopentelemetry.trace.use_span(...). The handler setsrun_inline = True, so these callbacks run synchronously inside the model invocation and the activation reliably wraps span creation.end_on_exit=False/record_exception=Falseensure the parent span is never closed or mutated.Non-root runs are untouched — when a wrapping chain/agent is present the LLM fires with a non-
Noneparent_run_id, and the SDK already nests it correctly under the chain span. Only the bare-model case changes.Test plan
test_langfuse_orphan_generation.py— end-to-end test driving the real langfuse SDK with an in-memory OpenTelemetry exporter (a pure mock cannot catch this — the orphaning happens inside the SDK). Asserts the root LLM generation:trace_id(the core of Langfuse trace contents / Hierarchy issues #13429), andgeneration(carries token usage).SAME TRACE: False,gen.parent: None) and passes with the fix._build_otel_parent_span(hex / non-hex / missing ids) and the re-parenting handler (activates parent for root chat-model & llm runs; does not activate for non-root runs; safe when the parent is unresolvable).get_langchain_callbacktests to subclass a real fake base (the handler is now subclassed, so a bareMagicMockbase no longer works).298 passed, 5 skipped.ruff check/ruff formatclean; pre-commit hooks pass.Summary by CodeRabbit
Bug Fixes
Tests