[EXPERIMENTAL] feat(tracer): add TLA+ formal verification of concurrency invariants#4426
Draft
kakkoyun wants to merge 4 commits intoDataDog:mainfrom
Draft
[EXPERIMENTAL] feat(tracer): add TLA+ formal verification of concurrency invariants#4426kakkoyun wants to merge 4 commits intoDataDog:mainfrom
kakkoyun wants to merge 4 commits intoDataDog:mainfrom
Conversation
Adds Specula-based formal verification for two critical concurrency protocols in dd-trace-go: Phase 1 — Span Lifecycle: verifies lock ordering (span.mu → trace.mu), finish-guard pattern, partial flush lock inversion safety, and deadlock freedom. TLC explored 4,437 distinct states across 2 goroutines and 3 spans with all 6 invariants passing. Phase 2 — GLS Push/Pop: verifies push/pop pairing, stack depth correctness, no-leak-on-finish, LIFO ordering, and goroutine isolation. TLC explored 159 distinct states with all 8 invariants passing. Complements the kakkoyun/orchestrion_gls_leak branch work. Includes extracted Go source files, hand-written TLA+ specs with TLC configs, Specula pipeline scripts, and documentation.
Runs TLC model checker on both specs (span lifecycle and GLS push/pop) when formal-verification/ files change. Validates Go source extraction compiles. Uses pinned action SHAs and cached tla2tools.jar.
- actions/setup-java v4.7.1 -> v5.2.0 - actions/cache v4.2.3 -> v5.0.3 - actions/setup-go v5.5.0 -> v6.2.0
- Add Apache 2.0 copyright headers to extracted Go source files - Fix gofmt comment formatting (3-space → 2-space numbered lists) - Bump go.mod directive to go 1.25.0 to match CI toolchain - Fix tla2tools.jar verification (tlc2.TLC -h exits non-zero)
felixge
reviewed
Feb 17, 2026
Member
felixge
left a comment
There was a problem hiding this comment.
Please ping me for approval once this is ready to look at.
Member
Author
Of course. But it is far from ready :) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
WIP: This still requires a lot work and validation
Using https://github.com/specula-org/Specula
What does this PR do?
Adds TLA+ formal verification for two critical concurrency protocols in dd-trace-go:
span.mu→trace.mu), finish-guard pattern, partial flush lock inversion safety (#incident-46344), and deadlock freedom. TLC explored 4,437 distinct states across 2 goroutines and 3 spans with all 6 invariants passing.kakkoyun/orchestrion_gls_leakbranch work.Includes:
formal-verification/Motivation
dd-trace-go's tracer has complex concurrency invariants (lock ordering, finish-guard, GLS stack pairing) that are currently enforced only by code comments, runtime assertions, and
-racetesting. ThechecklocksCI job is disabled pending annotation coverage. Formal verification with TLA+/TLC mathematically proves these invariants hold for all possible interleavings — not just those exercised by tests.This is experimental infrastructure to explore whether Specula-based formal verification is practical for dd-trace-go's concurrency model.
Related: #4384 (orchestrion GLS leak fix)
Reviewer's Checklist
make lintlocally.make testlocally.make generatelocally.make fix-moduleslocally.Unsure? Have a question? Request a review!