Skip to content

Analysis: server-local analysis window + null-safe blocking analyzer#1068

Merged
erikdarlingdata merged 1 commit into
devfrom
fix/analysis-window-server-clock
Jun 6, 2026
Merged

Analysis: server-local analysis window + null-safe blocking analyzer#1068
erikdarlingdata merged 1 commit into
devfrom
fix/analysis-window-server-clock

Conversation

@erikdarlingdata
Copy link
Copy Markdown
Owner

The bug (server-local window)

AnalysisService built the analysis window from DateTime.UtcNow, but every collector stamps rows with SYSDATETIME() (server-local). So all 102 windowed reads (wait/CPU/blocking facts, anomaly detection, every drill-down enrichment) filtered server-local data against a UTC window and silently matched nothing on any non-UTC server — the entire windowed half of the analysis engine was dark there.

Proven on sql2022 (UTC−7): the UTC window saw 0 wait-stats and 0 CPU rows; the server-local window saw 11,050 wait rows + 293 CPU rows.

The fix

  • AnalyzeAsync / CollectAndScoreFactsAsync probe the monitored server's clock (SELECT SYSDATETIME(), SYSUTCDATETIME()) and build a server-local window, capturing the UTC offset on AnalysisContext. The 102 windowed reads are unchanged (they consume the context window).
  • SqlServerFindingStore converts the window back to UTC for persistence, so stored time_range_*/analysis_time stay UTC — the reader's AsUtc, the deep-link offset math, and the retention purge are all unchanged.
  • compare_analysis uses the server clock too. Falls back to host UTC if the probe fails (prior behavior; no new hard-failure mode).

Bundled: null-safe blocking_deadlock_analyzer (install/26)

SUM(wait_time_ms)/MAX(...) and SUM(wait_time) feed NOT NULL columns, but those source columns are nullable — a report missing the duration made the aggregate NULL and failed the whole insert (it was failing on sql2022). Wrapped in ISNULL(..., 0).

Verification

  • get_analysis_facts now returns 292 wait facts (vs 0 before); persisted time_range_end = server SYSUTCDATETIME (UTC), not server-local.
  • Analyzer aggregates a NULL-wait_time_ms row to 0 instead of erroring (deployed + tested on sql2022).
  • Dashboard + Lite build clean; 346 Dashboard.Tests pass.

🤖 Generated with Claude Code

Two correctness fixes found while debugging why windowed analysis surfaced
nothing on a non-UTC server (sql2022 is UTC-7).

Server-local analysis window:
- AnalysisService built the window from DateTime.UtcNow, but every collector
  stamps rows with SYSDATETIME() (server-local). So all 102 windowed reads
  (wait/CPU/blocking facts, anomaly detection, every drill-down enrichment)
  filtered server-local data against a UTC window and silently matched nothing
  on any non-UTC server -- the entire windowed half of the engine was dark
  there (proven: 0 rows in the UTC window vs 11,050 wait rows in the
  server-local window on sql2022).
- Fix at the source: AnalyzeAsync / CollectAndScoreFactsAsync probe the
  monitored server's clock (SELECT SYSDATETIME(), SYSUTCDATETIME()) and build a
  SERVER-LOCAL window, capturing the UTC offset on AnalysisContext. The 102
  windowed reads are unchanged. SqlServerFindingStore converts the window back
  to UTC for persistence, so stored time_range_*/analysis_time stay UTC and the
  reader's AsUtc, the deep-link offset math, and the retention purge are all
  unchanged. compare_analysis uses the server clock too. Falls back to host UTC
  if the probe fails (prior behavior; no new hard-failure mode).

Null-safe blocking_deadlock_analyzer (install/26):
- total_blocking_duration_ms = SUM(wait_time_ms) / MAX(...) and
  total_deadlock_wait_time_ms = SUM(wait_time) feed NOT NULL columns, but those
  source columns are nullable -- a report missing the duration made the SUM/MAX
  NULL and failed the whole insert. Wrapped in ISNULL(..., 0).

Verified on sql2022: windowed facts return real data (292 wait facts vs 0
before); persisted time_range stays UTC; the analyzer aggregates a NULL-wait
row to 0 instead of erroring. Dashboard + Lite build clean; 346 Dashboard.Tests
pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@erikdarlingdata erikdarlingdata merged commit 40fa807 into dev Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant