Skip to content

Fix #1048: stop alert engine fabricating 100% host CPU on Linux#1055

Merged
erikdarlingdata merged 1 commit into
devfrom
feature/1048-alert-engine-linux-host-cpu
Jun 4, 2026
Merged

Fix #1048: stop alert engine fabricating 100% host CPU on Linux#1055
erikdarlingdata merged 1 commit into
devfrom
feature/1048-alert-engine-linux-host-cpu

Conversation

@erikdarlingdata
Copy link
Copy Markdown
Owner

Problem

The reporter on #1048 installed the #1049 nightly and still saw the host-CPU alert on their OpenSUSE-hosted SQL Server.

#1049 corrected every path that reads CPU from the collected table (install/18 collector, install/47 views, Overview.cs/ResourceMetrics.cs chart reads). But the alert engine doesn't read that tableDatabaseService.NocHealth.GetCpuPercentAsync runs its own live query against sys.dm_os_ring_buffers and was never touched.

It computes other_cpu_percent = 100 - SystemIdle - ProcessUtilization. Since SystemIdle is always 0 on SQL Server on Linux, that returns 100 - sqlcpu. AlertHealthResult.TotalCpuPercent then sums to a permanent 100%, so AlertStateService's TotalCpuPercent >= CpuThresholdPercent check fires the host-CPU alert forever. The chart was fixed; the alert badge was not.

Fix

Apply the same Linux guard already used by install/18, RemoteCollectorService.Cpu, and FinOps.Inventory:

  • Detect host_platform via sp_executesql behind an OBJECT_ID(N'sys.dm_os_host_info', N'V') check, so SQL 2016 never binds the 2017+ DMV.
  • Return NULL for other_cpu_percent on Linux.

The existing TotalCpuPercent getter already falls back to the SQL-only figure when OtherCpuPercent is null, so the alert clears. Windows behavior is unchanged.

Scope

Dashboard-C#-only — no schema or installer change. The reporter can verify with just the next nightly Dashboard (no DB re-install needed).

Verification

  • Rewritten query parses & runs on SQL 2022 — Windows path unchanged (SQL 0% / other 1%; Linux would yield NULL).
  • Dashboard builds clean (0 warnings, 0 errors).
  • Confirmed NocHealth.cs was the only remaining live SystemIdle computation in the Dashboard.
  • Linux runtime behavior is the reporter's to confirm (no Linux SQL host available here), same as Fix #1048: stop reporting 100% host CPU on SQL Server on Linux #1049.

🤖 Generated with Claude Code

The #1049 fix corrected every path that reads CPU from the collected
table (collector, views, chart reads), but the alert engine doesn't read
that table — DatabaseService.NocHealth.GetCpuPercentAsync runs its own
live query against sys.dm_os_ring_buffers and was never touched. It
computes other_cpu_percent = 100 - SystemIdle - ProcessUtilization, and
since SystemIdle is always 0 on SQL Server on Linux, that returns
100 - sqlcpu. AlertHealthResult.TotalCpuPercent then sums to a permanent
100%, so AlertStateService's TotalCpuPercent >= CpuThresholdPercent check
fires the host-CPU alert forever — exactly what the reporter still saw
after installing the nightly.

Fix: apply the same Linux guard used by install/18, RemoteCollectorService.Cpu,
and FinOps.Inventory — detect host_platform via sp_executesql behind an
OBJECT_ID(N'sys.dm_os_host_info', N'V') check (so SQL 2016 never binds the
2017+ DMV) and return NULL for other_cpu_percent on Linux. The existing
TotalCpuPercent getter already falls back to the SQL-only figure when
OtherCpuPercent is null, so the alert clears. Windows behavior is unchanged.

Dashboard-only change — no schema or installer impact.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@erikdarlingdata erikdarlingdata merged commit 70ecfbe into dev Jun 4, 2026
2 checks passed
@erikdarlingdata erikdarlingdata deleted the feature/1048-alert-engine-linux-host-cpu branch June 4, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant