Fix #1048: stop reporting 100% host CPU on SQL Server on Linux#1049
Merged
Conversation
On SQL Server on Linux the SCHEDULER_MONITOR ring buffer reports SystemIdle = 0 (a documented platform limitation — Microsoft's own sample query carries the comment "SystemIdle on Linux will be 0"). The CPU collector derives other_process_cpu_utilization as 100 - SystemIdle - ProcessUtilization, so on Linux that becomes 100 - 0 - sqlcpu and the host total (sqlserver + other) pins at 100% forever. SQL Server's own CPU number is correct; only the host/other figure is fabricated, and no DMV exposes true host CPU on Linux. Fix: detect Linux (sys.dm_os_host_info.host_platform, via sp_executesql so SQL 2016 never binds the 2017+ DMV) and store NULL for other_process_cpu_utilization on Linux instead of a false value. Every consumer then degrades to the correct SQL-only figure: - install/18 + Lite RemoteCollectorService.Cpu: store NULL on Linux. - install/02: other_process_cpu_utilization made nullable; NULL propagates to the total_cpu_utilization computed column. - upgrades/2.11.0-to-2.12.0/03: migrate existing tables — drop the persisted computed column, widen the base column to NULL, re-add the computed column (ALTER COLUMN is blocked while the computed column depends on it). Idempotent and partial-failure safe. - install/47 views + Dashboard ResourceMetrics/Overview reads: coalesce total to ISNULL(total, sqlserver) so charts, MCP, and high-CPU detection fall back to SQL CPU rather than NULL/0/100 on Linux. Windows behavior is unchanged (host_platform != Linux). Fact collectors, anomaly/baseline/drilldown, and FinOps already key on sqlserver_cpu_utilization or coalesce NULLs to 0, so they degrade consistently. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This was referenced Jun 2, 2026
MisterZeus
pushed a commit
to MisterZeus/PerformanceMonitor
that referenced
this pull request
Jun 5, 2026
… on Linux The erikdarlingdata#1049 fix corrected every path that reads CPU from the collected table (collector, views, chart reads), but the alert engine doesn't read that table — DatabaseService.NocHealth.GetCpuPercentAsync runs its own live query against sys.dm_os_ring_buffers and was never touched. It computes other_cpu_percent = 100 - SystemIdle - ProcessUtilization, and since SystemIdle is always 0 on SQL Server on Linux, that returns 100 - sqlcpu. AlertHealthResult.TotalCpuPercent then sums to a permanent 100%, so AlertStateService's TotalCpuPercent >= CpuThresholdPercent check fires the host-CPU alert forever — exactly what the reporter still saw after installing the nightly. Fix: apply the same Linux guard used by install/18, RemoteCollectorService.Cpu, and FinOps.Inventory — detect host_platform via sp_executesql behind an OBJECT_ID(N'sys.dm_os_host_info', N'V') check (so SQL 2016 never binds the 2017+ DMV) and return NULL for other_cpu_percent on Linux. The existing TotalCpuPercent getter already falls back to the SQL-only figure when OtherCpuPercent is null, so the alert clears. Windows behavior is unchanged. Dashboard-only change — no schema or installer impact. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Reported in #1048: on SQL Server 2019 on Linux, the Dashboard's host CPU reads 100% constantly while the host is barely busy. SQL Server's own CPU number is correct — only the host/other-process figure is wrong.
Root cause (web-confirmed)
The CPU collector reads
RING_BUFFER_SCHEDULER_MONITORand derivesother_process = 100 - SystemIdle - ProcessUtilization. On Linux,SystemIdleis always0— a documented platform limitation. Microsoft's own sample query carries the literal comment-- SystemIdle on Linux will be 0. Soother = 100 - 0 - sqlcpu, host total =100 - SystemIdle= 100% forever. No DMV exposes true host CPU on Linux.Fix
Detect Linux (
sys.dm_os_host_info.host_platform, viasp_executesqlso SQL 2016 never binds the 2017+ DMV) and storeNULLforother_process_cpu_utilizationon Linux instead of fabricating it. Every consumer degrades to the correct SQL-only figure.install/18_collect_cpu_utilization_stats.sqlNULLfor other-process on LinuxLite/Services/RemoteCollectorService.Cpu.csNULLto DuckDBinstall/02_create_tables.sqlother_process_cpu_utilization→ nullable (fresh installs);NULLpropagates to thetotal_cpu_utilizationcomputed columnupgrades/2.11.0-to-2.12.0/03_make_other_process_cpu_nullable.sqlinstall/47_create_reporting_views.sqlcpu_spikes+daily_summarycoalescetotal → ISNULL(total, sqlserver)so high-CPU is still detected via SQL CPU on LinuxDashboard/Services/DatabaseService.ResourceMetrics.cs,…/Overview.csISNULLcoalesce in the chart query, MCP query, and daily-summary predicateAll other CPU consumers (fact collectors, anomaly/baseline/drilldown, FinOps) already key on
sqlserver_cpu_utilizationor coalesce NULLs to 0, so they degrade consistently. The installer re-runsinstall/*on upgrade, so the collector/view edits reach existing installs automatically.Testing
Validated on sql2022 (live):
NULLother-process propagates toNULLtotal through the computed column0on Windows → Windows behavior unchangedNot tested: the actual Linux
NULLpath — all available SQL instances (2016–2025) are Windows and the behavior can't be faked without the Linux DMV. The logic (host_platform = 'Linux'→ NULL) is straightforward; a real Linux instance is the only true confirmation.Scope note: uses
NULL+ SQL-only fallback rather than an explicit "host CPU N/A on Linux" chart badge (WPF work, not visually verifiable in this environment). On Linux the Dashboard CPU chart's "Total" line sits on top of the "SQL" line.🤖 Generated with Claude Code