Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Changed

- **Trace files are now bounded at the source** ([#972]) — `collect.trace_management_collector` creates the long-query trace with a rollover file-count cap (`@filecount`, via the new `@max_files` parameter, default 5), so SQL Server itself deletes the oldest `.trc` file as the trace rolls. The scheduled collector also now issues `START` instead of `RESTART`: it keeps one trace running rather than tearing it down and spawning a fresh timestamped trace — and a fresh batch of orphaned files — every cycle
- **Blocked-process reports expose blocker-side fields as typed columns** — `collect.blocking_BlockedProcessReport` now carries `blocking_spid`, `blocking_last_tran_started`, `blocking_status`, `blocked_sql_text`, and `blocking_sql_text` populated at insert time from `blocked_process_report_xml`. The Dashboard analysis path will read these typed columns directly instead of re-parsing the report XML on every `BLOCKING_CHAIN` fact (up to 5000 `XElement.Parse` calls per analysis cycle). Existing rows are backfilled idempotently by the 2.11.0 → 2.12.0 upgrade script

### Added

Expand Down
12 changes: 12 additions & 0 deletions install/02_create_tables.sql
Original file line number Diff line number Diff line change
Expand Up @@ -1106,6 +1106,18 @@ BEGIN
login_name nvarchar(256) NULL,
transaction_id bigint NULL,
blocked_process_report_xml xml NULL,
/*
Blocker-side fields parsed from blocked_process_report_xml at insert
time so the analysis path does not re-parse XML on every BLOCKING_CHAIN
fact. Populated only on activity = 'blocked' rows; NULL on activity =
'blocking' rows (those rows describe the blocker side via their own
spid/status/last_transaction_started columns).
*/
blocking_spid integer NULL,
blocking_last_tran_started datetime2(7) NULL,
blocking_status nvarchar(10) NULL,
blocked_sql_text nvarchar(max) NULL,
blocking_sql_text nvarchar(max) NULL,
CONSTRAINT
PK_collect_blocking_BlockedProcessReport
PRIMARY KEY CLUSTERED
Expand Down
112 changes: 112 additions & 0 deletions install/23_process_blocked_process_xml.sql
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ BEGIN
@rows_deleted bigint = 0,
@rows_marked bigint = 0,
@rows_parsed bigint = 0,
@rows_typed bigint = 0,
@start_time datetime2(7) = SYSDATETIME(),
@utc_offset_minutes integer = DATEDIFF(MINUTE, GETUTCDATE(), SYSDATETIME()),
@start_date_local datetime2(7) = NULL,
Expand Down Expand Up @@ -245,6 +246,117 @@ BEGIN

IF @rows_parsed > 0
BEGIN
/*
Populate blocker-side typed columns on the rows just parsed by
sp_HumanEventsBlockViewer so the Dashboard analysis path can
read structured columns instead of re-parsing the XML on every
BLOCKING_CHAIN fact. Only activity='blocked' rows carry the
full XML; activity='blocking' rows stay NULL on the new
columns (they describe the blocker side via their own
spid/status columns).

XQuery uses the descendant axis (//blocked-process-report/...)
because the stored XML is <event>-rooted with the report
nested two levels deep at
/event/data[@name="blocked_process"]/value/blocked-process-report.
The descendant axis sidesteps the wrap and was empirically
validated; a leading-slash (/blocked-process-report/...)
returns NULL on every row.

LTRIM/RTRIM matches the C# parser's .Trim() for spaces only
(not CR/LF/TAB); the reconstructor keys on session pair, not
SQL text, so the divergence is cosmetic.

Runs BEFORE the is_processed=1 mark below so a crash here
rolls back inside the surrounding transaction and the raw XML
rows stay unmarked - the next run retries them.
*/
UPDATE
b
SET
b.blocking_spid =
b.blocked_process_report_xml.value
(
N'(//blocked-process-report/blocking-process/process/@spid)[1]',
N'integer'
),
b.blocking_last_tran_started =
b.blocked_process_report_xml.value
(
N'(//blocked-process-report/blocking-process/process/@lasttranstarted)[1]',
N'datetime2(7)'
),
b.blocking_status =
b.blocked_process_report_xml.value
(
N'(//blocked-process-report/blocking-process/process/@status)[1]',
N'nvarchar(10)'
),
b.blocked_sql_text =
LTRIM(RTRIM(b.blocked_process_report_xml.value
(
N'(//blocked-process-report/blocked-process/process/inputbuf/text())[1]',
N'nvarchar(max)'
))),
b.blocking_sql_text =
LTRIM(RTRIM(b.blocked_process_report_xml.value
(
N'(//blocked-process-report/blocking-process/process/inputbuf/text())[1]',
N'nvarchar(max)'
)))
FROM collect.blocking_BlockedProcessReport AS b
WHERE b.event_time >= @start_date_local
AND b.event_time <= @end_date_local
AND b.activity = 'blocked'
AND b.blocking_spid IS NULL
AND b.blocked_process_report_xml IS NOT NULL;

SELECT
@rows_typed = ROWCOUNT_BIG();

IF @debug = 1
BEGIN
RAISERROR(N'Populated blocker-side typed columns for %I64d rows', 0, 1, @rows_typed) WITH NOWAIT;
END;

/*
Defense-in-depth: if sp_HumanEventsBlockViewer wrote rows
(@rows_parsed > 0) but the typed-column UPDATE populated zero
(@rows_typed = 0), the XQuery is silently failing - most
likely a future wire-format change in
sp_HumanEventsBlockViewer or the upstream XE shape. Log this
clearly so it surfaces in config.collection_log instead of
only revealing itself when the analysis path returns garbage
chains. The condition can't be a hard error because new-row
UPDATEs can also legitimately populate 0 if every row had
non-NULL blocking_spid already (re-processing same window),
so we just record it.
*/
IF @rows_parsed > 0 AND @rows_typed = 0
BEGIN
INSERT INTO
config.collection_log
(
collector_name,
collection_status,
rows_collected,
duration_ms,
error_message
)
VALUES
(
N'process_blocked_process_xml',
N'TYPED_COLUMNS_EMPTY',
@rows_parsed,
DATEDIFF(MILLISECOND, @start_time, SYSDATETIME()),
N'sp_HumanEventsBlockViewer wrote '
+ CAST(@rows_parsed AS nvarchar(20))
+ N' rows but XQuery extraction populated 0 blocker-side typed columns - '
+ N'likely a wire-format change in blocked_process_report_xml; '
+ N'check //blocked-process-report path against a sample row.'
);
END;

/*
Mark raw XML rows as processed
Only mark the rows in the date range we just processed
Expand Down
171 changes: 171 additions & 0 deletions upgrades/2.11.0-to-2.12.0/01_extend_blocked_process_report_columns.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
/*
Copyright 2026 Darling Data, LLC
https://www.erikdarling.com/

Upgrade from 2.11.0 to 2.12.0
Adds typed blocker-side columns to collect.blocking_BlockedProcessReport so
the Dashboard analysis path (BLOCKING_CHAIN fact + drill-down) can read
structured columns instead of re-parsing blocked_process_report_xml on every
analysis cycle. Backfills existing activity='blocked' rows from their stored
XML in one pass.
*/

SET ANSI_NULLS ON;
SET ANSI_PADDING ON;
SET ANSI_WARNINGS ON;
SET ARITHABORT ON;
SET CONCAT_NULL_YIELDS_NULL ON;
SET QUOTED_IDENTIFIER ON;
SET NUMERIC_ROUNDABORT OFF;
SET IMPLICIT_TRANSACTIONS OFF;
SET STATISTICS TIME, IO OFF;
GO

USE PerformanceMonitor;
GO

/*
Add columns idempotently. Each column gets its own guarded ALTER so a re-run
after a partial failure resumes cleanly. Separate GO batches are required
because the backfill below references the new columns by name and the parser
needs them to exist at compile time.
*/
IF NOT EXISTS
(
SELECT
1/0
FROM sys.columns AS c
WHERE c.object_id = OBJECT_ID(N'collect.blocking_BlockedProcessReport')
AND c.name = N'blocking_spid'
)
BEGIN
ALTER TABLE collect.blocking_BlockedProcessReport
ADD blocking_spid integer NULL;

PRINT 'Added blocking_spid to collect.blocking_BlockedProcessReport';
END;
GO

IF NOT EXISTS
(
SELECT
1/0
FROM sys.columns AS c
WHERE c.object_id = OBJECT_ID(N'collect.blocking_BlockedProcessReport')
AND c.name = N'blocking_last_tran_started'
)
BEGIN
ALTER TABLE collect.blocking_BlockedProcessReport
ADD blocking_last_tran_started datetime2(7) NULL;

PRINT 'Added blocking_last_tran_started to collect.blocking_BlockedProcessReport';
END;
GO

IF NOT EXISTS
(
SELECT
1/0
FROM sys.columns AS c
WHERE c.object_id = OBJECT_ID(N'collect.blocking_BlockedProcessReport')
AND c.name = N'blocking_status'
)
BEGIN
ALTER TABLE collect.blocking_BlockedProcessReport
ADD blocking_status nvarchar(10) NULL;

PRINT 'Added blocking_status to collect.blocking_BlockedProcessReport';
END;
GO

IF NOT EXISTS
(
SELECT
1/0
FROM sys.columns AS c
WHERE c.object_id = OBJECT_ID(N'collect.blocking_BlockedProcessReport')
AND c.name = N'blocked_sql_text'
)
BEGIN
ALTER TABLE collect.blocking_BlockedProcessReport
ADD blocked_sql_text nvarchar(max) NULL;

PRINT 'Added blocked_sql_text to collect.blocking_BlockedProcessReport';
END;
GO

IF NOT EXISTS
(
SELECT
1/0
FROM sys.columns AS c
WHERE c.object_id = OBJECT_ID(N'collect.blocking_BlockedProcessReport')
AND c.name = N'blocking_sql_text'
)
BEGIN
ALTER TABLE collect.blocking_BlockedProcessReport
ADD blocking_sql_text nvarchar(max) NULL;

PRINT 'Added blocking_sql_text to collect.blocking_BlockedProcessReport';
END;
GO

/*
One-time backfill of existing activity='blocked' rows. Idempotent: the WHERE
filter targets only rows where blocking_spid IS NULL (i.e., not yet
populated). Safe to re-run; once a row is populated the predicate excludes it.

XQuery uses the descendant axis (//blocked-process-report/...) because the
stored XML is <event>-rooted with the report nested two levels deep at
/event/data[@name="blocked_process"]/value/blocked-process-report - both
upstream writers preserve the outer <event> wrap. The descendant axis
sidesteps the wrap and is empirically validated against
sql2022.PerformanceMonitor; a leading-slash (/blocked-process-report/...)
returns NULL on every row.

LTRIM/RTRIM on the inputbuf text() matches the C# parser's .Trim() for space
characters only; T-SQL one-arg LTRIM/RTRIM does NOT strip CR/LF/TAB while C#
.Trim() does. Treated as close-enough - the reconstructor keys on session
pair (SPID + tran start), not SQL text, so whitespace divergence is cosmetic
and appears (if at all) in drill-down JSON only.
*/
UPDATE
b
SET
b.blocking_spid =
b.blocked_process_report_xml.value
(
N'(//blocked-process-report/blocking-process/process/@spid)[1]',
N'integer'
),
b.blocking_last_tran_started =
b.blocked_process_report_xml.value
(
N'(//blocked-process-report/blocking-process/process/@lasttranstarted)[1]',
N'datetime2(7)'
),
b.blocking_status =
b.blocked_process_report_xml.value
(
N'(//blocked-process-report/blocking-process/process/@status)[1]',
N'nvarchar(10)'
),
b.blocked_sql_text =
LTRIM(RTRIM(b.blocked_process_report_xml.value
(
N'(//blocked-process-report/blocked-process/process/inputbuf/text())[1]',
N'nvarchar(max)'
))),
b.blocking_sql_text =
LTRIM(RTRIM(b.blocked_process_report_xml.value
(
N'(//blocked-process-report/blocking-process/process/inputbuf/text())[1]',
N'nvarchar(max)'
)))
FROM collect.blocking_BlockedProcessReport AS b
WHERE b.activity = 'blocked'
AND b.blocking_spid IS NULL
AND b.blocked_process_report_xml IS NOT NULL;

PRINT 'Backfilled blocker-side typed columns for existing activity=''blocked'' rows';
GO
1 change: 1 addition & 0 deletions upgrades/2.11.0-to-2.12.0/upgrade.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
01_extend_blocked_process_report_columns.sql
Loading