Skip to content

Commit ab9bd11

Browse files
authored
Merge pull request #28 from mcarpendale/add-vm-insights-kql
Add vm insights kql
2 parents faa5697 + b98c647 commit ab9bd11

File tree

2 files changed

+170
-0
lines changed

2 files changed

+170
-0
lines changed
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# VM Disk Capacity & Performance Report
2+
3+
A KQL query for Azure Monitor Logs that produces a per-drive, per-VM report of disk capacity and peak performance metrics — using data collected by [VM Insights](https://learn.microsoft.com/en-us/azure/azure-monitor/vm/vminsights-overview).
4+
5+
## What it does
6+
7+
The query correlates `InsightsMetrics` data to produce a single row per drive per VM showing:
8+
9+
- **Capacity** — disk size, used/free space, percent used
10+
- **Peak Write snapshot** — the moment of highest write throughput, with all other metrics (read MB/s, read/write IOPS, read/write latency) captured at that same timestamp
11+
- **Peak Read snapshot** — same idea, anchored to the moment of highest read throughput
12+
13+
The `PW_` and `PR_` column prefixes indicate which peak the correlated values belong to. Null values mean a metric sample didn't land at that exact timestamp.
14+
15+
Drives are classified as `OS`, `Temp`, or `Data` based on mount point and size. Ephemeral/system mounts (`/mnt`, `/mnt/resource`, `/snap/*`, `/boot`, `/sys/*`) are excluded automatically.
16+
17+
## Prerequisites
18+
19+
- **VM Insights** must be enabled on target VMs — this is where the `InsightsMetrics` table comes from
20+
- **Log Analytics workspace(s)** receiving the VM Insights data
21+
- Permissions to query the workspace(s) via Azure Monitor Logs
22+
23+
## How to run
24+
25+
1. In the Azure portal, navigate to **Monitor → Logs**
26+
2. Switch the editor to **KQL mode** (drop-down in the query toolbar)
27+
3. Paste the contents of [`vm-disk-performance-capacity.kql`](vm-disk-performance-capacity.kql)
28+
<img width="2520" height="1597" alt="Screenshot 2026-03-23 at 3 42 51 pm" src="https://github.com/user-attachments/assets/b1cd45ef-060a-4b92-a556-b80b3208b7e0" />
29+
30+
4. **Set the scope** — click the kebab menu (⋯) on the query tab and select **Change scope**
31+
- To query across all subscriptions: select each subscription
32+
- To narrow results: filter **Resource types** to `Log Analytics workspace` and select only the relevant workspace(s)
33+
5. Set the **Time range** (e.g. Last 24 hours) and click **Run**
34+
35+
## Exporting results
36+
37+
Click **Share → Export to CSV (all columns)** to download the full result set for offline analysis or import into a TCO model.
38+
39+
## Output columns
40+
41+
| Column | Description |
42+
|--------|-------------|
43+
| `SubscriptionId` | Azure subscription GUID |
44+
| `ResourceGroup` | VM resource group |
45+
| `Computer` | VM hostname |
46+
| `Drive` | Mount point / drive letter |
47+
| `DriveType` | `OS`, `Temp`, or `Data` |
48+
| `DiskSizeGB` | Total disk capacity |
49+
| `UsedSpaceGB` / `FreeSpaceGB` | Average used and free space over the time range |
50+
| `PctUsed` | Percent used |
51+
| `PeakWriteTime` | Timestamp of maximum write throughput |
52+
| `MaxWriteMBps` | Peak write throughput (MB/s) |
53+
| `PW_ReadMBps` | Read throughput at peak write time |
54+
| `PW_WriteIOPS` / `PW_ReadIOPS` | IOPS at peak write time |
55+
| `PW_WriteLatMs` / `PW_ReadLatMs` | Latency at peak write time |
56+
| `PeakReadTime` | Timestamp of maximum read throughput |
57+
| `MaxReadMBps` | Peak read throughput (MB/s) |
58+
| `PR_WriteMBps` | Write throughput at peak read time |
59+
| `PR_ReadIOPS` / `PR_WriteIOPS` | IOPS at peak read time |
60+
| `PR_ReadLatMs` / `PR_WriteLatMs` | Latency at peak read time |
61+
62+
## Files
63+
64+
| File | Description |
65+
|------|-------------|
66+
| [`vm-disk-performance-capacity.kql`](vm-disk-performance-capacity.kql) | The KQL query — paste directly into Azure Monitor Logs |
Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
//======================================================================
2+
// VM Disk Performance & Capacity Report
3+
// Scope: All subscriptions selected in the Log Analytics workspace
4+
// Shows correlated performance metrics at the moment of peak read
5+
// and peak write throughput for each drive on each VM
6+
//
7+
// COLUMNS:
8+
// PW_ prefix = correlated value at peak WRITE time
9+
// PR_ prefix = correlated value at peak READ time
10+
// Null values = metric sample didn't align at that exact timestamp
11+
//======================================================================
12+
let baseData = InsightsMetrics // All perf metrics, filtered to real drives
13+
| where Namespace == "LogicalDisk" and Name in ("ReadBytesPerSecond", "WriteBytesPerSecond", "ReadsPerSecond", "WritesPerSecond", "ReadLatencyMs", "WriteLatencyMs")
14+
| extend DiskDetails = parse_json(Tags)
15+
| extend Drive = tostring(DiskDetails["vm.azm.ms/mountId"])
16+
| where Drive !in ("", "/mnt", "/mnt/resource")
17+
| where Drive !startswith "/snap/"
18+
| where Drive !startswith "/boot"
19+
| where Drive !startswith "/sys/";
20+
let driveInfo = InsightsMetrics // Capacity data (size, used, free, pct used)
21+
| where Namespace == "LogicalDisk" and Name == "FreeSpaceMB"
22+
| extend DiskDetails = parse_json(Tags)
23+
| extend Drive = tostring(DiskDetails["vm.azm.ms/mountId"])
24+
| extend DiskSizeMB = todecimal(DiskDetails["vm.azm.ms/diskSizeMB"])
25+
| where Drive !in ("", "/mnt", "/mnt/resource")
26+
| where Drive !startswith "/snap/"
27+
| where Drive !startswith "/boot"
28+
| where Drive !startswith "/sys/"
29+
| summarize FreeSpaceMB = avg(Val), DiskSizeMB = max(DiskSizeMB) by Computer, Drive, _ResourceId
30+
| extend UsedSpaceMB = DiskSizeMB - FreeSpaceMB
31+
| extend FreeSpaceGB = round(FreeSpaceMB / 1024, 2)
32+
| extend UsedSpaceGB = round(UsedSpaceMB / 1024, 2)
33+
| extend DiskSizeGB = round(DiskSizeMB / 1024, 2)
34+
| extend PctUsed = round((UsedSpaceMB / DiskSizeMB) * 100, 1);
35+
let peakWrite = baseData // Timestamp and value of max write throughput
36+
| where Name == "WriteBytesPerSecond"
37+
| summarize arg_max(Val, TimeGenerated) by Computer, Drive, _ResourceId
38+
| project Computer, Drive, _ResourceId, PeakWriteTime = TimeGenerated, MaxWriteMBps = round(Val / 1048576, 2);
39+
let peakRead = baseData // Timestamp and value of max read throughput
40+
| where Name == "ReadBytesPerSecond"
41+
| summarize arg_max(Val, TimeGenerated) by Computer, Drive, _ResourceId
42+
| project Computer, Drive, _ResourceId, PeakReadTime = TimeGenerated, MaxReadMBps = round(Val / 1048576, 2);
43+
let allMetrics = baseData // Flattened lookup table for timestamp correlation
44+
| extend MBps = round(Val / 1048576, 2)
45+
| extend RawVal = Val
46+
| project Computer, Drive, _ResourceId, TimeGenerated, Name, MBps, RawVal;
47+
peakWrite // Assembly: join all blocks and correlate metrics at peak times
48+
| join kind=leftouter peakRead on Computer, Drive, _ResourceId
49+
| join kind=leftouter driveInfo on Computer, Drive, _ResourceId
50+
| extend SubscriptionId = tostring(split(_ResourceId, "/")[2])
51+
| extend ResourceGroup = tostring(split(_ResourceId, "/")[4])
52+
| extend DriveType = case(
53+
Drive == "/" or Drive == "C:", "OS",
54+
Drive == "D:" and DiskSizeGB <= 16, "Temp",
55+
"Data"
56+
)
57+
// Correlated metrics at PEAK WRITE time (PW_ prefix)
58+
| join kind=leftouter (
59+
allMetrics | where Name == "ReadBytesPerSecond"
60+
| project Computer, Drive, _ResourceId, TimeGenerated, PW_ReadMBps = MBps
61+
) on Computer, Drive, _ResourceId, $left.PeakWriteTime == $right.TimeGenerated
62+
| join kind=leftouter (
63+
allMetrics | where Name == "ReadsPerSecond"
64+
| project Computer, Drive, _ResourceId, TimeGenerated, PW_ReadIOPS = round(RawVal, 0)
65+
) on Computer, Drive, _ResourceId, $left.PeakWriteTime == $right.TimeGenerated
66+
| join kind=leftouter (
67+
allMetrics | where Name == "WritesPerSecond"
68+
| project Computer, Drive, _ResourceId, TimeGenerated, PW_WriteIOPS = round(RawVal, 0)
69+
) on Computer, Drive, _ResourceId, $left.PeakWriteTime == $right.TimeGenerated
70+
| join kind=leftouter (
71+
allMetrics | where Name == "ReadLatencyMs"
72+
| project Computer, Drive, _ResourceId, TimeGenerated, PW_ReadLatMs = round(RawVal, 2)
73+
) on Computer, Drive, _ResourceId, $left.PeakWriteTime == $right.TimeGenerated
74+
| join kind=leftouter (
75+
allMetrics | where Name == "WriteLatencyMs"
76+
| project Computer, Drive, _ResourceId, TimeGenerated, PW_WriteLatMs = round(RawVal, 2)
77+
) on Computer, Drive, _ResourceId, $left.PeakWriteTime == $right.TimeGenerated
78+
// Correlated metrics at PEAK READ time (PR_ prefix)
79+
| join kind=leftouter (
80+
allMetrics | where Name == "WriteBytesPerSecond"
81+
| project Computer, Drive, _ResourceId, TimeGenerated, PR_WriteMBps = MBps
82+
) on Computer, Drive, _ResourceId, $left.PeakReadTime == $right.TimeGenerated
83+
| join kind=leftouter (
84+
allMetrics | where Name == "ReadsPerSecond"
85+
| project Computer, Drive, _ResourceId, TimeGenerated, PR_ReadIOPS = round(RawVal, 0)
86+
) on Computer, Drive, _ResourceId, $left.PeakReadTime == $right.TimeGenerated
87+
| join kind=leftouter (
88+
allMetrics | where Name == "WritesPerSecond"
89+
| project Computer, Drive, _ResourceId, TimeGenerated, PR_WriteIOPS = round(RawVal, 0)
90+
) on Computer, Drive, _ResourceId, $left.PeakReadTime == $right.TimeGenerated
91+
| join kind=leftouter (
92+
allMetrics | where Name == "ReadLatencyMs"
93+
| project Computer, Drive, _ResourceId, TimeGenerated, PR_ReadLatMs = round(RawVal, 2)
94+
) on Computer, Drive, _ResourceId, $left.PeakReadTime == $right.TimeGenerated
95+
| join kind=leftouter (
96+
allMetrics | where Name == "WriteLatencyMs"
97+
| project Computer, Drive, _ResourceId, TimeGenerated, PR_WriteLatMs = round(RawVal, 2)
98+
) on Computer, Drive, _ResourceId, $left.PeakReadTime == $right.TimeGenerated
99+
// Output: Identity > Capacity > Peak Write snapshot > Peak Read snapshot
100+
| project SubscriptionId, ResourceGroup, Computer, Drive, DriveType, DiskSizeGB, UsedSpaceGB, FreeSpaceGB, PctUsed,
101+
PeakWriteTime, MaxWriteMBps, PW_ReadMBps, PW_WriteIOPS, PW_ReadIOPS, PW_WriteLatMs, PW_ReadLatMs,
102+
PeakReadTime, MaxReadMBps, PR_WriteMBps, PR_ReadIOPS, PR_WriteIOPS, PR_ReadLatMs, PR_WriteLatMs,
103+
_ResourceId
104+
| order by SubscriptionId asc, Computer asc, Drive asc

0 commit comments

Comments
 (0)