Skip to content

Commit 0fbac4d

Browse files
pykelloclaude
andcommitted
Add io-profile: bpftrace-based IO + CPU profiling wrapper
Wraps any command and produces a standardized report with block IO metrics, syscall tracking (fsync, O_DIRECT), CPU utilization, and per-thread breakdown. Supports optional JSON output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent accde4c commit 0fbac4d

File tree

2 files changed

+756
-0
lines changed

2 files changed

+756
-0
lines changed

README.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
Minimal-overhead I/O benchmarking tool using [libblkio](https://gitlab.com/libblkio/libblkio). Designed for benchmarking vhost-user-blk backends (e.g., ubiblk) with busy-loop polling and direct libblkio calls.
44

5+
Also includes `io-profile`, a bpftrace-based IO + CPU profiling wrapper that can wrap any command (including blkbench) to produce a system-level profile.
6+
57
## Build
68

79
Prerequisites: libblkio (installed with pkg-config support), GCC, pthreads.
@@ -128,6 +130,76 @@ blkbench: rw=randread, bs=4k, iodepth=32, numjobs=4, runtime=10s
128130
ios: total=3200000, errors=0, flushes=0
129131
```
130132

133+
## io-profile
134+
135+
A reusable IO + CPU profiling wrapper. Runs any command and produces a standardized report with block IO metrics, syscall tracking, CPU utilization, and per-thread breakdown.
136+
137+
Requires: `bpftrace` (root), `iostat`, `mpstat`, `awk`.
138+
139+
### Usage
140+
141+
```bash
142+
sudo ./io-profile [options] -- command [args...]
143+
```
144+
145+
### Options
146+
147+
| Arg | Default | Description |
148+
|-----|---------|-------------|
149+
| `-o, --output DIR` | `./io-profile-results` | Output directory for reports |
150+
| `-d, --device DEV` | auto-detect | Block device to monitor |
151+
| `-p, --pid PID` | | Only trace this PID and children |
152+
| `--json` | off | Also emit machine-readable JSON summary |
153+
| `-v, --verbose` | off | Show progress messages |
154+
155+
### Examples
156+
157+
Profile a blkbench run:
158+
```bash
159+
sudo ./io-profile --json -o results/ -- \
160+
./blkbench --driver io_uring --path /tmp/test-disk.raw --rw randread --bs 4k --iodepth 32 --runtime 10
161+
```
162+
163+
Profile any command:
164+
```bash
165+
sudo ./io-profile -d nvme0n1 -- fio job.fio
166+
sudo ./io-profile -- dd if=/dev/zero of=/tmp/test bs=1M count=1000
167+
```
168+
169+
### What it collects
170+
171+
- **Block IO**: throughput, IOPS, queue depth distribution, block size distribution, IO latency percentiles, sequential vs random ratio, read/write split
172+
- **Syscalls**: fsync/fdatasync/sync_file_range counts and rates, O_DIRECT detection
173+
- **CPU**: utilization percentiles, iowait, user/system split, context switch rate
174+
- **Per-thread**: IOPS, read/write bytes, fsync count, sequential percentage per thread
175+
176+
### Example output
177+
178+
```
179+
=== IO Profile: ./blkbench --driver io_uring --path /tmp/test-disk.raw --rw randread ... ===
180+
Duration: 10.2s | Device: nvme0n1 | Kernel: 6.8.0-94-generic
181+
182+
IO Summary:
183+
Throughput: Read 1250.3 MB/s | Write 0.0 MB/s
184+
IOPS: Read 320,012 | Write 0
185+
IO Threads: 4
186+
R/W Ratio: 100% read / 0% write
187+
Sequential: 12%
188+
fsync calls: 0 (0.0/s)
189+
O_DIRECT: Yes (4 opens)
190+
191+
Histograms:
192+
Queue Depth: p25=28 p50=31 p75=32 p99=32 max=32
193+
Block Size: p25=4K p50=4K p75=4K p99=4K max=4K
194+
IO Latency: p25=8us p50=12us p75=18us p99=45us max=850us
195+
196+
CPU Summary:
197+
CPU Usage: p25=22% p50=25% p75=28% p99=35%
198+
IOWait: p25=0% p50=0% p75=1% p99=2%
199+
User/System: 25% user / 0% system
200+
Ctx Switches: 12,340/s
201+
```
202+
131203
## Architecture
132204

133205
- **Threading**: One thread per job, each with its own libblkio queue — no shared mutable state during the benchmark.

0 commit comments

Comments
 (0)