Description of the bug:
While benchmarking a NVME SSD and ramp_time is set to a value greater than runtime, there is an unexpected ~500MB/s spike in read I/Os. This is observed when using a concurrent RD and WR job, but it is also observed with a pure WR job. This has been confirmed with SEQ and RND workloads. Also confirmed with libaio & io_uring. This read I/O spike spans for the time delta between the ramp_time and runtime. So if ramp_time=60s and runtime=45s, the unexpected reads occur for 15 seconds. This also stands true if you increase the gap between runtime even further. (See charted examples below)
Expectation: job = ramp_time → runtime → stop
But the IO behavior states this expectation is not how it is currently implemented.
It seems that both ramp_time and runtime clock counters start at the same time.
Since the clocks start at the same time, runtime expires first and this is the phase where we see the IO spike until ramp_time expires. Once ramp_time expires runtime then kicks back in and continues like normal.
Environment:
RHEL 8.5
Kernel 6.1.35
fio version:
Confirmed it is present in the latest version, but this is present in previous versions as well.
fio-3.41-90-gbfe3
Reproduction steps
Attached the job files used to reproduce the issue consistently.
rd+wr_qd_256_128k_1w.txt
wr_qd_256_128k_1w.txt
Collected IOs using iostat command:
iostat -dx 1 110 /dev/nvme0n1 > io_stats_60sec_ramp_libaio_128K_seq_rd+wr_rate-limited.txt
Description of the bug:
While benchmarking a NVME SSD and ramp_time is set to a value greater than runtime, there is an unexpected ~500MB/s spike in read I/Os. This is observed when using a concurrent RD and WR job, but it is also observed with a pure WR job. This has been confirmed with SEQ and RND workloads. Also confirmed with libaio & io_uring. This read I/O spike spans for the time delta between the ramp_time and runtime. So if ramp_time=60s and runtime=45s, the unexpected reads occur for 15 seconds. This also stands true if you increase the gap between runtime even further. (See charted examples below)
Expectation: job = ramp_time → runtime → stop
But the IO behavior states this expectation is not how it is currently implemented.
It seems that both ramp_time and runtime clock counters start at the same time.
Since the clocks start at the same time, runtime expires first and this is the phase where we see the IO spike until ramp_time expires. Once ramp_time expires runtime then kicks back in and continues like normal.
Environment:
RHEL 8.5
Kernel 6.1.35
fio version:
Confirmed it is present in the latest version, but this is present in previous versions as well.
fio-3.41-90-gbfe3
Reproduction steps
Attached the job files used to reproduce the issue consistently.
rd+wr_qd_256_128k_1w.txt
wr_qd_256_128k_1w.txt
Collected IOs using iostat command:
iostat -dx 1 110 /dev/nvme0n1 > io_stats_60sec_ramp_libaio_128K_seq_rd+wr_rate-limited.txt