Skip to content

Multi-job based data verification#2025

Open
minwooim wants to merge 3 commits intoaxboe:masterfrom
minwooim:shared-verify-table-rebased
Open

Multi-job based data verification#2025
minwooim wants to merge 3 commits intoaxboe:masterfrom
minwooim:shared-verify-table-rebased

Conversation

@minwooim
Copy link
Copy Markdown
Contributor

@minwooim minwooim commented Dec 2, 2025

See the first commit description for the background and details.

The following jobfile is an example for the multiple write job based data verification.
16 jobs are going to write[zeroes, uncor, trim] to the same area [0, 1M).

[global]
ioengine=io_uring_cmd
cmd_type=nvme
filename=/dev/ng0n1
size=1M
thread=1

norandommap

ignore_error=0x2281  # to mask uncorred (errored) offsets
iodepth=64

group_reporting=1

bs=4k
numjobs=4

verify=crc32
verify_header_seed=0
do_verify=1

verify_table_id=1

[trim]
rw=randtrim

[uncor]
rw=randwrite
write_mode=uncor

[write]
rw=randwrite

[zeroes]
rw=randwrite
write_mode=zeroes
nonvectored=0  ; to mask BAD ADDRESS error in kernel
                                       

After run this jobfile, we can see the READ(verify) phase were issued only just for 1MB(256 commands):

     issued rwts: total=256,3072,1024,0 short=0,0,0,0 dropped=0,0,0,0

@minwooim minwooim force-pushed the shared-verify-table-rebased branch from a3aa301 to b02ee38 Compare December 2, 2025 13:12
@vincentkfu
Copy link
Copy Markdown
Collaborator

Do you believe that multi-job verification is a sound test? Once requests are submitted to the device, they can be completed in an arbitrary order. For even a single job we can run into problems with repeated writes to the same offset completing in an order different from how they were submitted, leading to verification errors.

For example see:

commit b5ab9ba8e73692360e8ea5a5587724ecf74270f4
Author: Ankit Kumar <ankit.kumar@samsung.com>
Date:   Thu Jan 30 00:04:58 2025 +0530

    verify: disable write sequence checks with norandommap and iodepth > 1

    With norandommap for async I/O engines specifying I/O depth > 1, it is
    possible that two or more writes with the same offset are queued at once.
    When fio tries to verify the block, it may find a numberio mismatch
    because the writes did not land on the media in the order that they were
    queued. Avoid these spurious failures by disabling sequence number
    checking. Users will still be able to enable sequence number checking
    if they explicitly set the verify_header_sequence option.

    fio -name=verify -ioengine=libaio -rw=randwrite -verify=sha512 -direct=1 \
    -iodepth=32 -filesize=16M -bs=512 -norandommap=1 -debug=io,verify

    Below is the truncated log for the above command demonstrating the issue.
    This includes extra log entries when write sequence number is saved and
    retrieved.

    set: io_u->numberio=28489, off=0x5f2400
    queue: io_u 0x5b8039e30d40: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0
    set: io_u->numberio=28574, off=0x5f2400
    iolog: overlap 6235136/512, 6235136/512
    queue: io_u 0x5b8039e75500: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0
    complete: io_u 0x5b8039e75500: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0
    complete: io_u 0x5b8039e30d40: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0

    retrieve: io_u->numberio=28574, off=0x5f2400
    queue: io_u 0x5b8039e1db40: off=0x5f2400,len=0x200,ddir=0,file=verify.0.0

    bad header numberio 28489, wanted 28574 at file verify.0.0 offset 6235136, length 512 (requested block: offset=6235136, length=512)

    Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

@sitsofe
Copy link
Copy Markdown
Collaborator

sitsofe commented Jan 24, 2026

@vincentkfu I would argue that if you have a single job that may issue concurrent overlapping writes and you want to verify the result then you should use serialize_overlap (and that would have also prevented the issue generated by the job in the commit you referenced). However your point stands - without a thing that can serialize ALL overlapping writes regardless of which concurrent job submitted them, you open yourself up to regions where you cannot know what value they have...

@minwooim
Copy link
Copy Markdown
Contributor Author

minwooim commented Feb 3, 2026

Do you believe that multi-job verification is a sound test?

Yes, I do. dedicated offsets for a job limits the variority of commands. If fio can manages overlapping offset ranges during command submission/completions, multi-job based data verification can test data integrity.

Once requests are submitted to the device, they can be completed in an arbitrary order. For even a single job we can run into problems with repeated writes to the same offset completing in an order different from how they were submitted, leading to verification errors.

That's why this change plugs the incoming submissions if offsets are overlapped. So, only a single write command can enter to the device at a time.

For example see:

commit b5ab9ba8e73692360e8ea5a5587724ecf74270f4
Author: Ankit Kumar <ankit.kumar@samsung.com>
Date:   Thu Jan 30 00:04:58 2025 +0530

    verify: disable write sequence checks with norandommap and iodepth > 1

    With norandommap for async I/O engines specifying I/O depth > 1, it is
    possible that two or more writes with the same offset are queued at once.
    When fio tries to verify the block, it may find a numberio mismatch
    because the writes did not land on the media in the order that they were
    queued. Avoid these spurious failures by disabling sequence number
    checking. Users will still be able to enable sequence number checking
    if they explicitly set the verify_header_sequence option.

    fio -name=verify -ioengine=libaio -rw=randwrite -verify=sha512 -direct=1 \
    -iodepth=32 -filesize=16M -bs=512 -norandommap=1 -debug=io,verify

    Below is the truncated log for the above command demonstrating the issue.
    This includes extra log entries when write sequence number is saved and
    retrieved.

    set: io_u->numberio=28489, off=0x5f2400
    queue: io_u 0x5b8039e30d40: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0
    set: io_u->numberio=28574, off=0x5f2400
    iolog: overlap 6235136/512, 6235136/512
    queue: io_u 0x5b8039e75500: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0
    complete: io_u 0x5b8039e75500: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0
    complete: io_u 0x5b8039e30d40: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0

    retrieve: io_u->numberio=28574, off=0x5f2400
    queue: io_u 0x5b8039e1db40: off=0x5f2400,len=0x200,ddir=0,file=verify.0.0

    bad header numberio 28489, wanted 28574 at file verify.0.0 offset 6235136, length 512 (requested block: offset=6235136, length=512)

    Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

@minwooim
Copy link
Copy Markdown
Contributor Author

minwooim commented Feb 3, 2026

However your point stands - without a thing that can serialize ALL overlapping writes regardless of which concurrent job submitted them, you open yourself up to regions where you cannot know what value they have...

Indeed, so this PR serialize all the overlapping incoming submissions to guarantee that only a single write command can enter to the device at a time.

Background
  FIO has supported data verification integrity test based on per-job
methodology.  Each job should have its own dedicated offset area to make
sure that other jobs do not corrupt the written offsets to keep the
consistency based on the job point-of-view.  NVMe spec supports various
I/O commands causing written data pattern(e.g., zeroes) or status(e.g.,
unmap, uncor) updated.  This means that these commands should be the
candidates necessarily used for the data verification test.  It leads to
a demand for multiple jobs issuing various write-family commands to the
same area and see the *latest* snapshot of written data pattern or
status.

This patch added a new option `--verify_table_id=<n>` to represent the
verification table identifier for multiple jobs.  If one or more jobs
have the same value for this option, they will share the verification
table, which is a lock-free concurrent skiplist to keep the `ipo`, i.e.,
io pieces.  This patch does not change any functionalities for the
previous dedicated area-based data verification if `--verify_table_id=`
option is not given.  The classical data verify methodology will keep as
flist or rb-tree.

This shared verify table also considers trimmed offset by setting
`ipo->flags` with IP_F_TRIMMED and when the read phase starts, it will
be treated as it was.

Basically, fio verify job starts on the its own job context (thread or
process) each.  But, when the jobs are sharing the same verify table id,
one of the jobs will win the race and only a single job will do the
verify phase by atomic operation.

Currently this patch only supports thread-based usages (--thread=1) for
the simplicity.

Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
As TRIM does, some I/O requests can cause written data pattern(e.g.,
zeroed)  or status updated(uncor).  This patch, for example, added
support for IO_U_F_TRIMMED, IO_U_F_ZEROED, and IO_U_F_ERRORED for NVMe
commands to io_uring ioengine.

Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
For example, in io_uring_cmd with nvme cmd_type, trimmed offset might
return NVMe status code to represent the given offsets are unmapped.  In
this case, failure might be expected.  This patch added a new option
just like the other trim_verify_* series to maskd the expected error
value from the ioengine.

Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
@minwooim minwooim force-pushed the shared-verify-table-rebased branch from b02ee38 to c916ef8 Compare March 18, 2026 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants