Long running compute kernels in wgpu? #2353
-
| Hey guys. I have some compute kernels which would take up to a minute to complete. Unfortunately the Windows TDR kills the GPU if a kernel runs for too long (a couple secs). I am now splitting the big kernel down to smaller loads, but I am still getting killed by the TDR. Here's what I have tried so far: // One pass.
let mut encoder = device.create_command_encoder(..);
{
    let mut pass = device.begin_compute_pass(..);
    for _ in 0..many {
        pass.dispatch(..);
    }
}
queue.submit(Some(encoder.finish()));
// Many passes.
let mut encoder = device.create_command_encoder(..);
for _ in 0..many {
    let mut pass = device.begin_compute_pass(..);
    pass.dispatch(..);
}
queue.submit(Some(encoder.finish()));
// Many command buffers.
for _ in 0..many {
    let mut encoder = device.create_command_encoder(..);
    {
        let mut pass = device.begin_compute_pass(..);
        pass.dispatch(..);
    }
    queue.submit(Some(encoder.finish()));
}I couldn't find any way to sync or flush the queue in the WebGPU specs. I tried adding a couple of  Of course I could change the TDR limit in the registry, but it's not a very satisfactory solution 🙂, can't force the users to do it for example. If it matters, I am running headlessly, so no windows or surfaces. Basically I took your  I am running  | 
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
| My best guess is that TDR applies to queue submissions not individual dispatches, so you should try to split up the work into submissions of reasonable time (maybe 100ms). This will give you lots of headroom if you underestimate the time things will take. As an extension, you could use timestamp queries to dynamically adjust the load per submit bases on feedback. | 
Beta Was this translation helpful? Give feedback.
My best guess is that TDR applies to queue submissions not individual dispatches, so you should try to split up the work into submissions of reasonable time (maybe 100ms). This will give you lots of headroom if you underestimate the time things will take. As an extension, you could use timestamp queries to dynamically adjust the load per submit bases on feedback.