Skip to content

Commit fedd3b4

Browse files
committed
scx_p2dq: Add DHQ support and fix migration-disabled task errors
Integrate Double Helix Queue (DHQ) as an alternative to ATQ for LLC-aware task migration, and fix critical race condition causing migration-disabled task errors. DHQ Integration: - Add --dhq-enabled flag to enable DHQ mode for LLC migration - Add --dhq-max-imbalance parameter (default: 3) to control strand balance - Create one DHQ per pair of LLCs in same NUMA node - Map each LLC to a specific strand (A or B) for cache affinity - Each CPU inherits strand from its LLC for proper load distribution - DHQ provides cache-aware migration with controlled cross-LLC movement Strand-Specific DHQ Operations: Use scx_dhq_peek_strand() and scx_dhq_pop_strand() instead of generic operations to ensure CPUs only consume from their designated strand. This preserves cache locality and prevents load imbalance. Data Structure Changes: - Add mig_dhq and dhq_strand to cpu_ctx and llc_ctx - Add llc_pair_dhqs[] for shared DHQs between LLC pairs - Add llcs_per_node[] to track LLCs per NUMA node - Add P2DQ_ENQUEUE_PROMISE_DHQ_VTIME enqueue promise type - Add enqueue_promise_dhq struct for DHQ-specific metadata Configuration: - p2dq_config.dhq_enabled: Enable DHQ mode - p2dq_config.dhq_max_imbalance: Control strand pairing (0 = unlimited) - Priority mode: lowest vtime wins across strands Build System: - Add lib/dhq.bpf.c to scx_p2dq and scx_chaos builds scx_chaos Compatibility: - Update enqueue promise handling to recognize DHQ type - Error message updated to mention both ATQs and DHQs not supported Benefits: - Cache affinity: Tasks stay on origin LLC (strand) - Controlled migration: max_imbalance prevents migration storms - Race-free: Atomic affinity handling eliminates migration-disabled errors - Work conservation: Cross-strand stealing when priority demands - Scalable: Lock contention distributed across DHQ strands Signed-off-by: Daniel Hodges <[email protected]>
1 parent f11fd26 commit fedd3b4

File tree

7 files changed

+367
-59
lines changed

7 files changed

+367
-59
lines changed

lib/dhq.bpf.c

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
#include "scxtest/scx_test.h"
21
#include <scx/common.bpf.h>
32
#include <lib/sdt_task.h>
43

scheds/rust/scx_chaos/build.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ fn main() {
1010
.add_source("src/bpf/lib/arena.bpf.c")
1111
.add_source("src/bpf/lib/atq.bpf.c")
1212
.add_source("src/bpf/lib/bitmap.bpf.c")
13+
.add_source("src/bpf/lib/dhq.bpf.c")
1314
.add_source("src/bpf/lib/minheap.bpf.c")
1415
.add_source("src/bpf/lib/rbtree.bpf.c")
1516
.add_source("src/bpf/lib/sdt_alloc.bpf.c")

scheds/rust/scx_chaos/src/bpf/main.bpf.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -505,7 +505,8 @@ complete_p2dq_enqueue_move(struct enqueue_promise *pro,
505505
break;
506506
case P2DQ_ENQUEUE_PROMISE_ATQ_FIFO:
507507
case P2DQ_ENQUEUE_PROMISE_ATQ_VTIME:
508-
scx_bpf_error("chaos: ATQs not supported");
508+
case P2DQ_ENQUEUE_PROMISE_DHQ_VTIME:
509+
scx_bpf_error("chaos: ATQs/DHQs not supported");
509510
break;
510511
case P2DQ_ENQUEUE_PROMISE_FAILED:
511512
scx_bpf_error("chaos: delayed async_p2dq_enqueue failed");

scheds/rust/scx_p2dq/build.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ fn main() {
1010
.enable_skel("src/bpf/main.bpf.c", "bpf")
1111
.add_source("src/bpf/lib/arena.bpf.c")
1212
.add_source("src/bpf/lib/atq.bpf.c")
13+
.add_source("src/bpf/lib/dhq.bpf.c")
1314
.add_source("src/bpf/lib/bitmap.bpf.c")
1415
.add_source("src/bpf/lib/cpumask.bpf.c")
1516
.add_source("src/bpf/lib/minheap.bpf.c")

0 commit comments

Comments
 (0)