Skip to content

fix(aprender-train): CUDA forward path applies Q/K/V biases (H4D root-cause discharge)#1604

Closed
noahgift wants to merge 11 commits into
mainfrom
fix/cuda-forward-parity-qwen-biases
Closed

fix(aprender-train): CUDA forward path applies Q/K/V biases (H4D root-cause discharge)#1604
noahgift wants to merge 11 commits into
mainfrom
fix/cuda-forward-parity-qwen-biases