Skip to content

Fix CUDA init races and add safety checks#1

Merged
sethuiyer merged 1 commit intomainfrom
codex/explain-codebase-structure-and-pointers
Feb 7, 2026
Merged

Fix CUDA init races and add safety checks#1
sethuiyer merged 1 commit intomainfrom
codex/explain-codebase-structure-and-pointers

Conversation

@sethuiyer
Copy link
Copy Markdown
Owner

Motivation

  • Prevent nondeterministic behavior and hard-to-debug races in the CUDA Ramsey kernel initialization by removing concurrent RNG usage across threads.
  • Fail fast on invalid host-side launch parameters to avoid undefined GPU behavior for unsupported state sizes or zero/negative grids.
  • Make the sample GPU host code more robust by surfacing allocation and copy failures immediately.

Description

  • Replaced per-thread cooperative RNG initialization in ramsey_baha_kernel_optimized with a single-thread (tid == 0) initialization loop to avoid races and then __syncthreads(); added #include <climits> for INT_MAX usage (file: include/baha/baha.cuh).
  • Added input validation checks in launch_ramsey_optimizer to return cudaErrorInvalidValue when num_blocks <= 0 or words_per_state is out of the supported range (file: include/baha/baha.cuh).
  • Hardened the GPU demo (src/baha_gpu.cu) by adding missing standard headers (<cmath>, <ctime>) and replacing raw cudaMalloc/cudaMemcpy/cudaFree calls with the existing cudaCheckError wrapper so allocation/copy/free errors are detected and reported.

Testing

  • No automated tests were run for this change.

Codex Task

@sethuiyer sethuiyer merged commit 870206a into main Feb 7, 2026
2 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant