Skip to content

feat(gpu): add rerand without keyswitch#3384

Draft
pdroalves wants to merge 1 commit intomainfrom
pa/feat/rerand_without_ks
Draft

feat(gpu): add rerand without keyswitch#3384
pdroalves wants to merge 1 commit intomainfrom
pa/feat/rerand_without_ks

Conversation

@pdroalves
Copy link
Contributor

closes: https://github.com/zama-ai/tfhe-rs-internal/issues/1310

PR content/description

Check-list:

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • Relevant issues are marked as resolved/closed, related issues are linked in the description
  • Check for breaking changes (including serialization changes) and add them to commit message following the conventional commit specification

@cla-bot cla-bot bot added the cla-signed label Mar 11, 2026
@pdroalves pdroalves force-pushed the pa/feat/rerand_without_ks branch 4 times, most recently from 0d44308 to 14734a4 Compare March 11, 2026 17:05

Torus *tmp_zero_lwes;
Torus *tmp_ksed_zero_lwes;
Torus *tmp_expanded_zero_lwes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please initialize to nullptr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Torus *tmp_zero_lwes;
Torus *tmp_ksed_zero_lwes;
Torus *tmp_expanded_zero_lwes;
Torus *tmp_ksed_expanded_zero_lwes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pleas initialize to nullptr (this one is conditionally set in the ctr)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

lwe_trivial_indexes, ksk, input_dimension, output_dimension, ks_base_log,
ks_level, num_lwes, true, mem_ptr->ks_tmp_buf_vec);
// Keyswitch
execute_keyswitch_async<Torus>(streams.get_ith(0), lwes_to_be_added,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only on gpu 0? is there a multi-gpu approach to rerand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we never executed it multi-gpu. Honestly I don't remember why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we concluded was that it's not worthy. The bottleneck was the encryption of zeroes, not KS. KS is quite fast on a single GPU. Probably multi-gpu would only slow it down.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guillermo-oyarzun do you think we could benefit from multi-gpuing this?

Copy link
Member

@guillermo-oyarzun guillermo-oyarzun Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only the ks wouldn't benefit much, cause the overhead of using many GPUs would be almost equivalent to the benefit of the parallelization, this is always 64-bit right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't recall seeing any instantiation as 32 bits.

@pdroalves pdroalves force-pushed the pa/feat/rerand_without_ks branch from 14734a4 to 49333c1 Compare March 12, 2026 14:58
@pdroalves pdroalves force-pushed the pa/feat/rerand_without_ks branch from 49333c1 to 9bcfabd Compare March 12, 2026 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants