Skip to content

Conversation

jpdoyle
Copy link
Contributor

@jpdoyle jpdoyle commented Apr 2, 2025

This is still WIP. So far it is basically a PoC of what is needed to perform these changes.
The overall checklist before merging is:

  • Restore the deleted regression tests
  • Correct handling of the full 256GB key stream in Ietf ChaCha20
  • Correct handling of 64-bit counters in legacy and xchacha variants
  • ChaCha20 software backend cipher-mode support
  • ChaCha20 SSE2 backend cipher-mode support
  • ChaCha20 AVX2 support
  • ChaCha20 NEON support
  • Salsa20 software backend support
  • Salsa20 SSE2 backend support
  • Salsa20 AVX2 backend support
  • Salsa20 NEON backend support
  • Update traits so that there is an explicit representation of a looped/saturated counter. Currently the implementation of StreamCipherCoreWrapper does not allow me to directly implement a cipher that stops once it would loop; in the current implementation I get around this by effectively using a 64-bit counter for the 32-bit counter mode, which allows me to distinguish between a just-initialized cipher (block_pos = 0, pos = 64) and a looped cipher (block_pos = 2**32, pos = 64). Note that this does not currently allow correct behavior at the end of a 64-bit-counter keystream.
  • Convert the Ietf variant to use a 32-bit counter again, fixing the bad_overflow_check{6,7} seek-focused tests
  • Update the arithmetic in SeekNum::from_block_byte in traits to match the looped counter representation, thereby fixing cipher: Seeking near the end of the keystream causes try_current_pos to return an error traits#1808. (NOTE: the test in that issue is present in this PR, and succeeds; that is because I use a 64-bit overall counter even in 32-bit Ietf mode)

This would close #334, #391.

@dhardy
Copy link
Contributor

dhardy commented Aug 14, 2025

This PR could remove the warning in src/legacy.rs.

Copy link
Contributor

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the SSE2 code (arguably the most complex part of this PR) but can if it's helpful.

I'd like to see 64-bit counter support merged for usage in rand.

Comment on lines -31 to +39
self.0.state[12] = self.0.state[12].wrapping_add(1);
self.0.counter = self.0.counter.saturating_add(1);
if let Some(ctr) = self.0.state[12].checked_add(1) {
self.0.state[12] = ctr;
} else if V::COUNTER_SIZE > 1 {
if let Some(ctr) = self.0.state[13].checked_add(1) {
self.0.state[12] = 0;
self.0.state[13] = ctr;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old behaviour when COUNTER_SIZE=1 was to wrap. New behaviour is to saturate, but I don't think this is justified?

This also saturates when COUNTER_SIZE=2. Arguably it should wrap just in case someone set the block position to something just below the end.

Hint: it may be easier to convert to u64, increment and convert back, or what you did a few lines below.

Comment on lines +218 to +219
/// Current counter position
counter: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to add additional state to ChaChaCore (at least not for these variants).

This is just for testing?

Comment on lines -268 to +272
type Counter = u32;
type Counter = u64;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should depend on the variant.

Comment on lines +276 to +286
let le_pos = self.counter.to_le_bytes();
let max_val = V::MAX_USABLE_COUNTER;
if self.counter <= max_val {
for i in 0..V::COUNTER_SIZE {
debug_assert_eq!(
self.state[12 + i],
u32::from_le_bytes(<_>::try_from(&le_pos[4 * i..(4 * (i + 1))]).unwrap())
);
}
}
self.counter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming COUNTER_SIZE <= 2, it's much simpler to do this:

let mut counter = self.state[12];
if V::COUNTER_SIZE > 1 {
    counter |= self.state[13] << 32;
}

} else {
let ctr = _mm_cvtsi128_si64(backend.v[3]) as u64;

state[12] = (ctr&(u32::MAX as u64)) as u32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the and operation here is unnecessary, given that the next line works

@nstilt1
Copy link
Contributor

nstilt1 commented Aug 15, 2025

If you want to see some code that works for all backends, I have a working branch with code for all of the backends.

https://github.com/nstilt1/stream-ciphers-rng/blob/counter_support_2.0/chacha20/src/backends/sse2.rs

Just search for size_of in the backend files and you'll find all of the code that is specific to the size of the counter.

Edit: my current PR contains all of the necessary 64-bit counter code. You could just look at the diff and see what all needs to be changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chacha20: 64-bit counter support
3 participants