Skip to content

fix(tests): slice freqs_cis to sequence length in unit tests#42

Closed
sjswerdloff wants to merge 2 commits intokyegomez:mainfrom
sjswerdloff:fix/freqs-cis-slice-in-tests
Closed

fix(tests): slice freqs_cis to sequence length in unit tests#42
sjswerdloff wants to merge 2 commits intokyegomez:mainfrom
sjswerdloff:fix/freqs-cis-slice-in-tests

Conversation

@sjswerdloff
Copy link
Copy Markdown

Summary

  • Unit tests calling GQAttention, MLAttention, TransformerBlock, and RecurrentBlock directly were passing full-length freqs_cis (shape (max_seq_len, dim//2)) instead of slicing to the actual sequence length T
  • This caused RuntimeError: The size of tensor a (T) must match the size of tensor b (max_seq_len) in apply_rope
  • The model's forward() method slices correctly — only the unit tests were affected

Changes

  • Slice self.freqs[:T] in all 13 affected test call sites
  • No changes to model code

Test plan

  • python -m pytest test_main.py -q — 66 passed, 1 flaky (test_spectral_radius_stable_after_large_grad_step — float32 boundary, pre-existing)
  • Previously: 53 passed, 14 failed

🤖 Generated with Claude Code

Co-Authored-By: clement-7074f29f clement-7074f29f@sjstargetedsolutions.co.nz

Tests that call GQAttention, MLAttention, TransformerBlock, and
RecurrentBlock directly were passing full-length freqs_cis
(max_seq_len) instead of slicing to the actual sequence length T.
This caused a tensor size mismatch in apply_rope when the RoPE
frequencies didn't match the input sequence dimension.

The model's forward() method slices correctly — only the unit
tests that call components directly were affected.

Fixes 13 test failures:
- TestGQAttention: 3 tests
- TestMLAttention: 4 tests
- TestTransformerBlock: 3 tests
- TestRecurrentBlock: 3 tests
The ZOH discretization exp(-exp(clamp(x, -20, 20))) reaches exactly
1.0 in float32 at the clamp boundary. This is the neutral point (no
decay), not divergence — the stability guarantee is A ∈ (0, 1].

The test with lr=1e3 intentionally pushes parameters to extremes
and was intermittently failing with assert 1.0 < 1.0.
@sjswerdloff sjswerdloff mentioned this pull request Apr 22, 2026
@sjswerdloff
Copy link
Copy Markdown
Author

PR #41 has a more comprehensive approach .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant