Add batch invariant RL post #113

bwasti · 2025-11-10T22:22:44Z

As in title, text can be found in PR content

Signed-off-by: Bram Wasti <[email protected]>

Downloaded GitHub-hosted images to local assets directory and updated all image references to use local paths. Converted standalone images to markdown syntax while keeping centered images as HTML img tags for proper rendering. Signed-off-by: Bram Wasti <[email protected]>

Added Jekyll frontmatter with layout, title, and author metadata to properly render the blog post. Signed-off-by: Bram Wasti <[email protected]>

Restored the original width and height attributes (340x130 and 480x355) for the two centered images to maintain their fixed sizing. Signed-off-by: Bram Wasti <[email protected]>

Added horizontal rule and italic formatting to the acknowledgements section for better visual separation and styling. Signed-off-by: Bram Wasti <[email protected]>

_posts/2025-11-10-bitwise-consistent-train-inference.md

WoosukKwon · 2025-11-10T23:25:37Z

_posts/2025-11-10-bitwise-exact-rl.md

+
+In the septillions of flops used to pre-train models, this mismatch between values has largely been avoidable.  Pre-training typically runs at a fixed batch size which induces the same reduction kernels to be run - often side-stepping the issue entirely.
+
+Reinforcement learning, on the other hand, seems to almost exclusively run different reduction algorithms due to its inference-heavy (and thus largely latency and memory-bound) nature.  Kernels optimized for low-batch size inference typically run reductions all at once, whereas kernels for training models parallelize heavily to reuse data and amp up compute utilization.  That means the generators and the trainers are typically running completely different kernels!


Kernels optimized for low-batch size inference typically run reductions all at once

I don't understand this part. Are you talking about reductions like in RMS norm?

in the kernels, they don't tile. let me use the word "tile" for clarity

_posts/2025-11-10-bitwise-exact-rl.md

_posts/2025-11-10-bitwise-consistent-train-inference.md

_posts/2025-11-10-bitwise-exact-rl.md

Signed-off-by: Bram Wasti <[email protected]>

Added links to #sig-post-training and #sig-batch-invariant Slack channels in the blog post to invite readers to contribute to future developments. Signed-off-by: Bram Wasti <[email protected]>

bwasti added 2 commits November 10, 2025 17:20

Create 2025-11-10-bitwise-exact-rl.md

1fefbf2

Signed-off-by: Bram Wasti <[email protected]>

Add links to batch invariance and RFC in documentation

f5b896e

Signed-off-by: Bram Wasti <[email protected]>

vercel bot deployed to Preview November 10, 2025 22:23 View deployment

vercel bot deployed to Preview November 10, 2025 23:01 View deployment

Add frontmatter to bitwise-exact RL blog post

e4cb595

Added Jekyll frontmatter with layout, title, and author metadata to properly render the blog post. Signed-off-by: Bram Wasti <[email protected]>

vercel bot deployed to Preview November 10, 2025 23:07 View deployment

Restore width and height attributes for centered images

8640000

Restored the original width and height attributes (340x130 and 480x355) for the two centered images to maintain their fixed sizing. Signed-off-by: Bram Wasti <[email protected]>

vercel bot deployed to Preview November 10, 2025 23:09 View deployment

Format acknowledgements section

80a823a

Added horizontal rule and italic formatting to the acknowledgements section for better visual separation and styling. Signed-off-by: Bram Wasti <[email protected]>

vercel bot deployed to Preview November 10, 2025 23:11 View deployment

zhuohan123 reviewed Nov 10, 2025

View reviewed changes

_posts/2025-11-10-bitwise-consistent-train-inference.md Outdated Show resolved Hide resolved

WoosukKwon reviewed Nov 10, 2025

View reviewed changes

youkaichao reviewed Nov 11, 2025

View reviewed changes

_posts/2025-11-10-bitwise-exact-rl.md Outdated Show resolved Hide resolved

youkaichao reviewed Nov 11, 2025

View reviewed changes

_posts/2025-11-10-bitwise-consistent-train-inference.md Show resolved Hide resolved

youkaichao reviewed Nov 11, 2025

View reviewed changes

_posts/2025-11-10-bitwise-exact-rl.md Outdated Show resolved Hide resolved

Update 2025-11-10-bitwise-exact-rl.md

afa6b12

Signed-off-by: Bram Wasti <[email protected]>

vercel bot deployed to Preview November 11, 2025 17:25 View deployment

Rename bitwise-exact-rl.md to bitwise-consistent-train-inference.md

ff1b062

Signed-off-by: Bram Wasti <[email protected]>

vercel bot deployed to Preview November 11, 2025 17:33 View deployment

Update authors in bitwise consistent train inference post

99cd5c4

Signed-off-by: Bram Wasti <[email protected]>

vercel bot deployed to Preview November 11, 2025 17:37 View deployment

Update 2025-11-10-bitwise-consistent-train-inference.md

9a4cf9a

Signed-off-by: Bram Wasti <[email protected]>

vercel bot deployed to Preview November 11, 2025 21:27 View deployment

zhuohan123 and others added 2 commits November 11, 2025 15:11

zhuohan's rewrote on bitwise blog

54251d3

Merge pull request #1 from vllm-project/zhuohan/bitwise-rewrote

08c3cf6

vercel bot deployed to Preview November 12, 2025 01:40 View deployment

Update 2025-11-10-bitwise-consistent-train-inference.md

2a84a24

Signed-off-by: Bram Wasti <[email protected]>

vercel bot deployed to Preview November 12, 2025 16:05 View deployment

Merge branch 'main' into patch-1

54a3035

vercel bot deployed to Preview November 12, 2025 16:29 View deployment

vercel bot deployed to Preview November 12, 2025 16:32 View deployment

Add Slack channels for contributing to bitwise consistency work

7799fdc

Added links to #sig-post-training and #sig-batch-invariant Slack channels in the blog post to invite readers to contribute to future developments. Signed-off-by: Bram Wasti <[email protected]>

bwasti force-pushed the patch-1 branch from 792b3b4 to 7799fdc Compare November 12, 2025 16:34

vercel bot deployed to Preview November 12, 2025 16:34 View deployment

youkaichao merged commit b56f9ce into vllm-project:main Nov 12, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add batch invariant RL post #113

Add batch invariant RL post #113

bwasti commented Nov 10, 2025

Uh oh!

Uh oh!

WoosukKwon Nov 10, 2025

Uh oh!

bwasti Nov 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants


		In the septillions of flops used to pre-train models, this mismatch between values has largely been avoidable. Pre-training typically runs at a fixed batch size which induces the same reduction kernels to be run - often side-stepping the issue entirely.

		Reinforcement learning, on the other hand, seems to almost exclusively run different reduction algorithms due to its inference-heavy (and thus largely latency and memory-bound) nature. Kernels optimized for low-batch size inference typically run reductions all at once, whereas kernels for training models parallelize heavily to reuse data and amp up compute utilization. That means the generators and the trainers are typically running completely different kernels!

Add batch invariant RL post #113

Add batch invariant RL post #113

Conversation

bwasti commented Nov 10, 2025

Uh oh!

Uh oh!

WoosukKwon Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

bwasti Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants