Skip to content

[perf] refactor: optimizes DataProtoFuture to use fractional lazy fetching of futures#6234

Open
yurun00 wants to merge 1 commit intoverl-project:mainfrom
yurun00:v0.dev
Open

[perf] refactor: optimizes DataProtoFuture to use fractional lazy fetching of futures#6234
yurun00 wants to merge 1 commit intoverl-project:mainfrom
yurun00:v0.dev

Conversation

@yurun00
Copy link
Copy Markdown

@yurun00 yurun00 commented May 2, 2026

What does this PR do?

This PR optimizes the DataProtoFuture by replacing the brittle collect_fn and dispatch_fn mechanisms with robust, native chunking logic associating futures with start_fraction and end_fraction.

Key changes:

  • Removed obsolete collect_fn and dispatch_fn properties from DataProtoFuture.
  • Added properties start_fraction and end_fraction to track chunking offsets in start and end futures directly.
  • Implemented fractional range calculations in chunk to accurately track boundaries across multiple chunks.
  • Streamlined the get method to materialize data efficiently using integer offsets, avoiding overhead of fetching unnecessary futures
  • Updated tests/test_protocol_on_cpu.py to cover the new fractional splitting logic.

Co-authored-by: Antigravity

Checklist Before Starting

  • Search for similar PRs. Paste at least one query link here: is:pr is:open DataProtoFuture
  • Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
    • {modules} include single_controller
    • {type} is refactor

Test

Ran the updated CPU unit tests locally to validate the new fractional splitting logic:

pytest tests/test_protocol_on_cpu.py

cpu unit test: https://github.com/yurun00/verl/actions/runs/25240216728
sgl: https://github.com/yurun00/verl/actions/runs/25240216710
vllm: https://github.com/yurun00/verl/actions/runs/25240216727

API and Usage Example

No API change

Design & Code Changes

As shown in key changes above

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 2, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the DataProtoFuture class in verl/protocol.py to use fractional offsets (start_fraction and end_fraction) for data chunking, replacing the previous collect_fn and dispatch_fn mechanism. The changes include a more precise chunk implementation using the fractions module and an updated get method that slices data based on these fractions. Additionally, the PR includes minor cleanups to np.reshape calls and introduces comprehensive unit tests for DataProtoFuture in tests/test_protocol_on_cpu.py. A potential precision issue was identified in the get method where floating-point multiplication followed by integer truncation could lead to off-by-one errors in indexing.

Comment thread verl/protocol.py
This commit optimizes the `DataProtoFuture` by replacing the brittle `collect_fn` and `dispatch_fn` mechanisms with robust, native chunking logic associating futures with `start_fraction` and `end_fraction`.

Key changes:
- Removed obsolete `collect_fn` and `dispatch_fn` properties from `DataProtoFuture`.
- Added properties `start_fraction` and `end_fraction` to track chunking offsets in start and end futures directly.
- Implemented fractional range calculations in `chunk` to accurately track boundaries across multiple chunks.
- Streamlined the `get` method to materialize data efficiently using integer offsets, avoiding overhead of fetching unnecessary futures
- Updated `tests/test_protocol_on_cpu.py` to cover the new fractional splitting logic.

Co-authored-by: Antigravity
Signed-off-by: Run Yu <yurun00@gmail.com>
@yurun00 yurun00 changed the title [single_controller] refactor: optimizes DataProtoFuture to use fractional lazy fetching of futures [perf] refactor: optimizes DataProtoFuture to use fractional lazy fetching of futures May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants