eagle3 cb impl with top-1 proposal#3055
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR implements EAGLE3 speculative decoding with continuous batching support for improved inference performance. The changes add a new speculative decoding variant that uses hidden state passing between main and draft models for more efficient token generation.
Key Changes
- Introduced Eagle3DecodingImpl for EAGLE3-specific speculative decoding logic
- Extended model runner to support hidden state import/export for EAGLE3
- Added test coverage for EAGLE3 speculative decoding scenarios
Reviewed Changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/python_tests/utils/hugging_face.py | Adds eagle3 model detection and handles tokenizer conditionally |
| tests/python_tests/test_continuous_batching.py | Adds EAGLE3 test cases and refactors test helper functions |
| tests/python_tests/samples/test_speculative_decoding_lm.py | Extracts common test logic and adds EAGLE3 sample tests |
| tests/python_tests/samples/conftest.py | Adds model configurations for EAGLE3 models |
| src/cpp/src/speculative_decoding/update_request_structs.hpp | Extends GeneratedSequence to store hidden states |
| src/cpp/src/speculative_decoding/speculative_decoding_impl.hpp | Refactors generate logic into template helper and exposes internal state |
| src/cpp/src/speculative_decoding/speculative_decoding_impl.cpp | Extracts scheduler initialization and refactors generate using strategy pattern |
| src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.hpp | Defines EAGLE3 implementation with model transformations |
| src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp | Implements EAGLE3 decoding with hidden state management |
| src/cpp/src/speculative_decoding/continuous_batching_for_speculative_decoding_impl.hpp | Adds ContinuousBatchingForEagle3DecodingImpl class |
| src/cpp/src/speculative_decoding/continuous_batching_for_speculative_decoding_impl.cpp | Implements hidden state handling in update_requests |
| src/cpp/src/sequence_group.hpp | Adds hidden state storage and accessor methods to Sequence |
| src/cpp/src/sampling/sampler.hpp | Adds draft-to-target mapping for EAGLE decoding |
| src/cpp/src/sampling/sampler.cpp | Implements token index adjustment using draft2target mapping |
| src/cpp/src/llm/pipeline.cpp | Adds apply_eagle_rt_info helper and draft model configuration |
| src/cpp/src/continuous_batching/pipeline.cpp | Integrates EAGLE3 mode detection and instantiation |
| src/cpp/src/continuous_batching/model_runner.hpp | Adds hidden state flag system and sequence mapping structures |
| src/cpp/include/openvino/genai/continuous_batching_pipeline.hpp | Declares EAGLE3 implementation classes as friends |
| .github/workflows/windows.yml | Excludes eagle3 tests from main suite and adds dedicated test job |
| .github/workflows/manylinux_2_28.yml | Excludes eagle3 tests from main suite and adds dedicated test job |
| .github/workflows/linux.yml | Excludes eagle3 tests from main suite and adds dedicated test job |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: fishbell <bell.song@intel.com>
src/cpp/src/speculative_decoding/continuous_batching_for_speculative_decoding_impl.cpp
Outdated
Show resolved
Hide resolved
src/cpp/src/speculative_decoding/continuous_batching_for_speculative_decoding_impl.cpp
Show resolved
Hide resolved
1b08202 to
9d82589
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp:1
- Corrected spelling of 'implementation' in URL comment.
// Copyright (C) 2023-2025 Intel Corporation
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: fishbell <bell.song@intel.com>
Signed-off-by: fishbell <bell.song@intel.com>
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp
Outdated
Show resolved
Hide resolved
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp:1
- Corrected spelling of 'useage' to 'usage'.
// Copyright (C) 2023-2025 Intel Corporation
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: fishbell <bell.song@intel.com>
Signed-off-by: fishbell <bell.song@intel.com>
Signed-off-by: fishbell <bell.song@intel.com>
Co-authored-by: Vladimir Zlobin <vladimir.zlobin@intel.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp:1
- Corrected spelling of 'useage' to 'usage'.
// Copyright (C) 2023-2025 Intel Corporation
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/cpp/src/speculative_decoding/continuous_batching_for_speculative_decoding_impl.cpp
Show resolved
Hide resolved
aaa5612
eagle3 CB impl
Tickets: CVS-173358
ref code: https://github.com/SafeAILab/EAGLE