hetero pd2 #1980

hsubramony · 2025-09-25T00:03:30Z

No description provided.

…es TP=2

hsubramony · 2025-09-25T00:14:46Z

vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py

                remote_engine_id, len(meta.local_block_ids),
                len(meta.remote_block_ids))
-            if self.use_host_buffer:
+            is_hetero = True


whats the plan to set this variable

need the flag to enable the _recving_metadata

are u going to add it as env flag ?

can we reuse the DECODE_TP_RATIO env

it seems good

check if this can be moved to hpu_model_runner

hsubramony · 2025-09-25T00:15:23Z

vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py

        self.vllm_config = vllm_config
        self.block_size = vllm_config.cache_config.block_size
-
+        self.block_factor = 8 # A100.block_size/G2.block_size


is it going to be hardcoded value ?

it's hardcode for now, maybe it's ok since this number won't change

it's better to check block size on both

we can check if the remote.block_size is expected. we don't know the remote.block_size here because the handshake occures afterwards.

libinta · 2025-09-25T16:51:33Z

vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py

                "Rank %s, get_finished: %s requests done sending "
                "and %s requests done recving", self.tp_rank,
                len(done_sending), len(done_recving))
+        #import remote_pdb; remote_pdb.set_trace()


can you add check remote is gpu attention?

yes, i can add this

libinta · 2025-10-06T21:51:42Z

vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py

-        if num_local_blocks < num_remote_blocks:
-            remote_block_ids = remote_block_ids[-num_local_blocks:]
+        #if num_local_blocks < num_remote_blocks:
+        #    remote_block_ids = remote_block_ids[-num_local_blocks:]


add check for heter

bukeao added 2 commits September 15, 2025 10:59

first commit for heterogeneous PD: accommodate block_size gap

ed7fe6b

fix indexing bug and enable the 1P1D mode, where P uses TP=1 and D us…

40a53a5

…es TP=2

hsubramony changed the title ~~Buke hetero pd2~~ hetero pd2 Sep 25, 2025

hsubramony changed the base branch from habana_main to libint/debug_ttft September 25, 2025 00:07

hsubramony commented Sep 25, 2025

View reviewed changes

libinta reviewed Sep 25, 2025

View reviewed changes

add backend_name check

1744167

libinta reviewed Oct 6, 2025

View reviewed changes

hetero pd2 #1980

Are you sure you want to change the base?

hetero pd2 #1980

Uh oh!

Conversation

hsubramony commented Sep 25, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants