Skip to content

Fix tune-visual multi gpu finetuning and provide http server impl#257

Merged
youliangtan merged 3 commits intomainfrom
fix/ddp-tune-visual-and-http-server
Jul 16, 2025
Merged

Fix tune-visual multi gpu finetuning and provide http server impl#257
youliangtan merged 3 commits intomainfrom
fix/ddp-tune-visual-and-http-server

Conversation

@youliangtan
Copy link
Member

@youliangtan youliangtan commented Jul 10, 2025

  1. Provide fix to DDP error when tune-visual on the backbone during multigpu training

Related to: #247 and #243

  1. Also add inference server with http for ease of exposing it through public http server

WIP

Signed-off-by: youliangt <youliangt@nvidia.com>
for param in self.eagle_model.vision_model.parameters():
if param.requires_grad:
dummy_term = dummy_term + 0.0 * param.sum()
eagle_embeds = eagle_embeds + dummy_term
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently a hack to unblock things. Need better solution

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue is now tracked here: #265

Signed-off-by: youliangt <youliangt@nvidia.com>
@youliangtan youliangtan merged commit a2ec903 into main Jul 16, 2025
3 checks passed
@youliangtan youliangtan deleted the fix/ddp-tune-visual-and-http-server branch July 16, 2025 04:25
ddebenedittis pushed a commit to Borg-Robotics/Isaac-GR00T that referenced this pull request Oct 7, 2025
…IDIA#257)

* Fix tunevisual multi gpu and provide http server impl

Signed-off-by: youliangt <youliangt@nvidia.com>

* nit style comments

Signed-off-by: youliangt <youliangt@nvidia.com>

---------

Signed-off-by: youliangt <youliangt@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant