Skip to content

Conversation

@aflah02
Copy link

@aflah02 aflah02 commented Nov 24, 2025

This PR adds examples for running DPO, SFT and RM Training using OpenRLHF on SkyPilot. Verified the training runs on GCP

@aflah02
Copy link
Author

aflah02 commented Nov 24, 2025

I am not super sure why the PR breaks but after some discussion with AI tools it seems like the cause is some Sphinx logic which is not particularly well suited for Windows and hence recommended I use MyST Markdown syntax to fix it. I'll take a look at fixing this again tomorrow but the tests pass for now with the MyST syntax

Comment on lines 24 to 30
docker run --name openrlhf_tmp --runtime=nvidia -v $PWD:/openrlhf nvcr.io/nvidia/pytorch:25.02-py3 bash -c "
pip uninstall xgboost transformer_engine flash_attn pynvml opencv-python-headless -y

pip install git+https://github.com/OpenRLHF/OpenRLHF.git@1bfcc334692f4d1e0ec0817465d3d9d495fb162f
"
docker commit openrlhf_tmp openrlhf:custom
docker rm openrlhf_tmp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we considered to set the image_id with the docker image directly? https://docs.skypilot.co/en/latest/examples/docker-containers.html#using-containers-as-runtime-environments

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I thought both routes are fine given the docs showed examples with both. I'll test out a image_id based approach

Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aflah02 for submitting the PR! This is awesome! Just a quick question, see the comment above : )

@aflah02
Copy link
Author

aflah02 commented Nov 25, 2025

Hi @Michaelvll
Updated the examples with image_id based docker usage and also tested again

Re the docs/source/examples/training/openrlhf.md file, I am still trying to figure out why does it not work if I simply have ../../generated-examples/openrlhf.md in the file. Will try looking at it more closely tomorrow but just wanted to ask if you have any ideas/experience with this. My debugging so far leads me to believe this is caused by some Windows issue but I might be wrong

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants