-
Notifications
You must be signed in to change notification settings - Fork 855
Add OpenRLHF Example #8077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add OpenRLHF Example #8077
Conversation
|
I am not super sure why the PR breaks but after some discussion with AI tools it seems like the cause is some Sphinx logic which is not particularly well suited for Windows and hence recommended I use MyST Markdown syntax to fix it. I'll take a look at fixing this again tomorrow but the tests pass for now with the MyST syntax |
llm/openrlhf/openrlhf_sft.yaml
Outdated
| docker run --name openrlhf_tmp --runtime=nvidia -v $PWD:/openrlhf nvcr.io/nvidia/pytorch:25.02-py3 bash -c " | ||
| pip uninstall xgboost transformer_engine flash_attn pynvml opencv-python-headless -y | ||
|
|
||
| pip install git+https://github.com/OpenRLHF/OpenRLHF.git@1bfcc334692f4d1e0ec0817465d3d9d495fb162f | ||
| " | ||
| docker commit openrlhf_tmp openrlhf:custom | ||
| docker rm openrlhf_tmp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we considered to set the image_id with the docker image directly? https://docs.skypilot.co/en/latest/examples/docker-containers.html#using-containers-as-runtime-environments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I thought both routes are fine given the docs showed examples with both. I'll test out a image_id based approach
Michaelvll
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aflah02 for submitting the PR! This is awesome! Just a quick question, see the comment above : )
|
Hi @Michaelvll Re the |
This PR adds examples for running DPO, SFT and RM Training using OpenRLHF on SkyPilot. Verified the training runs on GCP