-
Notifications
You must be signed in to change notification settings - Fork 48
Merge --base-docker-image
and --docker-image
flag
#585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
3d26153
to
954377f
Compare
36b0cfc
to
95e6fa0
Compare
95e6fa0
to
b70f1b8
Compare
Hmm, I'm rethinking if we should be merging these flags together. I think we should still support both of these flags, but when we're using the benchmark runner in maxtext, we should support Pathways being able to use |
https://github.com/AI-Hypercomputer/xpk/blob/chzheng/docker_image_flag/src/xpk/core/docker_image.py#L228 will check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Zheng!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Zheng!
Once the commented code is removed, then it looks good to me.
6bf4461
to
08a7f36
Compare
Done |
@scaliby Would you be able to take a look at this PR? |
Fixes / Features
--base-docker-image
and--docker-image
flagTesting / Documentation
Tested with https://github.com/AI-Hypercomputer/maxtext/blob/wstcliyu/pw-405b-scale-test/benchmarks/recipes/pw_mcjax_benchmark_recipe.py for both
mcjax
andpathways
Changed https://github.com/AI-Hypercomputer/maxtext/blob/wstcliyu/pw-405b-scale-test/benchmarks/maxtext_xpk_runner.py#L624 to
mcjax
usesRUNNER = "maxtext_base_image"
andpathways
usesRUNNER="gcr.io/tpu-prod-env-multipod/wstcliyu_latest:latest"
as runner imagemcjax
will push localmaxtext_base_image
to remote for pods to pull andpathways
will pull images directly from the remote.XPK log:
Pod log: