Skip to content

support qwen3 on nvidia #3302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

support qwen3 on nvidia #3302

wants to merge 2 commits into from

Conversation

icyxp
Copy link
Contributor

@icyxp icyxp commented Jul 23, 2025

Support Qwen3 on Nvidia

@NikiBase
Copy link

wrong url https://huggingface.co/collections/Qwen/qwen3-67c6c6f89c4f76621268bb6d
I think it must be this one https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f

fix qwen3 url
@NikiBase
Copy link

Hi @icyxp, I have tried your pull request on local using the image with llamacpp backend and it does not work
https://huggingface.co/docs/text-generation-inference/main/en/backends/llamacpp#build-docker-image
I had updated the llamacpp version on the Dockerfile_llamacpp to the latest version but still not working for Qwen3 model, maybe I should change something else but not sure. Could you please explain how you run your pull request.
Thank you in advance.

@icyxp
Copy link
Contributor Author

icyxp commented Jul 29, 2025

@NikiBase Pls use Dockerfile,Not tested on llamacpp. maybe you can check out this: https://qwen.readthedocs.io/en/latest/run_locally/llama.cpp.html

@NikiBase
Copy link

NikiBase commented Jul 29, 2025

I am trying to run the docker build from the base dockerfile and I got the following error on this stage

ERROR [base 10/26] RUN cd server &&  uv sync --frozen --extra gen --extra bnb --extra accelerate --extra compressed-tensors --extra quantize --extra peft --extra outlines --extra t

...

583.8 requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://huggingface.co/api/resolve-cache/models/kernels-community/moe/e3efab933893cde20c5417ba185fa3b7cc811b24/build%2Ftorch27-cxx11-cu128-x86_64-linux%2Fmoe%2Fconfigs%2FE%3D8%2CN%3D8192%2Cdevice_name%3DAMD_Instinct_MI325X%2Cdtype%3Dfp8_w8a8.json

did you experienced this issue during the build phase?

I am running it again to see if it is a temporal connection issue, will tell you if it resolves this way

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants