Skip to content

OpenCL: add initial FA support #14987

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

rmatif
Copy link
Collaborator

@rmatif rmatif commented Jul 31, 2025

This PR introduces F16/F32 FA support for the OpenCL backend. It has been extremely challenging to achieve good performance on this kind of hardware, but I believe it is now decent enough to serve as a baseline that we can further iterate on. I also believe there is room for improvement for tg

Results on Adreno 830:

model size params backend ngl fa test t/s
llama 1B F16 2.30 GiB 1.24 B OpenCL 99 0 pp512 198.69 ± 0.59
llama 1B F16 2.30 GiB 1.24 B OpenCL 99 0 tg128 21.88 ± 0.85
llama 1B F16 2.30 GiB 1.24 B OpenCL 99 1 pp512 274.75 ± 1.22
llama 1B F16 2.30 GiB 1.24 B OpenCL 99 1 tg128 21.58 ± 0.39

Adreno 750:

model size params backend ngl fa test t/s
llama 1B F16 2.30 GiB 1.24 B OpenCL 99 0 pp512 139.96 ± 0.51
llama 1B F16 2.30 GiB 1.24 B OpenCL 99 0 tg128 19.70 ± 0.11
llama 1B F16 2.30 GiB 1.24 B OpenCL 99 1 pp512 151.22 ± 0.85
llama 1B F16 2.30 GiB 1.24 B OpenCL 99 1 tg128 17.94 ± 0.15

@rmatif rmatif requested review from max-krasnyansky and lhez July 31, 2025 12:35
@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Jul 31, 2025
@lhez
Copy link
Collaborator

lhez commented Aug 1, 2025

@rmatif Very cool, thank you!

@lhez
Copy link
Collaborator

lhez commented Aug 10, 2025

Sorry, got distracted during the past week. Will come back to this asap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants