Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b5477
releases : bundle llvm omp library in windows release (#13763)
b5476
releases : enable openmp in windows cpu backend build (#13756)
b5475
ggml-cpu : set openmp wait time if not set (#13758)
b5474
Move GLM4 f32 attention fix to the correct function (#13750)
b5473
ggml : add ggml_gelu_erf() CUDA kernel (#13719) * ggml : add ggml_gelu_erf() CUDA kernel * missing semicolon
b5472
vocab : fix ugm tokenizer precision (#13743)
b5471
CUDA: fix race condition in FA vector kernels (#13742)
b5468
hparams : initialize arrays (#13728) ggml-ci
b5466
server : support audio input (#13714) * server : support audio input * add audio support on webui
b5465
CANN: Support MUL_MAT_ID for q8_0 and q4_0 (#13705) * [CANN]Support MUL_MAT_ID Q8 && Q4 Signed-off-by: noemotiovon <[email protected]> * codestyle adjustment Signed-off-by: noemotiovon <[email protected]> --------- Signed-off-by: noemotiovon <[email protected]>