Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b5311
context : remove logits_all flag (#13284) * context : remove logits_all flag ggml-ci * llama : remove logits_all flag + reorder llama_context_params ggml-ci
b5310
ci : move release workflow to a separate file (#13362)
b5309
llama : print size and type of overridden tensors (#13364)
b5308
sycl: addressing non-contiguous src1 mul_mats (nc and batched) (#13343) * sycl: fixed non-contiguous src1 mul_mats (nc and batched) * Fixed wrong static_cast inside kernel
b5306
sync : ggml ggml-ci
b5303
llama : deci : support ffn-free with attention (#13296)
b5302
common : Add a warning when we can't match samplers from a string or …
b5301
cuda : remove nrows_x in mul_mat_q_process_tile (#13325) Signed-off-by: Xiaodong Ye <[email protected]>
b5300
examples : remove infill (#13283) ggml-ci
b5299
llama : support tie embedding for chatglm models (#13328)