-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
llama: fix leaked buffers for mmap + split files
#16765
opened Oct 25, 2025 by
JohannesGaessler
Loading…
model : add LightOnOCR-1B model
examples
python
python script changes
#16764
opened Oct 24, 2025 by
ngxson
Loading…
Add LFM2 tool handling
testing
Everything test related
#16763
opened Oct 24, 2025 by
ykhrustalev
Loading…
convert: Handle mmproj model output filename properly
python
python script changes
#16760
opened Oct 24, 2025 by
Galunid
Loading…
webui: add HTML/JS preview support to MarkdownContent with sandboxed iframe
examples
server
#16757
opened Oct 24, 2025 by
ServeurpersoCom
Loading…
qwen3-coder tool call parser
testing
Everything test related
#16755
opened Oct 24, 2025 by
marceldev89
Loading…
rpc: use changes relating to the ggml tensor library for machine learning
XXHash64 instead of FNV-1a for hashing tensors
ggml
cann: improve device ID handling and aclnnArange checks
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#16752
opened Oct 24, 2025 by
noemotiovon
Loading…
llama : disable pipeline parallelism if compute buffer allocation fails
#16748
opened Oct 23, 2025 by
slaren
Loading…
llama: consistent ctx <-> buf order for KV cache
ggml
changes relating to the ggml tensor library for machine learning
#16746
opened Oct 23, 2025 by
JohannesGaessler
Loading…
ggml: fix cuda kernel launch configuration for k_compute_batched_ptrs to support large batch
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#16744
opened Oct 23, 2025 by
leejet
Loading…
get_rows & dequantize function implementation for repacked weights of type q6_K (q6_Kx8)
ggml
changes relating to the ggml tensor library for machine learning
#16743
opened Oct 23, 2025 by
swetha097
Loading…
ggml-cpu: arm64: q4_K repack gemm and gemv implementations
ggml
changes relating to the ggml tensor library for machine learning
#16739
opened Oct 23, 2025 by
Alcpz
Loading…
sycl: add REPEAT_BACK operation support
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#16734
opened Oct 23, 2025 by
shani-f
Loading…
llama-server : Reduce log level of a debug message leaking prompt contents
examples
server
#16727
opened Oct 22, 2025 by
l-austenfeld
Loading…
CUDA: General GEMV fusion
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
testing
Everything test related
#16715
opened Oct 22, 2025 by
am17an
Loading…
CUDA: support for weight clamp in top-k norm
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#16702
opened Oct 21, 2025 by
am17an
Loading…
ggml : fix interpolate with align-corners and ne=1
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
OpenCL
Issues specific to the OpenCL backend
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#16700
opened Oct 21, 2025 by
Acly
Loading…
fix[readme]: Update docs/build.md to match the new GPU_TARGETS
documentation
Improvements or additions to documentation
#16698
opened Oct 21, 2025 by
catan2001
Loading…
llama-context: fix build fails with
-Werror=missing-braces
#16692
opened Oct 21, 2025 by
otegami
Loading…
convert : enable expert group selection for all models with it
python
python script changes
#16691
opened Oct 20, 2025 by
CISC
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:master.