Releases · ggml-org/llama.cpp

22 May 23:26

3079e9a

b5460

release : fix windows hip release (#13707)

* release : fix windows hip release

* make single hip release with multiple targets

Assets 18

22 May 19:38

github-actions

b5459

8a1d206

b5459

tts : fix n_ubatch + make WavTokenizer cache-less (#13713)

ggml-ci

Assets 20

22 May 19:02

github-actions

b5458

797990c

b5458

mtmd : add ultravox audio input (#13623)

* convert ok, load ok

* warmup ok

* test

* still does not work?

* fix padding

* temporary give up

* fix merge conflict

* build_ultravox()

* rm test

* fix merge conflict

* add necessary mtmd APIs

* first working version (only 4s of audio)

* will this monster compile?

* fix compile

* please compile

* fPIC

* fix windows

* various fixes

* clean up audio_helpers

* fix conversion

* add some debug stuff

* long audio input ok

* adapt the api

* add --audio arg

* final touch UX

* add miniaudio to readme

* fix typo

* refactor kv metadata

* mtmd_default_marker()

Assets 20

22 May 14:15

github-actions

b5456

cc74d5b

b5456

server : pad small embedding batches (#13692)

ggml-ci

Assets 20

22 May 12:51

github-actions

b5454

d394a9a

b5454

sycl : Remove waits from function calls (#13702)

* removes the waits in async memcpy functions

Assets 20

22 May 09:01

github-actions

b5453

6b56a64

b5453

SYCL: Avoid using with SYCL-Graph for unsupported nodes (#13587)

Currently on a CUDA backend to SYCL when running
`GGML_SYCL_DISABLE_GRAPH=0 ./bin/test-backend-ops -b SYCL0` there
are two operations that throw an exception from the blocking
waits during queue recording.

* `-o CONCAT` : Use of blocking waits on a queue that's being recorded https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/ggml-sycl/concat.cpp#L185-L187
* `-o MUL_MAT_ID`: Blocking wait on a recording queue for a copy to host memory https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/ggml-sycl/ggml-sycl.cpp#L3072-L3074

We've noticed that `ggml-cuda.cu` has the
[check_node_graph_compatibility_and_refresh_copy_ops](https://github.com/ggml-org/llama.cpp/blob/39e73ae0d69f882d7e29cecc6dd8f5052fca6731/ggml/src/ggml-cuda/ggml-cuda.cu#L2458-L2458)
method for checking if a graph can be used, even if enabled. I've taken a
similar approach in this PR by adding a method to `ggml-sycl.cpp` for checking
if a graph can be used for the operations even if a user has asked for it to be
enabled.

Assets 20

21 May 23:34

github-actions

b5452

a4e8912

b5452

opencl: Add support for multiple devices (#12622)

* opencl: Add support for multiple devices

... but limited to one platform. A platform with a GPU will be preferred.

Additionally:

* Filter out devices that lack capabilities needed by the backend
  implementation (half support, OpenCL 2.0+, etc).

* Make ggml_backend_opencl_reg() thread-safe.

* fixup: fix an error in sync_with_other_backends

... when there is only one OpenCL device available.

Assets 20

21 May 20:56

github-actions

b5451

edbf42e

b5451

opencl: fix couple crashes (#12795)

* opencl: fix couple crashes

* fix kernel launches failed on devices which do not support
  non-uniform work-groups. When non-uniform work-groups are not
  supported, set `local_work_size` to NULL (= let driver choose the
  work-group sizes). This patch does not cover everything - just the
  cases tested by test-backend-ops.

* fix sub-buffer creation failed due to `cl_buffer_region::origin` not
  being aligned to `CL_DEVICE_MEM_BASE_ADDR_ALIGN`.

* OpenCL: query non-uniform WG sizes only on OpenCL 3.0+

Assets 20

21 May 20:42

github-actions

b5450

d643bb2

b5450

releases : build CPU backend separately (windows) (#13642)

Assets 20

21 May 17:57

github-actions

b5449

8e186ef

b5449

hparams : support models for which all layers use SWA (#13682)

ggml-ci

Assets 20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b5460

Uh oh!

b5459

Uh oh!

b5458

Uh oh!

b5456

Uh oh!

b5454

Uh oh!

b5453

Uh oh!

b5452

Uh oh!

b5451

Uh oh!

b5450

Uh oh!

b5449

Uh oh!