v1.2.3

jan-grzybek-ampere released this 02 Jul 22:47

· 32 commits to main since this release

855aa8d

The rebase is to allow llama-cpp-python to pick up upstream CVE fix (GHSA-56xg-wfcc-g829)
Experimental support for Q8R16 quantized format with optimized matrix multiplication kernels
CMake files updated to build llama.aio on AmpereOne

Assets 7