v1.2.3
·
32 commits
to main
since this release
- The rebase is to allow llama-cpp-python to pick up upstream CVE fix (GHSA-56xg-wfcc-g829)
- Experimental support for Q8R16 quantized format with optimized matrix multiplication kernels
- CMake files updated to build llama.aio on AmpereOne