Releases: SciSharp/LLamaSharp
Releases · SciSharp/LLamaSharp
v0.5.1 - GGUF, grammar and semantic-kernel integration
What's Changed
- Remove native libraries from LLama.csproj and replace it with a targets file. by @drasticactions in #32
- Update libllama.dylib by @SignalRT in #36
- update webapi example by @xbotter in #39
- MacOS metal support by @SignalRT in #47
- Basic ASP.NET Core website example by @saddam213 in #48
- fix breaking change in llama.cpp; bind to latest version llama.cpp to… by @fwaris in #51
- Documentation Spelling/Grammar by @martindevans in #52
- XML docs fixes by @martindevans in #53
- Cleaned up unnecessary extension methods by @martindevans in #55
- Memory Mapped LoadState/SaveState by @martindevans in #56
- Larger states by @martindevans in #57
- Instruct & Stateless web example implemented by @saddam213 in #59
- Fixed Multiple Enumeration by @martindevans in #54
- Fixed More Multiple Enumeration by @martindevans in #63
- Low level new loading system by @martindevans in #64
- Fixed Memory pinning in Sampling API by @martindevans in #68
- Fixed Spelling Mirostate -> Mirostat by @martindevans in #69
- Fixed Mirostate Sampling by @martindevans in #72
- GitHub actions by @martindevans in #74
- Update llama.cpp binaries to 5f631c2 and align the LlamaContext by @SignalRT in #77
- Expose some native classes by @saddam213 in #80
- feat: update the llama backends. by @AsakusaRinne in #78
- ModelParams & InferenceParams abstractions by @saddam213 in #79
- Cleaned up multiple enumeration in FixedSizeQueue by @martindevans in #83
- Improved Tensor Splits by @martindevans in #81
- fix: antiprompt does not work in stateless executor. by @AsakusaRinne in #84
- Access to IModelParamsExtensions by @saddam213 in #86
- Utils Cleanup by @martindevans in #82
- Fixed
ToLlamaContextParams
using the wrong parameter foruse_mmap
by @martindevans in #89 - Fix serialization error due to NaN by @martindevans in #88
- Add native logging output by @saddam213 in #95
- Minor quantizer improvements by @martindevans in #96
- Improved
NativeApi
file a bit by @martindevans in #99 - Logger Comments by @martindevans in #100
- llama_sample_classifier_free_guidance by @martindevans in #101
- Potential fix for .Net Framework issues by @zombieguy98 in #103
- Add missing semi-colon to README sample code by @zerosoup in #104
- Multi Context by @martindevans in #90
- Updated Demos by @martindevans in #105
- renamed some arguments in ModelParams constructor so that class can be serialized easily by @erinloy in #108
- Stateless Executor Fix by @martindevans in #107
- Grammar basics by @martindevans in #102
- Re-renaming some arguments to allow for easy deserialization from appsettings.json. by @erinloy in #111
- Added native symbol for CFG by @martindevans in #112
- Minor Code Cleanup by @martindevans in #114
- Changed type conversion by @zombieguy98 in #116
- OldVersion obsoletion notices by @martindevans in #117
- Embedder Test by @martindevans in #97
- Improved Cloning by @martindevans in #119
- ModelsParams record class by @martindevans in #115
- ReSharper code warnings cleanup by @martindevans in #120
- Two small improvements to the native sampling API by @martindevans in #124
- Removed unnecessary parameters from some low level sampler methods by @martindevans in #125
- Dependency Building In Github Action by @martindevans in #126
- Fixed paths by @martindevans in #127
- Fixed cuda paths again by @martindevans in #130
- Linux cublas by @martindevans in #131
- Fixed linux cublas filenames by @martindevans in #132
- fixed linux cublas paths in final step by @martindevans in #133
- Fixed the cublas linux paths again by @martindevans in #134
- Fixed those cublas paths again by @martindevans in #135
- Translating the grammar parser by @Mihaiii in #136
- Higher Level Grammar System by @martindevans in #137
- Enable Semantic kernel support by @drasticactions in #138
- grammar_exception_types by @martindevans in #140
- GGUF by @martindevans in #122
- docs: update the docs to follow new version. by @AsakusaRinne in #141
- Update MacOS Binaries by @SignalRT in #143
- Remove LLamaNewlineTokens from InteractiveExecutorState by @martindevans in #144
- refactor: remove old version files. by @AsakusaRinne in #142
- Disable test parallelism by @martindevans in #145
- Removed duplicate
llama_sample_classifier_free_guidance
method by @martindevans in #146 - Swapped to llama-7b-chat by @martindevans in #147
New Contributors
- @drasticactions made their first contribution in #32
- @xbotter made their first contribution in #39
- @saddam213 made their first contribution in #48
- @fwaris made their first contribution in #51
- @martindevans made their first contribution in #52
- @zombieguy98 made their first contribution in #103
- @zerosoup made their first contribution in #104
- @erinloy made their first contribution in #108
- @Mihaiii made their first contribution in #136
Full Changelog: v0.4.0...v0.5.0
v0.4.2-preview: new backends
What's Changed
- update webapi example by @xbotter in #39
- MacOS metal support by @SignalRT in #47
- Basic ASP.NET Core website example by @saddam213 in #48
- fix breaking change in llama.cpp; bind to latest version llama.cpp to… by @fwaris in #51
- Documentation Spelling/Grammar by @martindevans in #52
- XML docs fixes by @martindevans in #53
- Cleaned up unnecessary extension methods by @martindevans in #55
- Memory Mapped LoadState/SaveState by @martindevans in #56
- Larger states by @martindevans in #57
- Instruct & Stateless web example implemented by @saddam213 in #59
- Fixed Multiple Enumeration by @martindevans in #54
- Fixed More Multiple Enumeration by @martindevans in #63
- Low level new loading system by @martindevans in #64
- Fixed Memory pinning in Sampling API by @martindevans in #68
- Fixed Spelling Mirostate -> Mirostat by @martindevans in #69
- Fixed Mirostate Sampling by @martindevans in #72
- GitHub actions by @martindevans in #74
- Update llama.cpp binaries to 5f631c2 and align the LlamaContext by @SignalRT in #77
- Expose some native classes by @saddam213 in #80
- feat: update the llama backends. by @AsakusaRinne in #78
New Contributors
- @xbotter made their first contribution in #39
- @saddam213 made their first contribution in #48
- @fwaris made their first contribution in #51
- @martindevans made their first contribution in #52
Full Changelog: v0.4.1-preview...v0.4.2-preview
v0.4.1-preview - follow up llama.cpp latest commit
This is a preview version which followed up the latest modifications of llama.cpp.
For some reasons the cuda backend hasn't been okay, we'll release v0.4.1 after dealing with that.
v0.4.0 - Executor and ChatSession
Version 0.4.0 introduces many break changes. However we strongly recommend to upgrade to 0.4.0 because it provides better abstractions and stability by refactoring the framework. The backend v0.3.0
and v0.3.1
still works for LLamaSharp v0.4.0
.
The main changes:
- Add three-level abstractions:
LLamaModel
,LLamaExecutor
andChatSession
. - Fix the BUG of saving and loading state.
- Support saving/loading chat session directly.
- Add more flexible APIs in the chat session.
- Add detailed documentations: https://scisharp.github.io/LLamaSharp/0.4/
Acknowledge
During the development, thanks a lot for the help from @TheTerrasque ! His/Her fork gives us many inspirations. Besides, many thanks for the following contributors!
- MacOS Arm64 support by @SignalRT in #24
- Fixed a typo in FixedSizeQueue by @mlof in #25
- Document interfaces by @mlof in #26
New Contributors
v0.3.0 - Load and save state
- Support loading and saving state.
- Support tokenization and detokenization.
- Fix BUGs of instruct mode.
- break change:
n_parts
param is removed. - break change:
LLamaModelV1
is dropped. - Remove dependencies for third-party loggers.
- Verified model repo is added on huggingface.
- Optimize the examples.
v0.2.3 - Inference BUG Fix
Fix some strange behaviors of model inference.
v0.2.2 - Embedder
- Sync with the latest llama.cpp master branch.
- Add
LLamaEmbedder
to support to get the embeddings only. - Add
n_gpu_layers
andprompt_cache_all
param. - Split the package into main package + backend package.
v0.2.1 - Chat session, quantization and Web API
- Add basic APIs and chat session.
- Support quantization.
- Add Web API support.