v3.8.0 #7364
mudler
announced in
Announcements
v3.8.0
#7364
Replies: 1 comment
-
|
I have a suggestion. I believe this engine is running on whisper, right? It its for speech to text, i would recommend vosk as it is more lightweight and designed for low powered devices. I have tried both and vosk is definetely better |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Welcome to LocalAI 3.8.0 !
LocalAI 3.8.0 focuses on smoothing out the user experience and exposing more power to the user without requiring restarts or complex configuration files. This release introduces a new onboarding flow and a universal model loader that handles everything from HF URLs to local files.
We’ve also improved the chat interface, addressed long-standing requests regarding OpenAI API compatibility (specifically SSE streaming standards) and exposed more granular controls for some backends (llama.cpp) and backend management.
📌 TL;DR
context_shift,cache_ram, andparallelworkers via YAML options.Feature Breakdown
🚀 Universal Model Import (URL-based)
We have refactored how models are imported. You no longer need to manually write configuration files for common use cases. The new importer accepts URLs from Hugging Face, Ollama, and OCI registries, or local file paths also from the Web interface.
import.mp4
llama.cppvsdiffusers) and applies native chat templates (e.g.,llama-3,mistral) automatically by reading the model metadata.vLLMovertransformers.mmproj) are detected and configured automatically.🎨 Complete UI Overhaul
The web interface has been redesigned for better usability and clearer management.
index.mp4
manage.mp4
🤖 Agentic Ecosystem & MCP Live Streaming
LocalAI 3.8.0 significantly upgrades support for agentic workflows using the Model Context Protocol (MCP).
mcp.mp4
Configuring MCP via the interface is now simplified:
mcp_configuration.mp4
🔁 Runtime System Settings
A new Settings > System panel exposes configuration options that previously required environment variables or a restart.
settings.mp4
⚙️ Advanced
llama.cppConfigurationFor power users running large context windows or high-throughput setups, we've exposed additional underlying
llama.cppoptions in the YAML config. You can now tune context shifting, RAM limits for the KV cache, and parallel worker slots.📊 Logprobs & Logitbias Support
This release adds full support for logitbias and logprobs. This is critical for advanced agentic logic, Self-RAG, and evaluating model confidence / hallucination rates. It supports the OpenAI specification.
🛠️ Fixes & Improvements
OpenAI Compatibility:
finish_reason). This resolves integration issues withopenai-node,LangChain, andLlamaIndex.top_ncan now be omitted or set to0to return all results, rather than defaulting to an arbitrary limit.General Fixes:
llama.cpp,vLLM,transformers, anddiffusers). This immediately stops generation and frees up resources.🚀 The Complete Local Stack for Privacy-First AI
LocalAI
The free, Open Source OpenAI alternative. Drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required.
Link: https://github.com/mudler/LocalAI
LocalAGI
Local AI agent management platform. Drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI.
Link: https://github.com/mudler/LocalAGI
LocalRecall
RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Works alongside LocalAI and LocalAGI.
Link: https://github.com/mudler/LocalRecall
❤️ Thank You
Over 35,000 stars and growing. LocalAI is a true FOSS movement — built by contributors, powered by community.
If you believe in privacy-first AI:
Your support keeps this stack alive.
✅ Full Changelog
📋 Click to expand full changelog
What's Changed
Bug fixes 🐛
top_nin the request by @mkhludnev in fix(reranker): respecttop_nin the request #7025top_nat the REST level #7212 by @mkhludnev in fix(reranker): tests and top_n check fix #7212 #7284Exciting New Features 🎉
🧠 Models
📖 Documentation and examples
👒 Dependencies
Other Changes
999a7e0cbf8484dc2cea1e9f855d6b39f34f7ae9by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to999a7e0cbf8484dc2cea1e9f855d6b39f34f7ae9#69972f68ce7cfd20e9e7098514bf730e5389b7bba908by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to2f68ce7cfd20e9e7098514bf730e5389b7bba908#6998cd5e3b57541ecc52421130742f4d89acbcf77cd4by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp tocd5e3b57541ecc52421130742f4d89acbcf77cd4#7023c5023daf607c578d6344c628eb7da18ac3d92d32by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp toc5023daf607c578d6344c628eb7da18ac3d92d32#7069ad51c0a720062a04349c779aae301ad65ca4c856by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp toad51c0a720062a04349c779aae301ad65ca4c856#7098a44d77126c911d105f7f800c17da21b2a5b112d1by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp toa44d77126c911d105f7f800c17da21b2a5b112d1#71257f09a680af6e0ef612de81018e1d19c19b8651e8by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to7f09a680af6e0ef612de81018e1d19c19b8651e8#715665156105069fa86a4a81b6cb0e8cb583f6420677by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to65156105069fa86a4a81b6cb0e8cb583f6420677#7184333f2595a3e0e4c0abf233f2f29ef1710acd134dby @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to333f2595a3e0e4c0abf233f2f29ef1710acd134d#7201b8595b16e69e3029e06be3b8f6635f9812b2bc3fby @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp tob8595b16e69e3029e06be3b8f6635f9812b2bc3f#7210a1867e0dad0b21b35afa43fc815dae60c9a139d6by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp toa1867e0dad0b21b35afa43fc815dae60c9a139d6#723113730c183b9e1a32c09bf132b5367697d6c55048by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to13730c183b9e1a32c09bf132b5367697d6c55048#72327d019cff744b73084b15ca81ba9916f3efab1223by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to7d019cff744b73084b15ca81ba9916f3efab1223#7247d9b7613b34a343848af572cc14467fc5e82fc788by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp tod9b7613b34a343848af572cc14467fc5e82fc788#7268c4abcb2457217198efdd67d02675f5fddb7071c2by @mudler in chore(deps): bump llama.cpp toc4abcb2457217198efdd67d02675f5fddb7071c2#72669b17d74ab7d31cb7d15ee7eec1616c3d825a84c0by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to9b17d74ab7d31cb7d15ee7eec1616c3d825a84c0#7273662192e1dcd224bc25759aadd0190577524c6a66by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to662192e1dcd224bc25759aadd0190577524c6a66#727780deff3648b93727422461c41c7279ef1dac7452by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to80deff3648b93727422461c41c7279ef1dac7452#7287b12abefa9be2abae39a73fa903322af135024a36by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp tob12abefa9be2abae39a73fa903322af135024a36#7300cb623de3fc61011e5062522b4d05721a22f2e916by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp tocb623de3fc61011e5062522b4d05721a22f2e916#73017d77f07325985c03a91fa371d0a68ef88a91ec7fby @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to7d77f07325985c03a91fa371d0a68ef88a91ec7f#731419ceec8eac980403b714d603e5ca31653cd42a3fby @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to19ceec8eac980403b714d603e5ca31653cd42a3f#7321dd0f3219419b24740864b5343958a97e1b3e4b26by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp todd0f3219419b24740864b5343958a97e1b3e4b26#732223bc779a6e58762ea892eca1801b2ea1b9050c00by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to23bc779a6e58762ea892eca1801b2ea1b9050c00#7331New Contributors
top_nin the request #7025Full Changelog: v3.7.0...v3.8.0
What's Changed
Bug fixes 🐛
top_nin the request by @mkhludnev in fix(reranker): respecttop_nin the request #7025top_nat the REST level #7212 by @mkhludnev in fix(reranker): tests and top_n check fix #7212 #7284Exciting New Features 🎉
🧠 Models
📖 Documentation and examples
👒 Dependencies
Other Changes
999a7e0cbf8484dc2cea1e9f855d6b39f34f7ae9by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to999a7e0cbf8484dc2cea1e9f855d6b39f34f7ae9#69972f68ce7cfd20e9e7098514bf730e5389b7bba908by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to2f68ce7cfd20e9e7098514bf730e5389b7bba908#6998cd5e3b57541ecc52421130742f4d89acbcf77cd4by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp tocd5e3b57541ecc52421130742f4d89acbcf77cd4#7023c5023daf607c578d6344c628eb7da18ac3d92d32by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp toc5023daf607c578d6344c628eb7da18ac3d92d32#7069ad51c0a720062a04349c779aae301ad65ca4c856by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp toad51c0a720062a04349c779aae301ad65ca4c856#7098a44d77126c911d105f7f800c17da21b2a5b112d1by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp toa44d77126c911d105f7f800c17da21b2a5b112d1#71257f09a680af6e0ef612de81018e1d19c19b8651e8by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to7f09a680af6e0ef612de81018e1d19c19b8651e8#715665156105069fa86a4a81b6cb0e8cb583f6420677by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to65156105069fa86a4a81b6cb0e8cb583f6420677#7184333f2595a3e0e4c0abf233f2f29ef1710acd134dby @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to333f2595a3e0e4c0abf233f2f29ef1710acd134d#7201b8595b16e69e3029e06be3b8f6635f9812b2bc3fby @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp tob8595b16e69e3029e06be3b8f6635f9812b2bc3f#7210a1867e0dad0b21b35afa43fc815dae60c9a139d6by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp toa1867e0dad0b21b35afa43fc815dae60c9a139d6#723113730c183b9e1a32c09bf132b5367697d6c55048by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to13730c183b9e1a32c09bf132b5367697d6c55048#72327d019cff744b73084b15ca81ba9916f3efab1223by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to7d019cff744b73084b15ca81ba9916f3efab1223#7247d9b7613b34a343848af572cc14467fc5e82fc788by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp tod9b7613b34a343848af572cc14467fc5e82fc788#7268c4abcb2457217198efdd67d02675f5fddb7071c2by @mudler in chore(deps): bump llama.cpp toc4abcb2457217198efdd67d02675f5fddb7071c2#72669b17d74ab7d31cb7d15ee7eec1616c3d825a84c0by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to9b17d74ab7d31cb7d15ee7eec1616c3d825a84c0#7273662192e1dcd224bc25759aadd0190577524c6a66by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to662192e1dcd224bc25759aadd0190577524c6a66#727780deff3648b93727422461c41c7279ef1dac7452by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to80deff3648b93727422461c41c7279ef1dac7452#7287b12abefa9be2abae39a73fa903322af135024a36by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp tob12abefa9be2abae39a73fa903322af135024a36#7300cb623de3fc61011e5062522b4d05721a22f2e916by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp tocb623de3fc61011e5062522b4d05721a22f2e916#73017d77f07325985c03a91fa371d0a68ef88a91ec7fby @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to7d77f07325985c03a91fa371d0a68ef88a91ec7f#731419ceec8eac980403b714d603e5ca31653cd42a3fby @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to19ceec8eac980403b714d603e5ca31653cd42a3f#7321dd0f3219419b24740864b5343958a97e1b3e4b26by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp todd0f3219419b24740864b5343958a97e1b3e4b26#732223bc779a6e58762ea892eca1801b2ea1b9050c00by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to23bc779a6e58762ea892eca1801b2ea1b9050c00#73313f3a4fb9c3b907c68598363b204e6f58f4757c8cby @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to3f3a4fb9c3b907c68598363b204e6f58f4757c8c#73360c7220db56525d40177fcce3baa0d083448ec813by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to0c7220db56525d40177fcce3baa0d083448ec813#7337New Contributors
top_nin the request #7025Full Changelog: v3.7.0...v3.8.0
This discussion was created from the release v3.8.0.
Beta Was this translation helpful? Give feedback.
All reactions