v3.8.0 #7364

mudler · 2025-11-26T20:22:50Z

mudler
Nov 26, 2025
Maintainer

Welcome to LocalAI 3.8.0 !

LocalAI 3.8.0 focuses on smoothing out the user experience and exposing more power to the user without requiring restarts or complex configuration files. This release introduces a new onboarding flow and a universal model loader that handles everything from HF URLs to local files.

We’ve also improved the chat interface, addressed long-standing requests regarding OpenAI API compatibility (specifically SSE streaming standards) and exposed more granular controls for some backends (llama.cpp) and backend management.

📌 TL;DR

Feature	Summary
Universal Model Import	Import directly from Hugging Face, Ollama, OCI, or local paths. Auto-detects backends and handles chat templates.
UI & Index Overhaul	New onboarding wizard, auto-model selection on boot, and a cleaner tabular view for model management.
MCP Live Streaming	New: Agent actions and tool calls are now streamed live via the Model Context Protocol—see reasoning in real-time.
Hot-Reloadable Settings	Modify watchdogs, API keys, P2P settings, and defaults without restarting the container.
Chat enhancements	Chat history and parallel conversations are now persisted in local storage.
Strict SSE Compliance	Fixed streaming format to exactly match OpenAI specs (resolves issues with LangChain/JS clients).
Advanced Config	Fine-tune `context_shift`, `cache_ram`, and `parallel` workers via YAML options.
Logprobs & Logitbias	Added token-level probability support for improved agent/eval workflows.

Feature Breakdown

🚀 Universal Model Import (URL-based)

We have refactored how models are imported. You no longer need to manually write configuration files for common use cases. The new importer accepts URLs from Hugging Face, Ollama, and OCI registries, or local file paths also from the Web interface.

import.mp4

Auto-Detection: The system attempts to identify the correct backend (e.g., llama.cpp vs diffusers) and applies native chat templates (e.g., llama-3, mistral) automatically by reading the model metadata.
Customization during Import: You can override defaults immediately, for example, forcing a specific quantization on a GGUF file or selecting vLLM over transformers.
Multimodal Support: Vision components (mmproj) are detected and configured automatically.
File Safety: We added a safeguard to prevent the deletion of model files (blobs) if they are shared by multiple model configurations.

🎨 Complete UI Overhaul

The web interface has been redesigned for better usability and clearer management.

index.mp4

Onboarding Wizard: A guided flow helps first-time users import or install a model in under 30 seconds.
Auto-Focus & Selection: The input field captures focus automatically, and a default model is loaded on startup so you don't start in a "no model selected" state.
Tabular Management: Models and backends are now organized in a cleaner list view, making it easier to see what is installed.

manage.mp4

🤖 Agentic Ecosystem & MCP Live Streaming

LocalAI 3.8.0 significantly upgrades support for agentic workflows using the Model Context Protocol (MCP).

Live Action Streaming: We have added a new endpoint to stream agent results as they happen. Instead of waiting for the final output, you can now watch the agent "think": seeing tool calls, reasoning steps, and intermediate actions streamed live in the UI.

mcp.mp4

Configuring MCP via the interface is now simplified:

mcp_configuration.mp4

🔁 Runtime System Settings

A new Settings > System panel exposes configuration options that previously required environment variables or a restart.

settings.mp4

Immediate Effect: Toggling Watchdogs, P2P, and Gallery availability applies instantly.
API Key Management: You can now generate, rotate, and expire API keys via the UI.
Network: CORS and CSRF settings are now accessible here (note: these specific network settings still require a restart to take effect).

Note: In order to benefit from persisting runtime settings, in older LocalAI deployments it's necessary to mount the /configuration directory from the container image.

⚙️ Advanced `llama.cpp` Configuration

For power users running large context windows or high-throughput setups, we've exposed additional underlying llama.cpp options in the YAML config. You can now tune context shifting, RAM limits for the KV cache, and parallel worker slots.

options:
- context_shift:false
- cache_ram:-1
- use_jinja:true
- parallel:2
- grpc_servers:localhost:50051,localhost:50052

📊 Logprobs & Logitbias Support

This release adds full support for logitbias and logprobs. This is critical for advanced agentic logic, Self-RAG, and evaluating model confidence / hallucination rates. It supports the OpenAI specification.

🛠️ Fixes & Improvements

OpenAI Compatibility:

SSE Streaming: Fixed a critical issue where streaming responses were slightly non-compliant (e.g., sending empty content chunks or missing finish_reason). This resolves integration issues with openai-node, LangChain, and LlamaIndex.
Top_N Behavior: In the reranker, top_n can now be omitted or set to 0 to return all results, rather than defaulting to an arbitrary limit.

General Fixes:

Model Preview: When downloading, you can now see the actual filename and size before committing to the download.
Tool Handling: Fixed crashes when tool content is missing or malformed.
TTS: Fixed dropdown selection states for TTS models.
Browser Storage: Chat history is now persisted in your browser's local storage. You can switch between parallel chats, rename them, and export them to JSON.
True Cancellation: Clicking "Stop" during a stream now correctly propagates a cancellation context to the backend (works for llama.cpp, vLLM, transformers, and diffusers). This immediately stops generation and frees up resources.

🚀 The Complete Local Stack for Privacy-First AI

LocalAI	The free, Open Source OpenAI alternative. Drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required. Link: https://github.com/mudler/LocalAI
LocalAGI	Local AI agent management platform. Drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI. Link: https://github.com/mudler/LocalAGI
LocalRecall	RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Works alongside LocalAI and LocalAGI. Link: https://github.com/mudler/LocalRecall

❤️ Thank You

Over 35,000 stars and growing. LocalAI is a true FOSS movement — built by contributors, powered by community.

If you believe in privacy-first AI:

✅ Star the repo
💬 Contribute code, docs, or feedback
📣 Share with others

Your support keeps this stack alive.

✅ Full Changelog

📋 Click to expand full changelog

What's Changed

📝 Want to Learn More?

📖 Official Documentation

🔧 Model Configuration Guide

🚀 Agent Debugging with MCP

🛠️ Troubleshooting & FAQ

🔧 Diffusers Setup Guide

💬 Found a bug? Open an issue on GitHub — we’ll respond fast.
🚀 Want to contribute? Join us at GitHub Discussions — every idea counts.

New Contributors

@shohidulbari made their first contribution in chore: show success toast when system prompt is updated #7131
@mkhludnev made their first contribution in fix(reranker): respect top_n in the request #7025
@ErixM made their first contribution in fix the tts model dropdown to show the currently selected model #7306
@filipeaaoliveira made their first contribution in fix: Update Installer Options URL #7330

Full Changelog: v3.7.0...v3.8.0

What's Changed

Bug fixes 🐛

fix(reranker): respect top_n in the request by @mkhludnev in fix(reranker): respect top_n in the request #7025
fix(chatterbox): pin numpy by @mudler in fix(chatterbox): pin numpy #7198
fix(reranker): support omitting top_n by @mkhludnev in fix(reranker): support omitting top_n #7199
fix(api): SSE streaming format to comply with specification by @Copilot in fix(api): SSE streaming format to comply with specification #7182
fix(edit): propagate correctly opts when reloading by @mudler in fix(edit): propagate correctly opts when reloading #7233
fix(reranker): llama-cpp sort score desc, crop top_n by @mkhludnev in fix(reranker): llama-cpp sort score desc, crop top_n #7211
fix: handle tool errors by @mudler in fix: handle tool errors #7271
fix(reranker): tests and top_n check fix Check /reranker top_n at the REST level #7212 by @mkhludnev in fix(reranker): tests and top_n check fix #7212 #7284
fix the tts model dropdown to show the currently selected model by @ErixM in fix the tts model dropdown to show the currently selected model #7306
fix: do not delete files if used by other configured models by @mudler in fix: do not delete files if used by other configured models #7235
fix(llama.cpp): handle corner cases with tool content by @mudler in fix(llama.cpp): handle corner cases with tool content #7324

Exciting New Features 🎉

feat(llama.cpp): allow to set cache-ram and ctx_shift by @mudler in feat(llama.cpp): allow to set cache-ram and ctx_shift #7009
chore: show success toast when system prompt is updated by @shohidulbari in chore: show success toast when system prompt is updated #7131
feat(llama.cpp): consolidate options and respect tokenizer template when enabled by @mudler in feat(llama.cpp): consolidate options and respect tokenizer template when enabled #7120
feat: respect context and add request cancellation by @mudler in feat: respect context and add request cancellation #7187
feat(ui): add wizard when p2p is disabled by @mudler in feat(ui): add wizard when p2p is disabled #7218
feat(ui): chat stats, small visual enhancements by @mudler in feat(ui): chat stats, small visual enhancements #7223
chore: display file names in model preview by @shohidulbari in chore: display file names in model preview #7251
feat: import models via URI by @mudler in feat: import models via URI #7245
chore(importers): small logic enhancements by @mudler in chore(importers): small logic enhancements #7262
feat(ui): allow to cancel ops by @mudler in feat(ui): allow to cancel ops #7264
feat: migrate to echo and enable cancellation of non-streaming requests by @mudler in feat: migrate to echo and enable cancellation of non-streaming requests #7270
feat(mcp): add LocalAI endpoint to stream live results of the agent by @mudler in feat(mcp): add LocalAI endpoint to stream live results of the agent #7274
chore: do not use placeholder image by @mudler in chore: do not use placeholder image #7279
chore: guide the user to import models by @mudler in chore: guide the user to import models #7280
chore(ui): import vendored libs by @mudler in chore(ui): import vendored libs #7281
feat(importers): add transformers and vLLM by @mudler in feat(importers): add transformers and vLLM #7278
feat: restyle index by @mudler in feat: restyle index #7282
feat: add support to logitbias and logprobs by @mudler in feat: add support to logitbias and logprobs #7283
feat(ui): small refinements by @mudler in feat(ui): small refinements #7285
feat(index): minor enhancements by @mudler in feat(index): minor enhancements #7288
chore: scroll in thinking mode, better buttons placement by @mudler in chore: scroll in thinking mode, better buttons placement #7289
chore: small ux enhancements by @mudler in chore: small ux enhancements #7290
feat(ui): add backend reinstall button by @mudler in feat(ui): add backend reinstall button #7305
feat(importer): unify importing code with CLI by @mudler in feat(importer): unify importing code with CLI #7299
feat(ui): runtime settings by @mudler in feat(ui): runtime settings #7320
feat(importers): Add diffuser backend importer with ginkgo tests and UI support by @Copilot in feat(importers): Add diffuser backend importer with ginkgo tests and UI support #7316
feat(ui): add chat history by @mudler in feat(ui): add chat history #7325

🧠 Models

chore(model-gallery): ⬆️ update checksum by @localai-bot in chore(model-gallery): ⬆️ update checksum #6972
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #6982
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #6989
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #7017
chore(model-gallery): ⬆️ update checksum by @localai-bot in chore(model-gallery): ⬆️ update checksum #7024
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #7039
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #7040
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #7068
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #7077
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #7127
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #7133
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #7162
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #7205
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #7216
chore(model gallery): 🤖 add 1 new models via gallery agent by @localai-bot in chore(model gallery): 🤖 add 1 new models via gallery agent #7237
chore(model-gallery): ⬆️ update checksum by @localai-bot in chore(model-gallery): ⬆️ update checksum #7248

📖 Documentation and examples

feat: docs revamp by @mudler in feat: docs revamp #7313
fix: Update Installer Options URL by @filipeaaoliveira in fix: Update Installer Options URL #7330

👒 Dependencies

chore(deps): bump github.com/mudler/cogito from 0.4.0 to 0.5.0 by @dependabot[bot] in chore(deps): bump github.com/mudler/cogito from 0.4.0 to 0.5.0 #7054
chore(deps): bump github.com/onsi/ginkgo/v2 from 2.26.0 to 2.27.2 by @dependabot[bot] in chore(deps): bump github.com/onsi/ginkgo/v2 from 2.26.0 to 2.27.2 #7056
chore(deps): bump github.com/modelcontextprotocol/go-sdk from 1.0.0 to 1.1.0 by @dependabot[bot] in chore(deps): bump github.com/modelcontextprotocol/go-sdk from 1.0.0 to 1.1.0 #7053
chore(deps): bump github.com/valyala/fasthttp from 1.55.0 to 1.68.0 by @dependabot[bot] in chore(deps): bump github.com/valyala/fasthttp from 1.55.0 to 1.68.0 #7057
chore(deps): bump github.com/mudler/edgevpn from 0.31.0 to 0.31.1 by @dependabot[bot] in chore(deps): bump github.com/mudler/edgevpn from 0.31.0 to 0.31.1 #7055
chore(deps): bump github.com/containerd/containerd from 1.7.28 to 1.7.29 in the go_modules group across 1 directory by @dependabot[bot] in chore(deps): bump github.com/containerd/containerd from 1.7.28 to 1.7.29 in the go_modules group across 1 directory #7149
chore(deps): bump appleboy/ssh-action from 1.2.2 to 1.2.3 by @dependabot[bot] in chore(deps): bump appleboy/ssh-action from 1.2.2 to 1.2.3 #7224
chore(deps): bump github.com/mudler/cogito from 0.5.0 to 0.5.1 by @dependabot[bot] in chore(deps): bump github.com/mudler/cogito from 0.5.0 to 0.5.1 #7226
chore(deps): bump github.com/jaypipes/ghw from 0.19.1 to 0.20.0 by @dependabot[bot] in chore(deps): bump github.com/jaypipes/ghw from 0.19.1 to 0.20.0 #7227
chore(deps): bump github.com/docker/docker from 28.5.1+incompatible to 28.5.2+incompatible by @dependabot[bot] in chore(deps): bump github.com/docker/docker from 28.5.1+incompatible to 28.5.2+incompatible #7228
chore(deps): bump github.com/testcontainers/testcontainers-go from 0.38.0 to 0.40.0 by @dependabot[bot] in chore(deps): bump github.com/testcontainers/testcontainers-go from 0.38.0 to 0.40.0 #7230
chore(deps): bump github.com/ebitengine/purego from 0.9.0 to 0.9.1 by @dependabot[bot] in chore(deps): bump github.com/ebitengine/purego from 0.9.0 to 0.9.1 #7229
chore(deps): bump fyne.io/fyne/v2 from 2.7.0 to 2.7.1 by @dependabot[bot] in chore(deps): bump fyne.io/fyne/v2 from 2.7.0 to 2.7.1 #7293
chore(deps): bump go.yaml.in/yaml/v2 from 2.4.2 to 2.4.3 by @dependabot[bot] in chore(deps): bump go.yaml.in/yaml/v2 from 2.4.2 to 2.4.3 #7294
chore(deps): bump github.com/alecthomas/kong from 1.12.1 to 1.13.0 by @dependabot[bot] in chore(deps): bump github.com/alecthomas/kong from 1.12.1 to 1.13.0 #7296
chore(deps): bump google.golang.org/protobuf from 1.36.8 to 1.36.10 by @dependabot[bot] in chore(deps): bump google.golang.org/protobuf from 1.36.8 to 1.36.10 #7295
chore(deps): bump golang.org/x/crypto from 0.43.0 to 0.45.0 in the go_modules group across 1 directory by @dependabot[bot] in chore(deps): bump golang.org/x/crypto from 0.43.0 to 0.45.0 in the go_modules group across 1 directory #7319
chore(deps): bump protobuf from 6.32.0 to 6.33.1 in /backend/python/transformers by @dependabot[bot] in chore(deps): bump protobuf from 6.32.0 to 6.33.1 in /backend/python/transformers #7340
chore(deps): bump actions/checkout from 5 to 6 by @dependabot[bot] in chore(deps): bump actions/checkout from 5 to 6 #7339
chore(deps): bump google.golang.org/grpc from 1.76.0 to 1.77.0 by @dependabot[bot] in chore(deps): bump google.golang.org/grpc from 1.76.0 to 1.77.0 #7343
chore(deps): bump llama.cpp to '583cb83416467e8abf9b37349dcf1f6a0083745a by @mudler in chore(deps): bump llama.cpp to '583cb83416467e8abf9b37349dcf1f6a0083745a #7358

Other Changes

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in docs: ⬆️ update docs version mudler/LocalAI #6996
chore: ⬆️ Update ggml-org/whisper.cpp to 999a7e0cbf8484dc2cea1e9f855d6b39f34f7ae9 by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to 999a7e0cbf8484dc2cea1e9f855d6b39f34f7ae9 #6997
chore: ⬆️ Update ggml-org/llama.cpp to 2f68ce7cfd20e9e7098514bf730e5389b7bba908 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 2f68ce7cfd20e9e7098514bf730e5389b7bba908 #6998
chore: ⬆️ Update ggml-org/llama.cpp to cd5e3b57541ecc52421130742f4d89acbcf77cd4 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to cd5e3b57541ecc52421130742f4d89acbcf77cd4 #7023
chore: display warning only when directory is present by @mudler in chore: display warning only when directory is present #7050
chore: ⬆️ Update ggml-org/llama.cpp to c5023daf607c578d6344c628eb7da18ac3d92d32 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to c5023daf607c578d6344c628eb7da18ac3d92d32 #7069
chore: ⬆️ Update ggml-org/llama.cpp to ad51c0a720062a04349c779aae301ad65ca4c856 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to ad51c0a720062a04349c779aae301ad65ca4c856 #7098
chore: ⬆️ Update ggml-org/llama.cpp to a44d77126c911d105f7f800c17da21b2a5b112d1 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to a44d77126c911d105f7f800c17da21b2a5b112d1 #7125
chore: ⬆️ Update ggml-org/llama.cpp to 7f09a680af6e0ef612de81018e1d19c19b8651e8 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 7f09a680af6e0ef612de81018e1d19c19b8651e8 #7156
chore: use air to live reload in dev environment by @shohidulbari in chore: use air to live reload in dev environment #7186
chore: ⬆️ Update ggml-org/llama.cpp to 65156105069fa86a4a81b6cb0e8cb583f6420677 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 65156105069fa86a4a81b6cb0e8cb583f6420677 #7184
chore: ⬆️ Update ggml-org/llama.cpp to 333f2595a3e0e4c0abf233f2f29ef1710acd134d by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 333f2595a3e0e4c0abf233f2f29ef1710acd134d #7201
chore: ⬆️ Update ggml-org/llama.cpp to b8595b16e69e3029e06be3b8f6635f9812b2bc3f by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to b8595b16e69e3029e06be3b8f6635f9812b2bc3f #7210
chore: ⬆️ Update ggml-org/whisper.cpp to a1867e0dad0b21b35afa43fc815dae60c9a139d6 by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to a1867e0dad0b21b35afa43fc815dae60c9a139d6 #7231
chore: ⬆️ Update ggml-org/llama.cpp to 13730c183b9e1a32c09bf132b5367697d6c55048 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 13730c183b9e1a32c09bf132b5367697d6c55048 #7232
chore: ⬆️ Update ggml-org/llama.cpp to 7d019cff744b73084b15ca81ba9916f3efab1223 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 7d019cff744b73084b15ca81ba9916f3efab1223 #7247
feat(swagger): update swagger by @localai-bot in feat(swagger): update swagger #7267
chore: ⬆️ Update ggml-org/whisper.cpp to d9b7613b34a343848af572cc14467fc5e82fc788 by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to d9b7613b34a343848af572cc14467fc5e82fc788 #7268
chore(deps): bump llama.cpp to c4abcb2457217198efdd67d02675f5fddb7071c2 by @mudler in chore(deps): bump llama.cpp to c4abcb2457217198efdd67d02675f5fddb7071c2 #7266
chore: ⬆️ Update ggml-org/llama.cpp to 9b17d74ab7d31cb7d15ee7eec1616c3d825a84c0 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 9b17d74ab7d31cb7d15ee7eec1616c3d825a84c0 #7273
feat(swagger): update swagger by @localai-bot in feat(swagger): update swagger #7276
chore: ⬆️ Update ggml-org/llama.cpp to 662192e1dcd224bc25759aadd0190577524c6a66 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 662192e1dcd224bc25759aadd0190577524c6a66 #7277
feat(swagger): update swagger by @localai-bot in feat(swagger): update swagger #7286
chore: ⬆️ Update ggml-org/llama.cpp to 80deff3648b93727422461c41c7279ef1dac7452 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 80deff3648b93727422461c41c7279ef1dac7452 #7287
chore(docs): improve documentation and split into sections bigger topics by @mudler in chore(docs): improve documentation and split into sections bigger topics #7292
chore: ⬆️ Update ggml-org/whisper.cpp to b12abefa9be2abae39a73fa903322af135024a36 by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to b12abefa9be2abae39a73fa903322af135024a36 #7300
chore: ⬆️ Update ggml-org/llama.cpp to cb623de3fc61011e5062522b4d05721a22f2e916 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to cb623de3fc61011e5062522b4d05721a22f2e916 #7301
chore(deps): bump llama.cpp to '10e9780154365b191fb43ca4830659ef12def80f by @mudler in chore(deps): bump llama.cpp to '10e9780154365b191fb43ca4830659ef12def80f #7311
chore: ⬆️ Update ggml-org/llama.cpp to 7d77f07325985c03a91fa371d0a68ef88a91ec7f by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 7d77f07325985c03a91fa371d0a68ef88a91ec7f #7314
chore: ⬆️ Update ggml-org/whisper.cpp to 19ceec8eac980403b714d603e5ca31653cd42a3f by @localai-bot in chore: ⬆️ Update ggml-org/whisper.cpp to 19ceec8eac980403b714d603e5ca31653cd42a3f #7321
chore(docs): add documentation about import by @mudler in chore(docs): add documentation about import #7315
chore: ⬆️ Update ggml-org/llama.cpp to dd0f3219419b24740864b5343958a97e1b3e4b26 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to dd0f3219419b24740864b5343958a97e1b3e4b26 #7322
chore(chatterbox): bump l4t index to support more recent pytorch by @mudler in chore(chatterbox): bump l4t index to support more recent pytorch #7332
chore: ⬆️ Update ggml-org/llama.cpp to 23bc779a6e58762ea892eca1801b2ea1b9050c00 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 23bc779a6e58762ea892eca1801b2ea1b9050c00 #7331
Revert "chore(chatterbox): bump l4t index to support more recent pytorch" by @mudler in Revert "chore(chatterbox): bump l4t index to support more recent pytorch" #7333
chore: ⬆️ Update ggml-org/llama.cpp to 3f3a4fb9c3b907c68598363b204e6f58f4757c8c by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 3f3a4fb9c3b907c68598363b204e6f58f4757c8c #7336
chore: ⬆️ Update ggml-org/llama.cpp to 0c7220db56525d40177fcce3baa0d083448ec813 by @localai-bot in chore: ⬆️ Update ggml-org/llama.cpp to 0c7220db56525d40177fcce3baa0d083448ec813 #7337
feat(inpainting): add inpainting endpoint, wire ImageGenerationFunc and return generated image URL by @gmaOCR in feat(inpainting): add inpainting endpoint, wire ImageGenerationFunc and return generated image URL #7328
feat(swagger): update swagger by @localai-bot in feat(swagger): update swagger #7356
fix: double sudo invocation fix in the install script by @poretsky in fix: double sudo invocation fix in the install script #7359
Initialize sudo reference before its first actual use by @poretsky in Initialize sudo reference before its first actual use #7360

New Contributors

@shohidulbari made their first contribution in chore: show success toast when system prompt is updated #7131
@mkhludnev made their first contribution in fix(reranker): respect top_n in the request #7025
@ErixM made their first contribution in fix the tts model dropdown to show the currently selected model #7306
@filipeaaoliveira made their first contribution in fix: Update Installer Options URL #7330
@poretsky made their first contribution in fix: double sudo invocation fix in the install script #7359

Full Changelog: v3.7.0...v3.8.0

This discussion was created from the release v3.8.0.

Ender68-ai · 2025-11-27T09:11:27Z

Ender68-ai
Nov 27, 2025

I have a suggestion. I believe this engine is running on whisper, right? It its for speech to text, i would recommend vosk as it is more lightweight and designed for low powered devices. I have tried both and vosk is definetely better

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

v3.8.0 #7364

Uh oh!

{{title}}

Uh oh!

LocalAI

LocalAGI

LocalRecall

What's Changed

Bug fixes 🐛

Exciting New Features 🎉

🧠 Models

📖 Documentation and examples

👒 Dependencies

Other Changes

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

v3.8.0 #7364

Uh oh!

mudler Nov 26, 2025 Maintainer

📌 TL;DR

Feature Breakdown

🚀 Universal Model Import (URL-based)

🎨 Complete UI Overhaul

🤖 Agentic Ecosystem & MCP Live Streaming

🔁 Runtime System Settings

⚙️ Advanced llama.cpp Configuration

📊 Logprobs & Logitbias Support

🛠️ Fixes & Improvements

🚀 The Complete Local Stack for Privacy-First AI

LocalAI

LocalAGI

LocalRecall

❤️ Thank You

✅ Full Changelog

What's Changed

Bug fixes 🐛

Exciting New Features 🎉

🧠 Models

📖 Documentation and examples

👒 Dependencies

Other Changes

New Contributors

What's Changed

Bug fixes 🐛

Exciting New Features 🎉

🧠 Models

📖 Documentation and examples

👒 Dependencies

Other Changes

New Contributors

Replies: 1 comment

Uh oh!

Ender68-ai Nov 27, 2025

mudler
Nov 26, 2025
Maintainer

⚙️ Advanced `llama.cpp` Configuration

Ender68-ai
Nov 27, 2025