27 Mar 15:09

shimmyshimmer

9477e7c

New Important Updates! Latest

Latest

Hey guys, it's only been 2 days since our last release, but we’ve got a lot more important updates:

Inference is now 20–30% faster. Previously, tool-calling and repeat penalty could slow inference below normal speeds. Inference tokens/s should now perform similar to llama-server / llama.cpp.
Now Auto-detects older or pre-existing models downloaded from LM Studio, Hugging Face, and similar sources.
Inference token/s speed is now calculated correctly. Previously, tokens/s included startup time, which made the displayed speed look slower than it actually was. It should now reflect 'true' inference speed.
CPU usage no longer spikes. Previously, inline querier identity changed every render, causing useLiveQuery to resubscribe continuously.
Unsloth Studio now has a shutdown x button and shuts down properly. Previously, closing it after opening from the desktop icon would not close it properly. Now, launching from the shortcut also opens the terminal, and closing that terminal fully exits Unsloth Studio. If you still have it open from a previous session you can restart your computer or run lsof -i :8888 then kill -9 <PID>.
Even better tool-calling and websearch with reduced errors.
Updated documentation with lots of new info on deleting models, uninstalling etc.
Cleaner, smarter install and setup logging across Windows and Linux. Output is now easier to read with consistent formatting, quieter by default for a smoother experience, and supports richer --verbose diagnostics when you want full technical detail.
{% endupdate %}
You can now view your training history

What's Changed

Bump installer min version to 2026.3.12 by @danielhanchen in #4600
Fix Colab Studio launch and setup.ps1 box alignment by @danielhanchen in #4601
Fix Colab huggingface-hub conflict, ensurepip fallback, bump to 2026.3.14 by @danielhanchen in #4603
Update README.md by @rolandtannous in #4604
fix: skip flex_attention for models with non-zero attention_dropout by @Abhinavexists in #4605
Fix Colab setup skipping llama.cpp installation by @rolandtannous in #4618
fix: show recommended models in search results by @Shine1i in #4615
studio: align Dataset/Parameters/Training cards, fix expandable height, animate LoRA settings by @Imagineer99 in #4614
fix: Windows installer fails on _yaml.pyd Access Denied (os error 5) by @Etherll in #4617
studio: humanize ETA display for long training runs by @RadouaneElhajali in #4608
fix: add python-json-logger to data-designer-deps by @Shine1i in #4627
[Studio] Colab fix - Allow install_python_stack to run on Colab by @rolandtannous in #4633
Fix repetition_penalty default causing 24% TPS drop in GGUF inference by @danielhanchen in #4634
fix: install.sh Mac Intel compatibility + Studio no-torch support by @danielhanchen in #4624
tests: add no-torch / Intel Mac test suite by @danielhanchen in #4646
fix: use unsloth[huggingfacenotorch] instead of --no-deps in no-torch mode by @danielhanchen in #4647
Fix Gemma3N audio training stride assertion with non-reentrant checkpointing by @danielhanchen in #4629
Fix missing num_items_in_batch in unsloth_prediction_step by @danielhanchen in #4616
Make Studio shortcuts launch in a visible terminal by @danielhanchen in #4638
studio: setup log styling by @Imagineer99 in #4494
Fix ~1.2s TTFT penalty when tools are enabled in Studio by @danielhanchen in #4639
Fix GGUF GPU fit check to account for KV cache VRAM by @danielhanchen in #4623
feat: update app icons to rounded logo by @Shine1i in #4640
Streaming tool detection: guard late tool_calls, filter incomplete fragments by @danielhanchen in #4648
fix: install no-torch runtime deps via requirements file by @danielhanchen in #4649
Fix orphan server cleanup killing user's own llama-server by @danielhanchen in #4622
fix: add auth + UX improvements to shutdown button by @Shine1i in #4642
Fix inference failing for transformers 5.x models (trust_remote_code) by @danielhanchen in #4652
fix: no-torch install deps without pulling torch transitively by @danielhanchen in #4650
Detect always-on reasoning models and show Think button as locked-on by @danielhanchen in #4654
fix: replace navbar shutdown text button with icon-only button by @Shine1i in #4655
Fall back to parsing model name when HF API has no param count by @danielhanchen in #4656
fix: disable OCR in pymupdf4llm PDF extraction by @Shine1i in #4659
Fix HF cache default and show LM Studio models in chat/inference by @rolandtannous in #4653
Bump minimum unsloth version to 2026.3.16 in install scripts by @danielhanchen in #4663

New Contributors

@Abhinavexists made their first contribution in #4605
@RadouaneElhajali made their first contribution in #4608

Full Changelog: v0.1.2-beta...v0.1.25-beta

Contributors

danielhanchen, Etherll, and 5 other contributors

Assets 2

0 Join discussion

25 Mar 15:39

shimmyshimmer

v0.1.2-beta

55d24d7

First Release post Unsloth Studio!

Hey guys, this is our first release since we launched Unsloth Studio last week. From now on you can directly access all our updates through our changelog here: https://unsloth.ai/docs/new/changelog

You can now update Unsloth Studio! Just use: unsloth studio update. Please update to use all the newest fixes and features.

Tool calling improved. Better llama.cpp parsing, no raw tool markup in chat, faster inference, a new Tool Outputs panel, timers.
Windows CPU or GPU now works seamlessly. Please reinstall!
App shortcuts. Once installed, you can now launch in Windows, MacOS and Linux via a shortcut icon in the Start / Launch and Desktop.
Pre-compiled llama.cpp binaries and mamba_ssm for finetuning - 6x faster installs! Also <300MB in size for binaries.
50% reduced installation sizes (-7GB or more savings), 2x faster installs and faster resolving. 50% smaller pypi sizes.
Colab with free T4 GPUs with Unsloth Studio now fixed! Try it here. Due to pre-compiled binaries, it's also 20x faster!
You can now properly use old GGUFs from Hugging Face or LM Studio
MacOS and CPU now have Data Recipes enabled with multi-file uploading.
AMD support preliminary for Linux only machines - auto detects.
Settings sidebar redesign. Settings are now grouped into Model, Sampling, Tools, and Preferences
Context length now adjustable. Keep in mind this is not needed as llama.cpp smartly uses the exact context you need via --fit on
Persistent system prompts and presets. Custom system prompts and chat presets now persist across reloads and page changes.
Multi-file upload. Data recipes now support multiple drag-and-drop uploads for PDF, DOCX, TXT, and MD, with backend extraction, saved uploads, and improved previews.
Better chat observability. Studio now shows llama-server timings and usage, a context-window usage bar, and richer source hover cards.
Better UX overall - clickable links, better LaTeX parsing, tool / code / web tooltips for default cards and much more!
LiteLLM - Unsloth Studio and Unsloth were NOT affected by the recent LiteLLM compromise. Nemo Data Designer used LiteLLM only up to 1.80, not the affected 1.82.7 or 1.82.8, and has since removed it entirely.
We now have a new one line install command, just run: Copycurl -fsSL https://unsloth.ai/install.sh | sh

Fixes:

Windows/setup improvements. Fixed silent Windows exits, Anaconda/conda-forge startup crashes, broken non-NVIDIA Windows installs, and missing early CUDA/stale-venv setup checks.
System prompts fixed. They work again for non-GGUF text and vision inference.
GGUF export expanded. Full fine-tunes, not just LoRA/PEFT, can now export to GGUF. Base model resolution is more reliable, and unsupported export options are disabled in the UI.
Chat scroll/layout fixes. Fixed scroll-position issues during generation, thinking-panel layout shift, and viewport jumps when collapsing reasoning panels.
Smarter port conflict detection. Studio now detects loopback conflicts, can identify the blocking process when possible, and gives clearer fallback-port messages.

Example of automatic parameter settings for context length etc:

super.final.mp4

What's Changed

[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #4542
fix: store embedding_learning_rate on self in UnslothTrainingArguments by @GoldenGrapeGentleman in #4531
studio: persist system prompt and preset settings across navigation by @Imagineer99 in #4538
studio: stop scroll hijack during generation and fix thinking panel layout shift by @Imagineer99 in #4543
Fix Studio port conflict detection for loopback addresses by @danielhanchen in #4532
fix(studio): show Windows-specific reset-password command by @Shine1i in #4529
fix(studio): restore scroll lock on reasoning panel collapse by @danielhanchen in #4545
fix: always show chat tool icons by @Shine1i in #4525
fix: system prompt ignored in unsloth inference by @Shine1i in #4528
fix: handle prompt/completion datasets in slow-path BOS detection by @danielhanchen in #4548
fix: give @0xKushwaha git history credit for completion_only_loss fix by @danielhanchen in #4552
⚠️Remove quarantined litellm for precaution -- Unsloth Studio NOT affected by @danielhanchen in #4553
fix: pin unsloth>=2026.3.11 in install scripts by @danielhanchen in #4556
Regroup chat settings sidebar into focused sections by @Shine1i in #4551
Add GRPO resume vLLM cleanup guard by @MagellaX in #4411
fix: prevent UnicodeEncodeError on Windows CP1252 consoles in studio setup by @Krishnachaitanyakc in #4563
studio: windows desktop shortcut launcher by @Imagineer99 in #4558
Remove duplicate frontend assets from wheel (~31 MB savings) by @danielhanchen in #4567
feat(studio): training history persistence and past runs viewer by @Shine1i in #4501
fix: remove auto wandb.finish() after train() to allow post-training evaluate() by @Krishnachaitanyakc in #4564
feat: Implement Q-GaLore optimizer and custom embedding learning rate… by @OnePunchMonk in #4511
Bump Data Designer to 0.5.4 (removes litellm dependency) by @danielhanchen in #4569
feat(chat): cleaner tool UI, inline LaTeX, clickable links by @Shine1i in #4561
[Studio] Try installing causal-conv1d from prebuilt wheels if avialable by @Datta0 in #4547
Feature/add dependabot and codeql security checks by @pkloehn1 in #4479
build(deps): bump the actions group with 2 updates by @dependabot[bot] in #4570
build(deps): bump oxc-parser from 0.116.0 to 0.121.0 in /studio/backend/core/data_recipe/oxc-validator in the npm-oxc-validator group by @dependabot[bot] in #4571
Remove advanced CodeQL workflow (conflicts with default setup) by @danielhanchen in #4584
Add macOS and Linux desktop shortcuts to install.sh by @danielhanchen in #4568
perf(studio): upgrade to Vite 8 + auto-install bun for faster frontend builds by @Etherll in #4522
feat(tokenizer): add get_tokenizer_info() diagnostic helper by @cz-03 in #4436
Add ROCm (AMD GPU) support to studio setup by @danielhanchen in #4585
Consolidate dual venvs and separate install from update by @rolandtannous in #4530
studio: stabilize reasoning panel scroll behavior and prevent composer overlap by @Imagineer99 in #4587
Use prebuilt llama.cpp for unsloth studio setup by @mmathew23 in #4562
fix(studio): add -ngl flag for GPU offloading in llama-server by @danielhanchen in #4588
fix(studio): add pip nvidia CUDA libs to LD_LIBRARY_PATH for llama-server by @danielhanchen in #4590
fix(studio): validate bun install and retry from official source on failure by @danielhanchen in #4589
fix(studio): clear bun cache on failure and retry before falling back to npm by @danielhanchen in #4594
Pin torch>=2.4,<2.11.0 in Studio installers by @danielhanchen in #4595
fix(studio): source-build fallback prefers Unsloth's tested tag over upstream latest by @danielhanchen in #4593
fix(studio): add bun cache validation to Windows setup.ps1 by @danielhanchen in #4596
feat: multi-source model discovery (HF default, legacy cache, LM Studio) by @rolandtannous in #4591
Add unsloth to User PATH on Windows after install by @danielhanchen in #4597
Add PID file tracking and unsloth studio stop command by @danielhanchen in #4598
feat(studio): editable context length with Apply/Reset for GGUF settings by @danielhanchen in #4592

New Contributors

@MagellaX made their first contribution in #4411
@Krishnachaitanyakc made their first contribution in #4563
@OnePunchMonk made their first contribution in #4511
@pkloehn1 made their fir...

Contributors

mmathew23, Krishnachaitanyakc, and 14 other contributors

Assets 2

2 Join discussion

22 Mar 20:34

mmathew23

b8475

1ecb55f

llama.cpp prebuilt b8475

Install-ready Unsloth Studio llama.cpp bundles for b8475.

Assets 9

17 Mar 15:21

shimmyshimmer

v0.1.0-beta

239ca98

Introducing Unsloth Studio (Beta)!

Hey guys, we're super excited to launch Unsloth Studio (Beta), a new open-source web UI to train and run LLMs.

Blog + everything you need to know: https://unsloth.ai/docs/new/studio

Run models locally on Mac, Windows, Linux
Compare and battle models side-by-side
Train 500+ models 2x faster with 70% less VRAM
Supports GGUF, vision, audio, embedding models
Self-healing Tool calling / web search + code execution
Auto-create datasets from PDF, CSV, DOCX
Export models to GGUF, safetensor and more formats

MacOS, Linux, WSL:

For MacOS, ensure you have cmake installed. If not, run brew install cmake.

curl -fsSL https://unsloth.ai/install.sh | sh

Then to launch every time:

source unsloth_studio/bin/activate
unsloth studio -H 0.0.0.0 -p 8888

Windows:

Run in Windows Powershell:

irm https://unsloth.ai/install.ps1 | iex

Then to launch every time:

.\unsloth_studio\Scripts\activate
unsloth studio -H 0.0.0.0 -p 8888

Docker

Use our Docker image unsloth/unsloth container. Run:

docker run -d -e JUPYTER_PASSWORD="mypassword" \
  -p 8888:8888 -p 8000:8000 -p 2222:22 \
  -v $(pwd)/work:/workspace/work \
  --gpus all \
  unsloth/unsloth

unsloth.studio.video.mp4

What's Changed

Update CODEOWNERS for studio and cli by @danielhanchen in #4266
[Feature] Support Sequence Classification by @danielhanchen in #4264
[Feature] VLMs support for GRPO by @danielhanchen in #4265
[Fix] Respect llm_int8_skip_modules for VLM by @danielhanchen in #4249
ROCM support by @danielhanchen in #4271
Remove Blackwell flex attention disable workaround from studio by @danielhanchen in #4273
ROCM support by @danielhanchen in #4272
fix: prevent ai-assist model config RCE via untrusted Hugging Face repos by @danielhanchen in #4274
fix(seed): disable remote code execution in seed inspect dataset loads by @danielhanchen in #4275
Update CODEOWNERS by @danielhanchen in #4279
fix: install data-designer plugin non-editable for Colab compatibility by @LeoBorcherding in #4268
Arch/mixtral by @danielhanchen in #4283
Improve documentation on how to export model from Colab by @danielhanchen in #4284
feat: Add Mixtral model support by @danielhanchen in #4285
Initial changes: Refactor Attention by @danielhanchen in #4286
patch vlm trainer to resize images by @danielhanchen in #4287
[WIP] add support for mixtral by @danielhanchen in #4288
studio: speed up setup -- uv for installs (8x), Ninja for llama.cpp (1.7x) by @danielhanchen in #4289
fix: remove old comments by @Shine1i in #4292
PR: Windows Setup Improvements by @rolandtannous in #4299
miscallenous studio by @Shine1i in #4293
Fix: Compare Mode Deadlock, Cancel Event Poisoning & IPC Optimization by @rolandtannous in #4303
studio: fix GGUF inference -- reasoning tokens, max_tokens, server flags, GPU allocation by @danielhanchen in #4290
chat only with gguf for mac devices by @Manan17 in #4300
studio: add max steps and epochs toggle switch by @Imagineer99 in #4296
Fix/colab plugin editable install by @LeoBorcherding in #4281
Graceful shutdown on Windows (signal handlers for Ctrl+C) by @rolandtannous in #4306
studio: simplify auth UX to password-only login by @Imagineer99 in #4305
studio: preserve save_steps when toggling to epochs mode by @Imagineer99 in #4308
Fix studio frontend build producing empty Tailwind CSS by @danielhanchen in #4311
Fix setup.sh crash on Mac with empty gitignore array by @danielhanchen in #4313
[Feature] studio: user can upload eval dataset by @Manan17 in #4307
fix: Ctrl+C not terminating backend on Linux by @rolandtannous in #4316
Add download progress bar for non-GGUF models in Chat by @danielhanchen in #4314
Apply use_reentrant removal to all TRL trainer configs by @danielhanchen in #4321
Fix VLM GRPO matmul shape mismatch in _get_per_token_logps_and_entropies by @danielhanchen in #4301
Improve AI Assist: Update default model, model output parsing, logging, and dataset mapping UX by @rolandtannous in #4323
studio: per-model inference defaults, GGUF slider fix, reasoning toggle by @danielhanchen in #4325
fix: Resolve CUDA toolkit mismatch on multi-CUDA Windows systems by @rolandtannous in #4324
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #4332
Fix/colab comment edits by @LeoBorcherding in #4317
fix: add Qwen3.5 version gate in loader dispatch by @danielhanchen in #4335
Fix xformers Blackwell guard: broader coverage and root cause docs by @danielhanchen in #4338
studio: improve Colab notebook, redesign ready popup, and clean up install output by @LeoBorcherding in #4339
Add check to disable xformers on newer GPUs by @pluesclues in #4342
studio: training progress, CUDA lib path, dataset_num_proc fix by @danielhanchen in #4336
studio: fix stale GGUF metadata, update helper model, auth improvements by @danielhanchen in #4346
studio: show "Off" for repetition penalty = 1 by @danielhanchen in #4349
studio: update Creative/Precise presets, show "Off" for disabled samplers by @danielhanchen in #4350
studio: fix slow cancellation of GGUF generation by @danielhanchen in #4352
Fix: Remove unused warmupToastShown variable (TS6133) by @rolandtannous in #4353
Studio: SVG preview, fix streaming and model selector bugs by @danielhanchen in #4354
fix: comment out debug print statements by @rolandtannous in #4357
fix(llm_assist): disable thinking mode for helper model JSON output by @rolandtannous in #4358
studio: improve onboarding UX, tooltips, and training defaults by @danielhanchen in #4355

New Contributors

@LeoBorcherding made their first contribution in #4268
@Shine1i made their first contribution in #4292
@Manan17 made their first contribution in #4300
@Imagineer99 made their first contribution in #4296

Full Changelog: https://github.com/unslothai/unsloth/commits/March-2026

Contributors

danielhanchen, Manan17, and 6 other contributors

Assets 3

5 Join discussion

20 Mar 21:49

mmathew23

b8457

dd283b0

llama.cpp prebuilt b8457

Install-ready Unsloth Studio llama.cpp bundles for b8457.

Assets 9

10 Feb 15:25

shimmyshimmer

February-2026

8ee5e62

12x Faster MoE Training + Embedding support!

Our first release of 2026! This year we’ve got a lot of exciting things coming and to kick things off, we’re introducing faster MoE training, embedding model support, and ultra long context for Reinforcement Learning. We’ll also be launching our brand new UI very soon.

We’d like to thank all of you for 50K stars on GitHub! ⭐

We’ve also added support for many new models that you can now run and fine-tune locally, including DeepSeek-OCR 2, GLM-4.7-Flash, Kimi-2.5, and more.

🚀 Faster MoE training

You can now train MoE models 12× faster with 35% less VRAM and 6x longer context via our new Triton and math kernels (no accuracy loss). gpt-oss-20b works on 12.8GB VRAM. Qwen3-30B-A3B (16-bit LoRA) uses 63GB.

Unsloth supports fast training for gpt-oss, Qwen3 (30B, 235B, VL, Coder), DeepSeek R1/V3 arch and GLM (4.7, Flash) models.

Faster MoE Blog

🔎 Embedding models now train 2× faster

We collaborated with Hugging Face to enable 1.8-3.3x faster embedding, BERT and classifier model training with 20% less VRAM, 2x longer context & no accuracy loss vs. FA2 setups.

Embedding model Blog

💡 Ultra Long Context RL is here

We’re introducing new batching algorithms to enable ~7x longer context (can be more than 12x) RL training with no accuracy or speed degradation vs. other optimized setups that use FA3, kernels & chunked losses.

Unsloth now trains gpt-oss QLoRA with 380K context on a single 192GB NVIDIA B200 GPU

Long Context RL Blog

🔮 New models

🐳 DeepSeek-OCR 2 - Run and fine-tune the new OCR model.
🥝 Kimi 2.5 - Run the SOTA model locally with Unsloth GGUFs.
⚡ GLM-4.7-Flash - Run and fine-tune the best-in-class 30B LLM.

🎉 Extra Updates

As part of our MoE release, we also made Gemma-3 now use Flex-Attention by default, and this works in float16 settings as well (there were infinities which we solved a while back). Gemma-3 now uses O(N) memory and not O(N^2) memory, and trains >3x faster (scales even better with context length). Previous Unsloth versions would OOM.
Vision fine-tuning now accepts mixed data of only images and text data!
trl==0.27.1 and transformers==5.1.0 are supported well - previous coverage was 30% of all our 120 notebooks, but now we have >80% coverage - we plan to make it 100% over the next few days.
And many many other bug fixes and other updates!

📖 New Guides

</> How To Use Claude Code + Codex with local LLMs: Guide
👾 Train & deploy to LM Studio for local inference: Guide
🎨 Run Diffusion image models with Unsloth GGUFs: Guide

Tip

Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo
If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo

February is shaping up to be an amazing month for LLM releases, and we hope you’re just as excited as we are. 😊

What's Changed

[FIX] [Transformers] VLM input embeds fix for gradients by @Datta0 in #3715
[fbgemm] Silence tma fbgemm by @Datta0 in #3735
[hf_hub] Token login by @Datta0 in #3739
Do not overwrite slots by @Datta0 in #3752
Fix VLM + DDP checkpointing by @djsaunde in #3751
Enable 4-bit quantization on AMD Radeon GPUs by @sstamenk in #3748
Nightly by @danielhanchen in #3753
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #3760
Nightly by @danielhanchen in #3767
Add missing import of inspect by @sstamenk in #3778
Clarify NotImplementedError for fast_inference with full_finetuning by @Fizza-Mukhtar in #3768
Update FUNDING.yml by @danielhanchen in #3792
fix(trainer): import psutil to prevent NameError in _prepare_dataset by @alkinun in #3780
fastrope fix for zero strided tensors by @f14-bertolotti in #3782
Fix crash when trl.experimental.openenv is unavailable by @Fizza-Mukhtar in #3787
Fix Boolean value of Tensor ambiguity error in mistral.py by @yurekami in #3790
fix: add support for init_lora_weights="corda" in get_peft_model by @majiayu000 in #3794
Fix correctness bugs in rl.py, rl_replacements.py, and vision.py by @danielhanchen in #3811
Fix correctness bugs across multiple model files by @danielhanchen in #3813
Fix 3D tensor support for bitsandbytes 8-bit matmul in forward pass by @Fizza-Mukhtar in #3806
FIX: weight tying for LoRA embeddings and lm_head by @oKatanaaa in #3711
Fix Gemma3 QAT training instability with int8-int4 scheme by @danielhanchen in #3818
Add helpful error messages for fast_generate when fast_inference=False by @danielhanchen in #3820
Bug fixes by @danielhanchen in #3821
Make llama.cpp CURL dependency optional when building from source by @Fizza-Mukhtar in #3822
remove redundant code of has_block by @ykaitao in #3832
rl.py fixes: buffer reset, safer attribute access, typo fix by @danielhanchen in #3834
Respect user quantization_config by @danielhanchen in #3835
Fix vLLM PDL bug on Blackwell GPUs (B200/B100) by @danielhanchen in #3841
Sync chat_template from tokenizer to vLLM by @danielhanchen in #3842
remove unused variable BlockDiagonalCausalMask by @ykaitao in #3836
Replace GitHub API check with vLLM version check for PDL fix by @danielhanchen in #3849
GRPO: restore model mode after generate (stacked on #3754) by @danielhanchen in #3851
Fix model training state restoration in GRPO trainer by @numb3r33 in #3754
Unify Version usage and fix TRL version handling by @danielhanchen in #3843
[ModelScope] Disable stats when modelscope is being used by @Datta0 in #3857
Fix FBGEMM/CUTLASS errors on SM100 (Blackwell) GPUs by @danielhanchen in #3863
Feature/raw text dataprep by @Vangmay in #3612
Fix Kaggle telemetry misclassification when COLAB_ keys exist by @hnxnq7 in #3869
reduce code duplication by _offload_frozen_module_for_training by @ykaitao in #3865
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #3881
wrong number of dimensions by @f14-bertolotti in #3880
Disable gradient checkpointing when explicitly off for vision by @ducviet00 in #3879
[trl] use non lora model as base for RL by @Datta0 in #3895
Chunk Across Batch and Context length for logprob calculations for grpo by @pluesclues in #3628
add weight-only int8 QAT scheme and update tests for torchao 0.15.0 by @electroglyph in #3859
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #3905
Fix vllm ipykernel patch by @pluesclues in #3907
Handle Transformers 5 vLLM import errors by @danielhanchen in #3908
add FastSentenceTransformer for easily finetuning SentenceTransformer models by @electroglyph in #3719
Guard torch.compile on ROCm when triton_key is missing by @hnxnq7 in #3923
Grpo compile settings update by @pluesclues in #3927
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #3937
chore: Update outdated GitHub Actions version by @pgoslatara in #3936
[trl] vllm trl topk fixup by @Datta0 in #3935
[fix] qwen3-guard tokenizer by @Datta0 in #3959
fix for intel devices torch compile configs by @l...

Contributors

djsaunde, andrewor14, and 32 other contributors

Assets 2

3 Join discussion

18 Dec 17:45

shimmyshimmer

December-2025

4f7651c

December Release + 3x Faster Training

Thanks for all the love and support this year! We're wishing you all a lovely Christmas. Please update Unsloth & our Docker to use the latest updates! 🦥

Introducing 3x faster training & 30% less VRAM. New Triton kernels, padding-free & packing. Blog
500K Context training and reinforcement learning is now possible on a single 80GB GPU. Blog • Notebook
Fine-tune then Deploy LLMs on your Phone with PyTorch and Unsloth. Tweet • Read Guide
🤗 Transformers v5 is now supported! It's not enabled by default due to possible instability issues.
Preliminary multi-GPU support: DDP Guide (not representative of the official release early next year)
More: Sudoku RL nb • Paddle-OCR nb • New NVIDIA blog
Lots of bug fixes! See further below.

🔮 New Models + Guides

✨FunctionGemma: Google new 270M tool-calling LLM. Guide • Notebook
Nemotron 3: NVIDIA new 30B reasoning model. Guide • GGUF
Mistral: new coding & instruct VLMs. Ministral 3 • Devstral 2
GLM-4.6V: new vision models. Guide • 4.6V • 4.6V-Flash
More: Qwen3-Next • Mistral Large 3 • FLUX.2-dev

Tip

Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo
If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo