v3.3.5
What's Changed
- [gaudi] Refine rope memory, do not need to keep sin/cos cache per layer by @sywangyi in #3274
- Gaudi: add CI by @baptistecolle in #3160
- [gaudi] Gemma3 sliding window support by @sywangyi in #3280
- xpu lora support by @sywangyi in #3232
- Optimum neuron 0.2.2 by @dacorvo in #3281
- [gaudi] Remove unnecessary reinitialize to HeterogeneousNextTokenChooser to m… by @sywangyi in #3284
- [gaudi] Deepseek v2 mla and add ep to unquantized moe by @sywangyi in #3287
- [gaudi] Fix the CI test errors by @yuanwu2017 in #3286
- Hpu gptq gidx support by @sywangyi in #3297
- Migrate to V2 Pydantic interface by @emmanuel-ferdman in #3262
- Xccl by @sywangyi in #3252
- Multi modality fix by @sywangyi in #3283
- some gptq case could not be handled by ipex. but could be handle by t… by @sywangyi in #3298
- fix outline import issue by @sywangyi in #3282
- HuggingFaceM4/Idefics3-8B-Llama3 crash fix by @sywangyi in #3267
- Optimum neuron 0.3.0 by @tengomucho in #3308
- Disable Cachix pushes by @danieldk in #3312
- chore: prepare version 3.3.5 by @tengomucho in #3314
- feat: bump flake including transformers and huggingface_hub versions by @drbh in #3313
Full Changelog: v3.3.4...git