Skip to content

Releases: facebookresearch/xformers

v0.0.35: Rely on upstream FA3

20 Feb 15:02
@lw lw

Choose a tag to compare

Pre-built binary wheels are available for PyTorch 2.10.0 (and later).

Improved

  • Supported free-threading Python.

Removed

  • Stopped bundling pre-built versions of Flash-Attention 3, and instead started relying on the wheels provided by the PyTorch indices.

v0.0.34: Stable wheels for PyTorch 2.10+

23 Jan 12:58
@lw lw

Choose a tag to compare

Pre-built binary wheels are available for PyTorch 2.10.0 (and later).

Improved

  • Migrated xFormers to the PyTorch stable API/ABI, which means that binary builds targeting PyTorch 2.10+ will be compatible with any later version

Removed

  • Removed optimized fast-path of SwiGLU (which was only available for A100 GPUs)
  • Removed most legacy components

v0.0.33.post2: Wheels for PyTorch 2.9.1

03 Dec 14:27
@lw lw

Choose a tag to compare

Release v0.0.33.post2 for PyTorch 2.9.1

v0.0.33.post1

13 Nov 14:25

Choose a tag to compare

Fixed wheel upload to pypi

Support Pytorch 2.9

12 Nov 13:49

Choose a tag to compare

Added

  • cutlass fmha Op for Blackwell GPUs
  • Support flash-attention package up to 2.8.3
  • expose FA3 deterministic mode
  • FW+BW pass overlap for DeepSeek-like comms/compute overlap

Improved

  • merge_attentions support for irregular head dimension

v0.0.32.post2

15 Aug 05:58
5d4b92a

Choose a tag to compare

Add ROCM 6.4 build

v0.0.32.post1

14 Aug 12:13
840bcec

Choose a tag to compare

wheels/windows timeout (#1309)

* wheels/windows timeout

Try building with `MAX_JOBS=3`

* Update wheels_build.yml

v0.0.32: Wheels for PyTorch 2.8.0

13 Aug 19:29

Choose a tag to compare

Pre-built binary wheels are available for PyTorch 2.8.0.

Added

  • Support flash-attention package up to 2.8.2
  • Speed improvements to python -m xformers.profiler.find_slowest

Removed

  • Removed autograd backward pass for merge_attentions as it is easy to use incorrectly.
  • Attention biases are no longer torch.Tensor subclasses. This is no longer
    necessary for torch.compile to work, and was adding more complexity

`v0.0.31.post1` Fixing wheels for windows

08 Jul 09:36

Choose a tag to compare

remove merge_attentions backward (fairinternal/xformers#1402)

__original_commit__ = fairinternal/xformers@601197af8bf5a55f73b4bb79b5d74a03b853dc51

v0.0.31 - PyTorch 2.7.1, Flash3 on windows, and dropping V100 support

25 Jun 09:11

Choose a tag to compare

[0.0.31] - 2025-06-25

Pre-built binary wheels are available for PyTorch 2.7.1.

Added

  • xFormers wheels are now python-version agnostic: this means that the same wheel can be used for python 3.9, 3.10, ... 3.13
  • Added support for Flash-Attention 3 on Ampere GPUs

Removed

  • We will no longer support V100 or older GPUs, following PyTorch (pytorch/pytorch#147607)
  • Deprecated support for building Flash-Attention 2 as part of xFormers. For Ampere GPUs, we now use Flash-Attention 3 on windows, and Flash-Attention 2 can still be used through PyTorch on linux.