Skip to content

Releases: huggingface/diffusers

Instruct-Pix2Pix, DiT, LoRA

25 Jan 16:10
Compare
Choose a tag to compare

🪄 Instruct-Pix2Pix

Instruct-Pix2Pix is a Stable Diffusion model fine-tuned for editing images from human instructions. Given an input image and a written instruction that tells the model what to do, the model follows these instructions to edit the image.

image

The model was released with the paper InstructPix2Pix: Learning to Follow Image Editing Instructions. More information about the model can be found in the paper.

pip install diffusers transformers safetensors accelerate
import PIL
import requests
import torch
from diffusers import StableDiffusionInstructPix2PixPipeline

model_id = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")

url = "https://huggingface.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png"
def download_image(url):
    image = PIL.Image.open(requests.get(url, stream=True).raw)
    image = PIL.ImageOps.exif_transpose(image)
    image = image.convert("RGB")
    return image
image = download_image(url)

prompt = "make the mountains snowy"
edit = pipe(prompt, image=image, num_inference_steps=20, image_guidance_scale=1.5, guidance_scale=7).images[0]
images[0].save("snowy_mountains.png")

🤖 DiT

Diffusion Transformers (DiTs) is a class conditional latent diffusion model which replaces the commonly used U-Net backbone with a transformer operating on latent patches. The pretrained model is trained on the ImageNet-1K dataset and is able to generate class conditional images of 256x256 or 512x512 pixels.

dit

The model was released with the paper Scalable Diffusion Models with Transformers.

import torch
from diffusers import DiTPipeline

model_id = "facebook/DiT-XL-2-256"
pipe = DiTPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")

# pick words that exist in ImageNet
words = ["white shark", "umbrella"]
class_ids = pipe.get_label_ids(words)

output = pipe(class_labels=class_ids)
image = output.images[0]  # label 'white shark'

⚡ LoRA

LoRA is a technique for performing parameter-efficient fine-tuning for large models. LoRA works by adding so-called "update matrices" to specific blocks of a pre-trained model. During fine-tuning, only these update matrices are updated while the pre-trained model parameters are kept frozen. This allows us to achieve greater memory efficiency as well as easier portability during fine-tuning.

LoRA was proposed in LoRA: Low-Rank Adaptation of Large Language Models. In the original paper, the authors investigated LoRA for fine-tuning large language models like GPT-3. cloneofsimo was the first to try out LoRA training for Stable Diffusion in the popular lora GitHub repository.

Diffusers now supports LoRA! This means you can now fine-tune a model like Stable Diffusion using consumer GPUs like Tesla T4 or RTX 2080 Ti. LoRA support was added to UNet2DConditionModel and DreamBooth training script by @patrickvonplaten in #1884.

By using LoRA, the fine-tuned checkpoints will be just 3 MBs in size. After fine-tuning, you can use the LoRA checkpoints like so:

from diffusers import StableDiffusionPipeline
import torch

model_path = "sayakpaul/sd-model-finetuned-lora-t4"
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16)
pipe.unet.load_attn_procs(model_path)
pipe.to("cuda")

prompt = "A pokemon with blue eyes."
image = pipe(prompt, num_inference_steps=30, guidance_scale=7.5).images[0]
image.save("pokemon.png")

pokemon-image

You can follow these resources to know more about how to use LoRA in diffusers:

📐 Customizable Cross Attention

LoRA leverages a new method to customize the cross attention layers deep in the UNet. This can be useful for other creative approaches such as Prompt-to-Prompt, and it makes it easier to apply optimizers like xFormers. This new "attention processor" abstraction was created by @patrickvonplaten in #1639 after discussing the design with the community, and we have used it to rewrite our xFormers and attention slicing implementations!

🌿 Flax => PyTorch

A long requested feature, prolific community member @camenduru took up the gauntlet in #1900 and created a way to convert Flax model weights for PyTorch. This means that you can train or fine-tune models super fast using Google TPUs, and then convert the weights to PyTorch for everybody to use. Thanks @camenduru!

🌀 Flax Img2Img

Another community member, @dhruvrnaik, ported the image-to-image pipeline to Flax in #1355! Using a TPU v2-8 (available in Colab's free tier), you can generate 8 images at once in a few seconds!

🎲 DEIS Scheduler

DEIS (Diffusion Exponential Integrator Sampler) is a new fast mult step scheduler that can generate high-quality samples in fewer steps.
The scheduler was introduced in the paper Fast Sampling of Diffusion Models with Exponential Integrator. More information about the scheduler can be found in the paper.

from diffusers import StableDiffusionPipeline, DEISMultistepScheduler
import torch

pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipe.scheduler = DEISMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
generator = torch.Generator(device="cuda").manual_seed(0)
image = pipe(prompt, generator=generator, num_inference_steps=25).images[0
  • feat : add log-rho deis multistep scheduler by @qsh-zh #1432

Reproducibility

One can now pass CPU generators to all pipelines even if the pipeline is on GPU. This ensures
much better reproducibility across GPU hardware:

import torch
from diffusers import DDIMPipeline
import numpy as np

model_id = "google/ddpm-cifar10-32"

# load model and scheduler
ddim = DDIMPipeline.from_pretrained(model_id)
ddim.to("cuda")

# create a generator for reproducibility
generator = torch.manual_seed(0)

# run pipeline for just two steps and return numpy tensor
image = ddim(num_inference_steps=2, output_type="np", generator=generator).images
print(np.abs(image).sum())

See: #1902 and https://huggingface.co/docs/diffusers/using-diffusers/reproducibility

Important New Guides

Important Bug Fixes

  • Don't download safetensors if library is not installed: #2057
  • Make sure that save_pretrained(...) doesn't accidentally delete files: #2038
  • Fix CPU offload docs for maximum memory gain: #1968
  • Fix conversion for exotically sorted weight names: #1959
  • Fix intermediate checkpointing for textual inversion, thanks @lstein #2072

All commits

Read more

v0.11.1: Patch release

20 Dec 16:32
Compare
Choose a tag to compare

This patch release fixes a bug with num_images_per_prompt in the UnCLIPPipeline

v0.11.0: Karlo UnCLIP, safetensors, pipeline versions

19 Dec 17:47
Compare
Choose a tag to compare

🪄 Karlo UnCLIP by Kakao Brain

Karlo is a text-conditional image generation model based on OpenAI's unCLIP architecture with the improvement over the standard super-resolution model from 64px to 256px, recovering high-frequency details in a small number of denoising steps.

This alpha version of Karlo is trained on 115M image-text pairs, including COYO-100M high-quality subset, CC3M, and CC12M.
For more information about the architecture, see the Karlo repository: https://github.com/kakaobrain/karlo
image

pip install diffusers transformers safetensors accelerate
import torch
from diffusers import UnCLIPPipeline

pipe = UnCLIPPipeline.from_pretrained("kakaobrain/karlo-v1-alpha", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "a high-resolution photograph of a big red frog on a green leaf."
image = pipe(prompt).images[0]

img

:octocat: Community pipeline versioning

The community pipelines hosted in diffusers/examples/community will now follow the installed version of the library.

E.g. if you have diffusers==0.9.0 installed, the pipelines from the v0.9.0 branch will be used: https://github.com/huggingface/diffusers/tree/v0.9.0/examples/community

If you've installed diffusers from source, e.g. with pip install git+https://github.com/huggingface/diffusers then the latest versions of the pipelines will be fetched from the main branch.

To change the custom pipeline version, set the custom_revision variable like so:

pipeline = DiffusionPipeline.from_pretrained(
     "google/ddpm-cifar10-32", custom_pipeline="one_step_unet", custom_revision="0.10.2"
)

🦺 safetensors

Many of the most important checkpoints now have https://github.com/huggingface/safetensors available. Upon installing safetensors with:

pip install safetensors

You will see a nice speed-up when loading your model 🚀

Some of the most improtant checkpoints have safetensor weights added now:

Batched generation bug fixes 🐛

We fixed a lot of bugs for batched generation. All pipelines should now correctly process batches of prompts and images 🤗
Also we made it much easier to tweak images with reproducible seeds:
https://huggingface.co/docs/diffusers/using-diffusers/reusing_seeds

📝 Changelog

v0.10.2: Patch release

09 Dec 18:19
Compare
Choose a tag to compare

This patch removes the hard requirement for transformers>=4.25.1 in case external libraries were downgrading the library upon startup in a non-controllable way.

🚨🚨🚨 Note that xformers in not automatically enabled anymore 🚨🚨🚨

The reasons for this are given here: #1640 (comment):

We should not automatically enable xformers for three reasons:

It's not PyTorch-like API. PyTorch doesn't by default enable all the fastest options available
We allocate GPU memory before the user even does .to("cuda")
This behavior is not consistent with cases where xformers is not installed

=> This means: If you were used to have xformers automatically enabled, please make sure to add the following now:

from diffusers.utils.import_utils import is_xformers_available

unet = ... # load unet

if is_xformers_available():
    try:
        unet.enable_xformers_memory_efficient_attention(True)
    except Exception as e:
        logger.warning(
            "Could not enable memory efficient attention. Make sure xformers is installed"
            f" correctly and a GPU is available: {e}"
        )

for the UNet (e.g. in dreambooth) or for the pipeline:

from diffusers.utils.import_utils import is_xformers_available

pipe = ... # load pipeline

if is_xformers_available():
    try:
        pipe.enable_xformers_memory_efficient_attention(True)
    except Exception as e:
        logger.warning(
            "Could not enable memory efficient attention. Make sure xformers is installed"
            f" correctly and a GPU is available: {e}"
        )

v0.10.1: Patch release

09 Dec 14:12
Compare
Choose a tag to compare

This patch returns enable_xformers_memory_efficient_attention() to UNet2DCondition to restore backward compatibility.

v0.10.0: Depth Guidance and Safer Checkpoints

08 Dec 18:32
Compare
Choose a tag to compare

🐳 Depth-Guided Stable Diffusion and 2.1 checkpoints

The new depth-guided stable diffusion model is fully supported in this release. The model is conditioned on monocular depth estimates inferred via MiDaS and can be used for structure-preserving img2img and shape-conditional synthesis.

image

Installing the transformers library from source is required for the MiDaS model:

pip install --upgrade git+https://github.com/huggingface/transformers/
import torch
import requests
from PIL import Image
from diffusers import StableDiffusionDepth2ImgPipeline

pipe = StableDiffusionDepth2ImgPipeline.from_pretrained(
   "stabilityai/stable-diffusion-2-depth",
   torch_dtype=torch.float16,
).to("cuda")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
init_image = Image.open(requests.get(url, stream=True).raw)

prompt = "two tigers"
n_propmt = "bad, deformed, ugly, bad anotomy"
image = pipe(prompt=prompt, image=init_image, negative_prompt=n_propmt, strength=0.7).images[0]

The updated Stable Diffusion 2.1 checkpoints are also released and fully supported:

🦺 Safe Tensors

We now support SafeTensors: a new simple format for storing tensors safely (as opposed to pickle) that is still fast (zero-copy).

Format Safe Zero-copy Lazy loading No file size limit Layout control Flexibility Bfloat16
pickle (PyTorch)
H5 (Tensorflow) ~ ~
SavedModel (Tensorflow)
MsgPack (flax)
SafeTensors

**More details about the comparison here: https://github.com/huggingface/safetensors#yet-another-format-

pip install safetensors
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.save_pretrained("./safe-stable-diffusion-2-1", safe_serialization=True)

# you can also push this checkpoint to the HF Hub and load from there
safe_pipe = StableDiffusionPipeline.from_pretrained("./safe-stable-diffusion-2-1")

New Pipelines

🖌️ Paint-by-example

An implementation of Paint by Example: Exemplar-based Image Editing with Diffusion Models by Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen

image

import PIL
import requests
import torch
from io import BytesIO
from diffusers import DiffusionPipeline

def download_image(url):
    response = requests.get(url)
    return PIL.Image.open(BytesIO(response.content)).convert("RGB")

img_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/image/example_1.png"
mask_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/mask/example_1.png"
example_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/reference/example_1.jpg"

init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))
example_image = download_image(example_url).resize((512, 512))

pipe = DiffusionPipeline.from_pretrained("Fantasy-Studio/Paint-by-Example", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

image = pipe(image=init_image, mask_image=mask_image, example_image=example_image).images[0]

Audio Diffusion and Latent Audio Diffusion

Audio Diffusion leverages the recent advances in image generation using diffusion models by converting audio samples to and from mel spectrogram images.

from IPython.display import Audio
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("teticio/audio-diffusion-ddim-256").to("cuda")

output = pipe()
display(output.images[0])
display(Audio(output.audios[0], rate=pipe.mel.get_sample_rate()))

[Experimental] K-Diffusion pipeline for Stable Diffusion

This pipeline is added to support the latest schedulers from @crowsonkb's k-diffusion
The purpose of this pipeline is to compare scheduler implementations and updates, so new features from other pipelines are unlikely to be supported!

pip install k-diffusion
from diffusers import StableDiffusionKDiffusionPipeline
import torch

pipe = StableDiffusionKDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1-base")
pipe = pipe.to("cuda")

pipe.set_scheduler("sample_heun")
image = pipe("astronaut riding horse", num_inference_steps=25).images[0]

New Schedulers

Heun scheduler inspired by Karras et. al

Algorithm 1 of Karras et. al. Scheduler ported from @crowsonkb’s k-diffusion

from diffusers import HeunDiscreteScheduler

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.scheduler = HeunDiscreteScheduler.from_config(pipe.scheduler.config)

Single step DPM-Solver

Original paper can be found here and the improved version. The original implementation can be found here.

  • Add Singlestep DPM-Solver (singlestep high-order schedulers) by @LuChengTHU in #1442
from diffusers import DPMSolverSinglestepScheduler

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.scheduler = DPMSolverSinglestepScheduler.from_config(pipe.scheduler.config)

📝 Changelog

Read more

v0.9.0: Stable Diffusion 2

25 Nov 17:17
Compare
Choose a tag to compare

🎨 Stable Diffusion 2 is here!

Installation

pip install diffusers[torch]==0.9 transformers

Stable Diffusion 2.0 is available in several flavors:

Stable Diffusion 2.0-V at 768x768

New stable diffusion model (Stable Diffusion 2.0-v) at 768x768 resolution. Same number of parameters in the U-Net as 1.5, but uses OpenCLIP-ViT/H as the text encoder and is trained from scratch. SD 2.0-v is a so-called v-prediction model.

image

import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "stabilityai/stable-diffusion-2"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "High quality photo of an astronaut riding a horse in space"
image = pipe(prompt, guidance_scale=9, num_inference_steps=25).images[0]
image.save("astronaut.png")

Stable Diffusion 2.0-base at 512x512

The above model is finetuned from SD 2.0-base, which was trained as a standard noise-prediction model on 512x512 images and is also made available.

image

import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "stabilityai/stable-diffusion-2-base"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "High quality photo of an astronaut riding a horse in space"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("astronaut.png")

Stable Diffusion 2.0 for Inpanting

This model for text-guided inpanting is finetuned from SD 2.0-base. Follows the mask-generation strategy presented in LAMA which, in combination with the latent VAE representations of the masked image, are used as an additional conditioning.

image

import PIL
import requests
import torch
from io import BytesIO
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

def download_image(url):
    response = requests.get(url)
    return PIL.Image.open(BytesIO(response.content)).convert("RGB")

img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))

repo_id = "stabilityai/stable-diffusion-2-inpainting"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
image = pipe(prompt=prompt, image=init_image, mask_image=mask_image, num_inference_steps=25).images[0]
image.save("yellow_cat.png")

Stable Diffusion X4 Upscaler

The model was trained on crops of size 512x512 and is a text-guided latent upscaling diffusion model. In addition to the textual input, it receives a noise_level as an input parameter, which can be used to add noise to the low-resolution input according to a predefined diffusion schedule.

image

import requests
from PIL import Image
from io import BytesIO
from diffusers import StableDiffusionUpscalePipeline
import torch

model_id = "stabilityai/stable-diffusion-x4-upscaler"
pipeline = StableDiffusionUpscalePipeline.from_pretrained(model_id, revision="fp16", torch_dtype=torch.float16)
pipeline = pipeline.to("cuda")

url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd2-upscale/low_res_cat.png"
response = requests.get(url)
low_res_img = Image.open(BytesIO(response.content)).convert("RGB")
low_res_img = low_res_img.resize((128, 128))

prompt = "a white cat"
upscaled_image = pipeline(prompt=prompt, image=low_res_img).images[0]
upscaled_image.save("upsampled_cat.png")

Saving & Loading is fixed for Versatile Diffusion

Previously there was a 🐛 when saving & loading versatile diffusion - this is fixed now so that memory efficient saving & loading works as expected.

📝 Changelog

v0.8.1: Patch release

24 Nov 00:09
Compare
Choose a tag to compare

This patch release fixes an error with CLIPVisionModelWithProjection imports on a non-git transformers installation.

⚠️ Please upgrade with pip install --upgrade diffusers or pip install diffusers==0.8.1

v0.8.0: Versatile Diffusion - Text, Images and Variations All in One Diffusion Model

23 Nov 18:23
Compare
Choose a tag to compare

🙆‍♀️ New Models

VersatileDiffusion

VersatileDiffusion, released by SHI-Labs, is a unified multi-flow multimodal diffusion model that is capable of doing multiple tasks such as text2image, image variations, dual-guided(text+image) image generation, image2text.

pip install git+https://github.com/huggingface/transformers

Then you can run:

from diffusers import VersatileDiffusionPipeline
import torch
import requests
from io import BytesIO
from PIL import Image

pipe = VersatileDiffusionPipeline.from_pretrained("shi-labs/versatile-diffusion", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

# initial image
url = "https://huggingface.co/datasets/diffusers/images/resolve/main/benz.jpg"
response = requests.get(url)
image = Image.open(BytesIO(response.content)).convert("RGB")

# prompt
prompt = "a red car"

# text to image
image = pipe.text_to_image(prompt).images[0]

# image variation
image = pipe.image_variation(image).images[0]

# image variation
image = pipe.dual_guided(prompt, image).images[0]

More in-depth details can be found on:

AltDiffusion

AltDiffusion is a multilingual latent diffusion model that supports text-to-image generation for 9 different languages: English, Chinese, Spanish, French, Japanese, Korean, Arabic, Russian and Italian.

Stable Diffusion Image Variations

StableDiffusionImageVariationPipeline by @justinpinkney is a stable diffusion model that takes an image as an input and generates variations of that image. It is conditioned on CLIP image embeddings instead of text.

Safe Latent Diffusion

Safe Latent Diffusion (SLD), released by ml-research@TUDarmstadt group, is a new practical and sophisticated approach to prevent unsolicited content from being generated by diffusion models. One of the authors of the research contributed their implementation to diffusers.

VQ-Diffusion with classifier-free sampling

vq diffusion classifier free sampling by @williamberman #1294

LDM super resolution

LDM super resolution is a latent 4x super-resolution diffusion model released by CompVis.

CycleDiffusion

CycleDiffusion is a method that uses Text-to-Image Diffusion Models for Image-to-Image Editing. It is capable of

  1. Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
    Traditional unpaired image-to-image translation with diffusion models trained on two related domains.
  2. Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
    Traditional unpaired image-to-image translation with diffusion models trained on two related domains.
  • Add CycleDiffusion pipeline using Stable Diffusion by @ChenWu98 #888

CLIPSeg + StableDiffusionInpainting.

Uses CLIPSeg to automatically generate a mask using segmentation, and then applies Stable Diffusion in-painting.

K-Diffusion wrapper

K-Diffusion Pipeline is community pipeline that allows to use any sampler from K-diffusion with diffusers models.

🌀New SOTA Scheduler

DPMSolverMultistepScheduler is the 🧨 diffusers implementation of DPM-Solver++, a state-of-the-art scheduler that was contributed by one of the authors of the paper. This scheduler is able to achieve great quality in as few as 20 steps. It's a drop-in replacement for the default Stable Diffusion scheduler, so you can use it to essentially half generation times. It works so well that we adopted it for the Stable Diffusion demo Spaces: https://huggingface.co/spaces/stabilityai/stable-diffusion, https://huggingface.co/spaces/runwayml/stable-diffusion-v1-5.

You can use it like this:

from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "runwayml/stable-diffusion-v1-5"
scheduler = DPMSolverMultistepScheduler.from_pretrained(repo_id, subfolder="scheduler")
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler)

🌐 Better scheduler API

The example above also demonstrates how to load schedulers using a new API that is coherent with model loading and therefore more natural and intuitive.

You can load a scheduler using from_pretrained, as demonstrated above, or you can instantiate one from an existing scheduler configuration. This is a way to replace the scheduler of a pipeline that was previously loaded:

from diffusers import DiffusionPipeline, EulerDiscreteScheduler

pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)

Read more about these changes in the documentation. See also the community pipeline that allows using any of the K-diffusion samplers with diffusers, as mentioned above!

🎉 Performance

We work relentlessly to incorporate performance optimizations and memory reduction techniques to 🧨 diffusers. These are two of the most noteworthy incorporations in this release:

  • Enable memory-efficient attention by default if xFormers is installed.
  • Use batched-matmuls when possible.

🎁 Quality of Life improvements

  • Fix/Enable all schedulers for in-painting
  • Easier loading of local pipelines
  • cpu offloading: mutli GPU support

📝 Changelog

Read more

v0.7.2: Patch release

05 Nov 21:45
Compare
Choose a tag to compare

This patch release fixes a bug that broken the Flax Stable Diffusion Inference.
Thanks a mille for spotting it @camenduru in #1145 and thanks a lot to @pcuenca and @kashif for fixing it in #1149