[Half precision] Make sure half-precision is correct #182

patrickvonplaten · 2022-08-15T16:45:13Z

This PR fixes a couple of things to allow stable diffusion to give 1-to-1 the same results as the original implementation

Changes include:

Make sure group norm is run always in fp32
Change ordering of how time embeddings are computed -> for some reason this makes a big difference for stable diffusion but not really for other models it seems
Make sure time embeddings are expanded if batch size > 1
Adapt DDIM sampler to allow to be 1-to-1 compatible with stable diffusion

Both DDIM and PNDM work with the following script:

#!/usr/bin/env python3
from diffusers import StableDiffusionPipeline, DDIMScheduler
from time import time
from PIL import Image
from einops import rearrange
import numpy as np
import torch
from torch import autocast
from torchvision.utils import make_grid

torch.manual_seed(42)

prompt = "a photograph of an astronaut riding a horse"
#prompt = "a photograph of the eiffel tower on the moon"
#prompt = "an oil painting of a futuristic forest gives"

pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-3-diffusers", use_auth_token=True)  # make sure you're logged in with `huggingface-cli login`

# uncomment to use DDIM
#scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False)
#pipe.scheduler = scheduler

all_images = []
num_rows = 1
num_columns = 4
for _ in range(num_rows):
    with autocast("cuda"):
        images = pipe(num_columns * [prompt], guidance_scale=7.5, output_type="np")["sample"]  # image here is in [PIL format](https://pillow.readthedocs.io/en/stable/)
        all_images.append(torch.from_numpy(images))

# additionally, save as grid
grid = torch.stack(all_images, 0)
grid = rearrange(grid, 'n b h w c -> (n b) h w c')
grid = rearrange(grid, 'n h w c -> n c h w')
grid = make_grid(grid, nrow=num_rows)

# to image
grid = 255. * rearrange(grid, 'c h w -> h w c').cpu().numpy()
image = Image.fromarray(grid.astype(np.uint8))

image.save(f"./images/diffusers/{'_'.join(prompt.split())}_{round(time())}.png")

src/diffusers/models/unet_2d.py

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

…sion.py

HuggingFaceDocBuilderDev · 2022-08-15T16:50:24Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten · 2022-08-15T17:13:01Z

The script above gives for:

PNDM:

And for DDIM

patrickvonplaten · 2022-08-15T17:15:11Z

The original code base: https://github.com/CompVis/stable-diffusion gives with the following command:

CUDA_VISIBLE_DEVICES="0" python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --n_samples 4 --n_iter 1 --fixed_code --plms

for PNDM:

and

CUDA_VISIBLE_DEVICES="0" python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --n_samples 4 --n_iter 1 --fixed_code

for DDIM:

patil-suraj

Thanks a lot for fixing these sneaky bugs! Looks good to me. Left some nit.

Just one question: Why do we need the negative_alpha_cumprod ?

src/diffusers/models/resnet.py

patil-suraj · 2022-08-15T17:15:01Z

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

+        text_input = self.tokenizer(
+            prompt,
+            padding="max_length",
+            max_length=self.tokenizer.model_max_length,
+            truncation=True,
+            return_tensors="pt",
+        )


This is important here to always pad to max_length, as that's how the model was trained.

yes agree, but let's make sure to not do this when we create our text to image training script (it's def cleaner to mask out padding tokens and should help the model learn better as stated by Katherine on Slack as well)
cc @anton-l

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

patil-suraj · 2022-08-15T17:17:50Z

src/diffusers/schedulers/scheduling_ddim.py

@@ -75,7 +76,7 @@ def __init__(

        self.alphas = 1.0 - self.betas
        self.alphas_cumprod = np.cumprod(self.alphas, axis=0)
-        self.one = np.array(1.0)
+        self.negative_alpha_cumprod = np.array(1.0) if do_neg_alpha_one else self.alphas_cumprod[0]


I'm not sure I understand why this is needed, could you maybe add some comment here explaining why we added it here as this True by default.
Does it affect other pipelines ?

yeah maybe we need a better name here indeed. Let me know if you have better ideas?
In short the story is the following:

For every step we need to know the previous step. Just at the step t=0 there is no previous step -> so what should we do?
What is the default here now and what was used before (no breaking change) is the we just set it to 1, but DDIM in stable diffusion instead uses the highest number that exists in the cumulative products of the alphas which is often smaller then one. This can actually make quite a difference in the end

Ok I changed it to clip_alpha_at_one - think the naming is a bit better no? Wdyt?

set_alpha_to_one, since it's not necessarily clipping it? (Clipping, to me, suggests that we perform the cumulative product and then make sure it's not larger than 1).

changing that actually real quick

anton-l

Great fixes! Will report LMS test results once this is merged :)

Co-authored-by: Suraj Patil <[email protected]>

patrickvonplaten · 2022-08-16T08:42:01Z

Merging! Now looking into PNDM

patrickvonplaten · 2022-08-16T08:42:20Z

Adapted tests to locally only run on GPU with autocast - Careful: Test might differ depending on the machine

…e#182) * Change tflite test from sharkimporter -> sharkdownloader * xfail all uint/int tflite sharkdownloader tests

* [Half precision] Make sure half-precision is correct * Update src/diffusers/models/unet_2d.py * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py * correct some tests * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> * finalize * finish Co-authored-by: Suraj Patil <[email protected]>

[Half precision] Make sure half-precision is correct

42e6d51

patrickvonplaten commented Aug 15, 2022

View reviewed changes

src/diffusers/models/unet_2d.py Outdated Show resolved Hide resolved

Update src/diffusers/models/unet_2d.py

4667928

patrickvonplaten commented Aug 15, 2022

View reviewed changes

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py Show resolved Hide resolved

patrickvonplaten commented Aug 15, 2022

View reviewed changes

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py Outdated Show resolved Hide resolved

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffu…

760a071

…sion.py

correct some tests

f3d19e1

patrickvonplaten requested review from patil-suraj and anton-l August 15, 2022 17:10

patil-suraj approved these changes Aug 15, 2022

View reviewed changes

anton-l approved these changes Aug 16, 2022

View reviewed changes

patrickvonplaten and others added 4 commits August 16, 2022 09:58

Apply suggestions from code review

c7743d5

Co-authored-by: Suraj Patil <[email protected]>

finalize

b30c8c7

Merge branch 'main' into correct_some_stuff

468b548

finish

387a6b0

patrickvonplaten merged commit 051b346 into main Aug 16, 2022

patrickvonplaten deleted the correct_some_stuff branch August 16, 2022 08:42

patil-suraj mentioned this pull request Aug 23, 2022

How do you verify that integration is equivalent to original Stable Diffusion? #233

Closed

patil-suraj mentioned this pull request Sep 6, 2022

Inference support for mps device #355

Merged

4 tasks

patrickvonplaten mentioned this pull request Sep 6, 2022

[Tests] Fix SD slow tests #364

Merged

patrickvonplaten mentioned this pull request Oct 20, 2022

Is there a reference for the model/architecture used by diffusers anywhere? It doesn't seem to match to the original stable-diffusion repo #901

Closed

PhaneeshB pushed a commit to nod-ai/diffusers that referenced this pull request Mar 1, 2023

Change tflite tests from sharkimporter -> sharkdownloader (huggingfac…

8434c67

…e#182) * Change tflite test from sharkimporter -> sharkdownloader * xfail all uint/int tflite sharkdownloader tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Half precision] Make sure half-precision is correct #182

[Half precision] Make sure half-precision is correct #182

patrickvonplaten commented Aug 15, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Aug 15, 2022 •

edited

Loading

patrickvonplaten commented Aug 15, 2022

patrickvonplaten commented Aug 15, 2022 •

edited

Loading

patil-suraj left a comment

patil-suraj Aug 15, 2022

patrickvonplaten Aug 15, 2022 •

edited

Loading

patil-suraj Aug 15, 2022

patrickvonplaten Aug 15, 2022

patrickvonplaten Aug 16, 2022

pcuenca Aug 16, 2022

patrickvonplaten Aug 16, 2022

anton-l left a comment

patrickvonplaten commented Aug 16, 2022

patrickvonplaten commented Aug 16, 2022

[Half precision] Make sure half-precision is correct #182

[Half precision] Make sure half-precision is correct #182

Conversation

patrickvonplaten commented Aug 15, 2022 • edited Loading

HuggingFaceDocBuilderDev commented Aug 15, 2022 • edited Loading

patrickvonplaten commented Aug 15, 2022

patrickvonplaten commented Aug 15, 2022 • edited Loading

patil-suraj left a comment

Choose a reason for hiding this comment

patil-suraj Aug 15, 2022

Choose a reason for hiding this comment

patrickvonplaten Aug 15, 2022 • edited Loading

Choose a reason for hiding this comment

patil-suraj Aug 15, 2022

Choose a reason for hiding this comment

patrickvonplaten Aug 15, 2022

Choose a reason for hiding this comment

patrickvonplaten Aug 16, 2022

Choose a reason for hiding this comment

pcuenca Aug 16, 2022

Choose a reason for hiding this comment

patrickvonplaten Aug 16, 2022

Choose a reason for hiding this comment

anton-l left a comment

Choose a reason for hiding this comment

patrickvonplaten commented Aug 16, 2022

patrickvonplaten commented Aug 16, 2022

patrickvonplaten commented Aug 15, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Aug 15, 2022 •

edited

Loading

patrickvonplaten commented Aug 15, 2022 •

edited

Loading

patrickvonplaten Aug 15, 2022 •

edited

Loading