Diffusion扩散模型学习4——Stable Diffusion原理解析-inpaint修复图片为例

[复制链接]
查看937 | 回复6 | 2023-8-16 15:43:41 | 显示全部楼层 |阅读模式
学习前言

Inpaint是Stable Diffusion中的常用方法,一起简单学习一下。

源码下载地址

https://github.com/bubbliiiing/stable-diffusion
喜欢的可以点个star噢。
原理解析

一、先验知识

txt2img的原理如博文
Diffusion扩散模型学习2——Stable Diffusion结构解析-以文本生成图像(文生图,txt2img)为例
img2img的原理如博文
Diffusion扩散模型学习3——Stable Diffusion结构解析-以图像生成图像(图生图,img2img)为例
二、什么是inpaint

Inpaint是一项图片修复技术,可以从图片上去除不必要的物体,让您轻松摆脱照片上的水印、划痕、污渍、标志等瑕疵。
一般来讲,图片的inpaint过程可以理解为两步:
1、找到图片中的需要重绘的部分,比如上述提到的水印、划痕、污渍、标志等
2、去掉水印、划痕、污渍、标志等,自动填充图片应该有的内容。
三、Stable Diffusion中的inpaint

Stable Diffusion中的inpaint的实现方式有两种:
1、开源的inpaint模型

参考链接:inpaint_st.py,该模型经过特定的训练。需要输入符合需求的图片才可以进行inpaint。
需要注意的是,该模型使用的config文件发生了改变,改为v1-inpainting-inference.yaml。其中最显著的区别就是unet_config的in_channels从4变成了9。相比于原来的4,我们增加了4+1(5)个通道的信息。

4+1(5)个通道的信息应该是什么呢?一个是被mask后的图像,对应其中的4;一个是mask的图像,对应其中的1。



  • 1、我们首先把图片中需要inpaint的部分给置为0,获得被mask后的图像,然后利用VAE编码,VAE输出通道为4,假设被mask的图像是[512, 512, 3],此时我们获得了一个[4, 64, 64]的隐含层特征,对应其中的4。
  • 2、然后需要对mask进行下采样,采样到和隐含层特征一样的高宽,即mask的shape为[1, 512, 512],利用下采样获得[1, 64, 64]的mask。本质上,我们获得了隐含层的mask
  • 3、然后我们将 下采样后的被mask的图像隐含层的mask 在通道上做一个堆叠,获得一个[5, 64, 64]的特征,然后将此特征与随机初始化的高斯噪声堆叠,则获得了上述图片中的9通道特征。
此后采样的过程与常规采样方式一样,全部采样完成后,使用VAE解码,获得inpaint后的图像。
可以感受到上述的方式必须基于一个已经训练好的unet模型,这要求训练者需要有足够的算力去完成这一个工作,对大众开发者而言并不友好。因此该方法很少在实际中得到使用。
2、基于base模型inpaint

如果我们必须训练一个inpaint模型才能对当前的模型进行inpaint,那就太麻烦了,有没有什么方法可以不需要训练就能inpaint呢?
诶诶,当然有哈。
Stable Diffusion就是一个生成模型,如果我们可以做到让Stable Diffusion只生成指定区域,并且在生成指定区域的时候参考其它区域,那么它自身便是一个天然的inpaint模型

如何做到这一点呢?我们需要结合img2img方法,我们首先考虑inpaint的两个输入:一个是原图,另外一个是mask图。
在img2img中,存在一个denoise参数,假设我们设置denoise数值为0.8,总步数为20步,那么我们会对输入图片进行0.8x20次的加噪声。如果我们可以在这个加噪声图片的基础上进行重建,那么网络必然会考虑加噪声图(也就对应了原始图片的特征)
在图像重建的20步中,对隐含层特征,我们利用mask将不重建的地方都替换成 原图按照当前步数加噪后的隐含层特征。此时不重建的地方特征都由输入图片决定。然后不替换需要重建的地方进行,利用unet计算噪声进行重建。
具体部分,可看下面的循环与代码,我已经标注出了 替换特征的地方,在这里mask等于1的地方保留原图,mask等于0的地方不断的重建。


  • 将原图x0映射到VAE隐空间,得到img_orig;
  • 初始化随机噪声img(也可以使用img_orig完全加噪后的噪声);
  • 开始循环:

    • 对于每一次时间步,根据时间步生成img_orig对应的噪声特征;
    • 一个是基于上个时间步降噪后得到的img,一个是基于原图得到的img_orig。通过mask将两者融合,                                                  i                                  m                                  g                                  =                                  i                                  m                                  g                                  _                                  o                                  r                                  i                                  g                                  ∗                                  m                                  a                                  s                                  k                                  +                                  (                                  1.0                                  −                                  m                                  a                                  s                                  k                                  )                                  ∗                                  i                                  m                                  g                                          img = img\_orig * mask + (1.0 - mask) * img                           img=img_orig∗mask+(1.0−mask)∗img。即,将原图中的非mask区域和噪声图中的mask区域进行融合,得到新的噪声图。
    • 然后继续去噪声直到结束。

由于该方法不需要训练新模型,并且重建效果也不错,所以该方法比较通用。
  1. for i, step in enumerate(iterator):
  2.     # index是用来取得对应的调节参数的
  3.     index   = total_steps - i - 1
  4.     # 将步数拓展到bs维度
  5.     ts      = torch.full((b,), step, device=device, dtype=torch.long)
  6.     # --------------------------------------------------------------------------------- #
  7.     #   替换特征的地方
  8.     #   用于进行局部的重建,对部分区域的隐向量进行mask。
  9.     #   对传入unet前的隐含层特征,我们利用mask将不重建的地方都替换成 原图加噪后的隐含层特征
  10.     #   self.model.q_sample用于对输入图片进行ts步数的加噪
  11.     # --------------------------------------------------------------------------------- #
  12.     if mask is not None:
  13.         assert x0 is not None
  14.         img_orig = self.model.q_sample(x0, ts)  # TODO: deterministic forward pass?
  15.         img = img_orig * mask + (1. - mask) * img
  16.     # 进行采样
  17.     outs = self.p_sample_ddim(img, cond, ts, index=index, use_original_steps=ddim_use_original_steps,
  18.                                 quantize_denoised=quantize_denoised, temperature=temperature,
  19.                                 noise_dropout=noise_dropout, score_corrector=score_corrector,
  20.                                 corrector_kwargs=corrector_kwargs,
  21.                                 unconditional_guidance_scale=unconditional_guidance_scale,
  22.                                 unconditional_conditioning=unconditional_conditioning)
  23.     img, pred_x0 = outs
  24.     # 回调函数
  25.     if callback: callback(i)
  26.     if img_callback: img_callback(pred_x0, i)
  27.     if index % log_every_t == 0 or index == total_steps - 1:
  28.         intermediates['x_inter'].append(img)
  29.         intermediates['pred_x0'].append(pred_x0)
复制代码
四、inpaint流程

根据通用性,本文主要以上述提到的基于base模型inpaint进行解析。
1、输入图片到隐空间的编码


inpaint技术衍生于图生图技术,所以同样需要指定一张参考的图像,然后在这个参考图像上开始工作。
利用VAE编码器对这张参考图像进行编码,使其进入隐空间,只有进入了隐空间,网络才知道这个图像是什么
此时我们便获得在隐空间的图像,后续会在这个 隐空间加噪后的图像 的基础上进行采样。
2、文本编码


文本编码的思路比较简单,直接使用CLIP的文本编码器进行编码就可以了,在代码中定义了一个FrozenCLIPEmbedder类别,使用了transformers库的CLIPTokenizer和CLIPTextModel。
在前传过程中,我们对输入进来的文本首先利用CLIPTokenizer进行编码,然后使用CLIPTextModel进行特征提取,通过FrozenCLIPEmbedder,我们可以获得一个[batch_size, 77, 768]的特征向量。
  1. class FrozenCLIPEmbedder(AbstractEncoder):
  2.     """Uses the CLIP transformer encoder for text (from huggingface)"""
  3.     LAYERS = [
  4.         "last",
  5.         "pooled",
  6.         "hidden"
  7.     ]
  8.     def __init__(self, version="openai/clip-vit-large-patch14", device="cuda", max_length=77,
  9.                  freeze=True, layer="last", layer_idx=None):  # clip-vit-base-patch32
  10.         super().__init__()
  11.         assert layer in self.LAYERS
  12.         # 定义文本的tokenizer和transformer
  13.         self.tokenizer      = CLIPTokenizer.from_pretrained(version)
  14.         self.transformer    = CLIPTextModel.from_pretrained(version)
  15.         self.device         = device
  16.         self.max_length     = max_length
  17.         # 冻结模型参数
  18.         if freeze:
  19.             self.freeze()
  20.         self.layer = layer
  21.         self.layer_idx = layer_idx
  22.         if layer == "hidden":
  23.             assert layer_idx is not None
  24.             assert 0 <= abs(layer_idx) <= 12
  25.     def freeze(self):
  26.         self.transformer = self.transformer.eval()
  27.         # self.train = disabled_train
  28.         for param in self.parameters():
  29.             param.requires_grad = False
  30.     def forward(self, text):
  31.         # 对输入的图片进行分词并编码,padding直接padding到77的长度。
  32.         batch_encoding  = self.tokenizer(text, truncation=True, max_length=self.max_length, return_length=True,
  33.                                         return_overflowing_tokens=False, padding="max_length", return_tensors="pt")
  34.         # 拿出input_ids然后传入transformer进行特征提取。
  35.         tokens          = batch_encoding["input_ids"].to(self.device)
  36.         outputs         = self.transformer(input_ids=tokens, output_hidden_states=self.layer=="hidden")
  37.         # 取出所有的token
  38.         if self.layer == "last":
  39.             z = outputs.last_hidden_state
  40.         elif self.layer == "pooled":
  41.             z = outputs.pooler_output[:, None, :]
  42.         else:
  43.             z = outputs.hidden_states[self.layer_idx]
  44.         return z
  45.     def encode(self, text):
  46.         return self(text)
复制代码
来源:https://blog.csdn.net/weixin_44791964/article/details/131997973
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!
回复

使用道具 举报

FrankJScott | 2025-5-6 21:32:14 | 显示全部楼层

Top Rated Tajir4D Link Alternatif Info

To the guy inquiring about situs situs slot terbaik, game slot game slot, games slot, slot agen toto, situs game slot online terbaik, situs 4d, daftar agen togel resmi, 5 situs, judi online terbaru, situs judi casino online terpercaya,  I highly suggest this new tajir4d details or situs judi casino, situs tembak ikan, menang slot online, judi slot online, situs slot rupiah, situs togel paling aman dan terpercaya, bandar togel online terpercaya, situs slot terpercaya di indonesia, deposit dengan pulsa, antara slot login, on top of this read more about login tajir4d info alongside all situs slot online terbaru, pemain slot online, situs slot game, situs togel 2d, situs web judi, permainan joker slot, slot play online, slots terbaru, toto slot online, menang togel, alongside all this my explanation about login tajir4d info which is also great. Also, have a look at this inquiry on login tajir4d blog and don't forget slot judi, pragmatic menang slot online, website judi terpercaya, nama situs judi slot, judi live casino, web judi online, dan terpercaya, game slot joker gaming, bandar judi online, apa saja game online, on top of this discover more here on tajir4d link alternatif info not to mention situs judi terpercaya di indonesia, permainan slot yang mudah menang, apa itu pragmatic slot, game slot game slot, live main slot,  read review about as well as hk siang, situs yang gampang menang, situs judi slot online terpercaya indonesia, deposit via, web judi slot online,  for good measure. Check more @ Updated TAJIR4D Site d047ad4
回复

使用道具 举报

FrankJScott | 2025-5-7 02:07:49 | 显示全部楼层

Excellent Laser Treatment In My Area Guide

For the guy talking about skin care injections, o shot what is it, co2 treatment, filler face lift, laser face lift treatment, juvederm lip augmentation, microneedling skin tightening, skin tightening threads, cosmetic spa, non invasive treatments for facial rejuvenation,  I highly recommend this top article on laser treatment skin tightening site or best non surgical treatment for hair loss, medical microneedling, non surgical facial, filler rha, led red light facial, skin care therapy, best face skin tightening treatment, skin care services, faces skin, body facials, alongside all this read what he said on melasma treatment Sarasota blog and don't forget prp facial rejuvenation, laser skin rejuvenation treatment, the medical spa, laser face rejuvenation, skin lifting facial, red light therapy for skin care, skin tightening non invasive, med spa skin tightening, facial rejuvenation injections, skin facial, not forgetting sites such as this these details about laser treatment skin tightening info which is also great. Also, have a look at this my review here for pigmentation treatment Sarasota blog not to mention red light light therapy, med spa lip filler, facial treatment, lip flip lip filler, skin rejuvenation treatment laser, be skin care, derma perfect, the best non invasive face lift, skin rejuvenation procedures, best dermal fillers, not to mention this over here about laser facial skin treatment site as well as spa sarasota, lip augmentation, laser skin, dermal fillers on face, prx skin,  more hints for alongside all derma peel facial, treatment cellulite, best skin chemical peel, best lip augmentation, red light for skin therapy,  for good measure. Check more @ Best Tajir4D Link Alternatif Info 043b3fd
回复

使用道具 举报

FrankJScott | 2025-5-20 08:46:54 | 显示全部楼层

Top Rated AI Stock Predictions Analysis Website

To the man inquiring about ai day trading, best ai shares, shares in artificial intelligence, best tools for investing in stocks, best free software for stock market analysis, ai trading software free, ai stock to buy, ai technology shares, stock market analytics tools, stock market for free,  I highly suggest this what google did to me for AI stock predictions analysis blog or best way to invest in ai, stock ai bot, machine learning stocks, ai based stock prediction, stock price ai, ai stock advisor, stock analysis service, stock market study, ai stock symbol, stocks dealing with artificial intelligence, together with this article source for AI stock trading analysis info on top of stock market analysis, best stock analysis tools, ai powered stock trading, share trading tools, best trading ai, best stock analysis software free, stocks with ai, investing in open ai, best stock analysis sites, best app to research stocks, as well as this official statement on AI stock picker analysis site which is also great. Also, have a look at this learn more on AI stock predictions analysis tips which is worth considering with ai shares to buy, ai stock analysis software, best tools for investing in stocks, free stock research, best ai stocks to buy right now, best platform for stock analysis, using ai to invest, stock picking software, buy shares in chat gpt, stock research software, which is worth considering with this top article about AI stock trading analysis tips as well as open ai company stock, best stock analysis app, using ai to pick stocks, stocks and investing, top stock analysis apps,  breaking news for alongside all open ai stock, ai companies in the stock market, stocks for ai, stock prediction software free, best ai stocks,  for good measure. Check more @ Top AI Stock Investing Analysis Guide d40_ae2
回复

使用道具 举报

FrankJScott | 2025-5-26 23:15:09 | 显示全部楼层

Awesome DVL TOTO Blog

In reply to the lady asking about joker slot link alternatif, judi slot online jackpot terbesar, game slot terpercaya, online judi, situs slot online terbaik 2021, data togel slot login, toto link alternatif, slot situs terbaru, joker joker slot, situs togel bonus besar,  I highly recommend this get the facts on DVL TOTO blog or judi tembak ikan online, slot pg soft, agen togel slot terpercaya, link judi slot online, situs judi slot online singapore, ini slot, agen resmi togel terpercaya, slot uang, situs terbaik judi slot, slot online terbaik, together with this best DVL TOTO tips not forgetting sites such as prediksi angka togel, judi bola, slot terbaik, link judi togel terpercaya, situs judi online slot, 4d slot login, situs judi game slot terbaik, situs judi slot resmi, data togel slot login, judi mesin slot online, which is worth considering with this helpful resources for DVL TOTO url which is also great. Also, have a look at this new DVL TOTO blog which is worth considering with web togel terbaik, situs togel bonus besar, link slot yang mudah menang, togel online daftar, situs judi slot online terbaik dan terpercaya, togel slot, sgp slot link alternatif, situs resmi web bandar judi togel, situs judi game slot online, slot game baru, together with this their explanation on DVLTOTO blog not to mention situs judi online terpercaya, slot online terpercaya, slot microgaming, contoh judi slot online, sbo slot,  more info for on top of judi slot terbaik dan terpercaya, permainan slot uang asli, game judi online, situs togel terbesar, bandar judi toto,  for good measure. Check more @ Excellent Gracianna Winery Info e61d047
回复

使用道具 举报

FrankJScott | 2025-5-27 09:17:09 | 显示全部楼层

New DVL TOTO Guide

In reply to the guy talking about situs togel terpercaya indonesia, permainan slot uang asli, situs judi online slot, slot game terlengkap, link slot uang asli, joker online slot, pasaran togel slot, link pragmatic play, cara main togel online, sbo slot,  I highly suggest this top rated DVLTOTO link or bandar togel terbesar indonesia, togel link, main casino online, situs web slot terpercaya, sbobet casino, slot play, togel bonus terbesar, slot online pg, agen togel slot terpercaya, situs togel online resmi, and don't forget this clicking here on DVL TOTO info which is worth considering with situs terbaru slot, game 4d slot login, slot online situs, jenis judi slot, judi slot terpercaya, situs slot online, contoh permainan slot online, situs judi slot terbesar di dunia, slot 88 link alternatif, agen togel singapore, bearing in mind this learn more about DVLTOTO info which is also great. Also, have a look at this cool DVL TOTO details together with game judi online slot, situs judi online terlengkap, slot play, game pg soft terbaik, game yang lain, situs togel pasaran hongkong terpercaya, game judi tembak ikan online, situs togel 24 jam, web togel terbaik, slot togel online, bearing in mind this my website for DVLTOTO forum on top of daftar provider slot online, bandar togel terbesar terpercaya, game slot88 online, situs alternatif, joker123 gaming online,  her comment is here on bearing in mind situs judi indonesia terpercaya, nama situs judi slot online, sbobet alternatif, situs judi slot online 4d, provider slot terbaik,  for good measure. Check more @ Great Laser Facial Skin Treatment Info 0e61d04
回复

使用道具 举报

FrankJScott | 2025-5-28 00:54:41 | 显示全部楼层

Top Rated House Cleaning Service Site

In response to the guy asking about house & home rugs, over rugs, quality rugs, project carpet, carpet design photos, best place for rugs, rugs and rugs, carpet facebook, rug and home, rug shops in warrington,  I highly recommend this top rated house cleaning service forum or mansion carpet, rug size small living room, carpet for backyard, carpets designs and colors, wall design carpet, best hand tufted rugs, a living room rug, carpet for your room, area rug with pattern, stores with area rugs near me, bearing in mind this my latest blog post on house cleaning service link together with easy carpet design, rug design software, handcrafted carpets, rug in carpet room, rug styles for living room, difference between hand knotted and hand tufted, shop carpet design, get carpet made into rug, custom shop rugs, luxury living room rugs, not to mention this this post for house cleaning service forum which is also great. Also, have a look at this recommended house cleaning service blog not forgetting sites such as rug for carpeted living room, great rugs for living room, making custom rugs, rugs image, designer carpet flooring, knotted hand, hand stitched rugs, area area rugs, floor carpet design for living room, best area rugs for bedrooms, not forgetting sites such as this helpful site about house cleaning service info bearing in mind hand rug, rugs advertised on instagram, custom color area rugs, beige and burgundy area rugs, photo carpet,  on front page for not to mention floor with rug, rug in carpeted living room, carpet for rugs, area rugs with, carpets and rugs market,  for good measure. Check more @ Top Rated Tajir4D Site 61d047a
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则