基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】

根据 PPDiffusers 练习DreamBooth LoRA微调生成我国山水画风格【livingbody/Chinese_ShanShui_Style】

本教程将从以下两个方面带领咱们了解整个流程。

1. 预备工作
- 1.1 环境装置
- 1.2 Hugging Face Space 注册和登录
2. 怎么练习
- 2.1 上传图片
- 2.2 练习参数调整
- 2.3 选择满意的权重上传至Huggingface
- 2.4 再生成一张

1. 预备工作

1.1 环境装置

在开始之前，咱们需求预备咱们所需的环境，运转下面的指令装置依靠。为了确保装置成功，装置结束请重启内核！（留意：这儿只需求运转一次！）

pip install "paddlenlp>=2.5.2" "ppdiffusers>=0.11.1" safetensors --user

# 请运转这儿装置所需求的依靠环境！！
!pip install "paddlenlp>=2.5.2" safetensors "ppdiffusers>=0.11.1" --user
from IPython.display import clear_output
clear_output() # 整理很长的内容

1.2 Hugging Face Space 注册和登录

标题要求将模型上传到 Hugging Face，需求先注册、登录。

注册和登录：huggingface.co/join
获取登录 Token
Aistudio 登录 Huggingface Hub

Tips：为了便利咱们之后上传权重，咱们需求登录 Huggingface Hub，想要了解更多的信息咱们能够查阅官方文档。

!git config --global credential.helper store
from huggingface_hub import login
login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

2. 怎么练习模型，并上传到HF

数据集运用的是我国山水画

2.1 上传图片

# 解压缩数据集
!unzip -qoa data/data107231/Chinese_art_dataset.zip -d Chinese_art_dataset

!cp Chinese_art_dataset/Chinese_art_dataset/style_images/shanshui*  train_dataset/

2.2 练习参数调整

在练习进程中，咱们能够测验修正练习的默认参数，下面将从三个方面介绍部分参数。

首要修正的参数：

pretrained_model_name_or_path ：想要练习的模型称号或者本地路径的模型，例如："runwayml/stable-diffusion-v1-5"，更多模型可参阅 PaddleNLP 文档。

instance_data_dir：练习图片所在的文件夹目录，咱们能够将图片上传至aistudio项目。

instance_prompt：练习所运用的 Prompt 文本。

resolution：练习时图像的分辨率，建议为 512。

output_dir：练习进程中，模型保存的目录。

checkpointing_steps：每隔多少步保存模型，默认为100步。

learning_rate：练习运用的学习率，当我运用 LoRA 练习模型的时分，咱们需求运用更大的学习率，因此咱们这儿运用 1e-4 而不是 2e-6。

max_train_steps：最大练习的步数，默认为500步。

可选修正的参数：

train_batch_size：练习时分运用的 batch_size，当咱们的GPU显存比较大的时分能够加大这个值，默认值为4。

gradient_accumulation_steps：梯度累积的步数，当咱们GPU显存比较小的时分还想模仿大的练习批次，咱们能够恰当增加梯度累积的步数，默认值为1。

seed：随机种子，设置后能够复现练习成果。

lora_rank：LoRA 层的 rank 值，默认值为4，终究咱们会得到 3.5MB 的模型，咱们能够恰当修正这个值，如：32、64、128、256 等。

lr_scheduler：学习率衰减战略，能够是"linear", "constant", "cosine"等。

lr_warmup_steps：学习率衰减前，warmup 到最大学习率所需求的步数。

练习进程中评价运用的参数：

num_validation_images：练习的进程中，咱们希望回来多少张图片，默认值为4张图片。

validation_prompt：练习的进程中咱们会评价练习的怎么样，因此咱们需求设置评价运用的 prompt 文本。

validation_steps：每隔多少个 steps 评价模型，咱们能够查看练习的进度条，知道当前到了第几个 steps。

Tips: 练习进程中会每隔 validation_steps 将生成的图片保存到 {你指定的输出路径}/validation_images/{步数}.jpg

权重上传的参数：

push_to_hub: 是否将模型上传到 huggingface hub，默认值为 False。

hub_token: 上传到 huggingface hub 所需求运用的 token，假如咱们现已登录了，那么咱们就无需填写。

hub_model_id: 上传到 huggingface hub 的模型库称号，假如为 None 的话表示咱们将运用 output_dir 的称号作为模型库称号。

在下面的例子中，由于咱们前面现已登录了，因此咱们能够开启 push_to_hub 按钮，将终究练习好的模型同步上传到 huggingface.co

当咱们开启push_to_hub后，等候程序运转结束后会主动将权重上传到这个路径 huggingface.co/{你的用户名}/{你指… ，例如： huggingface.co/junnyu/lora…

!python train_dreambooth_lora.py \
    --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5"  \
    --instance_data_dir="train_dataset" \
    --output_dir="lora_outputs" \
    --instance_prompt="Chinese_ShanShui_Style" \
    --resolution=512 \
    --train_batch_size=2 \
    --gradient_accumulation_steps=1 \
    --checkpointing_steps=100 \
    --learning_rate=1e-4 \
    --lr_scheduler="constant" \
    --lr_warmup_steps=0 \
    --max_train_steps=800 \
    --seed=0 \
    --lora_rank=4 \
    --push_to_hub=False \
    --validation_prompt="A little black cat is playing in the woods with Chinese_ShanShui_Style" \
    --validation_steps=100 \
    --num_validation_images=4

W0323 16:10:06.002939  5675 gpu_resources.cc:85] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0323 16:10:06.007860  5675 gpu_resources.cc:115] device: 0, cuDNN Version: 8.2.
正在下载模型权重，请耐性等候。。。。。。。。。。
[33m[2023-03-23 16:10:08,262] [ WARNING][0m - You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.[0m
Train Steps:  12%|█       | 100/800 [00:57<06:29,  1.80it/s, epoch=0016, lr=0.0001, step_loss=0.261]
 Saved lora weights to lora_outputs/checkpoint-100
100%|███████████████████████████████████████████| 601/601 [00:00<00:00, 171kB/s][A
100%|███████████████████████████████████████████| 342/342 [00:00<00:00, 113kB/s][A
Train Steps:  25%|█▊     | 200/800 [02:16<05:41,  1.76it/s, epoch=0033, lr=0.0001, step_loss=0.0311]
 Saved lora weights to lora_outputs/checkpoint-200
Train Steps:  38%|███     | 300/800 [03:35<04:38,  1.80it/s, epoch=0049, lr=0.0001, step_loss=0.113]
 Saved lora weights to lora_outputs/checkpoint-300
Train Steps:  50%|████    | 400/800 [04:53<03:44,  1.78it/s, epoch=0066, lr=0.0001, step_loss=0.118]
 Saved lora weights to lora_outputs/checkpoint-400
Train Steps:  62%|█████   | 500/800 [06:11<02:50,  1.76it/s, epoch=0083, lr=0.0001, step_loss=0.167]
 Saved lora weights to lora_outputs/checkpoint-500
Train Steps:  75%|██████▊  | 600/800 [07:30<01:52,  1.78it/s, epoch=0099, lr=0.0001, step_loss=0.11]
 Saved lora weights to lora_outputs/checkpoint-600
Train Steps:  88%|█████▎| 700/800 [08:49<00:56,  1.78it/s, epoch=0116, lr=0.0001, step_loss=0.00746]
 Saved lora weights to lora_outputs/checkpoint-700
Train Steps: 100%|███████| 800/800 [10:08<00:00,  1.74it/s, epoch=0133, lr=0.0001, step_loss=0.0411]
 Saved lora weights to lora_outputs/checkpoint-800
Model weights saved in lora_outputs/paddle_lora_weights.pdparams
Train Steps: 100%|███████| 800/800 [11:05<00:00,  1.20it/s, epoch=0133, lr=0.0001, step_loss=0.0411]
[0m

2.3 选择满意的权重上传至Huggingface

参数解释：

upload_dir：咱们需求上传的文件夹目录。

repo_name：咱们需求上传的repo称号，终究咱们会上传到 huggingface.co/{你的用户名}/{你指… 例如： huggingface.co/junnyu/lora….

pretrained_model_name_or_path：练习该模型所运用的基础模型。

prompt：调配该权重需求运用的Prompt文本。

from utils import upload_lora_folder
upload_dir                    = "lora_outputs"                   # 咱们需求上传的文件夹目录
repo_name                     = "Chinese_ShanShui_Style"                  # 咱们需求上传的repo称号
pretrained_model_name_or_path = "runwayml/stable-diffusion-v1-5" # 练习该模型所运用的基础模型
prompt                        = "Chinese_ShanShui_Style" # 调配该权重需求运用的Prompt文本
upload_lora_folder(
    upload_dir=upload_dir,
    repo_name=repo_name,
    pretrained_model_name_or_path=pretrained_model_name_or_path,
    prompt=prompt, 
)

Pushing to livingbody/Chinese_ShanShui_Style
Upload 1 LFS files:   0%|          | 0/1 [00:00<?, ?it/s]
paddle_lora_weights.pdparams:   0%|          | 0.00/3.23M [00:00<?, ?B/s]

2.4 再生成一张

from ppdiffusers import DiffusionPipeline, DPMSolverMultistepScheduler
import paddle
pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.unet.load_attn_procs("lora_outputs/", from_hf_hub=True)
prompt = "2 man are walking in the woods with Chinese_ShanShui_Style"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("demo.png")

[2023-03-23 17:07:14,171] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/model_index.json
[2023-03-23 17:07:14,176] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/vae/model_state.pdparams
[2023-03-23 17:07:14,179] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/vae/config.json
[2023-03-23 17:07:14,870] [    INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/safety_checker/config.json
[2023-03-23 17:07:14,875] [    INFO] - loading configuration file /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/safety_checker/config.json
[2023-03-23 17:07:14,878] [    INFO] - Model config CLIPVisionConfig {
  "architectures": [
    "StableDiffusionSafetyChecker"
  ],
  "attention_dropout": 0.0,
  "dropout": 0.0,
  "hidden_act": "quick_gelu",
  "hidden_size": 1024,
  "image_size": 224,
  "initializer_factor": 1.0,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "model_type": "clip_vision_model",
  "num_attention_heads": 16,
  "num_channels": 3,
  "num_hidden_layers": 24,
  "paddlenlp_version": null,
  "patch_size": 14,
  "projection_dim": 768,
  "return_dict": true
}
[2023-03-23 17:07:14,987] [    INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/safety_checker/model_state.pdparams
[2023-03-23 17:07:17,520] [    INFO] - All model checkpoint weights were used when initializing StableDiffusionSafetyChecker.
[2023-03-23 17:07:17,525] [    INFO] - All the weights of StableDiffusionSafetyChecker were initialized from the model checkpoint at runwayml/stable-diffusion-v1-5/safety_checker.
If your task is similar to the task the model of the checkpoint was trained on, you can already use StableDiffusionSafetyChecker for predictions without further training.
[2023-03-23 17:07:17,531] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/vocab.json
[2023-03-23 17:07:17,533] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/merges.txt
[2023-03-23 17:07:17,536] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/added_tokens.json
[2023-03-23 17:07:17,538] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/special_tokens_map.json
[2023-03-23 17:07:17,541] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/tokenizer_config.json
[2023-03-23 17:07:17,724] [    INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/text_encoder/config.json
[2023-03-23 17:07:17,728] [    INFO] - loading configuration file /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/text_encoder/config.json
[2023-03-23 17:07:17,731] [    INFO] - Model config CLIPTextConfig {
  "_name_or_path": "openai/clip-vit-large-patch14",
  "architectures": [
    "CLIPTextModel"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "dropout": 0.0,
  "eos_token_id": 2,
  "hidden_act": "quick_gelu",
  "hidden_size": 768,
  "initializer_factor": 1.0,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 77,
  "model_type": "clip_text_model",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 1,
  "paddlenlp_version": null,
  "projection_dim": 512,
  "return_dict": true,
  "torch_dtype": "float32",
  "transformers_version": "4.21.0.dev0",
  "vocab_size": 49408
}
[2023-03-23 17:07:17,891] [    INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/text_encoder/model_state.pdparams
[2023-03-23 17:07:18,926] [    INFO] - All model checkpoint weights were used when initializing CLIPTextModel.
[2023-03-23 17:07:18,930] [    INFO] - All the weights of CLIPTextModel were initialized from the model checkpoint at runwayml/stable-diffusion-v1-5/text_encoder.
If your task is similar to the task the model of the checkpoint was trained on, you can already use CLIPTextModel for predictions without further training.
[2023-03-23 17:07:18,936] [    INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/feature_extractor/preprocessor_config.json
[2023-03-23 17:07:18,940] [    INFO] - loading configuration file https://bj.bcebos.com/paddlenlp/models/community/runwayml/stable-diffusion-v1-5/feature_extractor/preprocessor_config.json from cache at /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/feature_extractor/preprocessor_config.json
[2023-03-23 17:07:18,943] [    INFO] - size should be a dictionary on of the following set of keys: ({'width', 'height'}, {'shortest_edge'}, {'shortest_edge', 'longest_edge'}), got 224. Converted to {'shortest_edge': 224}.
[2023-03-23 17:07:18,946] [    INFO] - crop_size should be a dictionary on of the following set of keys: ({'width', 'height'}, {'shortest_edge'}, {'shortest_edge', 'longest_edge'}), got 224. Converted to {'height': 224, 'width': 224}.
[2023-03-23 17:07:18,949] [    INFO] - Image processor CLIPFeatureExtractor {
  "crop_size": {
    "height": 224,
    "width": 224
  },
  "do_center_crop": true,
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "feature_extractor_type": "CLIPFeatureExtractor",
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "CLIPFeatureExtractor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "shortest_edge": 224
  }
}
[2023-03-23 17:07:18,951] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/unet/model_state.pdparams
[2023-03-23 17:07:18,954] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/unet/config.json
[2023-03-23 17:07:28,517] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/scheduler/scheduler_config.json
  0%|          | 0/25 [00:00<?, ?it/s]

代码如下：aistudio.baidu.com/aistudio/pr…

本文正在参加「金石计划」

基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】

1. 预备工作

1.1 环境装置

1.2 Hugging Face Space 注册和登录

2. 怎么练习模型，并上传到HF

2.1 上传图片

2.2 练习参数调整

2.3 选择满意的权重上传至Huggingface

2.4 再生成一张

相关文章

After Effects 教程，如何在 After Effects 中使用 Cinema 4D 渲染器？

写给应用开发的 Android Framework 教程——玩转 AOSP 之使用 Android Studio 开发系统 App

macOS系统升级导致brew发行包Nginx服务Service无法启动或启动失败问题解决（MacOS Ventura 13.3）

每日一题：Android Application为什么是单例

作者信息