Yuuki Sakuna (結城さくな) - Pony LoRA

Disclaimer

Do not use my LoRA to produce AI image and tagging her fanart hashtag in Twitter/X or if you have an enough contribution in that image then fine.

Version 3 description

reduce dim/alpha to 16/12
using scale weight norm to 1.5 (prevent overfitting)
change scheduler
new dataset + captions
her smile is less become :3 instead of :)

Trigger Word (V3 only)

default costume

yuuki sakuna, long hair, blush, animal ears, pink hair, medium breasts, hair ornament, maid headdress, bow, maid, animal ear fluff, puffy sleeves, apron, white dress, bowtie, collarbone, hairclip, hair bow, detached collar, black footwear, shoes, black choker, white thighhighs

any costume

yuuki sakuna, long hair, blush, cat ears, pink hair, medium breasts, hair ornament, bowtie, hairclip, hair bow

Limitations

some costume cannot change properly (due to lack of datasets)
full body still inaccurate (but much improvement)

Version 2 description

using Experimental Optimizer (it seems improvement on full body but camera angle still not variable if not prompt it)
bow still remains even change outfit

Version 1 description

I like cat ears girl so, I trained as LoRA (don't know about her past at all :<)

Trigger Word

trigger word

yuuki sakuna

any costume (still not flexible enough)

yuuki sakuna, long hair, animal ears, pink hair, blush, cat ears, pink eyes, two side up, ahoge, colored inner hair, two-tone hair

debut costume (good but some component still missing)

yuuki sakuna, long hair, hair ornament, bow, animal ears, pink hair, blush, cat ears, maid headdress, hair bow, frills, hairclip, pink eyes, pink bow, blue bow, maid, puffy sleeves, two side up, cat hair ornament, ahoge, heart hair ornament, puffy short sleeves, clothing cutout, pink dress, blue bow, colored inner hair, two-tone hair, cleavage, breasts

for full body add this following text will help (shoes still not correct)

shoes, black footwear, white thighhighs

Limitations

Cannot change costume properly (still have some debut costume component leftover)
full body may be not effective
LoRA still little bit underfitting (like medium rare pork) (in version 1)
version 2 is improve some small detail but dataset still not variable enough (due to imbalance image)

Training Details (Version 3)

LoRA size

reduced dimension to 8 with sv_fro=0.95

dataset

42 images

parameters

resolution = 1024
batch size = 2
dim,alpha = 16,12 (for training)
mix/save precision = bf16/bf16
optmizer = AdEMAMix + weight_decay=0.025 betas=0.9,0.999,0.9999
UNet LR = 2e-4
TE LR = 1e-4
scheduler = cosine_with_min_lr (min_lr_ratio 0.67)
huber snr 0.85

steps

epochs = 5
total steps = 1575
repeat = 15 (one concept only)

tools

kohya-ss GUI v24.3.0 (Forked by me)
torch 2.5.0 cu124
RTX 3060 12 GB + xformers + gradient_checkpointing

weight

UNet average weights : 0.0149531283161857
TE1 average weights : 0.011002991641968642
TE2 average weights : 0.009832777519477531

Training Details (Version 2)

LoRA size

reduced dimension to 8 with dynamic alpha

dataset

38 images (most is half body)

parameters

resolution = 1024
batch size = 2
dim,alpha = 16,16 (for training)
mix/save precision = bf16/bf16
optmizer = AdEMAMix (32 bit consume VRAM)
UNet LR = 2e-4
TE LR = 1e-4
scheduler = inverse_sqrt with warmup 100 steps
l2 loss only

steps

epochs = 10
total steps = 2850
repeat = 15 (one concept only)

tools

kohya-ss GUI v24.2.0
torch 2.5.0 cu124
RTX 3060 12 GB + xformers + gradient_checkpointing

weight

UNet weight average strength = 0.015634962041489377
Text Encoder (1) weight average strength Clip_L = 0.011193290141749815
Text Encoder (2) weight average strength Clip_G = 0.010691167576002698

Training Details (Version 1)

dataset

38 images (most is half body)

parameters

resolution = 1024
batch size = 2
dim,alpha = 16,16 (no resize for preserving quality if LoRA is good enough will do it :P)
mix/save precision = bf16/fp16 (accidentally change)
optmizer = AdEMAMix8bit
UNet LR = 1e-4
TE LR = 5e-05
scheduler = inverse_sqrt with warmup 100 steps
huber snr with c = 0.85

steps

epochs = 10
total steps = 2850
repeat = 15 (one concept only)
full_bf16 training

tools

kohya-ss GUI v24.2.0
torch 2.5.0 cu124
RTX 3060 12 GB + xformers + gradient_checkpointing

weight

UNet weight average strength = 0.008335085569112463
Text Encoder (1) weight average strength Clip_L = 0.0073367764333498705
Text Encoder (2) weight average strength Clip_G = 0.005826970830639767

description ref from Gtonero

*This LoRA is for studying LoRA training with new technique so do not use for damaging the vtuber (also support her too).

模型类型	LORA
基础模型	Pony
发布时间	2024-11-05
训练词汇	yuuki sakuna

[Pony] Yuuki Sakuna (結城さくな)

详情

下载文件 (1)

关于此版本

模型描述

Yuuki Sakuna (結城さくな) - Pony LoRA

Disclaimer

Version 3 description

Trigger Word (V3 only)

Limitations

Version 2 description

Version 1 description

Trigger Word

Limitations

Training Details (Version 3)

Training Details (Version 2)

Training Details (Version 1)

此模型生成的图像