mutton_klein

詳細

ファイルをダウンロード (1)

モデル説明

this is the only use case of klein 4b base that i found, other lora like text generation always got the model collapsed and produce body horror instead

and even in this case the result isn't really that good, you got a style similar to the artist but still have to do lots of manual segmentation and color curve / hue / sat fixing if you want to get the exact color, the output image is low quality (only 1M pixels), and sometimes pixel drift

also the lora rank was 64, so for each artist requires a lora of ~200MB, which isn't very lean, i tried rank 32 lora but seems like that wasn't enough parameters to pick up complex coloring style

what can this model used for:

  • coarse coloring: sketch some color then have the model fill in the rest

  • video2video: modify some frame and use ebsynth, or run it frame by frame with worse temporal consistency but better details (not all video is applicable though, and it only change the coloring, not the line art itself, refer to example)

  • style switch, manga coloring, ..., however will still require manual editing to have high quality image

how the data was generated:

  1. gather images, even low as 10 is sufficient, make sure they have the same coloring style

  2. generate line art + coarse coloring, there are already ML model out there

  3. generate line art on base color, line art on synthesis stroke, etc... ask any AI model to vibe code those

  4. augment with random flip / crop / zoom / etc... so the model don't lose the editing capability (this is important if you have only few images in the dataset)

このモデルで生成された画像