RDBT | Anima

详情

模型描述

About new uploaded INT8 version.

ComfyUI just added native support for int8 model and hardware int8, out-of-the-box. Hardware int8 requires Nvidia GPUs with SM 7.5 (aka. RTX 3000+).

And this is the ckpt for me to test if it works, and it works.

With torch.compile it's 70% faster on my RTX 4000 GPU than bf16. Some said it can be even faster on RTX 3000 cards because RTX 3000 is better at int8 op than float.

Quality degradation is minimal to none, unless you're comparing each pixel individually with a magnifying glass.

Right now (6/27/2026 0:00 UTC), you need the LATEST ComfyUI (aka. the master branch). There are still some performance issues when loading LoRA.

If you are looking for int8 anima base: here.


RDBT [Anima]

Finetuned + distilled.

I use it as a clean starting point to stack more style LoRAs.

See this page for update log.

For advanced users: The RDBT model is trained as LoRA natively. See this page for original LoRA, update more frequently.

This model is based on:

  • prefix with ym: AnimaYume (hf link) (civitai link). Has latest dataset, 1536px training. Check it's model page for more info.

  • prefix with b,p: Anima pretrained (hf link)


Sharing merges using this model is not allowed. If someone is selling this model as their own, I'm happy to list them here so everyone knows.

Known model thieves: NukeA.I (behind paywall on tensorart).

I wrote a story about it. Also contains a guide for trainers about "how to bake special trigger word into your model".


Usage:

Settings:

CFG scale: 1~3. This model has been guidance distilled. You can disable CFG (CFG 1) and run the model 2x faster. Cover images are without CFG for demonstration.

Steps: 16~24. (This is NOT turbo, low steps is not supported and image will not be fully denoised. Add 0.5x turbo LoRA if you need 8 steps)

Prompt

Always specify style, or use a style LoRA. Otherwise, you will get random/mixed style.

This is a feature, not a bug. This model is general finetuned. It does not provide a overfitted default style. I use it as a clean starting point to stack more style LoRAs. I can stack whatever I want and get exactly what I stacked.

Quality tags:

It's recommended to omit all the quality tags, or just keep the "masterpiece", if you're not confident. Omitting those redundant tokens allows LLM to pay more attention on other words.

Quality tags have been reinforced during distillation. Thus they don't have noticeable effects. Same as negative tags. If you use cfg, there is no need to dump "score_1, blurry, worst quality, jpeg artifacts, extra arms,... x100 words" in your negative prompt. Those things have been distilled out.


Training settings

~10k images finetuning -> guidance distillation

All captions are NL from Google Gemini.

Optimizer: adamw, constant lr 0.00002, weight decay 0.1, batch size 16.

LoRA rank/alpha 24.

Timesteps shift 3.

Block 0-2 and adaln linear layers are skipped.

此模型生成的图像