Anima Gemma3 Batch Caption & LoRA Training Workflow
詳細
ファイルをダウンロード (1)
モデル説明
This ComfyUI workflow is designed for Anima-Base LoRA batch captioning and automated training. Instead of manually opening every image, writing captions one by one, preparing text files, configuring sd-scripts, and starting training manually, this workflow turns the entire process into a structured batch pipeline.
The workflow starts with AnimaBatchFolderLoader, which scans an image folder such as /workspace/train and builds a task list from common image formats like PNG, JPG, JPEG, WEBP, and BMP. AnimaLoadCaptionImageBatch then loads the images as a batch and records image metadata for later caption saving and training preparation.
The captioning stage uses llama.cpp with a Gemma3 multimodal model. In this workflow, the model route loads gemma-3-12b-it-heretic-Q6_K.gguf together with its multimodal projection file. The caption prompt is designed for Anima-Base LoRA training, requiring concise hybrid captions that combine booru-style tags with one short natural-language phrase. It also avoids common caption pollution such as artist names, copyrighted character names, score tags, rating tags, watermark, logo, UI text, quality tags, markdown, JSON, and long explanations.
After Gemma3 generates captions, AnimaSaveLlamaCppCaptions writes the results back into .txt caption files. AnimaCaptionPrepare then prepares the caption structure, inserts trigger words based on folder names, and standardizes the training-friendly caption format. This makes the workflow suitable for character LoRA, clothing LoRA, and style LoRA datasets.
The training stage uses AnimaBatchTrainConfig and AnimaBatchStartTrain. The included configuration points to anima-base-v1.0.safetensors as the base model, outputs LoRA files into the Anima LoRA folder, uses 1024x1024 training resolution, bf16 precision, AdamW8bit optimizer, LoRA rank 32, alpha 32, learning rate 2e-5, target steps around 3001, and Anima-specific sd-scripts arguments such as networks.lora_anima, network_train_unet_only, sigmoid timestep sampling, discrete_flow_shift, gradient checkpointing, latent caching, and Qwen/VAE model paths.
Finally, AnimaBatchCheckStatus checks the training status by job ID. This makes the workflow feel like a mini LoRA factory inside ComfyUI: put the image folders in place, let Gemma3 generate captions, prepare clean training data, start Anima LoRA training, and monitor the process.
Main features:
- Batch image folder scanning
- Automatic image loading and task-list generation
- Gemma3 multimodal captioning through llama.cpp
- Hybrid booru-style caption prompt for Anima-Base LoRA
- Automatic caption saving to .txt files
- Trigger-word insertion from folder name
- Caption cleaning and preparation
- Anima-Base LoRA training configuration
- sd-scripts wrapper integration
- bf16 + AdamW8bit training setup
- Rank 32 / alpha 32 LoRA configuration
- Target-step training mode
- Training job launch and status checking
- Suitable for character, clothing, and style LoRA datasets
Suggested workflow:
Prepare your dataset inside the training folder first. Put each LoRA concept into its own first-level folder, because the folder name can be used as the trigger word. Keep your image dataset clean and remove low-quality, duplicated, or heavily broken images before training.
Run the batch loader to scan the dataset, then let Gemma3 generate captions for every image. Check the generated captions before training. Good captions should describe visible content clearly, preserve important character or clothing traits, and avoid useless metadata. After caption preparation, start the Anima LoRA training stage and monitor the job status.
This workflow is designed for creators who want to reduce repetitive dataset preparation work. It is especially useful when you need to produce multiple Anima LoRAs, test character datasets quickly, prepare clothing LoRAs, or build a reusable batch training pipeline for AI image production.
🎥 YouTube Video Tutorial
Want to know what this workflow actually does and how to start fast?
This video explains what the tool is, how to launch the workflow, and shares my core design logic for batch captioning and automated Anima LoRA training.
👉 YouTube Tutorial:
Before you begin, I recommend watching the video thoroughly — getting the full context helps you understand the tool faster and avoid common detours.
📺 Bilibili Updates (Mainland China & Asia-Pacific)
If you’re in the Asia-Pacific region, you can watch the video below to see the workflow demonstration and creative breakdown.
📺 Bilibili Video: https://www.bilibili.com/video/BV11eLm6JExi/
☕ Support Me on Ko-fi
If you find my content helpful and want to support future creations, you can buy me a coffee ☕.
Every bit of support helps me keep creating — just like a spark that can ignite a blazing flame.
👉 Ko-fi: https://ko-fi.com/aiksk
💼 Business Contact
For collaboration or inquiries, please contact aiksk95 on WeChat.
🎥 YouTube 视频教程
想了解这个工作流到底是怎样的工具,以及如何快速启动?
视频主要介绍工具定位、批量打标逻辑、Caption 规范、Anima LoRA 训练配置和我的构筑思路。
👉 YouTube 教程: https://youtu.be/QkqNLPGV6_o
开始前建议尽量完整地观看视频 —— 把握整体思路会更快上手,也能少走常见弯路。
📺 Bilibili 更新(中国大陆及南亚太地区)
如果你在中国大陆或南亚太地区,可以通过下方视频查看该工作流的实测效果与构思讲解。
📺 B站视频: https://www.bilibili.com/video/BV11eLm6JExi/
我会在 夸克网盘 持续更新模型资源:
👉 https://pan.quark.cn/s/20c6f6f8d87b
这些资源主要面向本地用户,方便进行创作与学习。

