Animated Character in Real Photo
Details
Download Files
Model description
Prompt format:
animated character in real photo, drawn in { flat shading | realistic } anime style, a young anime-style { girl | woman | boy | man }, { your prompt }
Other tags:
Framing:
- full length, medium shot, medium close shot, close-up, wide angle
Quality:
- overexposed, blurry, lowres, film grain, vignette, jpeg artifacts
Recommended strength: 0.6 <-> 1.0
I would like to add images depicting 3D anime characters in real environments and situations. If you've read this far and are willing to offer access or know a good source, please leave a comment or DM me. :)
Uploading 2 models at 3000 steps and 1500 steps because they're both good and work differently enough.
This is just a test training before the release of the Z-Image-Base model. I'd like to do a training on Chroma as well.
Training Details:
Trained with ai-toolkit (commit <2d30dc5d>) on a single RTX 4090.
Batch size 10, 512 resolution. DOP with the preservation target being "photo".
Full training config will be uploaded in the training data section.
Dataset:
Trained on a manually collected and curated dataset of 135 images depicting the concept "anime in real life."
About half of the images were images tagged with "photo background" and "anime in real life" on DanBooru.
The other half were images sourced from www . joyreactor . cc with the tag "Тульпа" or image search on yandex with "anime in real life".
Captioning:
Images in the dataset were captioned using JoyCaption beta, then the captions were cleaned up manually.
Images in the dataset where the character is depicted with realistic / semi-realistic lighting were tagged with "drawn in realistic anime style". (54 images)
Images in the dataset where the character appears to have very flat lighting or a distinct thick cartoon outline were tagged with "drawn in flat lighting anime style". (74 images)
Since this is a subjective tag, it may not be super consistent which is something to improve on.
The model may have a bias towards a 2D Japanese illustration style or characters that appear superimposed on the background since I had a hard time finding super high quality images.
The majority of images with watermarks were left as-is and were tagged accordingly. To the best of my knowledge, no character names or franchise names are included in the captions.
Since many images in the dataset appear to be taken somewhere in Russia, the tag "It appears to be somewhere in Russia." was added to any images where this appeared to be the case. The model may still show a bias towards these background settings.
99% of the images in the dataset are SFW.



