ChromaYume NoobAI-XL (NAI-XL)

Overview

This model is built based on the architecture of the NOOBAI XL-VPred 1.0, with some structural modifications.
- For version 1.0 to 3.0: It is trained on the Danbooru2024 dataset along with Yande Full and e621, and using NOOBAI XL-VPred 1.0 and Illustrious XL 1.0 as teacher models during training.
- In version 2.0: I have used the old data along with additional real-life character data more than 50k images from various source on the internet.
- In version 3.0, I refactored the dataset and added more dataset labels using ChatGPT o3-mini, followed by a manual recheck.
- In version 4.0: The model was trained on the danbooru2024, danbooru_newest-all datasets, e621, e621_newest, gelbooru_full, yande_full as well as a custom dataset (which I collected and labeled using natural language with GPT-4.5, and later manually verified by me).

For version 1.0: This model focuses on balancing multiple art styles (through the use of trigger prompts) and good anatomy when generating images.
For version 2.0: This version focuses heavily on improving anatomy and enables the creation of more realistic characters (through the use of trigger prompts). Note that this version may reduce the quality of image generation across multiple art styles.
For version 3.0: This version can generate images in multiple styles (similar to version 1.0) while also creating more realistic characters (more lifelike compared to version 2.0) with improved anatomy. However, to achieve the desired image, you need to input a precise descriptive prompt, as it significantly impacts the output.
In version 4.0, to adapt to the large amount of data used for training, I reconstructed the model with some modifications. Moreover, I had to train all parts of the model, including CLIP, VAE, and UNet. In this version, the improvements allow the model to generate image styles more accurately, as well as improve the character anatomy. In addition, I fixed the issues that occurred in versions 2.0 and 3.0.

I personally reconstructed this model, so I’d greatly appreciate any feedback. Your insights won’t just motivate me but will also help me better understand its strengths and weaknesses, allowing me to refine it in the future.
This is a V-prediction model (unlike epsilon-prediction), which requires specific parameter configurations. Please refer to the user guide here.

Currently, the model is not available for use via Civitai Generation. You can visit the following website to use it:

For version 2.0 and 3.0: Add these prompt to generate

Positive prompt: realistic, cosplay, real life, photorealistic
Negative prompt: illustration, blur, film grain, noise, sketch, comic, cartoon, toon, oil painting (medium), flat color, outline, 3D, 2.5D, 2D, unrealistic, game engine style, anime coloring, smooth skin

Negative prompt: bad quality,worst quality,worst detail,sketch,censor, simple background,transparent background
CFG: 4-6
Clip skip: 2
Step: 20-30
Sampler: Euler a

Contributed by @Ligmanese

Note:

I don’t use any post-processing or LoRA to enhance the example images. They are generated solely using these settings and prompts with my base model.
For comparison and independent evaluation, I used prompts from various sources and authors to generate these example images.

Thanks to narugo1992 and Nyanko for sharing such valuable data and Laxhar Lab for providing an amazing model!
Thanks @Sennke for creating the noobReal model! This model has given me more ideas for improving the ChromaYume version 2.0.