Version 3 is out!
This one focuses on blindingly fast generation, photographic realism and dramatic impact. It incorporates a bit of Big Lust and a mix of self-trained LORAs, but mostly I have been tinkering with the clip and text encoders, aiming for decent production at very low step numbers.
The model is extremely resource-efficient, and can generate very nice images with a sweet spot in precisely 3 LCM steps, although such low numbers can occasionally produce some anatomical horrors. Switching to 20 LCM steps and above improves quality in complex descriptions and is still very fast.
As the previous version, it combines pretty well with LORAs and embeddings from different flavors of Stable Diffusion XL, Pony and Illustrious.
The first 5 examples use no lora or embedding, and the first 4 are obtained with just 3 steps (the barbarian image is obtained with 20 steps to avoid strangeness with the sword grip), just to show the capabilities of the model.
For resolutions above 1280x1280, you should some hi-res fix, for example with a first generation at 1024x1024 refined at 1536x1536.
As before, beware of the NSFW capabilities of the model, and prompt carefully.
Please experiment, have fun, give thumbs up if you like it, and don't hesitate to give feedback!
OPTIMAL SETTINGS: This version is optimized for LCM sampling only.
I recommend:
CLIP Skip: 1, for maximum realism.
Ultrafast: LCM sampler / CFG 0.9-3 / 3 steps / noise shift 1 - 1.2
Improved quality: LCM sampler / CFG0.9-3 / 20+ steps / noise shift 1-1.2