Read Description

Note: Clarity XL is currently in BETA.

Fine-Tuning is still in progress.

Like Photorealism? Checkout my latest SDXL Fine-Tune: NatViS.

Changelog

8/26/24 ClarityXL v2.0 Lightning 8step

Released 8step lightning version of ClarityXL v2.0, by request. Make sure you read the About this version for more info.
- Note: Use a lower CFG (1.5 - 2.5) if colors look washed out. I made the mistake of setting it to high in the sample images.

————

8/12/24 ClarityXL v2.0

Released v2.0 of ClarityXL. Read About this version to see what's new.

Buy me a coffee ❤

https://ko-fi.com/ndimensional

I’ve never been a fan of e-begging, however SDXL fine-tunes at this scale are becoming expensive to tune. So I will begrudgingly ask; if you like what I do and would like to support my models. Consider donating on Ko-Fi 💗
I will be begin posting updates, answering questions, taking feedback, and releasing early access (NOT EXCLUSIVE) models to supporters.

All donations will be used to fund the creation of new Stable Diffusion fine-tunes and open-source AI tools.

About

Continuing from the original Clarity model for SD1.5, Clarity XL is an attempt to recreate and expand the original models capabilities within the more complex architecture of SDXL.

Differences between Clarity SD1.5 and Clarity XL

Currently, Clarity XL focuses purely on photorealism. This is intentional to build the foundation which will be expanded upon in future releases. That's not to say Clarity XL will ever be a general purpose model. It will always have a bias towards photorealism. Future releases will add more complex photorealistic/cinematic scene capabilities.

Improvements

Emphasis on authentic (non touched-up) photorealism.
Higher Image Fidelity.
Prompt Adherence: How well the model follows your prompt.
- Excluding concepts that the model was not trained on
Improved skin textures
Overall improvements to aesthetics.
Video Game / Movie character recognition.
- Including worlds spaces, landscapes, settings, ect..
Prompt how you want: Accepts natural language prompts, comma delimited-lists, a hybrid of the two. In addition, prompts can be as short or long as you'd like.

Limitations

Complex scenes, such as firing lighting bolts from hand, erupting in a cacophony of bright blue sparkling arcs.
Multi-medium generation: The model is currently grounded in photorealism and cinematography.

Model Details

Base Model: Stable Diffusion XL v1.0
- Since Clarity XL v1 is a mid-training epoch. I merged the Epoch with an unreleased fine-tune update for LomoXL. I used a modified version of the DARE merging method to preserve the original weight matrix of the base epoch. This will not be needed in later releases.
Data: Quality was a priority when creating the dataset. All image-caption pairs were cleansed through multiple iterations to ensure only high quality data was used for tuning.
- Captions: Captions were written by my MLLM captioning system, verified via GroundingDINO + a Reasoning Engine + NLP
  - Captions were written in a natural language format. Though, SDXL's text-encoders make it possible to write prompts in multiple prompting styles.
VAE: sdxl-vae-fp16-fix
Aspect Ratio: From training data, any of the typical aspect ratios for SDXL will work.
- 1344x768 (16:9) — Cinematic Film Stills
- 1536x640 (21:9) — Ultrawide Cinematic Film Stills
- 1152x896 (4:3) — Fullscreen
- 1216x832 (3:2) — Mobile landscape
- 1024x1024 (1:1) — Square
- 1024x704 (11:16)
- 768x1344 (9:16) — Tall (Instagram stories / snapchat)
- 896x1152 (3:4)
- 832x1216 (2:3) — Mobile Portrait
- 704x1024 (16:11)