GGUF: HyperFlux Unchained 16-Steps (Flux Unchained + ByteDance HyperSD LoRA)
Details
Download Files
Model description
Warning: This is the 16-step model. The faster 8-step model can be found HERE
[Note: Unzip the download to get the GGUF. Civit doesn't support it natively, hence this workaround]
A merge of FluxUnchained and HyperSD LoRA from ByteDance - converted to GGUF. As a result, it can now generate artistic NSFW images in 16 steps while consuming very low VRAM. The Q_4_0 model consumes around 6.5 GB VRAM and takes around 3 min to generate a 1024x1024 image with 16 steps on my 1080ti. [See https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050 to learn more about Forge UI GGUF support and also where to download the VAE, clip_l and t5xxl models.]
You can also combine it with other LoRAs to get the effect you want.
Advantages
Quality similar to the original FluxUnchained DEV model, while requiring only ~16 steps.
Better quality and expressivity than the 8-step HyperFlux Unchained in general.
For the same seed, the output image is pretty similar to the original DEV model, so you can use it to do quick searches and do a final generation with the dev model.
Sometimes you might even get better results than the DEV model due to serendipity :) .
Disadvantage: requires 16 steps
Which model should I download?
[Current situation: Using the updated Forge UI and Comfy UI (GGUF node) I can run Q8_0 on my 11GB 1080ti.]
Download the one that fits in your VRAM. The additional inference cost is quite small if the model fits in the GPU. Size order is Q4_0 < Q4_1 < Q5_0 < Q5_1 < Q8_0.
Q4_0 and Q4_1 should fit in 8 GB VRAM
Q5_0 and Q5_1 should fit in 11 GB VRAM
Q8_0 if you have more!
Note: With CPU offloading, you will be able to run a model even if doesn't fit in your VRAM.
All the license terms associated with Flux.1 Dev apply.
PS: Credit goes to ByteDance for the HyperSD Flux 16-steps LoRA which can be found at https://huggingface.co/ByteDance/Hyper-SD/tree/main
