FLUX.2 [klein] AIO

μ„ΈλΆ€ 정보

λͺ¨λΈ μ„€λͺ…

πŸš€ FLUX.2 [klein] 4B AIO | Sub-Second Image Generation

Ultra-Fast β€’ 4-6 Steps β€’ Text-to-Image + Image Editing β€’ All-in-One β€’ Apache 2.0


✨ What is FLUX.2 [klein] 4B AIO?

FLUX.2 [klein] 4B AIO is an All-in-One repackage of Black Forest Labs' newest compact image generation model. This version includes VAE, Text Encoder (Qwen3) and UNet in a single file – just load and go!

"Klein" means "small" in German – but this model is anything but limited. It delivers exceptional performance in Text-to-Image, Image Editing and Multi-Reference Generation, typically reserved for much larger models.


πŸ”„ UPDATE

⚑ Flux2-klein-4B-AIO-NVFP4

Fast generation with blackwell in just a few seconds β€” even at 4 steps, and scales nicely with more steps πŸš€

πŸ•’ Performance

  • Prompt executed in 2.08 seconds

  • β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 4 / 4 steps

  • ~2.65 it/s

βœ… Extremely fast
βœ… Stable
βœ… Great quality for a distilled setup


πŸ–ΌοΈ Example Generation

Prompt:

Anime, powerful anime illustration with vibrant dark fantasy colors, one adult woman inspired by Jouryuu, tall imposing presence, long hair flowing dramatically, intense anime eyes, wearing an ornate battle-inspired dress referencing official visuals, heavy fabric and strong silhouette, standing confidently in a ruined temple environment, low-angle camera enhancing dominance, dramatic backlighting with red and violet tones, strong shadows, intense cel-shading, bold anime lineart, intimidating yet elegant presence, correct anatomy, no text, no watermark.

πŸ’‘ This setup is optimized for speed, making it ideal for quick iterations, testing ideas, or just having fun generating without long wait times.

Have fun generating β€” and as always, thanks for all the feedback and support πŸ™Œβœ¨


πŸ“¦ Available Versions

🟑 FP8-AIO (~7.7 GB) – Recommended for most users

  • Precision: FP8

  • UNet: FP8

  • Text Encoder: FP8

  • VAE: BF16

  • Best for: Most users, quick tests, everyday use, lowest VRAM

πŸ”΅ FP16-AIO (~15 GB) – For older GPUs

  • Precision: FP16

  • UNet: FP16

  • Text Encoder: FP16

  • VAE: BF16

  • Best for: Older GPUs (GTX 10xx, RTX 20xx), broadest compatibility

🟒 BF16-AIO (~15 GB) – Maximum quality

  • Precision: BF16

  • UNet: BF16

  • Text Encoder: BF16

  • VAE: BF16

  • Best for: RTX 30xx/40xx/50xx, professional/commercial work

πŸ”΄ NVFP4-AIO (~5.7 GB) – Maximum speed ⚑

  • Precision: NVFP4

  • UNet: NVFP4

  • Text Encoder: FP4-mix

  • VAE: BF16

  • Best for: RTX 50xx only, ultra-fast generations, Blackwell GPUs, low VRAM + extreme performance


🎯 Key Features

  • ⚑ 4-6 Step Generation – Sub-second inference on modern hardware

  • πŸ“¦ All-in-One – No separate VAE/Text Encoder download needed

  • 🎨 Unified Architecture – T2I, I2I Editing & Multi-Reference in one model

  • πŸ“ 1024Γ—1024 native – Optimized for this resolution

  • πŸ’Ύ Low VRAM – Runs on consumer GPUs with ease

  • πŸ“œ Apache 2.0 – Fully open for commercial use!

  • πŸ”§ LoRA-compatible – Base version ideal for fine-tuning


βš™οΈ Recommended Settings

  • Steps: 4-6 (step-distilled, more steps β‰  better)

  • CFG: 1.0 ⚠️ CRITICAL!

  • Sampler: euler

  • Scheduler: simple (or "normal")

  • Resolution: 1024Γ—1024 (native)

⚠️ CRITICAL: CFG Must Be 1.0!

This is a distilled model optimized for CFG 1.0. Higher CFG values will produce worse results!

βœ… CFG 1.0 = Correct
❌ CFG 3.5+ = Wrong, will look bad

Additional Notes

  • 4-6 Steps are optimal! The model was step-distilled for fast inference

  • No negative prompts needed – works but not required

  • Natural language prompts – Just describe what you want to see


πŸ“₯ Installation (ComfyUI)

Quick Start

  1. Download your preferred version (FP8/FP16/BF16)

  2. Place in ComfyUI/models/checkpoints/

  3. Load with the "Load Checkpoint" node

  4. Generate!

Folder Structure

ComfyUI/
└── models/
    └── checkpoints/
        └── flux-2-klein-4b-bf16-aio.safetensors  (or fp16/fp8)

🎨 Example Prompts

Photorealistic

A professional photograph of a barista making latte art in a cozy 
coffee shop, morning light streaming through windows, shallow depth 
of field, shot on Sony A7III

Digital Art

A majestic dragon perched on a crystal mountain peak, aurora borealis 
in the background, fantasy digital painting, highly detailed scales, 
dramatic lighting

Product Photography

Minimalist product photo of a luxury perfume bottle on white marble, 
studio lighting, reflection, commercial photography

πŸ’» Capabilities

βœ… What FLUX.2 [klein] 4B can do:

  • Text-to-Image (T2I) – High-quality image generation from text

  • Image-to-Image (I2I) – Single-reference editing

  • Multi-Reference – Multiple input images for controlled transformations

  • Text Rendering – Improved text rendering in images

  • Photorealistic – Professional photo quality

  • Artistic Styles – Diverse artistic styles

⚠️ Limitations:

  • Optimized for 1024Γ—1024 (other resolutions possible but not optimal)

  • 4B model – less detail than larger models for complex scenes

  • Distilled version – less output diversity than base models


πŸ”§ Technical Details

  • Parameters: 4 Billion

  • Architecture: Rectified Flow Transformer

  • Text Encoder: Qwen3-based

  • Inference Steps: 4-6 (step-distilled)

  • Native Resolution: 1024Γ—1024

  • Precision: BF16 / FP16 / FP8

  • License: Apache 2.0


πŸ†š Comparison: 4B vs 9B

FLUX.2 [klein] 4B

  • Parameters: 4B

  • VRAM: ~8-13 GB

  • GPU: RTX 3090/4070+

  • Quality: Very Good

  • License: Apache 2.0 βœ…

  • Commercial Use: Yes!

FLUX.2 [klein] 9B

  • Parameters: 9B

  • VRAM: ~29 GB

  • GPU: RTX 4090+

  • Quality: Excellent

  • License: Non-Commercial ❌

  • Commercial Use: No

β†’ 4B is perfect for: Consumer hardware, commercial projects, fast iterations


❓ FAQ

Q: Do I need separate VAE/Text Encoder files?

No! AIO = All-in-One. Everything is included in a single file.

Q: Can I use this for commercial projects?

Yes! The 4B version is licensed under Apache 2.0.

Q: Why only 4-6 steps?

The model was step-distilled. More steps won't improve quality.

Q: Why must CFG be 1.0?

This is a distilled model optimized for CFG 1.0. Higher values will degrade output quality.

Q: FP8 vs BF16 – What's the difference?

FP8 is smaller and faster, BF16 has slightly better quality. For most applications FP8 is sufficient.

Q: Does this work with LoRAs?

Yes! Especially the Base version (non-distilled) is ideal for LoRA training.

Q: What's the difference to the 9B version?

9B has better quality but is non-commercial only. 4B is Apache 2.0!


πŸ› Troubleshooting

Images look "washed out" or oversaturated

  • Check CFG – must be 1.0 for distilled model!

  • Use 4-6 steps

Poor text rendering

  • Be more specific in your prompt

  • Use simple, short text

  • Place text requirements at the beginning of the prompt

Colors look off

  • Try BF16 version instead of FP8

  • Ensure your monitor is properly calibrated


πŸ™ Credits

Original Model: Black Forest Labs Architecture: Rectified Flow Transformer Text Encoder: Qwen3 AIO Repackage: SeeSee21

Official Links:


πŸ“‹ Changelog

v1.1 – January 2026 ⚑

  • πŸ†• Added NVFP4 AIO variant (RTX 50xx / Blackwell)

  • ⚑ Ultra-fast inference (optimized for 4 steps)

  • 🧠 Extremely low VRAM usage

  • 🎯 Designed for maximum speed while keeping good image quality

v1.0 (January 2026)

  • Initial Release

  • BF16, FP16 and FP8 versions

  • All-in-One with VAE + Text Encoder + UNet


License: Apache 2.0 – Free for personal AND commercial use! πŸŽ‰


The fastest open-source image generation model for ComfyUI! ⚑

Download and start creating! πŸš€

이 λͺ¨λΈλ‘œ λ§Œλ“  이미지