FLUX.2 [klein] AIO

่ฉณ็ดฐ

ใƒขใƒ‡ใƒซ่ชฌๆ˜Ž

๐Ÿš€ FLUX.2 [klein] 4B AIO | Sub-Second Image Generation

Ultra-Fast โ€ข 4-6 Steps โ€ข Text-to-Image + Image Editing โ€ข All-in-One โ€ข Apache 2.0


โœจ What is FLUX.2 [klein] 4B AIO?

FLUX.2 [klein] 4B AIO is an All-in-One repackage of Black Forest Labs' newest compact image generation model. This version includes VAE, Text Encoder (Qwen3) and UNet in a single file โ€“ just load and go!

"Klein" means "small" in German โ€“ but this model is anything but limited. It delivers exceptional performance in Text-to-Image, Image Editing and Multi-Reference Generation, typically reserved for much larger models.


๐Ÿ”„ UPDATE

โšก Flux2-klein-4B-AIO-NVFP4

Fast generation with blackwell in just a few seconds โ€” even at 4 steps, and scales nicely with more steps ๐Ÿš€

๐Ÿ•’ Performance

  • Prompt executed in 2.08 seconds

  • โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 4 / 4 steps

  • ~2.65 it/s

โœ… Extremely fast
โœ… Stable
โœ… Great quality for a distilled setup


๐Ÿ–ผ๏ธ Example Generation

Prompt:

Anime, powerful anime illustration with vibrant dark fantasy colors, one adult woman inspired by Jouryuu, tall imposing presence, long hair flowing dramatically, intense anime eyes, wearing an ornate battle-inspired dress referencing official visuals, heavy fabric and strong silhouette, standing confidently in a ruined temple environment, low-angle camera enhancing dominance, dramatic backlighting with red and violet tones, strong shadows, intense cel-shading, bold anime lineart, intimidating yet elegant presence, correct anatomy, no text, no watermark.

๐Ÿ’ก This setup is optimized for speed, making it ideal for quick iterations, testing ideas, or just having fun generating without long wait times.

Have fun generating โ€” and as always, thanks for all the feedback and support ๐Ÿ™Œโœจ


๐Ÿ“ฆ Available Versions

๐ŸŸก FP8-AIO (~7.7 GB) โ€“ Recommended for most users

  • Precision: FP8

  • UNet: FP8

  • Text Encoder: FP8

  • VAE: BF16

  • Best for: Most users, quick tests, everyday use, lowest VRAM

๐Ÿ”ต FP16-AIO (~15 GB) โ€“ For older GPUs

  • Precision: FP16

  • UNet: FP16

  • Text Encoder: FP16

  • VAE: BF16

  • Best for: Older GPUs (GTX 10xx, RTX 20xx), broadest compatibility

๐ŸŸข BF16-AIO (~15 GB) โ€“ Maximum quality

  • Precision: BF16

  • UNet: BF16

  • Text Encoder: BF16

  • VAE: BF16

  • Best for: RTX 30xx/40xx/50xx, professional/commercial work

๐Ÿ”ด NVFP4-AIO (~5.7 GB) โ€“ Maximum speed โšก

  • Precision: NVFP4

  • UNet: NVFP4

  • Text Encoder: FP4-mix

  • VAE: BF16

  • Best for: RTX 50xx only, ultra-fast generations, Blackwell GPUs, low VRAM + extreme performance


๐ŸŽฏ Key Features

  • โšก 4-6 Step Generation โ€“ Sub-second inference on modern hardware

  • ๐Ÿ“ฆ All-in-One โ€“ No separate VAE/Text Encoder download needed

  • ๐ŸŽจ Unified Architecture โ€“ T2I, I2I Editing & Multi-Reference in one model

  • ๐Ÿ“ 1024ร—1024 native โ€“ Optimized for this resolution

  • ๐Ÿ’พ Low VRAM โ€“ Runs on consumer GPUs with ease

  • ๐Ÿ“œ Apache 2.0 โ€“ Fully open for commercial use!

  • ๐Ÿ”ง LoRA-compatible โ€“ Base version ideal for fine-tuning


โš™๏ธ Recommended Settings

  • Steps: 4-6 (step-distilled, more steps โ‰  better)

  • CFG: 1.0 โš ๏ธ CRITICAL!

  • Sampler: euler

  • Scheduler: simple (or "normal")

  • Resolution: 1024ร—1024 (native)

โš ๏ธ CRITICAL: CFG Must Be 1.0!

This is a distilled model optimized for CFG 1.0. Higher CFG values will produce worse results!

โœ… CFG 1.0 = Correct
โŒ CFG 3.5+ = Wrong, will look bad

Additional Notes

  • 4-6 Steps are optimal! The model was step-distilled for fast inference

  • No negative prompts needed โ€“ works but not required

  • Natural language prompts โ€“ Just describe what you want to see


๐Ÿ“ฅ Installation (ComfyUI)

Quick Start

  1. Download your preferred version (FP8/FP16/BF16)

  2. Place in ComfyUI/models/checkpoints/

  3. Load with the "Load Checkpoint" node

  4. Generate!

Folder Structure

ComfyUI/
โ””โ”€โ”€ models/
    โ””โ”€โ”€ checkpoints/
        โ””โ”€โ”€ flux-2-klein-4b-bf16-aio.safetensors  (or fp16/fp8)

๐ŸŽจ Example Prompts

Photorealistic

A professional photograph of a barista making latte art in a cozy 
coffee shop, morning light streaming through windows, shallow depth 
of field, shot on Sony A7III

Digital Art

A majestic dragon perched on a crystal mountain peak, aurora borealis 
in the background, fantasy digital painting, highly detailed scales, 
dramatic lighting

Product Photography

Minimalist product photo of a luxury perfume bottle on white marble, 
studio lighting, reflection, commercial photography

๐Ÿ’ป Capabilities

โœ… What FLUX.2 [klein] 4B can do:

  • Text-to-Image (T2I) โ€“ High-quality image generation from text

  • Image-to-Image (I2I) โ€“ Single-reference editing

  • Multi-Reference โ€“ Multiple input images for controlled transformations

  • Text Rendering โ€“ Improved text rendering in images

  • Photorealistic โ€“ Professional photo quality

  • Artistic Styles โ€“ Diverse artistic styles

โš ๏ธ Limitations:

  • Optimized for 1024ร—1024 (other resolutions possible but not optimal)

  • 4B model โ€“ less detail than larger models for complex scenes

  • Distilled version โ€“ less output diversity than base models


๐Ÿ”ง Technical Details

  • Parameters: 4 Billion

  • Architecture: Rectified Flow Transformer

  • Text Encoder: Qwen3-based

  • Inference Steps: 4-6 (step-distilled)

  • Native Resolution: 1024ร—1024

  • Precision: BF16 / FP16 / FP8

  • License: Apache 2.0


๐Ÿ†š Comparison: 4B vs 9B

FLUX.2 [klein] 4B

  • Parameters: 4B

  • VRAM: ~8-13 GB

  • GPU: RTX 3090/4070+

  • Quality: Very Good

  • License: Apache 2.0 โœ…

  • Commercial Use: Yes!

FLUX.2 [klein] 9B

  • Parameters: 9B

  • VRAM: ~29 GB

  • GPU: RTX 4090+

  • Quality: Excellent

  • License: Non-Commercial โŒ

  • Commercial Use: No

โ†’ 4B is perfect for: Consumer hardware, commercial projects, fast iterations


โ“ FAQ

Q: Do I need separate VAE/Text Encoder files?

No! AIO = All-in-One. Everything is included in a single file.

Q: Can I use this for commercial projects?

Yes! The 4B version is licensed under Apache 2.0.

Q: Why only 4-6 steps?

The model was step-distilled. More steps won't improve quality.

Q: Why must CFG be 1.0?

This is a distilled model optimized for CFG 1.0. Higher values will degrade output quality.

Q: FP8 vs BF16 โ€“ What's the difference?

FP8 is smaller and faster, BF16 has slightly better quality. For most applications FP8 is sufficient.

Q: Does this work with LoRAs?

Yes! Especially the Base version (non-distilled) is ideal for LoRA training.

Q: What's the difference to the 9B version?

9B has better quality but is non-commercial only. 4B is Apache 2.0!


๐Ÿ› Troubleshooting

Images look "washed out" or oversaturated

  • Check CFG โ€“ must be 1.0 for distilled model!

  • Use 4-6 steps

Poor text rendering

  • Be more specific in your prompt

  • Use simple, short text

  • Place text requirements at the beginning of the prompt

Colors look off

  • Try BF16 version instead of FP8

  • Ensure your monitor is properly calibrated


๐Ÿ™ Credits

Original Model: Black Forest Labs Architecture: Rectified Flow Transformer Text Encoder: Qwen3 AIO Repackage: SeeSee21

Official Links:


๐Ÿ“‹ Changelog

v1.1 โ€“ January 2026 โšก

  • ๐Ÿ†• Added NVFP4 AIO variant (RTX 50xx / Blackwell)

  • โšก Ultra-fast inference (optimized for 4 steps)

  • ๐Ÿง  Extremely low VRAM usage

  • ๐ŸŽฏ Designed for maximum speed while keeping good image quality

v1.0 (January 2026)

  • Initial Release

  • BF16, FP16 and FP8 versions

  • All-in-One with VAE + Text Encoder + UNet


License: Apache 2.0 โ€“ Free for personal AND commercial use! ๐ŸŽ‰


The fastest open-source image generation model for ComfyUI! โšก

Download and start creating! ๐Ÿš€

ใ“ใฎใƒขใƒ‡ใƒซใง็”Ÿๆˆใ•ใ‚ŒใŸ็”ปๅƒ