HunyuanImage-2.1_fp8_e4m3fn

# HunyuanImage-2.1

### An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation

---

## Performance on RTX 5090

> When using HunyuanImage-2.1 with the quantized encoder + quantized base model,

> the VRAM usage on an NVIDIA RTX 5090 typically ranges between 26 GB and 30 GB with average

> 16 second inference time depending on resolution, batch size, and prompt complexity.

⚠ Important Note:

The refiner and not yet implemented and are not ready for use in ComfyUI.

Currently, only the base model and distilled is supported.

---

### Workflow Notes

- Model: HunyuanImage-2.1

- Mode: Quantized Encoder + Quantized Base Model

- VRAM Usage: ~26GB–30GB on RTX 5090

- Resolution Tested: 2K (2048×2048)

- Frameworks: ComfyUI & Diffusers

- Optimisations Works with Patch Sage Attention + Lazycache / TeaCache ✅

- Refiner: ❌ Not implemented yet, not available in ComfyUI

---