HunyuanImage-2.1_fp8_e4m3fn

Details

Model description

# HunyuanImage-2.1

### An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation

---

## Performance on RTX 5090

> When using HunyuanImage-2.1 with the quantized encoder + quantized base model,

> the VRAM usage on an NVIDIA RTX 5090 typically ranges between 26 GB and 30 GB with average

> 16 second inference time depending on resolution, batch size, and prompt complexity.

Important Note:

The refiner and not yet implemented and are not ready for use in ComfyUI.

Currently, only the base model and distilled is supported.

[Example_Workflow](https://huggingface.co/drbaph/HunyuanImage-2.1_fp8/resolve/main/example_workflow.json?download=true)

---

### Workflow Notes

- Model: HunyuanImage-2.1

- Mode: Quantized Encoder + Quantized Base Model

- VRAM Usage: ~26GB–30GB on RTX 5090

- Resolution Tested: 2K (2048×2048)

- Frameworks: ComfyUI & Diffusers

- Optimisations Works with Patch Sage Attention + Lazycache / TeaCache ✅

- Refiner: ❌ Not implemented yet, not available in ComfyUI

- License: [tencent-hunyuan-community](https://github.com/Tencent-Hunyuan/HunyuanImage-2.1/blob/master/LICENSE)

---

Images made by this model

No Images Found.