image_z_image_SDXL_Refiner

This ComfyUI workflow generates or processes images using ZImage Base model (Phase 1), refines them with SDXL-based models (Phase 2), applies automatic face and skin enhancement, followed by dual-stage upscaling.

Installation

Required Custom Nodes

Install via ComfyUI Manager:

ComfyUI-Easy-Use - https://github.com/yolain/ComfyUI-Easy-Use
ComfyUI-Levelpixel - https://github.com/levelpixel/ComfyUI-Levelpixel
ComfyUI-Impact-Pack - https://github.com/ltdrdata/ComfyUI-Impact-Pack
ComfyUI-Impact-Subpack - https://github.com/ltdrdata/ComfyUI-Impact-Subpack
ComfyUI KJ Nodes - https://github.com/kijai/ComfyUI-KJNodes
ComfyUI LoRA Manager - https://github.com/cubiq/ComfyUI_Lora_Manager
ComfyUI-pysssss - https://github.com/pythongosssss/ComfyUI-Custom-Scripts
ComfyUI Comfyroll Custom Nodes - https://github.com/Suzie1/ComfyUI_Comfyroll_CustomNodes

Required Model Files

Detection Models (ComfyUI/models/ultralytics/):

bbox/face_yolov8n_v2.pt - Face detection
segm/skin_yolov8n-seg_800.pt - Skin segmentation

SAM Model (ComfyUI/models/sams/):

sam_vit_b_01ec64.pth - Segment Anything Model

Checkpoints (ComfyUI/models/checkpoints/ or ComfyUI/models/unet/):

ZImage Base checkpoint (for Phase 1)
SDXL Refiner or any SDXL-based checkpoint (for Phase 2)

Upscale Models (ComfyUI/models/upscale_models/):

RealESRGAN_x4plus.pth or similar

Detection models and SAM download automatically on first use or from: https://github.com/ltdrdata/ComfyUI-Impact-Pack

Workflow Structure

Input Selection

Uses CR Latent Input Switch to choose between:

Input 1: Uploaded image (LoadImage → VAEEncode)
Input 2: Empty latent for generation from scratch (default: 896x1152)

Phase 1: ZImage Base Generation

Processes selected input using ZImage Base checkpoint
Default settings: 50 steps, CFG 5, uni_pc_bh2 sampler, ddim_uniform scheduler
Supports LoRAs via first Lora Loader
Output goes to Phase 2

Phase 2: SDXL Refiner

Refines Phase 1 output using SDXL-based checkpoint
KSamplerAdvanced settings: 50 steps, CFG 1.9, start step 40
Sampler: dpmpp_3m_sde_gpu, Scheduler: beta57
Supports multiple LoRAs via "Phase 2 Lora Loader"
Trigger words managed by TriggerWord Toggle node

Detailing System

Automatic face detection: YOLOv8n v2 (bbox/face_yolov8n_v2.pt)
Skin segmentation: YOLOv8n-seg (segm/skin_yolov8n-seg_800.pt)
SAM model for precise mask generation
FaceDetailer settings: 25 steps, CFG 6, denoise 0.25, bbox_threshold 0.3

Upscaling

Two-stage progressive upscaling
Uses RealESRGAN or similar models
Each stage independently controlled via Fast Groups Bypasser

Model Compatibility

Phase 1: Requires ZImage Base checkpoint

Phase 2: Accepts any SDXL-architecture checkpoint:

Official SDXL Refiner
SDXL base checkpoints
Pony-based models
Illustrious-based models
Other SDXL derivatives

Important: Different SDXL variants may require different sampler/scheduler settings. The workflow uses dpmpp_3m_sde_gpu with beta57 scheduler for Phase 2, and uni_pc_bh2 with ddim_uniform for Phase 1. For Pony or Illustrious models, you may need to adjust:

Scheduler (try karras, normal, simple)
Sampler (try euler_a, dpmpp_2m)
CFG scale and step counts

Usage

Basic Setup

Set output folder: Enter name in "Save Subdirectory Name"
Choose input: CR Latent Input Switch - 1 for uploaded image, 2 for generation
Load models: ZImage Base for Phase 1, SDXL model for Phase 2
Set prompts: Phase 1 prompts for generation, Phase 2 prompts for refinement
Configure LoRAs: Load in respective Lora Loader nodes, toggle trigger words

Fast Groups Bypasser

Control workflow sections:

Phase 1 - ZImage Base Generation: Main generation (keep enabled)
Phase 2 - SDXL Refiner: Refinement pass (keep enabled)
Model Unload, Clear Cache and VRAM: Enable if low VRAM (default: disabled)
Detailer Bridge: Prepares for face/skin enhancement (keep enabled)
Upscale 1: First upscale pass (disable to skip)
Upscale 2: Second upscale pass (disable to skip)

Output Files

All images save to: ComfyUI/output/[subdirectory]/

Includes complete metadata: prompts, seeds, steps, CFG, models, LoRAs, all workflow settings.

Default Settings

Phase 1 (ZImage Base):

Steps: 50
CFG: 5
Sampler: uni_pc_bh2
Scheduler: ddim_uniform

Phase 2 (SDXL Refiner):

Steps: 50
CFG: 1.9
Start step: 40
Sampler: dpmpp_3m_sde_gpu
Scheduler: beta57

FaceDetailer:

Steps: 25
CFG: 6
Denoise: 0.25
bbox_threshold: 0.3

Empty Latent: 896 x 1152

Troubleshooting

Out of VRAM errors:

Enable Model Unload/Clear Cache group via Fast Groups Bypasser
Disable one or both upscale stages
Lower step counts

Face detailer not activating:

Lower bbox_threshold (default is 0.3, try 0.2)
Ensure faces are clearly visible and adequately sized
Verify detector model files downloaded correctly

Using non-standard SDXL models (Pony, Illustrious):

Adjust Phase 2 sampler/scheduler settings
Common alternatives: euler_a sampler with karras scheduler
Test different CFG values
Check model card for recommended settings

Technical Details

Optimized for: RTX 4090 with 24GB VRAM

Processing flow:

Select input (uploaded image or empty latent)
Phase 1: ZImage Base generation/processing
Phase 2: SDXL refinement
Face and skin region detection
Targeted detail enhancement
Progressive dual upscaling
Save with complete metadata

Execution time: 40 seconds to 3 minutes depending on hardware, settings, and enabled stages. 40 seconds with current settings on a RTX 4090 GPU.

모델 유형	워크플로우
기본 모델	ZImageBase
게시일	2026-02-03

Zimage Base with SDXL Detailer and Refiner w/LoRA Manager

세부 정보

파일 다운로드 (1)

모델 설명