Zimage Base with SDXL Detailer and Refiner w/LoRA Manager

세부 정보

모델 설명

image_z_image_SDXL_Refiner

This ComfyUI workflow generates or processes images using ZImage Base model (Phase 1), refines them with SDXL-based models (Phase 2), applies automatic face and skin enhancement, followed by dual-stage upscaling.

Installation

Required Custom Nodes

Install via ComfyUI Manager:

  1. ComfyUI-Easy-Use - https://github.com/yolain/ComfyUI-Easy-Use

  2. ComfyUI-Levelpixel - https://github.com/levelpixel/ComfyUI-Levelpixel

  3. ComfyUI-Impact-Pack - https://github.com/ltdrdata/ComfyUI-Impact-Pack

  4. ComfyUI-Impact-Subpack - https://github.com/ltdrdata/ComfyUI-Impact-Subpack

  5. ComfyUI KJ Nodes - https://github.com/kijai/ComfyUI-KJNodes

  6. ComfyUI LoRA Manager - https://github.com/cubiq/ComfyUI_Lora_Manager

  7. ComfyUI-pysssss - https://github.com/pythongosssss/ComfyUI-Custom-Scripts

  8. ComfyUI Comfyroll Custom Nodes - https://github.com/Suzie1/ComfyUI_Comfyroll_CustomNodes

Required Model Files

Detection Models (ComfyUI/models/ultralytics/):

  • bbox/face_yolov8n_v2.pt - Face detection

  • segm/skin_yolov8n-seg_800.pt - Skin segmentation

SAM Model (ComfyUI/models/sams/):

  • sam_vit_b_01ec64.pth - Segment Anything Model

Checkpoints (ComfyUI/models/checkpoints/ or ComfyUI/models/unet/):

  • ZImage Base checkpoint (for Phase 1)

  • SDXL Refiner or any SDXL-based checkpoint (for Phase 2)

Upscale Models (ComfyUI/models/upscale_models/):

  • RealESRGAN_x4plus.pth or similar

Detection models and SAM download automatically on first use or from: https://github.com/ltdrdata/ComfyUI-Impact-Pack

Workflow Structure

Input Selection

Uses CR Latent Input Switch to choose between:

  • Input 1: Uploaded image (LoadImage → VAEEncode)

  • Input 2: Empty latent for generation from scratch (default: 896x1152)

Phase 1: ZImage Base Generation

  • Processes selected input using ZImage Base checkpoint

  • Default settings: 50 steps, CFG 5, uni_pc_bh2 sampler, ddim_uniform scheduler

  • Supports LoRAs via first Lora Loader

  • Output goes to Phase 2

Phase 2: SDXL Refiner

  • Refines Phase 1 output using SDXL-based checkpoint

  • KSamplerAdvanced settings: 50 steps, CFG 1.9, start step 40

  • Sampler: dpmpp_3m_sde_gpu, Scheduler: beta57

  • Supports multiple LoRAs via "Phase 2 Lora Loader"

  • Trigger words managed by TriggerWord Toggle node

Detailing System

  • Automatic face detection: YOLOv8n v2 (bbox/face_yolov8n_v2.pt)

  • Skin segmentation: YOLOv8n-seg (segm/skin_yolov8n-seg_800.pt)

  • SAM model for precise mask generation

  • FaceDetailer settings: 25 steps, CFG 6, denoise 0.25, bbox_threshold 0.3

Upscaling

  • Two-stage progressive upscaling

  • Uses RealESRGAN or similar models

  • Each stage independently controlled via Fast Groups Bypasser

Model Compatibility

Phase 1: Requires ZImage Base checkpoint

Phase 2: Accepts any SDXL-architecture checkpoint:

  • Official SDXL Refiner

  • SDXL base checkpoints

  • Pony-based models

  • Illustrious-based models

  • Other SDXL derivatives

Important: Different SDXL variants may require different sampler/scheduler settings. The workflow uses dpmpp_3m_sde_gpu with beta57 scheduler for Phase 2, and uni_pc_bh2 with ddim_uniform for Phase 1. For Pony or Illustrious models, you may need to adjust:

  • Scheduler (try karras, normal, simple)

  • Sampler (try euler_a, dpmpp_2m)

  • CFG scale and step counts

Usage

Basic Setup

  1. Set output folder: Enter name in "Save Subdirectory Name"

  2. Choose input: CR Latent Input Switch - 1 for uploaded image, 2 for generation

  3. Load models: ZImage Base for Phase 1, SDXL model for Phase 2

  4. Set prompts: Phase 1 prompts for generation, Phase 2 prompts for refinement

  5. Configure LoRAs: Load in respective Lora Loader nodes, toggle trigger words

Fast Groups Bypasser

Control workflow sections:

  • Phase 1 - ZImage Base Generation: Main generation (keep enabled)

  • Phase 2 - SDXL Refiner: Refinement pass (keep enabled)

  • Model Unload, Clear Cache and VRAM: Enable if low VRAM (default: disabled)

  • Detailer Bridge: Prepares for face/skin enhancement (keep enabled)

  • Upscale 1: First upscale pass (disable to skip)

  • Upscale 2: Second upscale pass (disable to skip)

Output Files

All images save to: ComfyUI/output/[subdirectory]/

Includes complete metadata: prompts, seeds, steps, CFG, models, LoRAs, all workflow settings.

Default Settings

Phase 1 (ZImage Base):

  • Steps: 50

  • CFG: 5

  • Sampler: uni_pc_bh2

  • Scheduler: ddim_uniform

Phase 2 (SDXL Refiner):

  • Steps: 50

  • CFG: 1.9

  • Start step: 40

  • Sampler: dpmpp_3m_sde_gpu

  • Scheduler: beta57

FaceDetailer:

  • Steps: 25

  • CFG: 6

  • Denoise: 0.25

  • bbox_threshold: 0.3

Empty Latent: 896 x 1152

Troubleshooting

Out of VRAM errors:

  • Enable Model Unload/Clear Cache group via Fast Groups Bypasser

  • Disable one or both upscale stages

  • Lower step counts

Face detailer not activating:

  • Lower bbox_threshold (default is 0.3, try 0.2)

  • Ensure faces are clearly visible and adequately sized

  • Verify detector model files downloaded correctly

Using non-standard SDXL models (Pony, Illustrious):

  • Adjust Phase 2 sampler/scheduler settings

  • Common alternatives: euler_a sampler with karras scheduler

  • Test different CFG values

  • Check model card for recommended settings

Technical Details

Optimized for: RTX 4090 with 24GB VRAM

Processing flow:

  1. Select input (uploaded image or empty latent)

  2. Phase 1: ZImage Base generation/processing

  3. Phase 2: SDXL refinement

  4. Face and skin region detection

  5. Targeted detail enhancement

  6. Progressive dual upscaling

  7. Save with complete metadata

Execution time: 40 seconds to 3 minutes depending on hardware, settings, and enabled stages. 40 seconds with current settings on a RTX 4090 GPU.

이 모델로 만든 이미지