Yozora-XL Rectified-Flow

Details

Download Files (1)

Model description

Yozora-XL: A Rectified Flow SDXL Model

Yozora-XL is a rectified-flow model based on Chenkin 0.2 RF, fine-tuned using the Aozora Training Script now supporting Flow-based SDXL architectures. Aozora It enables full/partial fine-tuning on 12GB consumer GPUs such as the RTX 3060. The training script is available on GitHub at [Aozora] for community use but requires general understanding to setup and use.

  • Never merged

  • No internally merged loras

Version 0.1

The initial release (v0.1 alpha) is a proof-of-concept demonstrating the trainer's Rectified Flow support. It was trained to validate Flow-based fine-tuning on consumer hardware rather than achieve final quality. Future versions will utilize 50k+ images and extended training schedules. Even at this stage, the model improved and provides decent colors and improved lighting in some scenes, with stable performance across wide CFG ranges without offset noise.

Training Settings

  • Base Model: ChenkinNoob-XL-V0.2 RF

  • Max Train Steps: 91,567

  • Batch Size: 1 with 16 Gradient Accumulation Steps

  • Learning Rate: 2e-5 (Graph shown below, The lr was spiked half way into the run due to unforeseen issue)

  • Shift: 2.0

  • Optimizer: Raven

  • Mixed Precision: bfloat16

  • VRAM Usage: ~11.8GB

  • Timestep Mode: Uniform

  • UNET Training: ~92% of parameters

  • Dropout: 15% unconditional

  • Loss: Semantic Loss (used 0.2x)

Training Graphs

  • This model was trained with a semantic-aware loss that approximates expensive perceptual metrics (e.g., LPIPS) with analytical importance maps. Rather than running auxiliary networks per step, it combines color saliency (bilateral-filtered LAB deviation) with structural edges (Sobel filtering) to weight the diffusion training loss spatially. This prioritizes semantically important regions—subjects and fine details—without the computational overhead of network-based evaluation."

Quick Start

  • Sampler: Euler | CFG: 6 | Steps: 25 | Shift: 3

  • Positive: masterpiece, best quality, aesthetic

  • Negative: worst quality, low quality, bad anatomy

  • ModelSamplingSD3 node in comfyui or Advanced Model Sampling wth sd3 in reforge

Recommended Settings

  • Positive Prompt: masterpiece, best quality, aesthetic

  • Negative Prompt: worst quality, low quality, bad anatomy, low resolution

  • Sampler: Euler (Quality may vary with others)

  • Scheduler: Normal/Simple/SGM Uniform

  • Steps: 20-50

  • CFG Scale: 4-8

  • Shift: 3-8

  • Resolution: 1024x1024 (up to 1152x1152) or any aspect ratio variations of this range

This is the workflow i use, in json format if needed
[YozoraComfyuiWorkflow]

Note: This model requires SD3 Flow loading to work, You will need the ModelSamplingSD3 node. You can copy the workflow form any of the preview images as a example

For A1111/ReForge, enable the Advanced Model Sampling extension and use RF-specific samplers (Euler Comfy, etc.). For ADETAILER compatibility, add advanced_model_sampling_script to your builtin scripts list.

License

This model follows the license of its base, ChenkinNoob-XL RF. Review and comply with those terms.

Images made by this model