Wan 2.2 I2V: HD/FHD resolution, but much faster
Details
Download Files
Model description
The workflow significantly mitigates the "speed VS quality" dilemma, allowing users with low-end hardware to generate videos in HD resolution nearly twice as fast!
How it works
The principle is dumb simple: we run the high-noise model at very low resolution, then upscale the latent before injecting it into the low-noise sampler.
Since the original image is also reinjected, with a new Wan wrapper node at the low-noise sampling step, visual details are preserved.
Limitations
The motion does lose a little in subtlety, but the speed gain is totally worth it in most of cases.
Not tested on T2V, probably won't work.
Getting Started
Replace the models by yours, or follow the links below to download them.
Install the required custom nodes listed below if they are missing from your installation.
Load an image and write your prompt.
Click Run.
Custom Nodes
Required
Optional
· ComfyUI-GIMM-VFI (for interpolation)
Models Used
Wan 2.2 14B I2V, Quantized:
Lightx2v LoRas :
Fun LoRa:
Speed benchmark
settings: 65 frames, using Q5_K_M I2V models, 4 steps on high-noise with lightx2v 1030, 4 steps on low-noise with lightx2v 1022 and Fun HPS2.1 loras, euler/beta sampler/scheduler on both samplers.
hardware: RTX 3060 with 12GB VRAM and 32GB RAM.
768*1152px (2:3)
768*1152, no upscale: 20′46″
256*384 x2 then x1.5: 11′16″ (-46%)
256*384 x1.5 then x2: 10′48″ (-48%)
720*1280px (9:16)
720*1280, no upscale : 23′19″
288*512 x2.5 : 15′57″ (-32%)
288*512 x2 then x1.25: 11′57″ (-49%) <- this is the showcased video
Target resolution VS Hardware Requirements
HD: >= 12GB VRAM v
FHD: >= 16GB VRAM ? (not tested, feedbacks appreciated)
Initial sampling resolutions and step settings recommendations are included in the workflow.
Edit: the last sampler's scheduler is set on linear_quadratic by default but it should be beta.

