Wan 2.2 - SVI Pro 2.0 - I2V for 12GB VRAM (Different Loras Per Stage)(Optimized for Speed)

WAN 2.2 / SVI Pro 2 / I2V for 12GB VRAM

Modified version of [SVI Pro 2.0 for Low VRAM (8GB)]

And [Wan2.2 SVI Pro Example KJ]

7 Stage Sample Setup, with each Stage having their own Loras, combined with Sage Attention Cuda for faster speeds.
Can save each stage clip if needed.
Final Output w/ Upscaler + RIFE for smooth 60FPS.
Fast Group Bypasser - for quick access.

### Required Models & LoRAs

GGUF Main Models:

* [DaSiWa-Wan 2.2 I2V] or

* [Smooth Mix Version] or

* [Enhanced NSFW Camera Prompt Adherence]

> Note: Use a suitable quantization (e.g., Q4 or Q5) based on your available VRAM. I highly recommend DaSiWa-Wan high/low Models, as the Lightning Loras are BAKED in, leaving you only with SVI Loras being required.

SVI PRO LoRAs (Wan2.2-I2V-A14B):

* Both Required

[SVI PRO - HIGH (Rank 128)]

[SVI PRO - LOW (Rank 128)]

Text Encoders:

[WAN UMT5] or

[NSFW WAN UMT5]

VAE:

[Wan 2.1 VAE]

The following is for Speed Boosts for nVidia Cards - If its already working then skip this!

Patch Sage Attention Node (sageattn_qk_int8_pv_fp16_cuda) + Model Patch Torch Settings Node (Faster Speed Times):

Prompt executed in 136.56 seconds <- Sage Attention Disable/FP16 Accumulation = Disable/Allow Compile = False

Prompt executed in 104.38 seconds <- Sage Attention Enabled/FP16 Accumulation = Enabled/Allow Compile = False

Prompt executed in 96.26 seconds <-- Sage Attention Enabled/FP16 Accumulation = True/Allow Compile = True

With this setup you can save a massive 40+ seconds just for one Stage!

If Sage Attention is not working/crashing comfyui then do the following or use (CTRL+B to bypass the nodes but I highly recommend getting it working for massive speed boost):

The following is for Comfyui_windows_portable, do not do it this way if you are using a different setup!
- Step 1 — Check your PyTorch + CUDA version

Open CMD in your ComfyUI Portable folder (SAME directory as run_nvidia_gpu.bat) and run the following command:

.\python_embeded\python.exe -c "import torch; print(torch.__version__, torch.version.cuda)"

output = 2.9.1+cu130 13.0

check Python embeded version:

.\python_embeded\python.exe -V

output = Python 3.13.9

Which Means:

Python: 3.13 (embeded)

PyTorch: 2.9.1

CUDA: 13.0

Warning! If you are unsure how to proceed with the following steps, then paste your error code into Grok/ChatGPT

for a more detailed analysis.

Pick the wheel that matches your Python + PyTorch + CUDA output from Step 1.

That means the correct SageAttention wheel for your setup would be something like this:

sageattention-2.2.0.post3+cu130torch2.9.0-cp313-cp313-win_amd64.whl

download the correct wheel for your setup from:

[List of Wheels]

It matches Python 3.13 (cp313-cp313), PyTorch 2.9.x, and CUDA 13.0.

The slight difference in patch version (2.9.1 vs 2.9.0) is fine — this wheel works with PyTorch 2.9.x.

Step 2 — Install Wheel (make sure the file is in \ComfyUI_windows_portable, same directory as run_nvidia_gpu.bat)

Open CMD in your ComfyUI Portable folder and run with the correct wheel file (example below):

.\python_embeded\python.exe -m pip install "sageattention-2.2.0.post3+cu130torch2.9.0-cp313-cp313-win_amd64.whl"

Step 3 — How to check if it works:

Open CMD in your ComfyUI Portable folder and run:

.\python_embeded\python.exe -c "import sageattention; print('SageAttention import successful!'); print(dir(sageattention))"

You should see:

SageAttention import successful!

['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '_fused', '_qattn_sm80', '_qattn_sm89', '_qattn_sm90', 'core', 'quant', 'sageattn', 'sageattn_qk_int8_pv_fp16_cuda', 'sageattn_qk_int8_pv_fp16_triton', 'sageattn_qk_int8_pv_fp8_cuda', 'sageattn_qk_int8_pv_fp8_cuda_sm90', 'sageattn_varlen', 'triton']

Step 4 — confirm if triton attention mode is available:

Open CMD in your ComfyUI Portable folder and run:

.\python_embeded\python.exe -c "import sageattention; print('SageAttention import successful!'); print('Triton mode available:' , hasattr(sageattention, 'sageattn_qk_int8_pv_fp16_triton'))"

You should see:

SageAttention import successful!

Triton mode available: True

if any triton errors run this command:

.\python_embedded\python.exe -m pip install triton

Step 5 - now you should be able to use "sageattn_qk_int8_pv_fp16_cuda" with Patch Sage Attention + Model patch Torch Settings Nodes properly.

Model Type	Workflows
Base Model	Wan Video 2.2 I2V-A14B
Published	1/13/2026

Wan 2.2 - SVI Pro 2.0 - I2V for 12GB VRAM (Different Loras Per Stage)(Optimized for Speed)

Details

Download Files

About this version

Model description

WAN 2.2 / SVI Pro 2 / I2V for 12GB VRAM

The following is for Speed Boosts for nVidia Cards - If its already working then skip this!

Images made by this model