WAN 2.2 5b WhiteRabbit InterpLoop

Details

Download Files

Model description

喜欢中文的你看这边:英文看完后就是中文版

WAN 2.2 5b WhiteRabbit Interp-Loop

This ready-to-run ComfyUI workflow turns one image into a short looping video with WAN 2.2 5b. Then, it cleans the loop seam so it feels natural. Optionally, you can also boost the frame rate and upscale with ESRGAN.

In other words, this is an Image to Video workflow that creates loops with WAN 2.2 5b!

Why is this so complicated?!

WAN 2.2 5b does not fully support injecting frames after the first. If you try to inject a last frame, it will create a looping animation but the last 4 frames will be "dirty" with a strange "flash" at the end of the loop.

This workflow leverages custom nodes I designed to overcome this limitation. We trim out the dirty frames and then interpolate over the seam.

Model Setup (WAN 2.2 5b)

Install these into the usual ComfyUI folders. FP16 = best quality. FP8 = faster and lighter, with some trade-offs.

Diffusion modelmodels/diffusion_models/
- FP16: wan2.2_ti2v_5B_fp16.safetensors
- FP8: Wan2_2-TI2V-5B_fp8_e5m2_scaled_KJ.safetensors

Text encodermodels/text_encoders/
- FP16: umt5_xxl_fp16.safetensors
- FP8: umt5_xxl_fp8_e4m3fn_scaled.safetensors

VAEmodels/vae/
- wan2.2_vae.safetensors

Optional LoRAmodels/lora/
- Recommended: Live Wallpaper Style

Tip: keep subfolders like models/vae/wan2.2/ so your growing collection stays tidy.

How It Works

- Seam prep: we take the very last and first frames and generate new in-betweens that bridge them smoothly. Only those new frames get appended — no duplicate of frame 1.
- Full-clip interpolation (optional): multiply in-betweens across the whole video, then resample to any FPS you want.
- Upscale (optional): add an upscaler pass before full-clip interpolation using an ESRGAN model of your choice.
- Output: saved to your ComfyUI/output/ folder, filename prefix LoopVid.

Controls You’ll Care About

Defaults are set for “safe on most GPUs.” Tweak if you have more VRAM.

Full-Clip Interpolation
- Roll & Multiply: add more in-betweens everywhere (e.g., ×3).
- Reample Framerate: convert to an exact FPS (e.g., 60). Great after Multiply, but you could use it on its own.

Other handy knobs
- Duration: WAN cost climbs past ~3s (2.2 is tuned up to ~5s).
- Working Size: long edge in pixels (shape comes from your input image).
- Steps: ~30 is WAN 2.2’s sweet spot.
- CFG: WAN default is 5, I have it bumped a little higher. Higher = more “prompt strength,” sometimes more motion.
- Schedule Shift: motion vs. stability. Higher = more motion.
- Upscale: choose model/target size; reduce tile/batch if you hit OOM.

You can find more detailed information on all these settings in the workflow itself.

Using Vision Models for Prompts (optional but handy)

If writing movement prompts feels daunting, you can use a vision model to get a great starting point. You have a few options.

Free Cloud Options

Google's Gemini or OpenAI's ChatGPT are free and will get the job done for most people.

- Upload your image and paste the prompt below.
- Copy the model’s description and paste it into this workflow’s Prompt field.

...however, these services are not exactly private and might censor lewd/NSFW requests. That's why you might prefer to explore the other two options.

Paid Cloud Options

There are many services that offer cloud model access which is a more reliable way to get uncensored access to models.

You could pay for credits on OpenRouter for example. Personally, I prefer Featherless because they charge a flat monthly fee which keeps my costs predictable, and they have a strict no log policy. If you decide you want to give them a try, you could always use my referral link which helps me out!

If you decide to go the API/Paid Cloud route, you might find my app, CloudInterrogator, useful. It's designed to make prompting cloud vision models as easy as possible and it's fully free and open source!

Local Inference Option

I know a lot of people on CivitAI are local-or-nothing types. For you, there is Ollama.

Here's the best guide I could find on setting it up. You will want to look at Google's Gemma-3 family of models and look at which size is appropriate for your card.

If you use Ollama, there's nothing stopping you from using CloudInterrogator as your access point since Ollama creates OpenAI compatible endpoints, or you could customize this workflow with Ollama nodes for ComfyUI. I don't recommend doing the latter unless you can set it to lock the prompt.

Many workflows for WAN build Gemma3/Ollama nodes into the workflow. I decided not to do that, because I think 99% of people are going to be well serviced by Gemini or ChatGPT.

Suggested prompt:

Analyze the content of this video frame and write a concise, single-paragraph description of your predictions around what movement takes place throughout the video sequence that follows.

Your description should include the details of the character and scene as a whole but only as they related to the movement that occurs in the scene. In addition, make note of the movements of the particles, blinking of the eyes if any, movement of the hair... this is a moment captured in time, and you are describing these few seconds encompassed by the image. Everything that can move, does move - even minute details of the scene.

Do not describe ‘pauses’. Don't minimize the motion with words like ‘slight’ or ‘subtle’. Do not use metaphorical language. Your description must be direct and decisive. Use simple, common language. Be specific, and describe how each detail in the scene moves, but do not be verbose; each word in your description must have purpose. Use the present tense, as if your predictions are coming true as you type them.

You will deliver one paragraph without any additional information and without any special characters that format this response, avoid ‘The image sequence depicts the character’ and describe what happens, without saying ‘the video ....’"

You might also have good luck with the suggested prompt from AmazingSeek's workflow depending on the model you use or what you're looking for!

Tips & Troubleshooting

WAN Framerate: WAN 2.2 is 24 fps. On WAN 2.1, if you decide to try it, set fps to 12 instead. There is a slider for this near the model loader node. The workflow auto-calculates what to do with your framerate (for multiplication and resampling) based on this number.

Seam looks off? Try switching between Simple/Fancy seam interpolation; increase the auto-crop search in Fancy; or re-render with a slightly different prompt/CFG.

Out-of-memory (OOM)?
- Lower tile size (x and y) in the WanVideo Decode node.
- Lower Upscale tile size and/or batch size.
- Reduce Working Size or Duration.
- Enable “Use Tiled Encoder”.

AttributeError: type object 'CompiledKernel' has no attribute 'launch_enter_hook'

I'm not sure what causes this, though my assumption is it has something to do with the WAN Video Nodes. This should fix it for you:

1. Open "🧩 Manager"

2. Click "Install PIP Packages"

3. Install these two, leave the quotes out: "SageAttention", "Triton-Windows".

3.1 Obviously Triton-Windows is only for Windows users. If you get this error on Linux, I would guess the package for Triton is just "Triton".

If this doesn't fix it for you, it may be that your ComfyUI Python environment is messed up for some reason or the version of Comfy you're using doesn't work with the Manager "Install PIP Packages" module. In that case, you might find this advice from the comments section helpful:

From alex223:
"i spent almost a day, but made it work. this thing helped, but also, for some reason my embedded python missed include and libs folder, I copied them from standalone version - that was essential for triton to work. Maybe my comment will help someone."

If you're still having problems, you can leave a comment. I don't mind trying to help people troubleshoot but I don't think the issue is with my workflow or with WhiteRabbit (my custom nodes).

Acknowledgements

- It occurred to me that interpolating over a loop seam might be a good solution to the "dirty frames" problem when I was first experimenting, but it was this workflow by AmazingSeek that really made me decide to go for it.
- It appears that Ekafalain should get some credit here, too, for their seamless loop workflow on which AmazingSeek's is based.
- While I didn't end up using any of their ideas directly, I want to shout out Caravel for their excellent, multi-step process you can have a look at over here that seems to primarily target WAN 2.2 14b. The level of documentation in this workflow alone is laudable.
- My recommended vision prompt is built off of NRDX'. You can find the original workflow it's from over on his patreon. This is the guy who is training LiveWallpaper LoRA for various WAN models, too!

P.S. 💖

If this workflow helps you, I’d love to see what you make! I put a lot of hard work into making it, including designing custom nodes to bring it all together and trying to document as much as possible so it is maximally useful to you.

Links
- Have a look at the WhiteRabbit repository for node documentation and atomic workflows if you want a better idea of how to build with the custom nodes here or tweak this workflow.
- My Website & Socials: See my art, poetry, and other dev updates at artificialsweetener.ai
- Buy Me a Coffee: You can help fuel more projects like this at my Ko-fi page

This workflow is dedicated to my beloved Cubby 🥰
- Find her artwork all over the internet
- She has many excellent LoRA on CivitAI for you to explore :3


WAN 2.2 5b WhiteRabbit 插值循环

这个开箱即用的 ComfyUI 工作流可将一张图片转换为使用 WAN 2.2 5b 生成的短循环视频。随后,它会清理循环衔接处的“接缝”,让过渡更自然。可选地,你还可以提升帧率并用 ESRGAN 进行放大。

换句话说,这是一个利用 WAN 2.2 5b 生成循环效果的“图像转视频”工作流!

为什么会这么复杂?!

WAN 2.2 5b 并不完全支持在首帧之后继续注入帧。如果你尝试注入最后一帧,它虽会生成循环动画,但最后 4 帧会出现“脏帧”,在循环结束处出现奇怪的“闪烁”。

此工作流通过我设计的自定义节点来规避这一限制。我们先裁掉脏帧,然后对接缝进行插帧插值。工作流内同时提供了“简单版”和“进阶版”的裁剪/插值流程,并配有切换开关,便于你分别试用。

模型设置(WAN 2.2 5b)

按常规 ComfyUI 目录安装这些文件。FP16 = 质量最佳;FP8 = 更快更省显存,但有一定取舍。

扩散模型models/diffusion_models/

文本编码器models/text_encoders/

VAEmodels/vae/

可选 LoRAmodels/lora/

提示:使用诸如 models/vae/wan2.2/ 这类子文件夹,便于管理不断增长的模型集合。

工作原理

  • 接缝准备:取最后一帧与第一帧,生成新的过渡中间帧以实现平滑衔接。只会追加这些新帧——不会重复追加第 1 帧。

  • 全片插值(可选):在整段视频中增加倍数级的中间帧,然后重采样到任意 FPS。

  • 放大(可选):在全片插值之前加入一次放大流程,使用你选择的 ESRGAN 模型。

  • 输出:保存到你的 ComfyUI/output/ 文件夹,文件名前缀为 LoopVid。

你会关心的控制项

默认设置为“对多数 GPU 安全”。如果你显存更充裕,可以适当调高。

全片插值

  • 滚动倍增 ("Roll & Multiply"):在全片范围增加更多中间帧(例如 ×3)。

  • 重采样帧率 ("Resample Framerate"):转换到精确的 FPS(例如 60)。在倍增后使用效果更佳,但也可单独使用。

其他实用旋钮

  • 时长 ("Duration"):超过 ~3 秒成本上涨(2.2 调校到 ~5 秒)。

  • 工作尺寸 ("Working Size"):以长边像素为准(纵横比来自输入图)。

  • 步数 ("Steps"):~30 是 WAN 2.2 的甜点区。

  • CFG:WAN 默认 5,这里略微上调。数值越高=“提示强度”更高,有时也会带来更多运动。

  • 日程偏移(Schedule Shift):运动 vs 稳定。数值越高=运动更强。

  • 放大 ("Upscale"):选择模型/目标尺寸;如遇 OOM,降低 tile/batch。

关于这些设置的更多细节,可在工作流中直接查看。

使用视觉模型来生成提示(可选但好用)

如果编写“运动提示”让你犯难,可以借助视觉模型获得一个很好的起点。你有多种选择。

免费云端方案

Google 的 Gemini 或 OpenAI 的 ChatGPT 是免费的,对多数人来说足够用了。

  • 上传你的图片并粘贴下方提示词。

  • 复制模型给出的描述,将其粘贴到本工作流的 Prompt 字段。

……不过,这些服务的私密性并不理想,并且可能会审查低俗/NSFW 类请求。这也是你或许想尝试其他两种方案的原因。

付费云端方案

有很多服务提供云端模型访问,这是获取未审查模型的更可靠方式。

例如,你可以在 OpenRouter 购买点数。就我个人而言更偏好 Featherless,因为它按月固定收费、成本可预期,而且有严格的“无日志”政策。如果你想试试,也可以使用我的推荐链接来支持我!

如果你选择 API/付费云路线,我的应用 CloudInterrogator 可能会对你有用。它旨在尽可能简化云端视觉模型的提示流程,而且完全免费开源!

本地推理方案

我知道 CivitAI 上有不少“只用本地”的用户。你可以选择 Ollama

这里有我能找到的最佳安装指南。你可以关注 Google 的 Gemma-3 模型家族,并选择与你显卡匹配的规模。

如果使用 Ollama,你完全可以把 CloudInterrogator 当作访问入口,因为 Ollama 提供 OpenAI 兼容的端点;或者你也可以为 ComfyUI 加上 Ollama 节点来定制本工作流。除非你能把提示锁定,否则我并不推荐后者。

许多 WAN 工作流会把 Gemma3/Ollama 节点直接内置进去。我选择不这样做,因为我认为 99% 的人用 Gemini 或 ChatGPT 就已经足够。

建议的提示词

分析该视频帧的内容,用一个简洁的单段落描述你对随后的整段视频序列中将发生哪些运动的预测。

你的描述应覆盖角色与场景的整体细节,但只限于与场景中“运动”相关的部分。另外,请记录粒子的运动、如果有的话眼睛的眨动、头发的摆动……这是一个被时间定格的瞬间,你要描述的是这张图像所涵盖的这几秒内发生的事。凡是可能运动的,都在运动——包括场景中微小的细节。

不要描述“停顿”。不要用“轻微”“细微”这类词来弱化运动。不要使用隐喻性语言。你的描述必须直接而明确。使用简单、常用的语言。要具体,说明场景中每个细节是如何运动的,但不要冗长;你写下的每个词都要有用处。使用现在时,好像你的预测在你输入时正在成真。

你将输出一个段落,不包含任何额外信息,也不要使用会改变格式的特殊字符;避免用“图像序列描绘了角色……”之类的说法,直接描述发生了什么,不要说“视频……”。

根据你所用的模型或目标,你也许会发现 AmazingSeek 工作流提供的提示词同样好用!

技巧与故障排查

WAN 帧率:WAN 2.2 为 24 fps。若尝试 WAN 2.1,请将 fps 设为 12。模型加载节点附近有对应滑块。工作流会基于该数值自动计算帧率相关流程(倍增与重采样)。

接缝看起来不对?试试在“简单/进阶”接缝插值之间切换;在进阶模式中增加自动裁剪搜索范围;或用略微不同的提示/CFG 重新渲染。

显存不足?

  • 在 WanVideo Decode 节点降低 tile 尺寸(x 和 y)。

  • 降低放大(Upscale)的 tile 尺寸和/或批大小。

  • 减小工作尺寸或时长。

  • 启用“Use Tiled Encoder”。

致谢

  • 最初试验时,我想到在循环接缝处做插值可能解决“脏帧”问题,但真正让我决定上手的是 AmazingSeek这个工作流

  • 看起来 Ekafalain 也应在此获得一些认可,AmazingSeek 的无缝循环工作流是基于其成果之上的。

  • 虽然我最终没有直接采用他们的想法,但仍想致敬 Caravel——他们面向 WAN 2.2 14b 的多步流程非常出色,你可以在这里查看,文档水准就值得称赞。

  • 我推荐的视觉提示是基于 NRDX 的版本改写而来。你可以在他 Patreon 上找到原始工作流。他也是为多种 WAN 模型训练 LiveWallpaper LoRA 的那位!

附言 💖

如果这个工作流对你有帮助,我很想看看你的作品!我为此投入了大量精力,包括设计自定义节点把一切串起来,并尽量详细地撰写文档,以便它对你尽可能有用。

链接

  • 若想更好地了解如何用这些自定义节点搭建,或如何微调本工作流,请查看 WhiteRabbit 仓库中的节点文档与原子工作流。

  • 个人网站与社交:在 artificialsweetener.ai 查看我的艺术、诗歌及开发动态

  • 请我喝咖啡:在我的 Ko-fi 页面支持更多类似项目

本工作流献给我挚爱的 Cubby 🥰

  • 你可以在全网各处看到她的作品

  • 她在 CivitAI 上也有许多优秀的 LoRA 供你探索 :3

Images made by this model

No Images Found.