(NSFW) Dead-Simple MMAudio + RIFE Interpolation Setup for WAN 2.2 I2V 14B

Details

Model description

Changelog

Version 1.0.1: RIFE Group output was set to 8fps by accident. Changed it to 24fps

Version 1.0: Initial release

A TRIBUTE TO GOONERS EVERYWHERE

Your WAN 2.2 video is great. It looks awesome. But where's the sound? We moved from images to videos, and WAN 2.2 is incredible for video. The missing piece...AUDIO!

This is my first article ever, so I'm sorry if I made any mistakes. Please leave a comment if I've made an error or if you need any help. For your reference, I'm running:

  • ComfyUI 0.3.68

  • Torch 2.9

  • CUDA 13

  • Python 3.13.9

  • Sage Attention 2.2

  • NVIDIA 5070 Ti (16gb vram)

And here are the custom nodes (3 in total):

Onto the workflow...

------------------------------------

This workflow handles two jobs:

  1. Fix WAN 2.2’s native 16fps output by interpolating it to 24fps with RIFE.

  2. Generate synced audio with MMAudio using the final 24fps video.

The setup is plug-and-play. Drop in your WAN video → interpolate → feed it into MMAudio → get synced output. The included notes explain the reasoning for FPS, step settings, and seed behavior.

What this workflow covers:

  1. RIFE interpolation from 16 → 24 fps.

  2. MMAudio sampler with recommended settings (50 steps, cfg 4.5).

  3. Automatic audio + video combine at 24fps.

  4. Optional re-interpolation afterward if you want 30fps+ output.

    1. You can plug your finished 24fps video into the 'Step 1: Rife Interpolation' group and just change the 'source_fps' to 24 and the 'target_fps' to 30.

Required MMAudio files

Download all of these into:

ComfyUI/models/mmaudio

MMAudio NSFW Model (fine-tuned off the base model)

https://huggingface.co/phazei/NSFW_MMaudio/resolve/main/mmaudio_large_44k_nsfw_gold_8.5k_final_fp16.safetensors?download=true

MMAudio VAE (fp16)

https://huggingface.co/Kijai/MMAudio_safetensors/resolve/5984623e6b436818c6ff287ef6eec93e3e05aa3f/mmaudio_vae_44k_fp16.safetensors

MMAudio Synchformer (fp16)

https://huggingface.co/Kijai/MMAudio_safetensors/resolve/main/mmaudio_synchformer_fp16.safetensors

MMAudio CLIP Encoder (fp16)

https://huggingface.co/Kijai/MMAudio_safetensors/resolve/main/apple_DFN5B-CLIP-ViT-H-14-384_fp16.safetensors

Bonus

Once you've created a good MMAudio track, there are some further steps you can take depending on what you'd like to create.

1. Import your audio/video into some type of software (CapCut/Shotcut) and layer on some music in the background. I've done this with a few of my videos. I added a 'radio' filter to make it seem like the music was kinda tinny and playing in the background.

2. Layer other audio tracks alongside the NSFW audio track. You can see KaptainSisay very elegantly did something like that here (https://civitai.com/images/110700679)

Images made by this model

No Images Found.