大摆锤 dance ——framepack

Details

Download Files

Model description

base on framepack

The 1.5 version, trained exclusively on a single video, exhibits improved motion dynamics and more coherent action sequences compared to version 1.0. However, this approach has led to overfitting in certain areas, such as unnatural limb proportions, which I am currently addressing. For optimal results, I suggest generating content with a duration of 7.5 seconds and a resolution of 448x752.

you can use cowboy shot,hands on hips to generate first img


v1.0

The FramePack LoRA training conducted using the Musubi tuner utilized 13 videos for the generation of big swing dance.

you can download this video,then drag into comfyui to see the workflow and parameter

The training took approximately 24 hours on 4090. I would highly recommend using a GPU with greater than 24GB of VRAM for training.

I have only tested this on BF16 precision and have not conducted any evaluations under FP8 precision.

thanks to 青龙圣者 , for addressing some questions regarding training parameters.

It can be observed that FramePack, even without using F1, achieves a remarkably significant improvement in motion amplitude when employing LoRA.

So I’m wondering if a single LoRA could simply amplify all motion amplitudes significantly.

On the RTX 4080 32GB, it takes an average of 1 minute per second to generate.

Images made by this model

No Images Found.