Wan2.2+14B+Sage + TorchCompile + LLM AutoPrompt Workflow
Details
Download Files
Model description
This ComfyUI workflow is a sophisticated pipeline for generating video from a single image (Image-to-Video). It leverages the power of the Wan2.2 14B model for the core video synthesis. The workflow is enhanced with several advanced features for performance and creative control.
A key feature is the integration of an LLM AutoPrompt node. This allows for the automatic generation of detailed and dynamic prompts to guide the video creation process. The workflow also incorporates Sage Attention and Torch Compile, which are advanced optimization techniques. Sage Attention provides more efficient and stable attention mechanisms, particularly beneficial for high-resolution video generation, while Torch Compile significantly speeds up the model's execution by compiling the PyTorch code into a more optimized representation.
The workflow is structured to first take a user-provided image. It then uses an LLM to generate a descriptive prompt based on the image content. This generated prompt, along with the initial image, is then fed into the Wan2.2 model to produce the final video output.
Usage Recommendations
To effectively use this workflow, follow these recommendations:
Input Image: Begin by loading your desired starting image into the designated "LoadImage" node. The workflow is designed to animate this static image, so a clear and well-defined subject will yield the best results.
LLM AutoPrompt: The LLM AutoPrompt node is configured to automatically generate a text prompt that will influence the video's narrative and action. You can customize the behavior of the LLM by modifying the
system_msginput to guide the style and content of the generated prompts. For more direct control, you can bypass the LLM and input your own descriptive prompt.Model and Performance Settings:
The workflow is pre-configured to use the Wan2.2 14B model. Ensure that you have the correct model files downloaded and placed in your ComfyUI
models/unetdirectory.Sage Attention and Torch Compile are enabled by default to optimize performance. For most users, the default settings will provide a good balance of speed and quality. If you encounter issues, you can try disabling these nodes, but expect a significant increase in generation time.
Output: The final output is a video file. You can adjust the video's dimensions, frame rate, and other parameters in the "VHS_VideoCombine" node to suit your needs.
This workflow is ideal for users looking to create high-quality video content from static images with the aid of automated and creative prompting, while also taking advantage of advanced performance optimizations.
Uncommon nodes used in workflow:
https://github.com/city96/ComfyUI-GGUF
https://github.com/pollockjj/ComfyUI-MultiGPU
+ optionally https://github.com/gokayfem/ComfyUI_VLM_nodes
