CarConsistency-Wan2.2-I2V-ConsistencyLoRA1
Details
Download Files
About this version
Model description
The Samples in showcase are using both high and low with lightning-low lora.
Hello everyone, long time no see. I apologize for not releasing more models recently, as I've spent the last month researching innovative features for the Wan2.2-I2V model's LoRA.I've finally achieved some research results and would like to introduce you to this series of Wan2.2-I2V LoRA, which I personally call the ConsistencyLoRA series. The function of this LoRA series is to take an input image and directly generate a video that maintains a high degree of consistency with that image, using the Wan2.2-I2V model.
大家好,好久不见.由于最近一个月都在研究Wan2.2-I2V模型Lora的创新功能,没有发更多的模型,抱歉.最近终于有了一些研究结果,向大家介绍这个系列的Wan2.2-I2V LoRA,我自己称为ConsistencyLoRA系列.这个系列的LoRA功能是通过输入图像,通过Wan2.2-I2V模型直接生成与输入图像高度一致性的视频.
CarConsistency is the first model in this series. The goal of this model is to directly generate a highly consistent video of a vehicle(If the image input is a F1 car, the prompt should change from The car to The F1 car) from an input image (preferably on a white background) and a prompt (e.g., "the car is speeding on the moon/water/ice field,floating in the space..."). From the dozen or so images I've personally tested, CarConsistency can maintain a high degree of vehicle consistency, preserving details like the advertisements on a Ferrari SF25 race car, the Chinese characters on the license plates of an SU7 Ultra and a Fang Cheng Bao SUV, and the decorative patterns on the vehicles. Using the 'lightning-low' model for generation results in faster speed and more consistent quality.
CarConsistency是该系列的第一个模型.该模型希望通过直接输入车辆的图(最好是白底图),然后通过prompt(样例中是the car is speeding on the moon/water/ice field,floating in the space)直接生成对应车辆高度一致性的视频(如果是F1赛车,建议写成F1 car).从我个人测试的十多张图来看,CarConsistency可以维持车辆的高度一致性,比如:法拉利SF25赛车身上的广告,Su7 ultra和方程豹车牌上的中文和车辆上的花纹等.建议生成时加入lightning-low模型,速度更快,质量更稳定.
The purpose of creating the ConsistencyLoRA series is to broaden the commercial application scenarios for I2V (Image-to-Video) models.ConsistencyLoRA was trained before the release of Wan Fun VACE and Wan Animate. Compared to them, ConsistencyLoRA has a couple of drawbacks: firstly, the generated video includes preceding frames from the input image, which can be removed by frame trimming (I have uploaded a script, CutFrame.ipynb, to do this directly). Secondly, the generated output can sometimes be blurry.However, ConsistencyLoRA also has its advantages:1.Ease of Use and Accessibility: Because it is based on the Wan I2V workflow, it is simple, convenient, and has a low VRAM threshold. Various other I2V-based LoRAs are also compatible. Furthermore, since it is trained for specific tasks, it offers strong stability in those particular applications.2.Rapid Generation via Prompts: It allows for quick generation controlled by prompts. For example, to maintain clothing consistency, you can use prompts to generate models of different ethnicities, skin tones, and body types all wearing the specified clothing.
做ConsistencyLoRA系列的LoRA是希望拓宽I2V模型商业应用的场景.ConsistencyLoRA的训练在Wan Fun VACE和Wan Animate发布之前,相比Wan Fun VACE和Wan Animate,ConsistencyLoRA的缺点在于视频有输入图像的前置帧,可以通过帧剪切去除(我上传了CutFrame.ipynb的脚本可以直接去除),然后生成有时候会有模糊情况.而ConsistencyLoRA优点在于:1.因为是基于Wan I2V工作流,所以简单方便,显存门槛低,各种基于I2V的lora也适用,且因为是基于特定任务进行的训练,所以在特定任务上的稳定性较强.2.可以快速通过Prompt生成,比如衣服一致性,通过Prompt可以生成不同族裔,不同肤色,不同身材的模特穿着对应的衣服.
I have handled the entire process independently, from the LoRA concept and dataset processing to training and hyperparameter tuning. Due to the VRAM limitations of a 24G 4090 GPU, I can currently only train with a [360, 360] latent space, so it is still in the prototype stage. If the results are not ideal, I ask for your understanding and feedback, and I will do my best to improve it.Thank you for reading this far. Commercial use of this model requires a license (I'm hoping to at least cover the electricity costs for training, lol). If you can support my experiments with computing power that has more VRAM (to attempt a larger latent space to solve the blurriness issue), or if you are interested in a commercial collaboration to train a LoRA for a specific product, please DM me in Civitai. If you want to donate me, https://ko-fi.com/ghostshell .Thank you.
因为从LoRA概念,数据集处理,训练和超参调整,都由我一个独立完成.由于4090 24G的显存限制,现在还只能用[360,360]的latent进行训练,所以还是处于原型机阶段,如果效果不太理想,请多谅解和反馈,我争取改进.感谢您能看到这里,该模型商用需要授权(希望能把训练的电费平了,哭).如果您有更大显存的算力支持我做一些实验(更大的latent尝试去解决模糊问题),或者有商业合作去训练特定产品LoRA的意向,请联系我QQ:338728644,感谢感谢.
