Retro 90's Anime / Golden Boy Style Lora LTX2

詳細

ファイルをダウンロード

モデル説明

Trigger GoldenBoyStyle

I took my dataset from the lora I used to make the wan version of this. Around 80 videos and 368 images or so. The wan 2.2 version is better, but that lora had lots of revisions and time to get where it is today.

Since the dataset was in 16fps, I had to set AI toolkit to train on 16fps, and I had to use "Audio Normalize" otherwise the voices all became high pitched. It unfortunately made the first 2k steps not usable, but I found the best results at the 4.5k steps range, though 2.5K steps also looked nice. Also to fit the frame buckets I padded some of the clips with 1-3 frames max (bucket sizes: 17, 25, 33, 41, 49, 57, 65). And trained on 512 and 768 resolutions.

The audio is trained from the Japanese dub (wish I had cut these clips from english dub audio originally way back when I made them). But you can prompt in english fine (actually english sounds better). I have a theory that because I didn't use character tagging, all the women's voices meshed into one which is also why they sound high pitched. Try re-rolling some seeds or modifying prompt if you get high pitched voices.

Really excited for the next LTX version (2.1+), because there are some limits to how far we can stretch this base model.

You will probably get the best results by doing long very detailed prompts (using LLM is best). Otherwise style may not trigger. This lora was made to learn how to do anime style in LTX2. I think to get truly good results we need a large video dataset at 25fps to train on. This is kind of patch work in getting a wan dataset to work on LTX2 which isn't the best way to do things.

Use my example workflow if you want to generate like I have. I recommend doing landscape videos as LTX2 is better at that. You can try earlier or later checkpoints here

このモデルで生成された画像

画像が見つかりません。