Singularity-LTX-2.3_OmniCine_V1
詳細
ファイルをダウンロード (1)
モデル説明
🚀 Singularity LTX-2.3 OmniCine V1 (Official Release)
Try It Online Experience the full potential of this model via the optimized workflow on RunningHub: 👉
https://www.runninghub.ai/post/2062051326342815746?inviteCode=sdhs0trb
This is not just a standard fine-tune; it is a fundamental restructuring of the LTX-Video (2.3) generation logic.
I am thrilled to present the official release of LTX2.3 Singularity to the Civitai community! This comprehensive optimization framework focuses heavily on Image-to-Video (I2V), First & Last Frame Control, and Reference-to-Video generation. Although it has currently undergone only nearly 100,000 steps (calculated by gradient accumulation), its enhancements in physical consistency, dynamic motion, and cinematic expression have already far exceeded expectations.
🌟 Key Improvements & Features
🦴 Limbs & Anatomy Evolution: Specifically optimized to fix the common degradation of fingers and toes, drastically reducing anatomy warping and artifacts during fast movements.
🎬 Injecting Shot Continuity: Achieved precise timeline-based shot and camera cuts controlled directly via text prompts (0-5s logical segments), saying goodbye to erratic, randomized framing.
🗣️ Elimination of "AI Stiffness": Significantly enhanced facial expressiveness during speech, deeply optimized lip-syncing, and natively eliminated the rigid, burned-in subtitles frequently generated by the base model.
⚖️ Physical Consistency: Improved the structural integrity of characters and environments during high-speed actions, suppressing chaotic "twisting/morphing" and aligning motions with real-world physics.
🎨 Flawless Anime Compatibility: Integrated a high-quality Anime training dataset, allowing the model to seamlessly adapt across diverse styles including 2D anime, 3D CGI, and hyper-realism.
🌪️ Extreme Dynamic Range: Delivers stellar performance in high-action sequences like running and combat sports. Simultaneously, visual effects for cyberpunk themes, transformations, magic casting, and monster rendering have been massively amplified.
🖼️ Revolutionary Reference Image Control: Upgraded the "Reference-to-Video" capability. No longer bound to rigid first-frame constraints, the model intelligently extracts character features and artistic styles from the reference image, generating entirely new angles and compositions based on your prompts.
📊 Current Limitation Note: While it meets the vast majority of movement demands, slight motion blur may still occur during extreme, highly complex actions. This is currently being addressed via optimized post-processing workflows—stay tuned!
⚙️ Generation & Usage Guide
To get the absolute best results from this LoRA, please follow these recommendations:
Recommended Base Model:
ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensorsComfyUI Workflow: Available in the files/post section. Highly recommended to use in First & Last Frame Mode for ultimate scene control.
LoRA Weight: Recommended to start at 0.8 - 1.0 and adjust based on your specific prompt intensity.
📝 Exclusive: Singularity Prompting Framework
This model follows a strict prompt structure to unlock its full cinematic potential. Please adhere closely to the "Cinematic Timeline Structure" below.
💡 Core Rule: Keep visual descriptions, timestamps, actions, and dialogue strictly formatted in English as shown below.
📐 Prompt Template Structure
[Scene & Style]: Core visual description in one sentence (e.g., Cinematic wuxia style, dim lighting, Anime, 3D).
[Action Timeline]: 0-X seconds, [action / emotional description].
[Camera Timeline]: 0-X seconds, [camera movement / composition parameters].
[Environment]: Lighting source, contrast, and color grading details.
[Dialogue]: 0-X seconds, [Character] says: "[Dialogue text]".
[Audio & Technical]: Background sounds, film grain, subtitle exclusion commands, etc.
🎬 Example Prompt
Cinematic wuxia style, indoor dim lighting, mysterious mood. 0-10 seconds, young man in ancient white robes looks down with a confused expression. 0-10 seconds, tight close-up, static camera with slight handheld movement. Dark stone background, warm candlelight bokeh. 0-10 seconds, man says: "What on earth is this? I've never heard of it before.". Voice: low and confused, Pace: slow. Precise lip-sync, film grain, cinematic bokeh, no subtitles.
🛠️ Dev Log (Behind the Scenes)
LTX2.3 is an architecture with immense latent potential, but I believe it requires more structured guidance to truly understand complex motion.
In this fine-tuning run, I abandoned brute-force action dataset stacking. Instead, I shifted towards high-quality dialogue scenes and clean, easily digestible action sequences for the model to fit. Furthermore, I deliberately reduced the ratio of real-world video footage. Real-world clips often carry heavy native motion blur. When combined with LTX2.3's high Latent compression ratio, the model easily loses temporal attention during high-velocity sequences, causing character consistency to collapse. By filtering out this noise, character feature retention has been massively reinforced.
What's Next? This run highlighted a few minor limitations that I plan to iterate on in the next version. However, given the intensity of this development cycle, I need to take a quick break before diving back in.
❤️ Support the Project: If you enjoy utilizing this model, please leave a 5-star review, drop a ❤️, and post your generations below! Your feedback and buzz directly shape the training set for the next phase. Enjoy the visual revolution of Singularity!
If you have questions, feedback, or want to collaborate on AI video workflows, feel free to reach out:
WeChat (微信):
aigctydQQ Group (QQ社群):
1058747239Email:
[email protected]
I'm actively looking for community feedback to refine the full version of this LoRA. Let's push the boundaries of LTX2.3 together!
