Gemma3-12B-Abliterated-fp8

세부 정보

모델 설명

V1.0a Experimental!!

Important, pls read!!

Gemma-3-12B-Heretic-X (Sikaworld High-Fidelity Edition)

This is the ultra-dynamic, fully uncensored text encoder for LTX-2, based on the experimental Heretic-X fine-tune by LastRef.

While the standard abliterated version removes the "refusal" mechanism, Heretic-X was actively steered with a custom dataset to be proactively descriptive and uninhibited. In LTX-2 video generation, this translates to significantly stronger motion vectors, helping to "unfreeze" static videos and generate more intense dynamics in complex scenes.

This edition applies the Sikaworld High-Fidelity Quantization method to tame the aggressive nature of Heretic-X, ensuring that the increased dynamics do not come at the cost of facial symmetry or anatomical coherence.

🚀 Key Features

  • Aggressive Uncensoring (Heretic-X): Unlike standard abliteration (which just deletes the refusal direction), this model uses modified weights (attn.o_proj, mlp.down_proj) derived from x-rated dataset training. It delivers a "louder" and more confident signal to the video transformer, which is often the cure for "frozen" I2V generations.

  • High-Fidelity Layer Protection (The Stabilizer): Aggressive fine-tunes can often lead to "melting" faces in video. This version uses a Mixed Precision Strategy: The critical input layers (0-1) and the final output layers (44-47), as well as all LayerNorms and Biases, are kept in BF16. This acts as a safety rail, keeping facial features symmetric while allowing the body and background to move dynamically.

  • True Standalone (.safetensors): Includes the embedded spiece_model tensor. It works as a single-file plug-and-play solution in ComfyUI (LTX-2) without requiring external tokenizer.model files or complex folder structures.

  • Surgical Extraction: Stripped of the 20GB+ Vision-Tower weights (which LTX-2 does not use) to save VRAM and loading time, while retaining the full 48-layer text intelligence of the 24GB BF16 source.

🛠 Usage in ComfyUI

  1. Place the .safetensors file in your ComfyUI/models/text_encoders/ folder.

  2. In your LTX-2 workflow (DualCLIPLoader), select this model.

  3. Recommended Dtype: Set weight_dtype to fp8_e4m3fn (the critical layers remain BF16 automatically).

  4. Prompting Tip: This model reacts very well to "action verbs" at the very beginning of the prompt. It requires less CFG scale than standard models to produce motion.

📊 Technical Background

Why Heretic-X for Video?
LTX-2 (especially the Dev version) often suffers from "motion collapse" (frozen video) when the text embedding is too neutral. Heretic-X provides a higher variance in its embeddings.

Why this Quantization?
Standard FP8 conversions of Heretic models often result in "weird" artifacts because the aggressive weights clip during quantization. By protecting the last 4 layers (44-47) in BF16, we ensure that the final instructions sent to the Video Transformer retain their high-precision spatial alignment, preventing the "uncanny valley" effect often seen in dynamic clips.

Credits

  • Base Model: Google Gemma 3

  • Heretic Fine-tune: LastRef

  • Optimization & Architecture Fixes: Sikaworld

v1.0

Gemma-3-12B-it-Abliterated (Sikaworld High-Fidelity Edition)

This is a specialized, fully uncensored (abliterated) text encoder for the LTX-2 audiovisual model.

While standard FP8 conversions often lead to "frozen" videos, facial drifting, or anatomical asymmetry in Image-to-Video (I2V) workflows, this version was surgically optimized to preserve the intelligence and stability of the original model.

🚀 Key Features

  • Uncensored Freedom: Based on the abliteration technique by Maxime Labonne. This model follows complex or "sensitive" prompts without refusals, ensuring a strong vector signal for high-motion video generation.

  • High-Fidelity Layer Protection: Unlike radical FP8 quants, this version uses a Mixed Precision Strategy. Critical input layers (0-1) and final output layers (44-47), as well as all LayerNorms and Biases, are kept in BF16. This specifically fixes the "face shifting" and "asymmetry" issues common in LTX-2.

  • True Standalone (.safetensors): Includes the embedded spiece_model tensor. It works as a single-file plug-and-play solution in ComfyUI without requiring external tokenizer.model files.

  • FP32 Sourced: Converted directly from the original 47GB FP32 shards to ensure maximum rounding precision during the FP8/BF16 hybrid conversion.

🛠 Usage in ComfyUI

  1. Place the .safetensors file in your ComfyUI/models/text_encoders/ folder.

  2. In your LTX-2 workflow, use the DualCLIPLoader or the specific LTXV Text Encoder Loader.

  3. Tip: For best motion results, leave the negative prompt empty and focus your positive prompt on actions and dynamics.

📊 Technical Background

Standard 8-bit quantization often "muffles" the subtle signals needed for temporal consistency in video models. By protecting the "navigation" layers (the beginning and end of the 48-layer stack) in BF16, this encoder provides a much "louder" and more stable movement command to the LTX-2 Transformer.

Credits

  • Abliteration: mlabonne

  • Optimization & Quantization: Sikaworld


이 모델로 만든 이미지

이미지를 찾을 수 없습니다.