Gemma3-12B-Abliterated-fp8

sikasolutionsworldwide709

151

ltx-2 base model gemma-3 abliterated fp8 textencoder comfyui

v1.0a (experimental)v1.0

세부 정보

파일 다운로드

이 버전에 대해

중요합니다. 읽어주세요.

Gemma-3-12B-Heretic-X (Sikaworld 하이파이디션 버전)

이것은 Heretic-X의 실험적 파인튜닝을 기반으로 한 초동적, 완전한 검열 해제 텍스트 인코더로, LTX-2용입니다.

표준 abliterated 버전은 "거부" 메커니즘을 제거하지만, Heretic-X는 맞춤형 데이터셋으로 적극적으로 유도되어 능동적으로 상세하고 억제되지 않은 표현을 제공합니다. LTX-2 비디오 생성에서 이는 훨씬 강력한 모션 벡터를 생성하여 정지된 비디오를 "해동"하고 복잡한 장면에서 더 강렬한 역동성을 생성합니다.

이 버전은 Heretic-X의 공격적인 특성을 억제하기 위해 Sikaworld 하이파이디션 양자화 방법을 적용하여, 증가된 역동성이 얼굴 대칭성이나 해부학적 일관성에 손실을 주지 않도록 보장합니다.

🚀 주요 기능

공격적 검열 해제 (Heretic-X): 표준 abliteration(거부 방향을 단순 삭제)과 달리, 이 모델은 성인용 데이터셋 학습을 통해 파생된 수정된 가중치(attn.o_proj, mlp.down_proj)를 사용합니다. 이는 비디오 트랜스포머에 "더 큰"이고 더 확신 있는 신호를 전달하며, 종종 "정지된" I2V 생성의 해결책이 됩니다.
하이파이디션 레이어 보호(안정화 장치): 공격적인 파인튜닝은 비디오에서 얼굴이 "녹아내리는" 현상을 유발할 수 있습니다. 이 버전은 혼합 정밀도 전략을 사용합니다: 핵심 입력 레이어(0-1) 및 최종 출력 레이어(44-47), 그리고 모든 LayerNorm과 편향은 BF16으로 유지됩니다. 이는 얼굴 특징을 대칭적으로 유지하면서 신체와 배경은 역동적으로 움직일 수 있도록 안전 장치 역할을 합니다.
진정한 독립형(.safetensors): 내장된 spiece_model 텐서를 포함합니다. 외부 tokenizer.model 파일이나 복잡한 폴더 구조 없이 ComfyUI(LTX-2)에서 단일 파일 플러그 앤 플레이 솔루션으로 작동합니다.
정밀 추출: LTX-2가 사용하지 않는 20GB 이상의 Vision-Tower 가중치를 제거하여 VRAM 및 로딩 시간을 절약하면서, 24GB BF16 소스의 전체 48레이어 텍스트 지능을 유지합니다.

🛠 ComfyUI 사용법

.safetensors 파일을 ComfyUI/models/text_encoders/ 폴더에 배치합니다.
LTX-2 워크플로우(DualCLIPLoader)에서 이 모델을 선택합니다.
권장 Dtype: weight_dtype를 fp8_e4m3fn으로 설정하세요(핵심 레이어는 자동으로 BF16로 유지됩니다).
프롬프트 팁: 이 모델은 프롬프트 시작 부분에 행동 동사를 사용할 때 매우 잘 반응합니다. 모션을 생성하기 위해 표준 모델보다 낮은 CFG 스케일로도 충분합니다.

📊 기술적 배경

왜 비디오에 Heretic-X인가요?
LTX-2(특히 Dev 버전)는 텍스트 임베딩이 너무 중립적일 때 "모션 붕괴"(정지된 비디오)를 자주 겪습니다. Heretic-X는 임베딩에서 더 높은 분산을 제공합니다.

왜 이 양자화인가요?
Heretic 모델의 표준 FP8 변환은 공격적인 가중치가 양자화 중 클리핑되어 "이상한" 아티팩트를 유발합니다. **마지막 4개 레이어(44-47)**를 BF16로 보호함으로써, 비디오 트랜스포머에 전달되는 최종 지시가 고정밀 공간 정렬을 유지하여 역동적인 클립에서 흔히 나타나는 "불안한 계곡" 효과를 방지합니다.

감사의 말

기반 모델: Google Gemma 3
Heretic 파인튜닝: LastRef
최적화 및 아키텍처 수정: Sikaworld

모델 설명

V1.0a Experimental!!

Important, pls read!!

Gemma-3-12B-Heretic-X (Sikaworld High-Fidelity Edition)

This is the ultra-dynamic, fully uncensored text encoder for LTX-2, based on the experimental Heretic-X fine-tune by LastRef.

While the standard abliterated version removes the "refusal" mechanism, Heretic-X was actively steered with a custom dataset to be proactively descriptive and uninhibited. In LTX-2 video generation, this translates to significantly stronger motion vectors, helping to "unfreeze" static videos and generate more intense dynamics in complex scenes.

This edition applies the Sikaworld High-Fidelity Quantization method to tame the aggressive nature of Heretic-X, ensuring that the increased dynamics do not come at the cost of facial symmetry or anatomical coherence.

🚀 Key Features

Aggressive Uncensoring (Heretic-X): Unlike standard abliteration (which just deletes the refusal direction), this model uses modified weights (attn.o_proj, mlp.down_proj) derived from x-rated dataset training. It delivers a "louder" and more confident signal to the video transformer, which is often the cure for "frozen" I2V generations.
High-Fidelity Layer Protection (The Stabilizer): Aggressive fine-tunes can often lead to "melting" faces in video. This version uses a Mixed Precision Strategy: The critical input layers (0-1) and the final output layers (44-47), as well as all LayerNorms and Biases, are kept in BF16. This acts as a safety rail, keeping facial features symmetric while allowing the body and background to move dynamically.
True Standalone (.safetensors): Includes the embedded spiece_model tensor. It works as a single-file plug-and-play solution in ComfyUI (LTX-2) without requiring external tokenizer.model files or complex folder structures.
Surgical Extraction: Stripped of the 20GB+ Vision-Tower weights (which LTX-2 does not use) to save VRAM and loading time, while retaining the full 48-layer text intelligence of the 24GB BF16 source.

🛠 Usage in ComfyUI

Place the .safetensors file in your ComfyUI/models/text_encoders/ folder.
In your LTX-2 workflow (DualCLIPLoader), select this model.
Recommended Dtype: Set weight_dtype to fp8_e4m3fn (the critical layers remain BF16 automatically).
Prompting Tip: This model reacts very well to "action verbs" at the very beginning of the prompt. It requires less CFG scale than standard models to produce motion.

📊 Technical Background

Why Heretic-X for Video?
LTX-2 (especially the Dev version) often suffers from "motion collapse" (frozen video) when the text embedding is too neutral. Heretic-X provides a higher variance in its embeddings.

Why this Quantization?
Standard FP8 conversions of Heretic models often result in "weird" artifacts because the aggressive weights clip during quantization. By protecting the last 4 layers (44-47) in BF16, we ensure that the final instructions sent to the Video Transformer retain their high-precision spatial alignment, preventing the "uncanny valley" effect often seen in dynamic clips.

Credits

Base Model: Google Gemma 3
Heretic Fine-tune: LastRef
Optimization & Architecture Fixes: Sikaworld

v1.0

Gemma-3-12B-it-Abliterated (Sikaworld High-Fidelity Edition)

This is a specialized, fully uncensored (abliterated) text encoder for the LTX-2 audiovisual model.

While standard FP8 conversions often lead to "frozen" videos, facial drifting, or anatomical asymmetry in Image-to-Video (I2V) workflows, this version was surgically optimized to preserve the intelligence and stability of the original model.

🚀 Key Features

Uncensored Freedom: Based on the abliteration technique by Maxime Labonne. This model follows complex or "sensitive" prompts without refusals, ensuring a strong vector signal for high-motion video generation.
High-Fidelity Layer Protection: Unlike radical FP8 quants, this version uses a Mixed Precision Strategy. Critical input layers (0-1) and final output layers (44-47), as well as all LayerNorms and Biases, are kept in BF16. This specifically fixes the "face shifting" and "asymmetry" issues common in LTX-2.
True Standalone (.safetensors): Includes the embedded spiece_model tensor. It works as a single-file plug-and-play solution in ComfyUI without requiring external tokenizer.model files.
FP32 Sourced: Converted directly from the original 47GB FP32 shards to ensure maximum rounding precision during the FP8/BF16 hybrid conversion.

🛠 Usage in ComfyUI

Place the .safetensors file in your ComfyUI/models/text_encoders/ folder.
In your LTX-2 workflow, use the DualCLIPLoader or the specific LTXV Text Encoder Loader.
Tip: For best motion results, leave the negative prompt empty and focus your positive prompt on actions and dynamics.

📊 Technical Background

Standard 8-bit quantization often "muffles" the subtle signals needed for temporal consistency in video models. By protecting the "navigation" layers (the beginning and end of the 48-layer stack) in BF16, this encoder provides a much "louder" and more stable movement command to the LTX-2 Transformer.

Credits

Abliteration: mlabonne
Optimization & Quantization: Sikaworld

이 모델로 만든 이미지

정렬

이미지를 찾을 수 없습니다.

모델 유형	체크포인트
기본 모델	Other
게시일	1/24/2026