Slime Girl Concept

세부 정보

파일 다운로드

이 버전에 대해

Hunyuan T2V 생성에서 슬라임을 생명 있게 구현!

https://github.com/tdrussell/diffusion-pipe를 사용하여 학습

학습 데이터는 다음의 소규모 조합입니다:

이 모델 카드의 다른 버전에서 사용된 이미지
여러 비디오에서 추출한 키프레임 이미지
각각 약 40프레임의 짧은 비디오 클립

학습 구성:

dataset.toml

# 비율 버킷 설정
enable_ar_bucket = true
min_ar = 0.5
max_ar = 2.0
num_ar_buckets = 7

[[directory]] # 이미지
# 이미지 및 해당 캡션 파일이 포함된 디렉토리 경로
path = '/mnt/d/huanvideo/training_data/images'
num_repeats = 5
resolutions = [1024]
frame_buckets = [1] # 이미지에는 1프레임 사용

[[directory]] # 비디오
# 비디오 및 해당 캡션 파일이 포함된 디렉토리 경로
path = '/mnt/d/huanvideo/training_data/videos'
num_repeats = 5
resolutions = [256] # 비디오 해상도를 256으로 설정 (예: 244p)
frame_buckets = [33, 49, 81] # 비디오용 프레임 버킷 정의

config.toml

# 데이터셋 구성 파일
output_dir = '/mnt/d/huanvideo/training_output'
dataset = 'dataset.toml'

# 학습 설정
epochs = 50
micro_batch_size_per_gpu = 1
pipeline_stages = 1
gradient_accumulation_steps = 4
gradient_clipping = 1.0
warmup_steps = 100

# 평가 설정
eval_every_n_epochs = 5
eval_before_first_step = true
eval_micro_batch_size_per_gpu = 1
eval_gradient_accumulation_steps = 1

# 기타 설정
save_every_n_epochs = 15
checkpoint_every_n_minutes = 30
activation_checkpointing = true
partition_method = 'parameters'
save_dtype = 'bfloat16'
caching_batch_size = 1
steps_per_print = 1
video_clip_mode = 'single_middle'

[model]
type = 'hunyuan-video'

transformer_path = '/mnt/d/huanvideo/models/diffusion_models/hunyuan_video_720_cfgdistill_fp8_e4m3fn.safetensors'
vae_path = '/mnt/d/huanvideo/models/vae/hunyuan_video_vae_bf16.safetensors'
llm_path = '/mnt/d/huanvideo/models/llm'
clip_path = '/mnt/d/huanvideo/models/clip'

dtype = 'bfloat16'
transformer_dtype = 'float8'
timestep_sample_method = 'logit_normal'

[adapter]
type = 'lora'
rank = 32
dtype = 'bfloat16'

[optimizer]
type = 'adamw_optimi'
lr = 5e-5
betas = [0.9, 0.99]
weight_decay = 0.02
eps = 1e-8

모델 유형	LORA
기본 모델	Hunyuan Video
게시일	1/16/2025
학습된 단어	slime girl see-through body blue skin translucent skin

세부 정보

파일 다운로드

이 버전에 대해

모델 설명

슬라임 소녀 콘셉트

라이선스

이 모델로 만든 이미지