Here's a human-readable version of the provided configuration, organized into clear sections with explanations:

Network Configuration

U-Net Learning Rate: 0.0005
(Sets the learning speed for the U-Net, which handles image generation.)
Text Encoder Learning Rate: 0.00005
(Controls how fast the text encoder, which processes prompts, learns.)
Network Dimension: 64
(Defines the size of the LoRA network layers, larger than the previous config for potentially more capacity.)
Network Alpha: 16
(Scales the LoRA weights to balance their impact.)
Network Module: LoRA (networks.lora)
(Uses LoRA for efficient fine-tuning of the model.)

Optimizer Settings

Learning Rate: 0.0005
(The base learning rate for the optimizer.)
Learning Rate Scheduler: Cosine with Restarts
(Reduces the learning rate using a cosine curve, restarting 3 times to refine learning.)
Learning Rate Warmup Steps: 0
(No warmup period for the learning rate.)
Optimizer Type: Adafactor
(A memory-efficient optimizer for training large models.)
Optimizer Arguments:
- Scale Parameter: Disabled
- Relative Step: Disabled
- Warmup Init: Disabled
  (Custom settings to optimize Adafactor's behavior.)

Training Settings

Maximum Training Steps: 0
(Training duration is determined by epochs, not a fixed number of steps.)
Maximum Training Epochs: 38
(The model will train for 38 passes over the dataset, more than the previous config.)
Save Model Every N Epochs: 1
(Saves the model after every epoch.)
Sample Generation Every N Epochs: 1
(Generates sample images after every epoch.)
Sample Prompts File: /workspace/training/1ff5c6a9-10c9-41f3-abd6-9e9cf7caef08/text/sample_prompts.txt
(Uses prompts from this file to generate samples during training.)
Sample Sampler: Euler_a
(Uses the Euler Ancestral sampler for generating images.)
Training Batch Size: 4
(Processes 4 images per batch during training.)
Noise Offset: 0.1
(Adds a small amount of noise to stabilize training.)
Clip Skip: 1
(Skips the last layer of the CLIP model for text encoding.)
Weighted Captions: Disabled
(All captions are treated equally, without weighting.)
Maximum Token Length: 225
(Supports text prompts up to 225 tokens long.)
Low RAM Mode: Disabled
(Uses full RAM for faster training.)
Data Loader Workers: 8
(Employs 8 parallel workers to load data, improving efficiency.)
Persistent Data Loader Workers: Enabled
(Keeps data loader workers active between batches to save time.)
Save Precision: Bfloat16 (bf16)
(Saves the model in bfloat16 format to reduce memory usage.)
Mixed Precision Training: Bfloat16 (bf16)
(Uses bfloat16 for calculations to balance speed and precision.)
Output Directory: /workspace/training/1ff5c6a9-10c9-41f3-abd6-9e9cf7caef08/model
(Where the trained models are saved.)
Logging Directory: /workspace/training/1ff5c6a9-10c9-41f3-abd6-9e9cf7caef08/logs
(Where training logs are stored.)
Output Model Name: rugbycoach--khmer--midjourney
(The name of the saved model, suggesting a focus on rugby coaches with a Khmer context, possibly inspired by MidJourney-style outputs.)
Save Training State: Disabled
(Only saves model weights, not the full training state.)
Xformers: Enabled
(Uses Xformers for faster attention computations.)
SDPA (Scaled Dot-Product Attention): Enabled
(Enables efficient attention mechanisms for better performance.)
No Half VAE: Enabled
(Disables half-precision for the Variational Autoencoder to maintain quality.)
Gradient Checkpointing: Enabled
(Saves memory by recomputing gradients during backpropagation.)
Gradient Accumulation Steps: 1
(Processes gradients in a single step, no accumulation.)

Advanced Training Settings

Multi-Resolution Noise Iterations: 6
(Applies noise at multiple resolutions for 6 iterations to enhance image quality.)
Multi-Resolution Noise Discount: 0.3
(Reduces noise impact by 30% across iterations.)
Minimum SNR Gamma: 5.0
(Ensures a minimum signal-to-noise ratio to stabilize training.)

Model Settings

Pretrained Model Path: /model_cache/@civitai/889818/889818.safetensors
(Uses the same pre-trained model as the previous config for fine-tuning.)
V2 Model: Disabled
(Not using a V2 model architecture.)

Saving Settings

Save Model Format: Safetensors
(Saves the model in the efficient Safetensors format.)

DreamBooth Settings

Prior Loss Weight: 1.0
(Balances the influence of prior preservation loss in DreamBooth training.)

Dataset Settings

Cache Latents: Enabled
(Precomputes and caches latent representations of images to speed up training.)

Comparison to Previous Config

This configuration is similar to the previous one but includes key differences:

Network Dimension: Increased from 32 to 64, potentially allowing for more complex fine-tuning.
Training Epochs: Increased from 33 to 38, indicating longer training.
Output Name: Changed to rugbycoach--khmer--midjourney, suggesting a specific focus on rugby coaches with a Khmer cultural context, possibly aiming for MidJourney-like visual style.
Workspace Path: Uses a different workspace directory (1ff5c6a9-10c9-41f3-abd6-9e9cf7caef08 vs. 7dd6c905-0edb-4cd6-bb4c-39fa6c179726).

This setup is designed for fine-tuning a Stable Diffusion model using LoRA and DreamBooth, optimized for efficiency (bfloat16, Xformers, SDPA) and quality (noise management, gradient checkpointing). The model is likely tailored to generate images of rugby coaches in a Khmer context, with a focus on high-quality, MidJourney-inspired outputs.

Here's a human-readable version of the provided configuration, organized into clear sections with explanations, followed by the sample prompts:

Dataset Configuration

Subsets:
- Number of Repeats: 4
  (Each image in the dataset will be repeated 4 times during training to increase its influence.)
- Image Directory: /workspace/training/7dd6c905-0edb-4cd6-bb4c-39fa6c179726/img
  (The folder containing the training images.)

General Settings

Resolution: 1024
(Images will be processed at a resolution of 1024x1024 pixels.)
Shuffle Caption: Enabled
(Randomly shuffles words in captions to improve generalization, while respecting the keep_tokens setting.)
Keep Tokens: 3
(Preserves the first 3 tokens of each caption in their original order during shuffling.)
Flip Augmentation: Enabled
(Applies horizontal flipping to images during training to increase dataset variety.)
Caption Extension: .txt
(Captions for images are stored in text files with a .txt extension.)
Enable Bucket: Enabled
(Groups images into buckets based on their resolution for efficient training.)
Bucket Resolution Steps: 64
(Buckets are created in increments of 64 pixels to match image resolutions.)
Bucket No Upscale: Enabled
(Prevents upscaling of images to fit bucket resolutions, preserving original sizes.)
Minimum Bucket Resolution: 256
(The smallest resolution bucket is 256x256 pixels.)
Maximum Bucket Resolution: 2048
(The largest resolution bucket is 2048x2048 pixels.)

Sample Prompts

These are the prompts used to generate sample outputs during training (likely from the file referenced in the previous configuration). They describe the concepts the model is being trained to generate:

Prompt: "father, rugby coach"
(Generates images of a father who is a rugby coach.)
Prompt: "grandfather, rugby coach"
(Generates images of a grandfather who is a rugby coach.)
Prompt: "father and grandfather went to mosque together"
(Generates images of a father and grandfather together at a mosque.)

This configuration sets up a dataset for training a model (likely Stable Diffusion with LoRA) using images from a specified directory, with captions stored in .txt files. The settings emphasize data augmentation (flipping, caption shuffling) and efficient resolution handling (bucketing) to optimize training. The sample prompts suggest the model is being fine-tuned to generate images related to family members (father and grandfather) with a focus on rugby coaching and a specific cultural context (visiting a mosque).

モデルタイプ	LORA
ベースモデル	Illustrious
公開日	2025-06-04
トレーニングワード	chweeee1

rugbycoach--khmer--midjourney

詳細

ファイルをダウンロード (1)

このバージョンについて

モデル説明

このモデルで生成された画像