rugbycoach--khmer--midjourney

詳細

ファイルをダウンロード (1)

モデル説明

Here's a human-readable version of the provided configuration, organized into clear sections with explanations, followed by the sample prompts:


Dataset Configuration

  • Subsets:

    • Number of Repeats: 4
      (Each image in the dataset will be repeated 4 times during training to increase its influence.)

    • Image Directory: /workspace/training/7dd6c905-0edb-4cd6-bb4c-39fa6c179726/img
      (The folder containing the training images.)


General Settings

  • Resolution: 1024
    (Images will be processed at a resolution of 1024x1024 pixels.)

  • Shuffle Caption: Enabled
    (Randomly shuffles words in captions to improve generalization, while respecting the keep_tokens setting.)

  • Keep Tokens: 3
    (Preserves the first 3 tokens of each caption in their original order during shuffling.)

  • Flip Augmentation: Enabled
    (Applies horizontal flipping to images during training to increase dataset variety.)

  • Caption Extension: .txt
    (Captions for images are stored in text files with a .txt extension.)

  • Enable Bucket: Enabled
    (Groups images into buckets based on their resolution for efficient training.)

  • Bucket Resolution Steps: 64
    (Buckets are created in increments of 64 pixels to match image resolutions.)

  • Bucket No Upscale: Enabled
    (Prevents upscaling of images to fit bucket resolutions, preserving original sizes.)

  • Minimum Bucket Resolution: 256
    (The smallest resolution bucket is 256x256 pixels.)

  • Maximum Bucket Resolution: 2048
    (The largest resolution bucket is 2048x2048 pixels.)


Sample Prompts

These are the prompts used to generate sample outputs during training (likely from the file referenced in the previous configuration). They describe the concepts the model is being trained to generate:

  1. Prompt: "father, rugby coach"
    (Generates images of a father who is a rugby coach.)

  2. Prompt: "grandfather, rugby coach"
    (Generates images of a grandfather who is a rugby coach.)

  3. Prompt: "father and grandfather went to mosque together"
    (Generates images of a father and grandfather together at a mosque.)


This configuration sets up a dataset for training a model (likely Stable Diffusion with LoRA) using images from a specified directory, with captions stored in .txt files. The settings emphasize data augmentation (flipping, caption shuffling) and efficient resolution handling (bucketing) to optimize training. The sample prompts suggest the model is being fine-tuned to generate images related to family members (father and grandfather) with a focus on rugby coaching and a specific cultural context (visiting a mosque).

このモデルで生成された画像