DnD Rogue woman with a horse [Flux] [Concept]

This LoRA was inspired by a lot of topics, mostly Diablo IV and DnD movie.

Inspirations and main idea

When I played Diablo IV, I wanted to recreate some characters from the game, in particular their style. This particular LoRA was inspiled by Boho and earthy themes, as well as dark fantasy topic, mostly Dungeons and Dragons (Baldurs Gate 2, DnD movie) as well as Diablo easthethic. Because of that, I tried to make a photorealistic image of a character that could fit two universes at once — Diablo and DnD. Since, I don't aim to recreate the actual style and atmosphere of the titles mentioned above. This can be seen as my take on cosplay or perhaps a scene from a movie, dedicated to any of the titles, like Diablo and DnD.

Dataset preparation

My dataset is based on original images (14) that I uploaded from my MidJourney account and used for further generation. First, all images were augmented by adding horizontal flip. Then, I used more anvanced technique to create color augmentation and variations of the images.

In order to do so, I used Controlnet (canny) with Xlabs sampler and Xlabs controlnet depth v3 (this one: XLabs-AI/flux-controlnet-canny). I utilized the similar checkpoint as I used in training - Atomix FLUX Unet (v.1.0). This allowed me to make more variations in color and expand the dataset.

I used LoRA tagging workflow with Florence 2 tagger, and resized images to 512x672 (WxH).

The final dataset consisted of 14x2x2=56 images, including flip and color augmentation.

Training workflow

Now to the training workflow. I used the official workflow from Kijai (GitHub - kijai/ComfyUI-FluxTrainer), based on Kohya script. I trained the LoRA with such settings — 56 images, number of steps — 1000 (I found that the best results were at 1000 and 400 steps, all others less prominent, but had potential. Based on my observations, these values translate to 19 and 9 epochs respectively. Since I have varying success in other steps, I can upload other steps in the future.

Now regarding the checkpoints used. I used Atomix FLUX Unet (v.1.0) for training. Regarding the training parameters — I used fp8 training format without offloading and with gradient checkpointing.

LoRA deployment and testing

Now to deployment of the model. I tested it (and still testing to check any issues) using same Unet and Text encoder I used during training:

clip-L from Hugging Face from Flux-dev repository: black-forest-labs/FLUX.1-dev at main
T5xxl fp8 encoder FLUX.1 T5 Text Encoder

The best results so far I got with the following parameters:

Lora model weight — 1.0
Lora CLIP weight — 1.0
Steps — 15
CFG — 1.5
Sampler: Euler
Scheduler: simple

Since the LoRA was trained with tags from the initial training images, instead of trigger words you may use the tags section from the example prompt::

"A photo-realistic shoot from a front camera angle about a young woman dressed in traditional clothing stands confidently beside a horse in a forest setting, holding a bow and arrow. on the middle of the image, a 20-year-old dark-skinned woman with an afro hairstyle appears to be standing, looking directly at the viewer with a serious expression. she is wearing a long, flowing dress with intricate gold embroidery and a red shawl draped over her shoulders. her hair is styled in an intricate updo, and she is adorned with jewelry including earrings, a necklace, and a sword. on her right side, a brown horse with a black mane and a white heart-shaped mark is standing beside her. the background is blurred, with trees and greenery, and the lighting is soft and natural, creating a serene atmosphere.

A full body portrait, standing pose, photorealistic, African, fantasy, woman, archer, D&D character, full body portrait, photorealistic, draped cloak, bow, Greg Rutkowski visual style, belt, pouches, cinematic portrait, bun hair, dark skin, detailed, curly hair, horse, wilderness, forest, muddy road, worn boots, styled ornaments, tribal clothing, detailed clothing textures, depth of view, natural sunlight, elegant tribal jewelry, earrings, beaded necklaces, broche or pendant, D&D, fantasy superhero, rogue"

Credits

Thanks to the developers of mentioned models and ComfyUI nodes, for inspiration in prompting and workflows. All credits for used models and workflows left for the respective authors (AlexLai, kijai). Thanks to authors of other awesome nodes, models and tools not mentioned here, but which were essential to create this image.

Disclaimer on content

Since the checkpoint is in early beta stage, it can generate some content, that is not for all audiences, if used alongside a checkpoint (e.g. dedistilled), if prompted. The LoRA does not depict a real person and serves only for testing purposes only.

Disclaimer on fair usage of training data

The training data (14 images) was created on personal account on MidJourney and not intended to replicate or imitate the Midjourney model or its outputs. Transformative work, including augmentation, control net, distilling, filtering and merging was made to make the output of the model less look-alike to the original images. The resulted model is intended to use for research purposes and has non-commercial licence to distribute, create or recreate any content. All credits are given to the authors of the original Midjourney model.

License

The LoRA inherits the license from Atomix Flux (used in training workflow as Unet):

FLUX.1 [dev] Non-Commercial License .

The FLUX.1 [dev] Model is licensed by Black Forest Labs. Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs. Inc.

IN NO EVENT SHALL BLACK FOREST LABS, INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.

モデルタイプ	LORA
ベースモデル	Flux.1 D
公開日	2025-01-15

DnD Rogue woman with a horse [Flux] [Concept]

詳細

ファイルをダウンロード (2)

このバージョンについて

モデル説明

このモデルで生成された画像