Qwen Breast Type Selector (WIP)

Details

Model description

This is a hub of multiple different breast types. Why?

Accuracy- The more varied images you add, the more it dilutes your Lora and the training time. Qwen already knows what breasts are but doesn't know what areolas look like. Training breasts as a single entity makes it highly compatible and the results remain consistent. You can use my innie vagina lora, so now you can choose your vagina and breast type. Since this only focus on the the breasts and a small amount from the outside area to guide the position of the breasts and direction. It will work well with posing and side angles without changing the body size if you have a lora trained on a character and want to maintain the likeness.

Training speed- This is a big one, this only took 1.2 hours to train compared to 8 hours and even 1.5 days.

What's the issue now? The training resolution, since high resolution images like 4K are hard to come by and I don't have the hardware to train that high, 512 and 1024 are the only options. The training resolution isn't high enough to grab enough details of the human anatomy to where you can see blood veins or to get proper areolas, so the areolas will look blurry, upscaling and hires fix might be an option to fix this. The consumer hardware just isn't powerful enough unless you have 128GB of RAM to make up of the lack of VRAM, 64GB of RAM and a 5090 isn't enough sadly.

The trigger word is b00b135

(This word has produced unforeseen consequences in my training run so be carefull, later steps fixed it but might be apparent at lower strength.)

I found out the hard way that Qwen doesn't like it when you caption your lora without a trigger word. If you don't want your character wearing those ugly sunglasses then don't add it to your dataset. Qwen works better as a trigger word only model when training. If your character has a shade of green you like or is unique, caption it to an existing word or mask it and create your own trigger word.

What should I do If I want to train it?

Use mask layering, what is that? If you separate a breast into different layers, the nipples 1 layer, areolas layer 2 and the breast itself as layer 3. You would need to duplicate this image 3 times and create 3 different masks, then mask 1 by 1. With this you will have 3 seperate loras. This is the most accurate way to do this, but it's very inefficient. Why? Qwen doesn't know what an areola is and adding more words will confuse the AI further, if you say breast without a lora, it will include the areola but it will look horrible, by separating and creating your own trigger word for each layer, you are bypassing any Qwen's badly captioned or poorly produced dataset. Your goal is for the model to learn the areola from your dataset, not Qwens internal dataset of what it knows.

Since this lora is just 1 set of the same breasts, I can include the areolas. This is the best way to train a lora without having artifacts. The only thing wrong is that the images weren't high resolution enough. The boundary area of the mask matters, I only masked what key areas are defined by what are breasts, without including much of the person likeness. Her skin tone and some moles might show, that's about it.

Here's a thought, any word you use that isn't in Qwen's knowledge base trains everything in your image on your dataset regardless of what that word means. Using an existing word will attempt to overwrite it, I tried it this way and even after 6000 steps it won't change much since the word I used was connected to two people kissing, using a trigger word fixed this.

Your captions should be in detail in ComfyUI or other WebUI interface, instead of in your training dataset. The model itself will learn each feature of what makes your character, as long as the image isn't too complicated.

For example, if you train on a real person and use the word male, your lora will use the Qwens male knowledge instead of your character, the shape of his body will be lost but his face will remain mostly intact, since AI in-general is very good at faces. The trigger word will treat the word as your character as a whole. Think of the caption in WebUI is the definition of the word you used for the trigger word (Like a dictionary.)

Images made by this model

No Images Found.