LoRA Dataset Caption Aid with Ollama Vision Uncensored

Details

Download Files

Model description

This workflow is used to get a starting point when creating captions for a character LoRA. You feed it an image and it will analyze the image and provide a well rounded description aimed for Character LoRA that should include all the relevant details required for proper captioning.

I found a good Ollama model with vision capability that isn't censored and fed it with a detailed prompt on how to craft the proper caption for a Character LoRA. The Ollama model is about 7GB and should fit on almost any GPU if ran on its own.

An additional "hint" box is added, so the user can provide a hint for the Vision model for exceptional situation like extreme close-up where the image analysis can get wrong if the AI isn't provided with at least minimal context. Leave that box empty in most cases.

I have also added a few markdown notes with tips for the best possible captioning.

Images made by this model

No Images Found.