Illustrious AND Anima (SDXL) Workflow (T2I/I2I) with NSFW detailing

详情

模型描述

Updated for 2.0 - SDXL and Anima (Flow) models. Still a bit of a mess but sections have been moved into Subgraphs and using GetNode/SetNode to cut down on the clutter. Some areas do not support the nodes so it is still long reroutes.

This is a major upgrade since Anima is a Unet model you can now toggle between a Checkpoint (SDXL/Illustrious/etc) and a Unet model. The Turbo lora is also toggled when switching models. Unfortunately, the full lora stack is not, so be careful.

The image input sections has not changed much from the 1.x series other than the Size and Orient block is now in a subgraph.

There is also extensive use of Spectrum forecasting to speed up rendering for both SDXL and Anima. A first round slow render will use a 30 step by default and each refining/detailer will use a 20 step.

Another big addition is the optional LLM node. This can be used to convert a set of tags to a natural language input or to enhance some natural language input you may have. See the notes for details. Currently using Ollama with dolphin-mistral. The detailers have been upgraded to use SAM3 for everything except NSFW detailers. For faces and people, the yolo8 and SAM3 are compared and the one with the most segments is selected and sent for processing.

On the output, you can now choose between the AuraSR block or the Upscale by Model. The save file will be automatically selected to pull the right model name (Checkpoint or UNET) for the filename.

As a sidenote, Anima seems to have better hands in general so the first pass hand detailer and second pass are typically not needed to fix any hands before running to the second hand detailer.

Most everything else from 1.x notes applies.

NOTE: Preview images done with and without detailers and LLM conversion. Also toggle between Anima and Illustrious. When toggle models the model specific General tags also swap for positive and negative prompts.

1.x series notes:

My workflow for SDXL models. It is a bit messy but functional. Has the following high level features:

  1. Text to Image and Image to Image from multiple sources

  2. I2I can be from random Danbooru posts, image file, image folder, URL, or video with optional autotagging.

  3. Wildcard support

  4. Randomized orientation and image ratio

  5. Prompt list to create a series of images from a file

  6. Batch or single image generation

  7. Multiple detailers with support for detailing of background characters and faces and NSFW areas

  8. Uses a mix of LCM and standard samplers to speed up generation

General Notes:

Default sampler is euler_ancestral or lcm for the fast render (turbo) or dpmpp_2m for everything else. This is mostly because it is deterministic so when trying out different tag prompt changes you can see the difference from the prompt and not the sampler. In normal use, you can change to whatever you prefer. For the detailers there is a single set of settings for the slow and fast sections that will change the settings for all detailers (hands, face, persone, nsfw)

Also for the person and face detailers, there is an automatic tiered approach to reduce the size of people or faces that are background figures. This cuts down on time and it also makes sure that the background characters do not become super detailed and look odd. The largest segment is found and then it is split out to segments that are 50% smaller and then 75% smaller than the largest segment. The max size for the detailer is also tiered so the smaller segments will size limited.

Additional resources:

https://huggingface.co/ai-forever/Real-ESRGAN

https://openmodeldb.info/models/4x-Remacri

https://huggingface.co/Comfy-Org/sam3.1

此模型生成的图像