Astaroth
Details
Download Files
Model description
ver.2
I created Astaroth as a semi-realistic model. However, when I looked at the output, it felt incomplete. So, I tried to make it closer to the photo, and this is version 2.
Prompt fidelity is slightly improved. The composition and pose dynamism has been reduced slightly. There may also be a slight decrease in physical stability and fingers. Some of the images have an excessive effect in progressing in photographic drawing. Depending on the picture, it is recommended to add "oily skin" to the negative prompt.
All sample images are output from LCM with 512 x 768 pixels, and HiRes.Fix or Adetailer is not used. Note that this is something I do to demonstrate the model, and it would be better to actually use them. It is also an inevitable problem with SD1.5. If you are using an upscaler, I recommend using ESRGAN-based ones with denoise strength below 0.3 (e.g. 0.18. For me, I used 0.09 a lot at one point). In principle, the Latent system is more in line with the characteristics of the generative model, but it appears to be inferior in terms of performance. In addition, in methods other than Latent, the generation model is not referenced, and the output image is used to enlarge the image using a model with built-in upscaler, so if used with a stronger intensity, there is a risk that any model will have a similar picture. Of course, there are cases where the model is adjusted based on that assumption, but this is not the case this time, so when using this model, make sure to keep it at a low strength.
ver.1
It is difficult to explain what this model is. Simply put, it is a modified version of a model called basilisk.fp16.safetensors that I created two years ago and found in storage. It was originally a failed attempt based on an old recipe.
Since it was based on a recipe from a much older generation, there were numerous issues with image quality (low resolution and blurriness) and basic quality (such as body structure). To address these issues, I reinforced the core structure with NAI2, thoroughly enhanced photo expression and content, and combined it with high-resolution LoRA models I've been creating recently to create this model. Some might say that the original model is no longer relevant.
In terms of photo models using NAI2, I have already experimented with several using Beyond, but merging anime models into high-level layers tends to lower the characters' ages. This makes it difficult to present on CIVITAI, so they have been shelved. There are also models using others' photo-based models that seem to have achieved fairly good improvements, but they face issues with high-resolution display (though they work fine at standard resolution), leaving them in a delicate situation.
While this model supports high-resolution display, I would not recommend it. It only produces uninteresting images. Additionally, it has a rather distinctive set of characteristics, and depending on the prompt, it may not generate any images at all. From the creator's perspective, the results are acceptable, but surprisingly, the body structure is relatively well-defined (though it can break when it breaks), and the fingers are displayed quite realistically, which is somewhat puzzling.
The DPM++SDE sampler is the most recommended. However, while I usually use a 20-step standard, 24 steps are necessary for photo-realistic rendering with DPM++SDE. The CFG scale is 7-6. This sampler provides the most stable body structure.
With Euler a, 20 steps are sufficient for photographic representation, but it has a semi-realistic tone and lacks the reproducibility of DPM++SDE in details such as mid-range faces. The composition is more stable with DPM++SDE, but it lacks some interest, so trying Euler a is worth considering.
For photographic representation, DPM++3MSDE is superior, but the CFG scale needs to be reduced to around 3.5. The step count is around 28.
DDIM also produces quite good results. It is slightly inferior to DPM++SDE in terms of detail stability, but its expressive power in terms of composition and other aspects is at a high level. The number of steps is 30, and the CFG scale is around 5.5?
Most of the functionality checks were performed using LCM, and while it may be slightly inferior in terms of finger reproduction, it produces sufficient images with 7 steps.
All samplers may retain a slightly semi-realistic tone overall.
The sample images are low-resolution outputs at 512×768 pixels (which has become my standard recently). HighRes.Fix and Adetailer are not used. Negative prompts are used for CIVITAI as a precaution, but they are not mandatory.
Note that Astaroth is an angel (demon) holding a poisonous snake. There is a much longer story about the basilisk, but that will be for another occasion.



















