NeoSD

Details

Download Files

Model description

Overview

This full FT aims to fundamentally improve the SD1.5 model. It includes multi-character display, pose diversity, stable body structure, and additional information.

The base model is an anime-style model incorporating elements of NAI2, and I aim for version 1 through repeated small-scale FTs of several thousand images. I plan to create several base models as raw materials, then improve the learning method while ultimately merging and adjusting them.

High-resolution output is supported to some extent, but is not recommended at all.

Although not specifically mentioned, all samples are low-resolution output via LCM.

Note: Since this is SD1.5, please specify what you want to output first. In many cases, the quality prompt is just a nuisance.

I now have five types of FT materials. I'll stop using FT model materials for now. I'll combine these five with existing materials to explore new models.

Qwen's output isn't particularly interesting, but it's stable and rarely breaks down, so I plan to use 0.3 (which may need expansion) as a base and supplement it with NSFW elements like 0.4.

When combined with existing models, something like TeatimeDream Neo will be created.

ver.0.32L

I tried using LoRA to compensate for the unstable parts of 0.32. Anime images are now relatively stable, but since I added a lot of character elements in one LoRA, there are a few more NSFW elements. This was one of the LoRAs I've been using for a while, but because I created it by crawling CIVITAI's anime drawings and captions, there was a problem with the NSFW elements being too strong. I adjusted the layers before using it, but applied it a little too strongly to correct it. Even with this, some of the images still didn't look like anime drawings.

It's not a big problem. Ideally, you should adjust it using multiple LoRAs, but this produces some interesting images.

ver.0.32

When I was checking the data from ver. 0.31, I discovered that some of the caption data was missing entirely.

The extension of some images, or rather the format of the referenced files themselves, was incorrect. I thought I had corrected that and other minor issues with character codes, but there are some areas that are working well and some areas that are not working at all. In addition, the convergence rate is lower than last time. I imagine that it will probably settle down after around 150 epochs, but I extracted data from 90 epochs here.

It is disappointing that the basic issues have not changed much and that the quality cannot be said to have improved, but this version has corrected the errors in the previous data.

Ver. 0.31

Last time, I mentioned that the version 0.3 series, which primarily uses the output of Qwen-image, would be the base model. However, since 0.3 had an extremely small number of image resources (Qwen-image's images barely change even when the seed is changed), I added more resources and reworked the base model to create version 0.31. While stable, the Qwen images were a bit boring, but I've tried to add some variety.

In fact, version 0.3 was a model that trained with an unprecedented convergence rate, but adding more resources has made it less stable than expected. The body structure and fingers have become quite unstable.

More unexpectedly, the images are unstable. I intended to produce stable anime images, but sometimes the images end up looking semi-realistic. Try removing prompts like masterpiece and best quality (in some cases, it may be better to add them). This may be due to remaining issues with the base model or captions.

As such, the release of versions 0.32 and 0.33 may be on the way.

That said, I think that the 0.31 is a model that can produce images that have not been seen in previous SD1.5 models as a base model material. However, since it is in an unadjusted state after FT, I do not recommend using it alone.

As usual, this sample is the 512x768 pixel LCM output as is. Faces at mid-distance should obviously be processed using HiRes.Fix or Adetailer, but no processing is done.


ver.0.5

This is a model with large movements. While convergence wasn't bad, the image wasn't stable, so I ended up training it for 100 epochs.

ver.0.4

This version uses different materials and more images than before. Approximately 10,000 images were used, and it took 60 epochs.

The learning convergence rate was slow, affecting the body structure and the details, but it produces beautiful images when it gets right. It uses materials from a similar series to 0.1 and 0.2, so it produces similar images.

It has clear strengths and weaknesses in responding to each prompt, and may have some quirks. Since it's primarily for material use, I'll consider how to utilize it when merging.

ver.0.3

This is based on the output of Qwen-image. There are earlier versions, but they have a Qwen-like feel that's almost laughable, even down to the SFW elements. ver.0.3 itself was regenerated without those elements, so the Qwen feel is somewhat diminished. This time, due to issues with the Qwen environment, VAE had problems, resulting in inferior finger accuracy and color reproduction. However, I still think it's not a bad new material for SD1.5.

ver.0.1+0.2K

A simple tweak didn't make it look very cute, so I added some cutesy LoRA (which I don't usually use because it has strong side effects). If it works, it can be used as is, but the fingers and other parts tend to break down easily. Would it be better to apply it only to the face in Adetailer? (Would it have been okay to just release the LoRA?)

ver.0.1+0.2

Merge example. This is a combination of the ver.0.1 composition and the ver.0.2 character and painting style, lightly applying my usual LoRA tools. I focused on the details of the mid-distance face and the background. I only polished up some rough edges, but I think it's good enough to be used normally.

ver.0.2_38

This version is made using a completely different material series than ver.0.1 (though there are many similar images). I think this version is more stable in terms of character and anime illustrations, but the variety of poses is inferior to ver.0.1.


ver.0.1_41

While it worked reasonably well, I felt that 100 epochs was excessive, so I reworked this version in 41 epochs, revising the materials and changing the captions. In exchange for lowering the epoch, I increased the number of materials by 1.5 times (approximately 4,500 images). I also attempted to unify the anime art style. The details are a bit sloppy, and the fingers are a bit unstable. The facial details can be easily corrected with HiRes.Fix or LoRA, so it shouldn't be a problem. Do you need a few more Epochs? If anything, there seems to be a tendency for the body structure to become unstable when the Epochs are increased.


ver.0.1

This is the output of an anime-style model, fully fine-tuned for 100 epochs. This is my second full FT model.

It feels more stable than my first attempt, but the overall finish isn't quite there yet. It would probably be better to adjust it by merging, but I'll try FT alone for a while.

Looking back, I wonder why EtudeFT was so difficult. Perhaps it was a problem with the base model.

Images made by this model

No Images Found.