2D Gold Fish | High-Res Anime XL
Details
Download Files
About this version
Model description
This model is a small finetune and lora merge of Seele-NoobAI-SDXL v2.1. This model and Seele is based on Noob 1.0 Vpred.
The goal of this model is to produce high quality 2d flat anime and backgrounds.
The main factor that differentiates this model from others is the ability to natively generate images as high as 2048x2048. Every image posted above was natively generated. No upscaling, no post editing, no inpainting at all.
Why use this model?
It can create true 2d anime in a flat/anime screencap style
It can produce illustration grade backgrounds
It can generate images up to 2048x2048, no need to upscale if you don't want to
It's noob based and also v-pred, so that means you gain access to all of noobs knowledge and the benefits of v-pred
Noob/Illustrious loras seem to work for the most part (haven't tested too many)
Where does it suck?
Hands and feet are a hit and miss. When you generate at 2048x2048 or any unrecommended resolution, things can break more often than not. Due to the 2d style bias, some artist and loras will be affected. Some background types are particularly unsavory. Like cities and indoor environments.
How to use?
The prompt follows the formatting from Seele. Start your prompt using Danbooru tags to describe your image. At the end of your prompt, put:
masterpiece,best quality,absurdres,highres,high resolution,
As for the negative, all you need is:
worst quality
Here are a few negative recommendations I feel will help your image.
bad perspective: This helps with the backgrounds a bit.
too many fingers,bad hands: This and other hand related tags can help by placing emphasis on the hands when you're trying to correct them.
ringed eyes: Sometimes the eyes are generated with a ring around the pupil. This tag in the negative will help prevent that.
flat color: Due to the material used for training, the 2d style can become overly simplified. If you want to enhance the 2d, place this tag in the negative.
minimalism: Like the tag flat color, this also enhances the 2d style, but much, much more. Using this together with flat color will bring out the gradient style of the training material that was over shadowed by the predominant flat 2d material.
How to prompt in this model?
This is a v-pred noob based model. Meaning, if you don't prompt for it, it most likely won't show up. You have to be very deliberate with your Danbooru tagging. Please study my example images and become very familiar with Danbooru tags. Tags that contradict each other or don't make sense together will result in bad outcomes.
This is particularly noticeable when it comes to backgrounds. For example, if you just prompt "forest" with a qt 1girl like sango from inuyasha. The forest background will generate poorly. This is because the forest tag is just too broad of a term. Instead, prompt for things like "tree,grass,rock,moss,leaf,tree shade" and any other forest related terms. We need the model to put emphasis on the background as well as your waifu sango!!
Parameters:
Sample: Euler Ancestral CFG++ or Euler/Ancestral
Schedule type: DDIM
Steps: 48 [44-60] (anything lower will decrease the quality)
CFG Scale: 1 (for CFG++) 5 (Normal)
VAE: SDXL Anime VAE Dec-only B3 (Built in)
For any additional information, please go here.
Recommended Resolutions
1568x2048, 1408x2048, 1728x2048, 1024x2048
There are more you can use but these are the ones I stick to the most. These particular resolutions give me the least amount of bad hands/feets. Yes, you can also use the default XL resolutions (or any other) but why use this model if you're not taking advantage of the higher quality results?
This goes without saying but, using such a high resolution with such a high step count will greatly decrease your generation time. It takes me roughly 25 seconds to generate a image at 1568x2048 on a 5090. You're just going to have to determine if the image quality is worth the trade off in time.
Why make this model?
I originally had no intention of making or releasing another checkpoint. But after being introduced to Seele I begin to get intrigued. It started with a lora using the data from the MeMax material. I liked the results but felt the backgrounds were too poor. So I trained a lora using the Worldly material but the backgrounds barely improved.
At this point I begin to wonder why I couldn't effect the backgrounds to the degree I could with character styles and then it hit me. Why don't I try finetuning the model itself? I'd never done an actual finetune but I figured it couldn't be much different than loras.
After many days of iteration, I finally had a small finetune. I used pretty much all my MeMax data, along with background material from both Worldly and CoMix.
The results were...........not very noticeable.
So after scratching my head, I decided to train new loras on top of the finetune. Basically the same material for MeMax and Worldly. And the difference was night and day. Not only did the anime style look better but so did the backgrounds. By a large margin.
At that point I knew I had struck some kind of gold. Thus begin my journey of baking. I've made so many changes and improvements to the data set of MeMax, and Worldly thanks to this journey. What you see above is the culmination of many weeks of training, testing and failing. It's not perfect, but I'm more than thrilled I managed to get it to the state it's in.
Final Comments
A big thanks to waw1w1 for putting together the amazing Seele model. I wouldn't have even bothered if it didn't exist. This model follows the same license of everything it's based on. Use it responsibly.




















