JPEG Repair
Details
Download Files
Model description
This is a proof of concept LORA that aims at replacing jpeg artifacts (blocking/grid, ringing) with predicted original textures for real photographs, specialized in skin textures.
Dataset
The training images consists of 750 handpicked high quality photos of (female) body parts of various distances (from extreme closeups to fullbody), angles, lightings, etc from a private dataset. The images were manually resized/cropped (Lanczos) to 768 (0.6MP), 1024, 1280 and 1536 (2.36MP) resolution buckets from 5-30 Megapixels originals to create target images, and then compressed with ffmpeg -q:v 2 and 6 to create control images. Most images have a 4:3 or 3:2 aspect ratio, aspect ratios are maintained after resizing.
Training
This LORA was trained on the bf16 unquantized Kontext model weights. It was trained with adamw8bit, LR 2e-04 and cosine with restarts (2.66 cycles) with a total of 3500 steps, including 500 steps warmups, then continue with 1250 steps, both with wavelet loss, which did a good job to get rid of the ringing and blocking artifacts (MSE loss could only remove some blocking and had very limited effect on ringing at 1250 steps during my tests), then another 1250 steps with MSE loss which helped bringing extremely fine pixel level textures and improved texture clearity (wavelet gave a slightly blurred result for pixel level textures). I then finished it up with an extra 500 steps with only the -q:v 6 samples, I thought that helped with adding textures. The entire training was only focused on low noise late timesteps loss.
How to use
I strongly recommand using the full, unquantized (bf16) base model with this LORA, see the limitations section below.
There is no trigger word. Simply add the LORA to a normal Kontext workflow and hit run.
I get good results with 25 steps euler/simple.
Note that many photo viewers/editors blur and smooth out the pixels when you zoom in, including browsers. If you don't see a difference in the repaired image, try switching to a viewer that doesn't do this. For macOS, Finder preview does blur the pixels when you zoom in, the Preview app doesn't. Nothing in ComfyUI allows you to zoom to the degree that you can see pixels clearly so external viewers are better for this.
Roadmap
Add 2048 resolution samples to the dataset
Captioning
Train/experiment/release a 2x model that you first scale the input image by 2x, which will make the artifacts 2x bigger and maybe easier for the model to produce high quality results
Add samples with scaled artifacts to the dataset
Add samples with misaligned 8x8 JPEG blocks to the dataset
Limitations
AI generated images - This LORA was trained ONLY on real photographs and is intended to restore real photos. It may not know how to restore AI generated images. I have not tested this.
Quantization - During my testing, the Kontext base model couldn't even reliably reproduce jpeg artifacts from the input image when it's quantized to fp8, everything just became a blurred mess when you zoom in to check the pixels. I had to train and run this LORA with the unquantized model to get meaningful results.
Resizing - Resizing the input image is not recommanded because that changes the size of the JPEG block patterns and this LORA is only trained to process the 8x8 blocks.
Upscaling - This is NOT an upscaler and does not upscale images. It does add textures but only when the JPEG artifacts suggest that there are original textures that are lost. It will NOT add textures out of thin air to completely smooth skin/surfaces that has no artifacts and will not automatically convert plastic skins to a real looking one.
Old, wrinkled, male skin - Not tested and not trained on old skin.
Faces - It can restore faces, but it's not specifically trained for this due to civitai terms and conditions. It will remove blocking/ringing artifacts for faces like usual, but YMMV for adding face skin textures. I may release a version trained also with faces outside of civitai.
Noisy images - Will work for images with lots of (jpeg compressed) chroma and luma noise, but not reliably. Training data does contain some noisy images, but only a small portion.
Motion blur - The dataset does not contain any image with serious motion blur, I find that it doesn't produce bad results for motion blur, but it's also not trained for this.
Shallow depth of field - Many images in the dataset contain shallow depth of field (they're captured with real cameras/lenses) so it seems like there's a bias to create the blurred shallow depth of field effect when the model is not sure what to do. I'll plan to reduce this bias.
Very dark skin tones - I do not have training data for this, and I also did not test this. It may or may not work.
Lots of pubic hair - Training data are mostly no or little pubic hair, it should still work, but it's not specifically trained for this.
Prompt adherence - It is intened soly for restoring compressed JPEG images, regular prompts may still work with this LORA, but expect it to have reduced quality.
This is a proof of concept LORA. It does surprisingly well, but it's a proof of concept LORA.


