Clarity TagFlow

Details

Model description

Clarity TagFlow

The intelligent, local-first image tagging and AI dataset curation powerhouse — now rewritten in Rust.

(original source code here) GitHub HERE

Clarity TagFlow is a modern desktop application designed to streamline the process of tagging images for Machine Learning datasets, Stable Diffusion training, and digital asset management. Built with a strict focus on privacy, speed, and optimization, it runs state-of-the-art AI models entirely on your local machine.

🚀 Project Status: The Rust rewrite is here! The Grand Blueprint became reality — Clarity TagFlow is now a fully native application. No JVM, no 2 GB heap, no garbage collection pauses. Just one small, blazing-fast binary built from ~22,000 lines of pure Rust.

💬 We Need Your Feedback! Do you miss an old feature that hasn't been ported yet? Is there something you don't like, or an improvement you're dying to see? Leave a comment and let us know!


🛠️ System Requirements

  • OS: Windows (Inno Setup installer), macOS (.dmg, Apple Silicon), Linux (.tar.gz)

  • Runtime: None! Native binary — no Java, no JVM, nothing to install first

  • Install size: A few tens of MB (down from 300–400 MB)

  • Startup: Near-instant, with a skippable animated splash screen

  • Optional: VLC for video playback (the app runs fine without it and will politely offer an install link)


🌟 Key Features

🤖 Local AI Powerhouse

  • Privacy First: No images or data are ever uploaded to the cloud. All processing happens 100% offline.

  • Multi-Model Support: Seamlessly switch between JoyTag, PixAI v0.9, and the WD14 v3 family (ConvNext, SwinV2, Eva02) — now with a built-in Model Manager that downloads models with progress bars and auto-discovers ones you already have.

  • Smart Thresholding: Fine-tune confidence thresholds to control exactly how strict the AI is when applying tags.

  • Buttery-Smooth Inference: ONNX Runtime ships with the app and runs at Level-3 graph optimization on a background thread — the UI never stutters while tagging.

🎨 AI Creative Suite (NEW)

  • In-App Image Generation: The app installs and manages its own ComfyUI backend, downloads GGUF-quantized Flux.1 and Z-Image Turbo models, and gives you full prompt/steps/guidance/seed controls with live logs — zero manual setup.

  • AI Background Removal: Right-click → Remove Background. BiRefNet computes a saliency matte and saves a transparent PNG next to the original.

  • Pixal3D — Image to 3D: Turn a single image into a textured GLB 3D model and inspect it in the built-in 3D viewer (orbit, zoom, PBR lighting).

  • Spatial Scene: An Apple-Photos-style depth parallax effect — Depth Anything V2 estimates depth, and your photo subtly shifts in 3D as you move the mouse.

📷 Pro Image Support (NEW)

  • Camera RAW: Full pure-Rust develop pipeline for DNG, Sony ARW, Canon CR2, and Nikon NEF — colors matched against the camera's embedded preview.

  • Radiance HDR with tone mapping, plus PNG, JPEG, GIF, BMP, WebP, ICO, TIFF, AVIF, and HEIC — all decoded in pure Rust, identical on every OS, no codec DLLs.

  • SD Metadata Done Right: A real parser for A1111, ComfyUI, and Civitai generation data — found in any container (PNG, JPEG, WebP, AVIF), not just PNG text chunks.

⚡ Accelerated Workflow

  • Batch Processing: Auto-tag entire folders of images in minutes. Choose to Append new tags or Overwrite existing ones completely.

  • Smart Autocomplete: Type faster with a context-aware autocomplete system that learns from your current dataset and standard tag libraries.

  • Sidecar Compatibility: Reads and writes standard .txt sidecar files, ensuring seamless compatibility with Kohya_ss, OneTrainer, and other major training tools.

  • Deep Scan ("Find Issues"): Decode-verify every image to catch corruption, and find exact duplicates via SHA-256 — with one-click cleanup.

  • Smarter Booru Downloader: The Gelbooru downloader now writes tag-role sidecars (artist/character/copyright/general), with blacklist support, dedup logging, and built-in good-citizen rate limiting.

🎨 Modern Experience

  • GPU-Rendered UI: The entire interface is drawn on the GPU — scrolling a wall of 10,000 thumbnails is fundamentally smoother than ever before.

  • Gallery Layout: A gorgeous Pinterest-style masonry grid with lazy thumbnail loading, a click-to-open detail popup, and a floating draggable search pill.

  • 5 Themes, 3 Brand New: Including Space (animated starfield), Aurora (drifting pastel blobs), and Glass (frosted translucent panels).

  • Visual Feedback: The AI status orb got a full rewrite — a 3D particle sphere that breathes while thinking and morphs through shapes during long jobs.

  • Polish Everywhere: Color emoji, full CJK font fallback (no more tofu), movable popups that remember their position, HD thumbnails for high-DPI displays, live CPU/RAM graphs, and a crop tool.

🛡️ Control & Safety

  • Global Blacklist: Automatically filter out unwanted tags across your entire dataset.

  • Session Memory: Newly added AI tags are highlighted in Cyan, making it incredibly easy to review changes before committing to a save.

  • Hardened Backups: AES-256 encrypted zips with pre-flight corruption checks, dated filenames, and live progress.

  • Encrypted Secrets: Civitai API keys, Hugging Face tokens, and rate-limit counters are all stored DPAPI-encrypted on Windows.

  • Crash Containment: Background workers are panic-isolated — a failure shows a clean error message instead of killing the app.


🚀 What's New & Improved

  • Native Rust Core: No JVM, no GC pauses, no pre-allocated heap. Typical memory use is a few hundred MB with explicitly bounded caches.

  • Dual Image Cache: Separate browser and viewer caches with decode-permit gating, so opening a huge image never starves thumbnail loading.

  • Civitai Integration, Leveled Up: Resolve models, LoRAs, LyCORIS, VAEs, and embeddings by version ID, hash, or name — with preview cards, trigger words, and a live online indicator.

  • Cross-Platform CI Releases: Every release tag automatically builds a Windows installer, macOS .dmg, and Linux tarball.

  • Video Quality of Life: Off-thread poster frames, loop playback, pure-Rust MP4/MOV metadata reading, and VLC is now fully optional.


⚠️ Not Yet Ported from the Java Version

We're being honest — a few terminus2 features haven't made the jump yet. They're all on the porting list:

  • LLM chat / role-play assistant (Ollama / llama.cpp) and text-to-speech

  • SFTP/FTP remote browsing

  • Danbooru and Pexels downloaders (Gelbooru is in)

  • EXIF GPS / Geo location panel

  • Live folder watching (re-open the folder to refresh for now)

  • Browsing encrypted archives as a library

  • Animated WebP playback (animated GIFs work; WebP shows the first frame)


🔮 The Road Ahead

With the Rust foundation in place, the next frontiers from the Grand Blueprint:

  • Built-in Model & LoRA Training with a live visual training preview.

  • Native Video Generation alongside image generation.

  • Zero-Dependency Embedded LLM: Eliminating the need for external tools like Ollama or LM Studio.

  • In-App Civitai Browsing: Search and download Civitai resources without leaving the app.

  • Interactive VR Anime Companions: A built-in VR character for interactive roleplaying — dynamically controlled by the AI, expressing real emotions in real time.

  • External Engine Integration: Broadcasting the embedded LLM to external 3D applications and game engines.

Images made by this model