ComfyUI Local LLM Qwen Prompt Refiner (Offline LLM Node)

详情

下载文件

关于此版本

v2.0 - Major performance & multi-GPU update (nightly branch)

Key new features:

- Added torch.compile support (reduce-overhead + dynamic=True) for ~30–100% faster inference after initial compilation warmup

- Explicit SDPA (Scaled Dot-Product Attention) backend for better speed and memory efficiency on Ampere/Ada GPUs

- Multi-GPU support via device_map="auto" – toggleable new input "use_multi_gpu" (default True). Turn off for single-GPU only (e.g.users with only cuda:0 visible)

- Modern dtype options (bf16 default, fp16, fp32, auto)

- Better logging, error handling, and unloading when keep_loaded=False

This version focuses on stability, speed, and compatibility with multi-GPU setups while keeping single-GPU rock-solid.

Repo: link

Tested on RTX 3090 (single GPU) with PyTorch 2.7–2.9.

Feedback/PRs welcome for multi-GPU testing!

模型描述

ComfyUI 的本地 Qwen LLM 加载器与思维提示优化器

使用 safetensors 格式完全离线加载任何 Qwen 模型（如 Qwen3-4B-Thinking-2507）。非常适合在完全控制下优化 Stable Diffusion 提示词：可自定义指令、可见的思维链输出、固定种子以确保结果一致，以及可选的完全内存释放功能，使用后释放 VRAM。

无需 API，无速率限制，与重型生成工作流并行运行时也不会出现 OOM 问题。

只需将模型文件放入 models/qwen/model-main-folder-goes-here/repo-files-goes-here，重启 ComfyUI，即可本地创作更优质的提示词。

模型放置位置：
子文件夹内应包含的内容：

自定义节点仓库地址：https://github.com/capitan01R/Qwen-llm-loader

此模型生成的图像

排序

未找到图像。

模型类型	工作流
基础模型	Other
发布时间	1/11/2026