Fantasy Card Diffusion
详情
下载文件
关于此版本
模型描述
最初发布于 HuggingFace 作者 volrath50
一个用于生成奇幻交易卡风格艺术的完整 Stable Diffusion 模型,基于所有现有的 Magic: the Gathering 卡牌艺术(约 3.5 万幅独特艺术作品)训练至 14 万步,以 Stable Diffusion v1.5 为基础模型。该模型对 MTG 艺术家、异世界、卡牌系列、颜色、卡牌类型、生物类型等有深入理解。有关使用该模型的指南,请向下滚动查看。
功能
- 融入你所喜爱的 Magic: the Gathering 艺术家的风格
- 生成看起来像来自特定 MTG 异世界、系列或年份的艺术作品
- 以 Magic: the Gathering 中存在的风格创作奇幻生物
- 创作 MTG 独有的奇幻生物类型(如 Eldrazi)
- 使用广为人知的 MTG 角色(如 planeswalkers)
- 以 MTG 艺术风格描绘现实世界或非 MTG 角色
- 混合使用以上所有元素
使用该模型
该模型基于 MTG 卡牌信息进行训练,而非艺术描述。这一特性使得大部分非 MTG 学习内容得以保留,允许你将 MTG 卡牌术语与艺术描述结合,实现高度自定义。
每张卡牌的训练数据均来自 Scryfall 的卡牌信息,格式如下:
MTG card art, [卡牌名称], by [艺术家], [年份], [颜色(文字)], [颜色(字母)], [卡牌类型], [稀有度], [系列名称], [系列代码], [异世界], [系列类型], [水印], [法术力费用], [安全印章], [力量/防御力], [关键字], [促销类型], [故事焦点]
一些实际卡牌数据格式的示例:
MTG card art, Ayula, Queen Among Bears, by Jesper Ejsing, 2019, Green, G, Legendary Creature - Bear, rare, Modern Horizons, mh1, draft_innovation, 1G, None, 2/2, Fight,
MTG card art, Force of Will, by Terese Nielsen, 1996, Blue, U, Instant, uncommon, Alliances, all, Dominaria, Terisiare, Ice Age, expansion, 3UU,
简要解释部分条目:每张卡牌艺术开头都带标签 "MTG card art"。通常建议使用该标签。但它会略微泛化图像效果。可尝试使用或不使用此标签,以观察差异。例如,若难以让图像呈现出明显的“Tarkir”风格,移除该标签有时可帮助减弱泛化效果。类似地,标签越泛化(如稀有度、"legendary" 等词),对图像的泛化效应越强。请自行尝试,找到最适合你需求的组合。
艺术家:训练数据中每个艺术家姓名前都带有 "by",如 "by Mark Tedin"。该模型对 MTG 艺术家的风格理解非常深入——这正是本项目启动的原因。坦白说,我接触艺术的渠道大多来自 Magic: the Gathering。早在八月时,我发现基础 Stable Diffusion 模型对许多我尝试模仿的艺术家风格理解不佳,仅少数艺术家(如 Greg Rutkowski 和 Rebbecca Guay)在基础模型中已有良好表现。即使你并非要创作 MTG 风格的艺术,该模型也非常适合使用 MTG 艺术家的风格。它也非常适合混合多种艺术家风格。见下方 "Innistrad Moon Goddess" 示例,我结合六位不同艺术家的风格,以不同权重组合,实现了我想要的效果。
系列类型:通常是 "expansion"。其他可能性包括 "core"、"funny" 等。可查阅 Scryfall API 文档获取更多信息。
安全印章:我已为方便使用对部分印章进行翻译。其中两个重要的是 "acorn" 和 "universes beyond"。还有一些罕见印章,例如 My Little Pony 卡牌的专属印章。
故事焦点:具有故事焦点的卡牌会特别标注。此信息价值有限,我可能在今后版本中将其移除。
几乎所有正常 Stable Diffusion 的标签仍可按预期使用(如 extremely detailed、intricate details)。我发现加入 "beautiful composition" 通常能让图像更美观,但每个人都有自己的个性化标签偏好,这些标签也应可正常工作。
我喜欢将提示写成艺术描述的形式。你可以在下方的示例中看出。
示例图像与提示
该模型训练覆盖了大量内容,我仅仅是刚刚开始探索其全部潜力。我认为展示一些我用它创作的作品会很有帮助。
完整的生成参数、种子等信息应标注在图像中。所有示例均使用 Automatic1111 的 UI,fantasycarddiffusion-140000.ckpt 模型,以及 "DPM++2S a Karras" 采样器。CFG 值有所不同,我发现约 11 是一个不错的基准值。大多数图像生成在 40-50 步之间完成——可能有些过度。
昇华的 Eldrazi
(一个不知怎的来到 Theros 的 Eldrazi,放慢节奏,达到了神明境界)
MTG card art, ascended eldrazi, (by eric deschamps:1.1), (legendary enchantment creature - god:1.2) (eldrazi:1.2), colorless, theros, ths, jou, bng, thb, mythic, indestructible, annihilator, trample, a wise eldrazi titan emerging from the horizon, ascended to godhood, now looking serene, calm, divine, powerful, beautiful composition, emrakul, kozilek, ulamog, (sense of scale:1.2), sense of wonder, overwhelming, extremely detailed, intricate details
负向提示:weak, angry, scary, underwhelming, powerless
迅捷影鳞虫
(一个 Mardu 影鳞虫,具备疾驰能力,出现在 Tarkir)
MTG card art, speedy sliver, by John avon, Creature - (sliver:1.3), white, black, red, wbr, (Mardu:1.1), Khans of tarkir, ktk, dash, a fast sliver is speeding through the Mardu (steppe:1.1) landscape, beautiful composition
负向提示:human, humanoid, m14
泰勒·斯威夫特,流浪吟游诗人
(不言自明,泰勒·斯威夫特扮演一位吟游诗人,身处 Eldraine。未来 Secret Lair?)
mtg card art, (Taylor Swift:1.2), wandering bard, legendary creature - human (bard:1.2), white, red, green, wrg, throne of eldraine, eld, by chris rahn, by volkan baga, by zoltan boros, armored bard taylor swift holding her weapons and instruments, beautiful composition, detailed, realistic fantasy painting, masterpiece, best quality,
负向提示:guitar, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
艾默拉库,完成的末日
(妖精族将艾默拉库从 Innistrad 月球中释放,完成了她,如今正进攻 Strixhaven。去上学的日子可不好过。)
mtg card art, (emrakul:1.2), (compleated:1.1) doom, (by seb mckinnon:1.1), legendary creature - (phyrexian:1.1) (eldrazi:1.2) (horror:1.1), black, (strixhaven, arcivos:1.2), annihilator, (infect:1.2), 15/15, a (phyrexianized:1.1), compleated Emrakul, attacking (strixhaven school, university campus:1.2), stx, beautiful composition, detailed painting, (sense of scale:1.2), horror, dark, terrifying, eldritch horror, new phyrexia, nph, rise of the eldrazi, roe, extremely detailed, intricate details, masterpiece, best quality, emrakul, the aeons torn, emrakul, the promised end
负向提示:zendikar, water, ocean, funny, happy, optimistic, bright, tentacles, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, octopus, spikes, urchin, tentacles, arms, hands, legs
艾尤拉,升格之熊
(艾尤拉,熊中女王,如今已成为 Planeswalker,并定居于 Kaladesh)
mtg card art, ayula, ascended (bear,:1.1) (by jesper ejsing,:1.1) green, g, legendary planeswalker - (bear:1.1), kaladesh, aether revolt, kld, aer, mythic, beautiful composition, a powerful bear planeswalker riding in a kaladesh (vehicle:1.1), looking very serious, intricate details, ayula, queen among bears, mh1, 2/2, 1g, masterpiece, best quality
负向提示:silly, human, humanoid, breasts, anthropomorphic, bipedal, funny, lowres, text, error, cropped, worst quality, low quality, normal quality, jpeg artifacts, watermark, blurry
奈尔萨里昂,死亡之翼
(我试图想象死亡之翼作为经典“龙族传奇”的模样,灵感来自《魔兽世界:大地的裂变》电影场景)
mtg card art, neltharion, (deathwing:1.2), (by edward beard, jr:1.1), 1994, legendary creature - (elder dragon:1.1), black, red, br, legends, leg, flying, trample, (world of warcraft cataclysm:1.2), large Firey flaming black dragon perched on stormwind castle rampart, roaring, breathing fire, flames, destruction, beautiful composition, extremely detailed, intricate details, masterpiece, best quality, terrifying, epic, cinematic
负向提示:lowres, text, error, cropped, worst quality, low quality, normal quality, jpeg artifacts, watermark, blurry, human, humanoid, deformed, mutant, (ugly:1.3)
哈拉姆贝,塔基尔的灵长类冠军
(哈拉姆贝没有死去,他的 Planeswalker 火花已点燃。)
(harambe:1.1), simian champion of tarkir, by magali villeneuve, legendary planeswalker - ape (monk:1.2), white, blue, red, wur, (jeskai:1.2), khans of tarkir, ktk, planeswalker harambe training with the jeskai, in a (monastery:1.2), in the mountains, wearing robes, martial arts, beautiful composition, extremely detailed, intricate details, masterpiece, best quality,
负向提示:lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
盖伯·纽维尔,科技巫师
(对盖伯的提示表达歉意——我想让他看起来像今天的样子,但模型总试图把他变成多年前的模样。)
mtg card art, (gabe newell:1.3), techno-wizard, by zezhou chen, legendary creature - human wizard, blue, red, ur, izzet, ravnica, beautiful composition, (grey beard:1.1), (gray hair:1.1), elderly izzet techno wizard gabe newell is casting a spell, powerful, intelligent, epic composition, cinematic, dramatic, masterpiece, best quality, extremely detailed, intricate details
负向提示:lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, young, silly, goofy, funny
露娜,因尼斯特拉德月之盲女神
(或者,也许只是艾默拉库的假扮?)
mtg card art, luna, blind lunar goddess of innistrad's moon, legendary enchantment creature - (god:1.1), by howard lyon, (by chris rahn:1.1), (by seb mckinnon:1.1), (by terese nielsen:0.8), (by rebecca guay:0.8), (by richard kane ferguson:1.1), (innistrad:1.3), dark ascension, shadows over innistrad, inn, dka, soi, white, blue, black, wub, mythic, (blindfolded cute young woman:1.2) as smug (moon goddess:1.1), sitting on throne, dark lighting, full moon night, long white hair, pale skin, (silver blindfold:1.1), opalescent robes, ethereal, celestial, mysterious, beautiful composition
负向提示:orange
哥布林烈焰喷射器
(该模型也能生成瞬间法术与法术)
mtg card art, (goblin flamethrower:1.1), red, r, instant, sorcery, onslaught, legions, scourge, ons, lgn, scg, a crazed, intense, happy goblin is shooting fire from a flamethrower, dangerous, reckless, beautiful composition
负向提示:(ugly:1.5), lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
摩克斯 翡翠,迷雾
(如果 Mirage 系列中曾有“摩克斯 翡翠”,并借鉴 Volkan Baga 的 Vintage Masters 摩克斯艺术风格)
Mtg card art, two african hands cupped together holding a (mox topaz:1.1) on a gold chain, in the middle of the palm, in front of the (African savannah:1.1), by Terese Nielsen, (by Volkan baga:1.1), by Dan Frazier, artifact, beautiful composition, jamuraa, mirage, mir, vma
负向提示:deformed, bad anatomy
摩克斯 翡翠,阿尔法
(同理,如果在最初的 Alpha 系列中曾有一种第六种颜色——橙色)
(mox topaz:1.1) ( by dan frazier:1.2), artifact, rare, (limited edition alpha, lea:1.1), (1993,:1.1) a mox topaz on a chain
负向提示:lowres, cropped, worst quality, low quality, normal quality, jpeg artifacts, watermark, blurry
海岛(妖精族多伦多)
(妖精族入侵并完成了多伦多)
mtg card art, (toronto:1.2), (basic land - island:1.1), new phyrexia, nph, by adam paquette, (toronto skyline:1.2), (phyrexian:1.1), dark, horror, cn tower, rogers centre, extremely detailed, intricate details, masterpiece, best quality
爱丽儿,小美人鱼
(再等下去,我确信会有 Secret Lair 出现。)
mtg card art, (ariel, the little mermaid:1.2), legendary creature - (merfolk:1.1), blue, white, red, uwr, (theros:1.1), by Greg Staples, beautiful composition, ariel sitting on a rock with waves, theros temple in background, masterpiece, best quality,
负向提示:green skin, blue skin, red tail, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
蝙蝠侠,黑暗骑士
(同样地,Secret Lair 也只待时间。)
mtg card art, batman, the dark knight, by justine cruz, by zoltan boros, legendary creature - human ninja, white, blue, black, (ub:1.1), (dimir,:1.1), (ravnica:1.1), (kamigawa:0.9), neon dynasty, neo, innistrad, investigate, ninjutsu, (at night:1.3), on roof, dark lighting, masterpiece, best quality,
负向提示:lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
训练与数据集
训练基于一组 512x512 尺寸的裁剪版 MTG 卡牌艺术(共约 3.5 万张图像),通过自定义 Python 脚本,从 Scryfall 获取数据并进行标注。使用 Automatic1111 的 Dreambooth 扩展,在我的 4090 显卡上训练了约几天时间,共 14 万步。训练过程中多次调整设置,一般逐步增加批处理大小并降低学习率。目前设置为批处理大小 10,梯度累积 5,学习率 4e-7,效果良好。
结果是一个全面的模型,对 MTG 艺术家、系列、异世界、卡牌类型、生物类型、年份、颜色等均有良好理解。如果你曾好奇过,一位由 Ron Spencer 绘制的 Merfolk,在 Tarkir 上成为 Mardu 部落的一员,具备疾驰、先攻和践踏,那会是什么样子——这个模型可以实现你想要的。
我已上传用于生成训练数据集的 Python 脚本,配合 https://scryfall.com/docs/api/bulk-data 提供的 "unique artwork" JSON 文件,可生成非裁剪图像与相同(或几乎相同)的文本文件。
该脚本简明,但仍有改进空间。在此之前,我已有二十年未写代码,当年还是少年时,且从未用过 Python,仅凭着 2000-2001 年对 Perl 的零星记忆,结合大量使用 GitHub Copilot 及谷歌搜索完成此脚本。
裁剪使用了 ImageMagick 完成(见下方“问题”部分)。
问题
这本是完整数据集的第二次测试运行(第一次失败了),因此为快速启动测试,部分步骤有所简化。模型的实际效果远超预期,因此我决定直接发布当前版本,希望其他人也能像我一样享受它。不过,我已意识到几个问题,未来版本将着手解决:
- 裁剪问题:MTG 艺术通常为矩形。我最初尝试使用支持不同长宽比的训练器,但几次失败后,改用 ImageMagick 快速批量裁剪,将所有图像统一调整为 512x512,以快速启动训练。我记不清具体操作,但似乎统一聚焦于卡牌左侧,导致右侧普遍被裁剪。你将在许多图像中看到这种现象:右侧元素被截断。平面信息是在约第 7 万步才加入训练的,因此该信息可能训练程度不如其他信息。本质上,我希望通过“异世界”来分组卡牌系列,因为发现模型对某系列外观的识别程度,取决于 WotC 是否将“异世界”名称直接融入系列名中——例如,仅用“Theros”只会匹配“Theros”和“Theros: Beyond Death”,而无法覆盖“Born of the Gods”或“Journey into Nyx”。一些艺术家姓名中使用了特殊字符。我尝试去除所有重音,但仍漏掉一位:Tom Wänerstrand,他的名字在模型中仍保留为 Tom Wänerstrand(带分音符)。Greg Rutkowski:这不是问题,但 AI 艺术界代表人物 Greg Rutkowski 也是 MTG 艺术家。他在 MTG 卡牌上使用波兰语姓名 Grzegorz Rutkowski,因此该模型也是基于这个名字训练的。所以使用 "by Greg Rutkowski" 与 "by Grzegorz Rutkowski" 会得到不同结果。











