LTX 2.3 Audio-Driven Reference Character Pseudo-Replacement Workflow

This workflow is designed for LTX 2.3 audio-driven reference character pseudo-replacement video generation. Its main purpose is to take a source video / reference motion structure, combine it with a target character reference, and generate a new video where the final subject visually follows the desired character direction while keeping stronger motion, audio timing, depth structure, and cinematic continuity.

Unlike a simple face-swap or one-click character replacement workflow, this setup is closer to a controlled “pseudo-replacement” production pipeline. It does not only paste a new face onto an existing video. Instead, it uses LTX 2.3 video generation, image-to-video conditioning, audio latent routing, motion / structure guidance, depth information, edge guidance, and multi-stage sampling to regenerate the video in a more coherent way. This makes it useful when creators want the final result to feel like a newly generated AI video rather than a visible cut-and-paste edit.

The workflow uses LTX 2.3 as the main video model route, with LTX video VAE, LTX audio VAE, Gemma-style text conditioning, LTXVPreprocess, EmptyLTXVLatentVideo, LTXVImgToVideoConditionOnly, LTXVConcatAVLatent, LTXVSeparateAVLatent, SamplerCustomAdvanced, tiled VAE decoding, audio decoding, CreateVideo, and final video export. This gives the graph a full audio-video generation structure: prepare the visual input, build the latent video, connect the audio latent, sample the result, separate audio / video latents, decode them, and export a complete video.

A key part of the workflow is its reference-control section. The graph includes video information reading, frame rate handling, duration / frame count calculation, image preprocessing, Canny edge preprocessing, and DepthCrafter depth estimation. These modules help extract useful motion and structure signals from the source material. For character pseudo-replacement, this matters because the new character should not move randomly. The output needs to inherit the original performance rhythm, camera structure, body movement, and spatial depth while still changing the visual identity direction.

The audio route is also central. The workflow uses LTX audio VAE logic to preserve or generate audio-aware video output. This makes the pipeline suitable for speaking characters, performance clips, music-driven characters, AI presenters, virtual idols, dialogue scenes, and short-form character videos where sound and motion should stay connected.

The workflow also appears to use multiple sampling / refinement stages. The first stage builds the base LTX 2.3 result from the conditioned latent structure, while later stages can refine the generated video with additional sampler passes and tiled decoding. This helps improve texture, reduce rough latent artifacts, preserve motion coherence, and make the final result more publishable.

This workflow is ideal for creators who want to test LTX 2.3 in reference-character video transformation, audio-driven character performance, motion-preserving AI role replacement, virtual human content, short drama clips, stylized character remakes, and Civitai / RunningHub workflow demonstrations. The best use case is authorized or self-created material where the goal is not to impersonate a real person, but to use a reference motion or performance structure to generate a new AI character video.

If you want to see how the reference video, audio route, DepthCrafter structure, Canny guidance, LTX 2.3 latent generation, and final pseudo-replacement output are connected, watch the full tutorial from the YouTube link above.

⚙️ Try the Workflow Online

👉 Workflow: https://www.runninghub.ai/post/2040718940661354497?inviteCode=rh-v1111

Open the link above to run the workflow directly online and view the generation results in real time.

If the results meet your expectations, you can also deploy it locally for further customization.

🎁 Fan Benefits: Register now to get 1000 points, plus 100 daily login points — enjoy 4090-level performance and 48 GB of powerful compute!

📺 Bilibili Updates (Mainland China & Asia-Pacific)

If you are in Mainland China or the Asia-Pacific region, you can watch the video below for workflow demos and a detailed creative breakdown.

📺 Bilibili Video: https://www.bilibili.com/video/BV1QrDzBsELY/

I will continue updating model resources on Quark Drive:

👉 https://pan.quark.cn/s/20c6f6f8d87b

These resources are mainly prepared for local users, making creation and learning more convenient.

⚙️ 在线体验工作流

👉 工作流： https://www.runninghub.ai/post/2040718940661354497?inviteCode=rh-v1111

打开上方链接即可直接运行该工作流，实时查看生成效果。

如果觉得效果理想，你也可以在本地进行自定义部署。

🎁 粉丝福利：注册即送 1000 积分，每日登录 100 积分，畅玩 4090 体验 48 G 超级性能！

📺 Bilibili 更新（中国大陆及南亚太地区）

如果你在中国大陆或南亚太地区，可以通过下方视频查看该工作流的实测效果与构思讲解。

📺 B站视频： https://www.bilibili.com/video/BV1QrDzBsELY/

我会在夸克网盘持续更新模型资源：

👉 https://pan.quark.cn/s/20c6f6f8d87b

这些资源主要面向本地用户，方便进行创作与学习。

모델 유형	워크플로우
기본 모델	LTXV 2.3
게시일	2026-05-13

LTX 2.3 Audio-Driven Reference Character Pseudo-Replacement Workflow

세부 정보

파일 다운로드 (1)

모델 설명

이 모델로 만든 이미지