ChronoEdit 14B

Details

Model description

Join LUXED AI, the best AI community: https://discord.gg/HxfP9TnctJ

šŸ’š ChronoEdit Ā Ā  | Ā Ā  šŸ–„ļø GitHub Ā Ā  | Ā Ā šŸ¤— Hugging FaceĀ Ā  | Ā Ā šŸ¤– Gradio DemoĀ Ā  | Ā Ā  šŸ“‘ Paper

ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
ChronoEdit-14BĀ enables physics-aware image editing and action-conditioned world simulation through temporal reasoning. It distills priors from a 14B-parameter pretrained video generative model and separates inference into (i) aĀ video reasoning stageĀ for latent trajectory denoising, and (ii) anĀ in-context editing stageĀ for pruning trajectory tokens. ChronoEdit-14B was developed by NVIDIA as part of theĀ ChronoEditĀ family of multimodal foundation models. This model is ready for commercial use.

ChronoEdit Method OverviewOverview of the ChronoEdit pipeline. From right to left, the denoising process begins in the temporal reasoning stage, where the model imagines and denoises a short trajectory of intermediate frames. These intermediate frames act as reasoning tokens, guiding how the edit should unfold in a physically consistent manner. For efficiency, the reasoning tokens are discarded in the subsequent editing frame generation stage, where the target frame is further refined into the final edited image.

Images made by this model

No Images Found.