In the realm of video generation, diffusion models have showcased remarkable advancements. However, a lingering challenge persists—the unsatisfactory temporal consistency and unnatural dynamics in inference results. The study explores the intricacies of noise initialization in video diffusion models, uncovering a crucial training-inference gap.
The study addresses challenges in diffusion-based video generation, identifying a training-inference gap…
