Versatile Editing of Video Content, Actions, and Dynamics without Training

¹ Google DeepMind | ² Technion - Israel Institute of Technology | ³ The Weizmann Institute of Science

^* Work done during an internship at Google DeepMind.

Abstract

Controlled video generation has seen drastic improvements in recent years. However, editing actions and dynamic events, or inserting contents that should affect the behaviors of other objects in real-world videos, remains a major challenge. Existing trained models struggle with complex edits, likely due to the difficulty of collecting relevant training data. Similarly, existing training-free methods are inherently restricted to structure- and motion-preserving edits and do not support modification of motion or interactions. Here, we introduce DynaEdit, a training-free editing method that unlocks versatile video editing capabilities with pretrained text-to-video flow models. Our method relies on the recently introduced inversion-free approach, which does not intervene in the model internals, and is thus model-agnostic. We show that naively attempting to adapt this approach to general unconstrained editing results in severe low-frequency misalignment and high-frequency jitter. We explain the sources for these phenomena and introduce novel mechanisms for overcoming them. Through extensive experiments, we show that DynaEdit achieves state-of-the-art results on complex text-based video editing tasks, including modifying actions, inserting objects that interact with the scene, and introducing global effects.

Method

Current inversion-free approaches struggle to perform general non-structure-preserving edits. In particular, when tuning their hyperparameters to allow significant modifications, they generate videos whose low frequencies are unnecessarily misaligned with the source video and whose high-frequencies suffer from jitter. We introduce two novel mechanisms to counter these phenomena: Similarity Guided Aggregation (SGA) and Annealed Noise Correlation (ANC). The SGA mechanism improves low-frequency global alignment to the source video, so that edited dynamics stay consistent with the scene layout. The ANC mechanism suppresses high-frequency jitter and stabilizes local appearance across frames.

Input Video

A large bucket of yellow paint is in front of the train, the train smashes the bucket and paint is smeared all over it.

FlowEdit [1] (No SGA and ANC)

+ SGA

+ ANC

DynaEdit (With SGA and ANC)

Versatile Inversion-Free Editing with DynaEdit

Dynamic Object Insertion

Our method allows for dynamic object insertion, where a new object is added to the scene and interacts with the existing elements.

Input Video

Edited by DynaEdit

A woman holding a kite at the beach with very strong wind, colorful umbrella in the sand. The camera is slowly moving to the right.

Input Video

Edited by DynaEdit

A side-on tracking shot follows a woman riding a dark horse as it canters across the sandy paddock.

Input Video

Edited by DynaEdit

A big mirror is in the middle of the forest.

Input Video

Edited by DynaEdit

A cat is jumping on the couch.

Input Video

Edited by DynaEdit

A helicopter is in the sky, shining a bright light in the middle of the town, lighting up the rooftops.

Input Video

Edited by DynaEdit

A giant meteor is in the sky, crashing down into the lagoon, creating a huge splash and big waves.

Interactive Swap

Our method allows for swapping of objects within the scene, such that the scene is adapted or interacting with the new object.

Input Video

Edited by DynaEdit

A strawberry and a feather are falling into a container with water. The strawberry is sinking very fast. The feather stays on the surface.

Input Video

Edited by DynaEdit

A static camera shot of yellow tulips and a dandelion swaying in the wind. Close-up shot. The strong wind causes the dandelion seeds to detach and fly away.

Input Video

Edited by DynaEdit

A giant snail is crossing the bridge from left to right, leaving a trail of ooze.

Input Video

Edited by DynaEdit

A bird made of lava is sitting on a branch. The branch starts to burn under the lava bird.

Input Video

Edited by DynaEdit

A person is squeezing a lemon into the green liquid. Suddenly, a chemical reaction occurs and the green liquid bursts out of the pitcher.

Input Video

Edited by DynaEdit

An orange stand in the supermarket with a red apple on top. Suddenly, a person grabs the red apple.

Input Video

Edited by DynaEdit

A woman in a pink headscarf sitting in front of a fluffy gray cat. The woman is petting the cat.

Action Change

Our method enables the modification of actions performed by entities in the scene.

Input Video

Edited by DynaEdit

Three swans in a pond. Suddenly, the leftmost swan dives underwater. The others remain calm.

Input Video

Edited by DynaEdit

Midway through it's run, the horse approaches a barrier and leaps over it.

Input Video

Edited by DynaEdit

Three horses in a pasture.Suddenly, the horse closest to the camera leaves.

Input Video

Edited by DynaEdit

The pizza catches fire and starts to burn, causing a flame to burst out of the oven.

Input Video

Edited by DynaEdit

A person uses a pizza shovel to pull out the pizza.

Global Effects

Our method can apply global effects to the entire scene, such as lighting or atmospheric changes.

Input Video

Edited by DynaEdit

A magmatic waterfall. The lava at the brink of the magmatic waterfall is a striking orange color.

Input Video

Edited by DynaEdit

A steam train is running through a forest at nighttime. Suddenly, lightning strikes the train, setting it on fire.

Input Video

Edited by DynaEdit

Medieval Manga style.

Input Video

Edited by DynaEdit

A man hiking up a hill. Suddenly, a heavy sandstorm begins, causing the man to grab his hat.

Input Video

Edited by DynaEdit

Fireworks are exploding above the river, reflecting off the water and across the river.

Comparisons to Competing Methods

Input Video

DynaEdit (Ours)

Aleph [2]

FlowAlign [3]

Editing-by-Inversion [4]

FlowEdit [1]

A strawberry and a feather are falling into a container with water. The strawberry is sinking very fast. The feather stays on the surface. (first frame edited)

Input Video

DynaEdit (Ours)

Aleph [2]

FlowAlign [3]

Editing-by-Inversion [4]

FlowEdit [1]

A cat is jumping on the couch. (first frame edited)

Input Video

DynaEdit (Ours)

Aleph [2]

FlowAlign [3]

Editing-by-Inversion [4]

FlowEdit [1]

* Note that FlowAlign and FlowEdit's results are misaligned with the source in terms of camera and object motion.

A large bucket of yellow paint is in front of the train, the train smashes the bucket and the paint is smeared all over the train. (first frame edited)