Removing an object and its interactions can require rewriting the entire scene. On the left, when the middle three blocks are removed, VOID correctly models the domino effect halting so that the yellow block never falls. On the right, when the hands are removed, VOID correctly models the spinning tops continuing without interruption. Credit: arXiv (2026). DOI: 10.48550/arxiv.2604.02296

New AI video tool removes objects without breaking the laws of physics

07 Apr 2026, 15:20 by Paul Arnold, Phys.org · Tech Xplore

When movie and TV directors want to tinker with their footage in post-production, they have an array of tools at their disposal to perfect a scene if it wasn't shot exactly how they liked. That includes removing objects like stray equipment or unwanted background actors. But the tech has its limits when it comes to more complex physical interactions.

For example, if you want to remove an object that was bumping into or supporting something else, traditional tools often leave the remaining objects behaving in ways that defy the laws of physics, like a character hovering mid-air if the chair they were sitting on is deleted.

A physics-aware approach

So researchers from Netflix and collaborators have developed VOID (video object and interaction deletion), a new AI framework that doesn't just remove items from a scene, but also rewrites the physical consequences of their absence. So, in the example above, it could model the character falling naturally to the floor once the chair is gone.

Details of the innovation are published in a paper available on the arXiv preprint server.

Credit: VOID

To ensure that removing an object looks realistic instead of physically impossible, VOID behaves like a physics-aware editing system.

It follows a three-step process to help it reason through a scene. First, it identifies any area that might be affected by the change, then it creates a specialized map called a quadmask where shadows should disappear or where objects may move differently. Then it generates a new version of the video that accounts for all these changes. Finally, it makes a second pass of the scene to refine motion and ensure that objects do not warp or lose their shape as they move along different paths.

Learning cause and effect

The system was trained on thousands of pairs of digital sequences to help it understand cause-and-effect relationships. By watching these simulated actions, it learned how objects typically respond when supports, collisions or obstacles disappear.

To illustrate how VOID could handle a chain reaction, the team describes a sequence of falling dominoes where the middle tiles are deleted. The AI has to "realize" that without those pieces to pass the energy along, the rest of the line should stay upright. As they noted in their paper, "VOID correctly models the domino effect halting so that the yellow block never falls."

This is possible because the AI is doing so much more than merely copying pixels: "VOID does not just recall simple visual cues from its training data, but applies high-level reasoning and world knowledge..."

The researchers now hope to expand VOID's capabilities to handle more complex scenarios.

Written for you by our author Paul Arnold, edited by Gaby Clark, and fact-checked and reviewed by Robert Egan—this article is the result of careful human work. We rely on readers like you to keep independent science journalism alive. If this reporting matters to you, please consider a donation (especially monthly). You'll get an ad-free account as a thank-you.

Publication details
Saman Motamed et al, VOID: Video Object and Interaction Deletion, arXiv (2026). DOI: 10.48550/arxiv.2604.02296
Journal information: arXiv