New AI video tool removes objects without breaking the laws of physics
by Paul Arnold, Phys.org · Tech XploreWhen movie and TV directors want to tinker with their footage in post-production, they have an array of tools at their disposal to perfect a scene if it wasn't shot exactly how they liked. That includes removing objects like stray equipment or unwanted background actors. But the tech has its limits when it comes to more complex physical interactions.
For example, if you want to remove an object that was bumping into or supporting something else, traditional tools often leave the remaining objects behaving in ways that defy the laws of physics, like a character hovering mid-air if the chair they were sitting on is deleted.
A physics-aware approach
So researchers from Netflix and collaborators have developed VOID (video object and interaction deletion), a new AI framework that doesn't just remove items from a scene, but also rewrites the physical consequences of their absence. So, in the example above, it could model the character falling naturally to the floor once the chair is gone.
Details of the innovation are published in a paper available on the arXiv preprint server.
To ensure that removing an object looks realistic instead of physically impossible, VOID behaves like a physics-aware editing system.
It follows a three-step process to help it reason through a scene. First, it identifies any area that might be affected by the change, then it creates a specialized map called a quadmask where shadows should disappear or where objects may move differently. Then it generates a new version of the video that accounts for all these changes. Finally, it makes a second pass of the scene to refine motion and ensure that objects do not warp or lose their shape as they move along different paths.
Learning cause and effect
The system was trained on thousands of pairs of digital sequences to help it understand cause-and-effect relationships. By watching these simulated actions, it learned how objects typically respond when supports, collisions or obstacles disappear.
To illustrate how VOID could handle a chain reaction, the team describes a sequence of falling dominoes where the middle tiles are deleted. The AI has to "realize" that without those pieces to pass the energy along, the rest of the line should stay upright. As they noted in their paper, "VOID correctly models the domino effect halting so that the yellow block never falls."
This is possible because the AI is doing so much more than merely copying pixels: "VOID does not just recall simple visual cues from its training data, but applies high-level reasoning and world knowledge..."
The researchers now hope to expand VOID's capabilities to handle more complex scenarios.
Written for you by our author Paul Arnold, edited by Gaby Clark, and fact-checked and reviewed by Robert Egan—this article is the result of careful human work. We rely on readers like you to keep independent science journalism alive. If this reporting matters to you, please consider a donation (especially monthly). You'll get an ad-free account as a thank-you.
| Publication details Saman Motamed et al, VOID: Video Object and Interaction Deletion, arXiv (2026). DOI: 10.48550/arxiv.2604.02296 Journal information: arXiv |