Netflix - yes Netflix - jumps on the AI bandwagon with video editor

Foto: The Register
64.8% of surveyed professionals identified VOID as a superior solution to the popular Runway, positioning Netflix as an unexpected leader in video editing technology. Unveiled on April 3, 2026, the VOID (Video Object and Interaction Deletion) model is an advanced Vision-Language Model that goes beyond simple object removal. Unlike existing tools, this system can intelligently "rewrite" the physics of a scene after a key element is removed. For example, after erasing a car involved in a collision, VOID automatically corrects the movement of the second vehicle and removes all interaction effects, such as smoke, debris, or skid marks, creating a physically plausible image of an empty road. For the global creative community, this means a drastic reduction in costly reshoots and hours spent on manual retouching in post-production. The model handles extremely difficult scenarios, such as removing a person jumping into a pool while simultaneously erasing the water splashes and restoring the surface to a state of rest. By making VOID available on the Hugging Face platform, Netflix is democratizing access to technology that was previously reserved for major Hollywood studios. This is a clear signal that the boundary between pure image recording and generative modification is finally blurring, giving editors a tool with almost unlimited narrative agency.
Imagine the final scene of the high-budget blockbuster "Car Crash III: Suddenest Impact." The main character, played by superstar Cruz Control, crashes head-on into an oncoming truck. The explosion is spectacular, car debris litters the highway, and the hero's career ends in clouds of smoke. Suddenly, the producer decides: "What if Cruz survives after all and drives off into the sunset?". In a traditional production model, this would mean expensive reshoots or months of work by CGI specialists. Thanks to new technology from Netflix, this scenario can be changed with just a few clicks.
The streaming giant has officially joined the arms race in the field of artificial intelligence, presenting the VOID model. This name is an acronym for Video Object and Interaction Deletion and, as the term itself suggests, this tool is capable of much more than just simple object removal from a frame. It is an advanced Vision-Language Model (VLM) that redefines how we understand video editing and physics in the digital world.
Physics instead of empty spots
The key innovation of VOID is its ability to perform so-called physically plausible inpainting (image completion). Most object removal tools available on the market handle backgrounds well but fail when the removed element had an impact on the rest of the scene. If we remove a person jumping into a pool, traditional algorithms might leave unnatural water splashes suspended in a vacuum or a blurred surface texture.
Read also
VOID understands interactions. In the case of the aforementioned jump into the water, the model not only erases the figure but also generates a video in which the surface remains undisturbed and no splashes appear on the ground. In a car accident scene, the model can remove one of the vehicles, erase the fire, smoke, and debris, and then generate smooth movement for the second car, which moves as if the collision never occurred. This is a transition from simple retouching to a full simulation of an alternative reality.
Technological foundation and creators
Behind the project is a team of researchers from Netflix and Sofia University: Saman Motamed, William Harvey, Benjamin Klein, Luc Van Gool, Zhuoning Yuan, and Ta-Ying Cheng. In their publication, they describe VOID as a framework designed to model the complex dynamics that follow object removal. This approach ensures the model handles scenarios that were previously considered too complicated for automated editing systems.
Significantly for the creative industry, Netflix has opted for openness. The model has been made available on the Hugging Face platform, meaning it is accessible not only to Hollywood studios but to any user with the appropriate hardware infrastructure. This is a strategic move that could accelerate the adoption of AI tools in independent film production and post-production.
VOID against the competition
The generative video tool market is becoming increasingly saturated. Netflix had to face established players and new models such as:
- Runway
- Generative Omnimatte
- DiffuEraser
- ROSE
- MiniMax-Remover
- ProPainter
According to research results conducted by the creators of VOID, their model outperforms the competition. In tests conducted on a group of 25 people who evaluated various scenarios, VOID was identified as the best in 64.8 percent of cases. For comparison, the popular Runway took second place with a score of only 18.4 percent. This crushing advantage stems from the fact that the Netflix model is better at maintaining temporal and logical consistency in edited segments.
"Through extensive evaluations against baseline inpainting and text-driven video models on synthetic and real-world data, we show that VOID excels at modeling complex dynamics," the project authors declare.
An end to expensive reshoots?
The implementation of VOID could bring massive savings to production budgets. The ability to change key elements of a scene without having to call the crew back to the set is every producer's dream. This tool allows for the correction of errors that were previously impossible to fix in post-production – from unwanted pedestrians in the background to fundamental changes in fight choreography or stunts.
However, questions about ethics and authenticity arise. In an era of a rising wave of disinformation, a tool capable of such convincing manipulation of reality raises understandable concerns. If VOID can make a car accident disappear without a trace, the line between what was filmed and what was generated becomes almost invisible. Netflix technology is a powerful weapon in the hands of creators, but also another step toward the digital fluidity of visual truth, which in the coming years will redefine our relationship with visual media.





