Open AI Unveils Shocking Video-Generating AI
Have you seen the videos created by Open AI's Sora? Just type in a simple prompt and you'll see a woman walking through a colorful Tokyo city center or a mammoth charging through a snowy meadow. Here are three of the most impressive features of Sora, a video-generating AI that has shocked many with its quality.
Consistency
The first and foremost feature is consistency. In the three photos below, taken from Sora's video, you can see a woman walking past a sign. The text on the sign is obscured by the woman, but the text on the sign that reappears after she walks past remains the same, because Sora is able to accurately recognize the sign and the woman's object and restore the obscured background.
That's why Sora is able to maintain 3D consistency in dynamic footage, even when the camera is moving and rotating, which is remarkable considering that other AI video production tends to focus on videos where the camera is fixed straight ahead to minimize jagged objects, backgrounds, and noise.
High-level expressive
When I first saw Sora's footage, I thought, "This is real." The crisp high definition, realistic and intricate backgrounds, and the realism of the reflections in the water and on the glass windows made it hard to believe it was fake. Take a look at the video capture below and you'll see why.
Versatile
Sora is also incredibly versatile: it can generate videos based on text, like the one we've shown you, but it can also take in images or videos to create natural-looking videos. You can even create an extension of an existing video, or create a video that connects two different videos.
What's amazing about Sora is that it can create videos not only from live-action footage, but also from digital art, illustrations, watercolors, and even the digital world of the game. The accompanying description of the Minecraft video also suggests that a video-generating AI like Sora has the potential to be a simulator of sorts.
Limitations and implications
While Sora's videos have shocked people with their unbelievable quality for an AI-generated video, there are still some awkwardnesses: it doesn't model basic interactions well, such as breaking glass, and it sometimes produces videos with unnatural object placement in videos with many objects.
It also requires a lot of computing resources to generate the video and a lot of data to train the model, which is a complex process, so it will be a while before it's ready for public use.
But even considering these shortcomings, it's clear that Sora is a tremendous achievement for AI video creation, and its significance for the AI industry and others is that it gives us confidence in the direction we're headed. Sora answered the question that what we were doing was not wrong, and that the reason why AI videos were not good was because they lacked scale. So far, "Scale is all you need" has not been proven wrong.
When will Sora be available?
There's no specific information about Sora's release date yet, and Open AI says it's only available to select artists and others for internal feedback, with the goal of sharing early results and progress. With the recent increase in fake news, criminal cases, and ethical issues using AI, it seems like Sora needs to be prepared for these issues.