On Thursday, February 15, OpenAI launched text-to-video model Sora.
The generative AI model can turn written prompts into videos up to one minute long. It can also generate videos out of still images or build upon videos by adding frames and making them longer.
OpenAI CEO Sam Altman has been using Sora to transform prompts from X users into videos. One example depicts a wizard shooting lightning out of his hands, which mostly have the correct number of fingers, apart from a few frames.
https://t.co/SOUoXiSMBY pic.twitter.com/JB4zOjmbTp
— Sam Altman (@sama) February 15, 2024
https://t.co/rPqToLo6J3 pic.twitter.com/nPPH2bP6IZ
— Sam Altman (@sama) February 15, 2024
https://t.co/uCuhUPv51N pic.twitter.com/nej4TIwgaP
— Sam Altman (@sama) February 15, 2024
https://t.co/qbj02M4ng8 pic.twitter.com/EvngqF2ZIX
— Sam Altman (@sama) February 15, 2024
In a blog post, OpenAI said that Sora “may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.”
Last month, OpenAI announced that it’s adding watermarks to DALL-E 3. While not mentioning watermarking specifically, OpenAI’s blog post said “we’re leveraging the existing safety methods that we built for our products that use DALL·E 3, which are applicable to Sora, as well.”
Sora will also eventually reject prompts that “request extreme violence, sexual content, hateful imagery, celebrity likeness or the IP of others,” OpenAI said. Image classifiers will review the frames of every video generated to check that they adhere to usage policies.
But Sora still may exacerbate the difficulty consumers already have with knowing when an image is AI-generated or not. For example, many viewers thought that this year’s Super Bowl ad from He Gets Us, which didn’t rely on AI, was AI-generated.
According to OpenAI’s blog post, it’s “building tools to help detect misleading content, such as a detection classifier that can tell when a video was generated by Sora.”
For now, Sora is only available to a select few experts in misinformation, hateful content and bias , who are testing it for potential risks, as well as some visual artists, designers and filmmakers. There’s no word yet on when it may get a wider release.
However, Sora has the ad industry hypothesising how the tool can revolutionise the way ads are created.
We asked AI ad experts about the creative potential behind Sora and the challenges it could pose from a copyright and quality-of-work perspective. Below are their thoughts on the new tech:
Jason Zada, founder of Secret Level
When OpenAI announced Sora, the entire AI and creative community gasped in unison. The quality of Sora is beyond anything we've seen.
All of the existing text-to-video and image-to-video models struggle with common things like framerate and people walking realistically. Sora looks like it's solved all of that, but also added a level of realism that we've never seen before.
Currently, most of the models produce semi-realistic video output, but also have a slightly Uncanny Valley vibe as well. If anyone and everyone can simply generate very realistic video footage with nothing more than an elaborate string of text, we will see a whole new world of content creation instantly pop up.
We've been preaching ethically-conscious AI, which includes using actors for performance capture with voice and image. Once we get to a point where we can prompt two generative AI actors to act in a scene, we will move into a dangerous area that threatens the entire filmmaking process.
Henry Daubrez, CEO and chief creative officer at Dogstudio/Dept
OpenAI just dropped a big bomb on all the text-to-video companies out there hogging the headlines — namely Runway, Pika Labs and Stable Video.
Before Sora, the ecosystem of text-to-video was divided between slow-motion videos that result when trying to avoid too much hallucination and videos that are visually crazy but difficult to use commercially.
With Sora, text-to-video just made huge progress, instantly moving from infancy to teen years, while the other models are still toddlers.
Sora is a Midjourney moment for AI video. [We now have] up to 30fps AI videos that are one minute long (the competition offers four seconds) with a high degree of movement and realism—besides the extra fingers or hands that will likely be fixed soon—and an understanding of extremely complex requests.
If Sora continues to move at this speed, we could see a revolution in the film industry in the next 24 months that opens the way for a whole new generation of cineasts who don’t have the access or the money to break into the industry but have something to say.
There are also a lot of questions — an entire world of questions regarding misinformation warfare. OpenAI can monitor for this, but if the technology is around, how long before it is replicated? How long before we have AI-generated videos that relay hatred and surgically sway elections?
McDonald Predelus, VP and creative director of technology at VML Commerce
Sora has the potential to revolutionise how we interact and communicate in social settings.
It can generate content quickly and efficiently, making it easier for users to express themselves creatively and engage with others on various platforms. Whether it's generating digital assets, text, images or even music, Sora opens up new possibilities for content creation and collaboration.
AI and blockchain could play a significant role in addressing concerns, particularly in verifying the authenticity and ownership of content generated by Sora. By leveraging blockchain, it becomes possible to create immutable records of content creation and ownership, providing a transparent and decentralised way to verify its integrity. This could help mitigate the risk of copyright infringement and ensure that creators receive proper credit and compensation for their work.
Craig Elimeliah, chief creative officer at Lippe Taylor
Sora is a completely new chapter for creatives, allowing us to dream up and deliver dynamic narratives from our imaginations to the screen! It's like having a production studio at our fingertips —this pushes the boundaries of storytelling.
Still, we tread a fine line, having to navigate copyrights and ensuring our wild creations don't just dazzle, but also respect originality. As we start to use this tool, our challenge will be to blend AI's potential with our human experience without compromising the art of the craft or its legal standing.
Sora's not just about making videos; it's about envisioning the future of creativity. But we must also anchor ourselves in the ethics of today's very complicated digital ecosystem.
Henry Cowling, chief innovation officer at MediaMonks
The creative industry needs to come to terms with the fact that our roles are radically shifting because of this technology. We're seeing the democratisation of creativity.
[Sora] has the ability to take creativity in video to a new level, both for people who are already experts in their fields and for a whole new group of people that now have the power to create their wildest dreams. In terms of quality, what you see now is the worst it will ever be. This is exciting.
It shows, again, that the speed of development is unprecedented and is exactly the reason why brands and corporations need strong partners in order to navigate this evolution properly.
When it comes to copyright, we have copyright-protected models to address these challenges. It’s true that how brands choose to engage with technology is an important expression of their values. But the genie is out of the bottle. There's no going back to a world before LLMs because of copyright alone.
(This article first appeared pon PRWeek)