Artificial intelligence (AI) is being developed at breakneck speed, and this weekend OpenAI delivered one of the biggest updates to the system in a while. Sora is OpenAI's latest AI model that lets you create realistic, imaginative scenes using just text instructions.
Sora enables industry professionals to create realistic and complex videos without leaving their seats.
According to a report conducted by AI Munk, today's consumers are watching more videos and demand for short-form content is rapidly increasing, with 66% finding the type of content most appealing. is more important than ever. -Equipped with an automation platform for social media.
Don't miss: Why OpenAI's communication strategy failed tragically amid Sam Altman's ouster
According to the report, video content is no longer an option but a necessity for businesses and brands looking to succeed, with 42% of businesses preferring Instagram and 26% preferring to post such videos. He prefers Facebook. TikTok doesn't rank in the top three platform choices for marketers.
Given the importance of short-form video content in your marketing efforts, we'll take a closer look at what you need to know about Sora and how it can help industry professionals in this space.
What is Sora?
Sora is OpenAI's solution that enables AI to understand and simulate the physical world in motion, with the goal of training models to help people solve problems that require interaction with the real world. , the company said in a statement.
As such, Sora is a text-to-video model that can generate videos of up to 1 minute while maintaining visual quality and compliance with user prompts.
Sora can generate complex scenes with multiple characters, specific types of motion, and precise subject and background details. The model understands not only what the user asks for in a prompt, but also how those things exist in the physical world.
This model's deep understanding of language allows it to accurately interpret prompts and generate engaging characters that express vivid emotions. Sora can also create multiple shots within a single generated video that accurately depicts a character or visual style.
“Specifically, we jointly train a text conditional diffusion model on videos and images of varying lengths, resolutions, and aspect ratios. We leverage a transformer architecture that operates on spatiotemporal patches of video and image latent codes. ,” OpenAI said.
How exactly does it work?
This part is a bit technical, but OpenAI says it takes inspiration from large-scale language models that gain generalist capabilities by training on internet-scale data.
“The success of the LLM paradigm is enabled in part by the use of tokens. Elegantly integrate diverse forms of text, including code, mathematics, and various natural languages. In this study, we consider how generative models of visual data can inherit such advantages.”
OpenAI explained in a technical report that LLM has text tokens, while Sora has visual patches. Patches have previously been shown to effectively represent models of visual data.
“We found that patches are a highly scalable and effective representation for training generative models on a wide variety of video and image types,” the company said.
Sora is essentially a diffusion model that generates a video starting with what looks like static noise and gradually transforms the video by removing the noise in many steps.
As a result, you can generate the entire video at once or extend the generated video to make it longer.
This model is also based on previous work in the DALL·E and GPT models. It uses DALL·E 3's re-captioning technology, which generates highly descriptive captions for visual training data. As a result, the model can more closely follow the user's text instructions in the generated video.
In addition to being able to generate videos from text instructions alone, this model can take existing still images and generate videos from them, and animate image content with precision and detail.
What are some of its weaknesses?
As with all AI models, weaknesses, biases, and misinformation can occur from time to time. As OpenAI acknowledges, Sora is no exception.
Currently, Sora struggles to accurately simulate the physics of complex scenes and may not understand certain instances of cause and effect. For example, even if a person bites into a cookie, there may not be a bite mark left on the cookie afterward, OpenAI said.
The model may also confuse the spatial details of the prompt (e.g., confuse left and right). You may also struggle to accurately describe events that occur over time, such as following the trajectory of a particular camera.
OpenAI announced that it will work with experts in areas such as misinformation, hateful content, and bias to adversarially test the model prior to its public release.
“We are also building tools to help detect misleading content, such as a detection classifier that can tell when a video was generated by Sora.” will include C2PA metadata,” the company said. . It added that it will also leverage existing safety methods it has already built for products using DALL-E 3.
For example, when integrated into an OpenAI product, its text classifier checks and rejects text input prompts that violate its usage policy. These include extreme violence, sexual content, hateful images, likenesses of celebrities, or requests for someone else's IP.
“We have also developed a robust image classifier that is used to review frames of every video produced to ensure they comply with usage policies before being displayed to users.” said.
OpenAI also plans to work with policymakers, educators, and artists around the world to understand their concerns and identify positive use cases for this new technology.
Introducing technology in marketing
That said, the main question is how marketers and industry professionals can implement this technology in their daily work, and according to the industry professionals MARKETING-INTERACTIVE spoke to, According to the company, this technology has “amazing” potential.
Pramodth Rai, co-founder of Cyber Sierra, said Sora's features and low barrier to entry create a “very high likelihood that consumers will start experimenting with creative teams, including marketers and advertisers.” ” means there is.
“Content creation is currently being revolutionized in a very exciting way. Our ability to quickly prototype and create high-quality videos and dynamic visuals reduces time-to-market and traditional advertising. It significantly reduces the resources required for campaigns,” Rai said. He added that the ability to automate routine editorial tasks and content for different platforms and audiences frees up creative teams to focus on the strategic and innovative aspects of their campaigns.
“As the lines between reality and AI are blurring thanks to AI advances like Sora, personalized advertising through custom content will proliferate. “It can enhance collaboration and enable tighter feedback loops between team members, and it's likely to be done cheaply as well, sparking a new level of experimentation across society,” he said. .
Milind, Mercedes' AI scientist who expressed his own opinion, agreed, saying that from what has been shared so far, the model's capabilities look “pretty amazing.”
“The consistency and quality of video over time is truly revolutionary. It is no exaggeration to say that it will be extremely useful for use cases such as hyper-personalized video content creation. “We are confident that it will continue to improve sound production and provide fine-grained control in the future,” he said.
pay attention to technology
That being said, don't get too excited about technology right away. Edwin Yeo, general manager of the Strategic Public Relations Group, said marketers need to be “cautious” about deploying Sora quickly or do so at “huge risk.” Masu. He said:
If there's one thing we've learned from advances in technology, it's that technology tends to outpace regulatory and safety concerns.
He added that questions around usage and copyright remain big challenges for marketers and content creators when it comes to Sora and generative AI in general.
Apart from copyright and safety concerns, quality remains questionable, Yeo added. “As with AI art, I don't like using my hands, and I still have doubts about the computing power required to output video in 4K or 8K format.”
He added that he personally uses something like Midjourney for concept presentations. If approved, I will continue to return to photography and DI. “I think Sora will be useful in the near future as well. It's already having a big impact on marketing's workflow, but we're a long way from being able to replace Sora in video production.” says Mr. Yeo.
Adding to his claims, Mr. Rai said: There are also still a number of potential brand safety concerns that marketers need to be aware of.
“First, deepfakes and misinformation constitute a new and unprecedented level of risk that can impact brand safety,” Lai said. Additionally, brands may face the issue of inappropriate content generation that is inconsistent with brand values or may be offensive or insensitive. Mr. Rai said:
As the world begins to rely on AI-generated content rather than human oversight, brand credibility could take a hit.
Apart from these issues, marketers are also concerned that human input is too important as AI models like Sora can misinterpret creative briefs, raise data privacy and security issues, and lead to copyright infringement lawsuits. You should also be aware of what's missing.
“Marketers use Sora to generate content that resonates with individual tastes and behaviors, keeping humans part of the loop where a combination of creativity, strategy, analysis, and a unique personal touch shines. We need to do that,” Rai said.
Join us on #Content360 from April 24th to 25th. This event is his two-day celebration centered around his four main thematic pillars: Insight-based strategies. Content as an experience. And embrace the future. Learn how to curate content with creativity, critical thinking, and confidence with Content360.
Related article:
OpenAI announces platform for users to create custom GPTs
SG creates new Gen-AI framework and seeks global input to build trust
What do Singaporeans think about the questioning of the TikTok CEO?