OpenAI announced a new artificial intelligence tool that can take a text prompt and turn it into a video.
Sora is the newest tool developed by the company behind ChatGPT.
Sora can take a text prompt, and create a video based on it. One example was based on the prompt, “New York City submerged like Atlantis. Fish, whales, sea turtles and sharks swim through the streets of New York.”
Other companies, like Google, have also developed similar tools in this space.
So what does this mean for the ever-advancing artificial intelligence industry?
“This one is more in-depth, more unique, I guess if you will. And it generates very realistic video, moving images. And it does that with what we call a text prompt, so you can type something in and ask it to generate a scene based on text,” said Steve Beaty, a professor of computer science at MSU Denver.
@scrippsnews Sharks swimming in New York City? OpenAI has unveiled a new tool that takes a text prompt and turns it into #video. Here’s how it works, and what experts are watching for as the new tool is developed. #ArtificialIntelligence ♬ original sound - Scripps News
Similar to how ChatGPT works as a large language model, Beaty said Sora has a similar process, in which it takes a frame and decides what would be the next logical frame to come in after that.
Experts say tools like this could make AI more difficult to detect.
“It is going to be possible, I think moving into the future, to not be able to easily recognize some of these images, etcetera. I think it's already happened to a certain degree,” Beaty said.
There may, however, be ways for our computers to differentiate what’s real from what’s AI.
“It could be, moving forward, that certain companies like OpenAI are going to embed things inside of the video that, for example, our browsers will be able to immediately recognize as generated AI. Now there will be other companies who might not choose to do that,” he explained.
Meta to add 'AI-generated' label to images made by third-party tools
The labels will roll out — in multiple languages — for artificially generated third-party images posted on Facebook, Instagram and Threads.
Another factor experts are considering is how the AI was trained.
In an email statement to Scripps News, Abe Davis, an assistant professor of computer science at Cornell University, wrote, "The results are definitely impressive, but understanding their real significance will require a bit of a shift in how we think about video."
Davis went on to explain how a complex video made by a human reflects deliberate choices on the part of the creator behind every detail. In contrast, he explained, "When AI produces something like this, we get a video with way more detail than the short text prompt used to generate it. That detail must come from somewhere, and in this case, it comes from other content that the AI has been trained on. So, we should probably think of the result as a mash-up of other content made by different creators with different intentions."
OpenAI’s new text-to-video tool is not perfect. On the website, the company wrote, “The current model has weaknesses,” and “For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.”
OpenAI wrote on their website that they are taking several safety steps before making this tool publicly available.
“We can expect a certain amount of concern, and I think that's completely appropriate. We might be able to see a certain amount of legislation come out,” Beaty said.