OpenAI – “Creating video from text. Sora is an AI model that can create realistic and imaginative scenes from text instructions. Read the Technical Report – Video generation models as world simulators. We explore large-scale training of generative models on video data. Specifically, we train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. We leverage a transformer architecture that operates on spacetime patches of video and image latent codes. Our largest model, Sora, is capable of generating a minute of high fidelity video. Our results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world.”
Washington Post [read free] – The new tool, called ‘Sora,’ further raises concerns about deep fakes as AI shows up in elections around the world. Artificial intelligence company OpenAI showed off a new AI tool that can generate highly realistic 60-second videos based off a simple text prompt, a jump forward in quality for AI videos and “deepfakes” that have already been used to deceive voters. The new tool, called “Sora,” will initially only be available to a small group of artists and filmmakers as well as “red teamers,” or researchers who try to find ways that an AI tool can be used for malicious purposes, OpenAI said in an announcement Thursday. Sora builds on the tech behind OpenAI’s image-generating DALL-E tool. It interprets a user’s prompt, expanding it into a more detailed set of instructions, and then uses an AI model trained on video and images to create the new video. The quality of AI-generated images, audio and video has rapidly increased over the past year, with companies like OpenAI, Google, Meta and Stable Diffusion racing to make more capable tools and find ways to sell them. At the same time, democracy advocates and AI researchers have warned that the tools are already being used to trick and lie to voters. This isn’t the first time such videos or audio have been created and other companies have built their own text-to-video AI generators. Google is testing one called Lumiere, Meta has a model called Emu, and AI start-up Runway has already been building products to help filmmakers create videos. But AI experts and analysts said the length and quality of the Sora videos went beyond what has been seen up to now…”