VentureBeat: “Today at its Spring Updates event, OpenAI chief technology officer Mira Murati announced a powerful new multimodal foundation large language model (LLM), GPT-4o (short for GPT-4 Omni), which will be made available to all free ChatGPT users in the coming weeks, and a ChatGPT desktop app for MacOS (later for Windows) that will allow users access outside the web and mobile apps. “GPT-4o reasons across voice, text, and vision,” Murati said. That includes accepting and analyzing realtime video captured by users on their ChatGPT smartphone apps, though this capability is not yet publicly available….The new model responds in realtime audio, can detect a user’s emotional state from audio and video, and can adjust its voice to convey different emotions, similar to rival AI startup Hume.”
See also Gizmodo: “OpenAI Unveils GPT-4 Omni’s Voice Capabilities and They’re Literally Unbelievable. ChatGPT sounds more human than ever with OpenAI’s release of GPT-4 Omni, capable of processing text, audio, and vision with little to no latency.”
Sorry, comments are closed for this post.