The business is also putting out a ChatGPT PC app.
GPT-4o is a completely new artificial intelligence model that was introduced by OpenAI on Monday. According to the business, this model brings the company one step closer to “much more natural human-computer interaction.” The new model is capable of producing output in all three modes, including text, audio, and visuals, and it is able to receive any combination of these modalities as input. Additionally, it is able to identify feelings, allows you to interrupt it in the middle of a sentence, and answers almost as quickly as a human person would when they are having a conversation.
Mira Murati, the Chief Technology Officer of OpenAI, stated during a live-streamed presentation that “the special thing about GPT-4o is that it beings GPT-4 level intelligence to everyone, including what we call our free users.” “This is the very first time that we are making a significant advancement in terms of how easy it is to use everything.”
During the presentation, OpenAI demonstrated GPT-4o’s ability to translate live between English and Italian, assist a researcher in solving a linear equation in real time on paper, and provide instruction on deep breathing to another OpenAI executive merely by listening to his breaths. All of these capabilities were demonstrated.
Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN
— OpenAI (@OpenAI) May 13, 2024
Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx
The letter “o” in GPT-4o is an abbreviation for “omni,” which is a reference to the multimodal capabilities of the model. According to OpenAI, GPT-4o was trained on text, vision, and audio, which indicates that the same neural network is responsible for processing all of the inputs and outputs. The GPT-3.5 and GPT-4, the company’s earlier models, did allow users to ask inquiries merely by speaking, but they transcribed the speech into text afterward. This is a significant departure from the previous configuration. The tone and emotion were removed, and the interactions became more sluggish as a result.
Over the next few weeks, OpenAI will make the new model accessible to all users, including those who use ChatGPT for free. Additionally, the company will release a desktop version of ChatGPT, initially for the Mac, which will be accessible to premium users beginning today.
The news made by OpenAI comes one day before Google I/O, which is the annual developer conference held by the firm. Google teased a version of Gemini, its own artificial intelligence chatbot, with capabilities comparable to those of GPT-4o, which was disclosed by OpenAI not long after.