The business is also putting out a ChatGPT PC app.

GPT-4o is a completely new artificial intelligence model that was introduced by OpenAI on Monday. According to the business, this model brings the company one step closer to “much more natural human-computer interaction.” The new model is capable of producing output in all three modes, including text, audio, and visuals, and it is able to receive any combination of these modalities as input. Additionally, it is able to identify feelings, allows you to interrupt it in the middle of a sentence, and answers almost as quickly as a human person would when they are having a conversation.

Mira Murati, the Chief Technology Officer of OpenAI, stated during a live-streamed presentation that “the special thing about GPT-4o is that it beings GPT-4 level intelligence to everyone, including what we call our free users.” “This is the very first time that we are making a significant advancement in terms of how easy it is to use everything.”

During the presentation, OpenAI demonstrated GPT-4o’s ability to translate live between English and Italian, assist a researcher in solving a linear equation in real time on paper, and provide instruction on deep breathing to another OpenAI executive merely by listening to his breaths. All of these capabilities were demonstrated.

Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN

Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx
— OpenAI (@OpenAI) May 13, 2024

The letter “o” in GPT-4o is an abbreviation for “omni,” which is a reference to the multimodal capabilities of the model. According to OpenAI, GPT-4o was trained on text, vision, and audio, which indicates that the same neural network is responsible for processing all of the inputs and outputs. The GPT-3.5 and GPT-4, the company’s earlier models, did allow users to ask inquiries merely by speaking, but they transcribed the speech into text afterward. This is a significant departure from the previous configuration. The tone and emotion were removed, and the interactions became more sluggish as a result.

Over the next few weeks, OpenAI will make the new model accessible to all users, including those who use ChatGPT for free. Additionally, the company will release a desktop version of ChatGPT, initially for the Mac, which will be accessible to premium users beginning today.

The news made by OpenAI comes one day before Google I/O, which is the annual developer conference held by the firm. Google teased a version of Gemini, its own artificial intelligence chatbot, with capabilities comparable to those of GPT-4o, which was disclosed by OpenAI not long after.

Login

OpenAI says its GPT-4o model is free and can see, talk, laugh, and sing like a person – technology

Sam Altman Says Mission Driven AI Talent Will Outperform Meta’s

Skypeaklimits 2024: Your Digital Success Elevate Your Presence

OpenAI partners with Palmer Luckey’s Anduril to build military AI

MS assures Windows 11 TPM security requirement won’t change