For developers that use Google’s models to create AI applications, the new model will result in cost savings.
Tuesday was the day that Google held its annual developer conference, known as I/O, where the firm introduced improvements to its Gemini family of artificial intelligence models. The company is releasing a new model that it calls Gemini 1.5 Flash. According to the company, this model is geared for speed and efficiency.
Demis Hassabis, the Chief Executive Officer of Google DeepMind, noted in a blog post that “[Gemini] 1.5 Flash excels at summarization, chat applications, image and video captioning, data extraction from long documents and tables, and more.” As an additional point of interest, Hassabis mentioned that Google developed Gemini 1.5 Flash because developers required a model that was both lighter and more affordable than the Pro version, which Google had released in February. The Gemini 1.5 Pro is a more powerful and efficient version of the company’s first Gemini model, which was introduced during the end of the previous year.
Between Gemini 1.5 Pro and Gemini 1.5 Nano, which is Google’s smallest model that can run locally on smartphones, Gemini 1.5 Flash offers a convenient middle ground. It is just as strong as the Gemini Pro, despite the fact that it is lower in weight. According to Google, this was accomplished using a process known as “distillation,” in which the most important information and abilities from the Gemini 1.5 Pro were transmitted to the smaller model. This indicates that Gemini 1.5 Flash will receive the same multimodal capabilities as Pro, in addition to its long context window, which is the amount of data that an AI model is able to consume at the same time, which is one million tokens. According to Google, this indicates that Gemini 1.5 Flash will be able to efficiently analyze a document that is 1,500 pages long or a codebase that contains more than 30,000 lines of code simultaneously.
Neither the Gemini 1.5 Flash nor any of these other models are genuinely designed with the end user in mind. Instead, it is a method that lets developers build their own artificial intelligence products and services more quickly and at a lower cost by utilizing technology that was designed by Google.
In addition to the launch of Gemini 1.5 Flash, Google is also working to improve Gemini 1.5 Pro. According to the corporation, it had “enhanced” the model’s capabilities to write code, reason, and analyze audio and visuals during the development process. However, the most significant upcoming change is yet to come: Google has indicated that it would increase the current context window of the model to two million tokens by the end of this year. That would make it capable of simultaneously processing more than 1.4 million words, more than 60,000 lines of code, two hours of video, and 22 hours of audio. Additionally, it would be able to process four hours of video.
The public preview versions of Gemini 1.5 Flash and Gemini 1.5 Pro are now available at Google’s AI Studio and Vertex AI respectively. Today, the company also revealed that it has released a new version of its Gemma open model, which is referred to as Gemma 2. However, unless you are a member of the development community or someone who enjoys experimenting with the creation of artificial intelligence applications and services, these changes are not really intended for the typical consumer.