 Gemini was the star of Google I/O 2024 with many announcements around the LLM. Among them, the arrival of Gemini 1.5 Flash, the ultra-fast version of the large language model. Why this LLM and what will it be used for?

It was the topic of the day: Gemini , with plenty of innovations during Google 's annual conference , Google I/O 2024 . Indeed, Gemini is available in Google Photos , on Google search , and its most powerful version arrives in Gemini Advanced . Alongside all these announcements, there is also a very fast version of the LLM : Gemini 1.5 Flash.

What is Gemini 1.5 Flash?

Google presents its umpteenth version of Gemini, called Gemini 1.5 Flash. This is a lighter language model than Gemini 1.5 Pro , “  designed to be fast and efficient at scale ,” Google says in its blog post . The idea is not so much to make it an LLM that can be used directly by the general public, but rather a model integrated into certain applications (via the Google API). This LLM is actually “  optimized for high-volume, high-frequency tasks ,” while being more cost-effective for companies that would like to use it.

Gemini 1.5 Flash // Source: Google

However, it is multimodal like its big brother Gemini 1.5 Pro, which means that it can be based on text, audio or even images. And this includes large quantities of information. To do this, it has a processing capacity of one million tokens, which makes it possible to process one hour of video, eleven hours of audio, 30,000 lines of code or 700,000 words in one go. .

If Gemini 1.5 Flash is less good than 1.5 Pro or 1.0 Ultra, it shows in all the tests advanced by Google better than Gemini 1.0 Pro presented last December . What raises questions is that in its communication, Google only mentions one second of latency, without advanced statistics. This is rather strange for an LLM that claims to be specialized.

You will not (directly) use this version of Gemini

Google sells its model by saying that Gemini 1.5 Flash "  excels at summaries, chat applications, image and video captioning, extracting data from long documents and tables, and more." » Its response speed: less than a second of latency on average in the vast majority of situations, Google says.

Google Gemini // Source: Frandroid

Gemini 1.5 Flash is currently available for public preview testing in its version with 1 million tokens, only in Google AI Studio and Vertex AI (on Google Cloud). The idea for Google is therefore above all to sell this LLM to companies and developers.


