Just a couple of months ago, many were sceptical about Google’s future in AI. However, the company found a healthy redemption with the Gemini 2.0 Flash and the 2.5 Pro model, both equipped with reasoning and deep research capabilities.
While these models led the competition on all fronts—speed, performance, and cost efficiency—the company showed that it is far from slowing down at the I/O 2025 event.
The Gemini 2.5 Pro and 2.5 Flash models have received massive upgrades. While the former was expected to maintain its benchmark dominance, the 2.5 Flash—designed for speed and cost efficiency—has now surpassed its rivals. It competes with models like OpenAI’s o3 and o4-mini and even comes close to its more powerful counterpart, the 2.5 Pro.
Both models excel at coding, mathematics, graduate-level science questions, and virtually every benchmark imaginable.
Source: Artificial Analysis
The company also introduced Deep Think for the Gemini 2.5 Pro models, further enhancing their reasoning capabilities. With Deep Think, the model scored 49.4% on the United States of America Mathematical Olympiad (USAMO), outperforming the Gemini 2.5 Pro without Deep Think (34.5%) and OpenAI’s o3 (21.7%) and o4-mini (19.2%) models.
Of course, Deep Think comes at a cost. The company has unveiled a new Ultra plan, priced at $249 per month, which also includes the Veo 3 video generation model.
However, the improvements to the Gemini models were not the only surprising element at Google’s event. The company unveiled a language model with a completely different architecture. If that wasn’t enough, Google revealed many applications for specific use cases, often only targeted by startups.
Google’s New Model is Over 3x Faster Than The Fastest Model
Artificial Analysis, a platform that provides benchmark results for models, stated that Gemini 2.5 Flash outputs 340 tokens per second. The next best speed was from Gemini 2.5 Pro at 152 tokens per second, followed by OpenAI’s o4-mini, at 151 tokens/second.
But with Gemini Diffusion, the company’s new model that learns to generate outputs using methods used by image and video models, the company reported an impressive speed of 1,479 tokens/second.
The architecture, commonly referred to as Diffusion Language Models (Diffusion-LM), works differently from traditional language models that generate one word or token at a time.
“This sequential process can be slow and limit the quality and coherence of the output,” Google said. “Diffusion models work differently. Instead of predicting text directly, they learn to generate outputs by refining noise, step by step. This means they can iterate on a solution very quickly and error correct during the generation process.”
This iterative refinement process can help the models excel at code and math-related tasks while coherently producing large chunks of text output.
Gemini Diffusion is my fav GoogleIO announcement
vibe coding at 1000tok/s hits different
multi-turn looks good so far
(no video speedup or anything)insanely bullish on diffusionLM pic.twitter.com/6AK6IM7AlM
— bycloud (@bycloudai) May 20, 2025
In February, Inception Labs, founded by professors from Stanford University, the University of California, Los Angeles (UCLA), and Cornell University, introduced Mercury, the first commercial-scale diffusion large language model. The model also achieved a similar output speed of over 1,000 tokens per second.
“A year ago, when we started Inception Labs, there was a lot of scepticism around anything non-autoregressive,” said co-founder Aditya Grover while congratulating Google. “We are excited to see the growing validation and ecosystem around diffusion language models from other providers,” he added.
‘Google I/O is Such a Bloodbath for Startups’
The conversation about AI incumbents rendering startups obsolete is quite old. Analysts have always agreed that there will be areas where founders can still focus and provide valuable user experiences.
But, considering Google’s recent release, does the statement still hold?
founders waking up to realise their startup is now a bullet point in Google’s keynote. pic.twitter.com/s1MAMJiUM7
— Charmie Kapoor (@charmiekapoor) May 21, 2025
Google also launched its Jules coding agent, a fully autonomous tool that integrates with users’ repositories. The tool can perform tasks like writing tests, building new features, fixing bugs, and providing audio changelogs. This will directly compete with OpenAI’s newly released Codex feature on ChatGPT.
This is another addition to Google’s suite of coding-related products, like the canvas feature on Gemini, Gemini code assist, and Firebase Studio. Furthermore, the company also updated Google Colab to an AI-first platform with agentic capabilities.
“This Google I/O is such a bloodbath for startups,” said Jane Manchun Wong, a software engineer. “Whatever you built gets sherlocked into a Google feature in the next quarter and backed by multitude of computation powers, engineering resources, data, and basically unlimited budget,” she added.
Moreover, Google also introduced Stitch, an experiment that allows users to turn their prompts, wireframes, and sketches into user interface mock-up images. This adds AI functionalities to the UI/UX design layer, and even allows users to refine designs with images and paste them into Figma.
The tool will also generate clean, functional front-end code based on the design. This effectively provides functionalities similar to those on platforms like Galileo, Uizard, and UXPilot.ai.
All of the above features pose a direct challenge towards the other AI-enabled coding platforms that exist today.
Google’s new products were aimed at a larger user base, beyond just the developers.
The company also unveiled a virtual try-on mode for users who are shopping for clothes to enhance the online shopping experience. A few days ago, a startup called Doji raised $14 million in a seed fund for virtual try-ons.
I knew 2 startups who just raised money building this 💀 https://t.co/E2tyuIh1AE
— Arnav Gupta (@championswimmer) May 20, 2025
Besides, the AI Mode on Google Search is now available in the US, and it has received a slew of updates. These features will compete directly with those offered by Perplexity.
Besides, Google has added a new video overview feature for NotebookLM, new image and video generation tools, and a new filmmaking tool. The company’s previously announced Starline project is now updated to Google Beam, which provides 3D video communication features, a real-life conversation-like experience for users.
“Every five minutes at the Google I/O event, I saw 10 startups getting killed right before my eyes,” said one user on X.