X (Twitter) Facebook Pinterest LinkedIn E-mail

The new version leads international benchmarks and solidifies Google’s commitment to autonomous and reasoning AI.

Google has officially unveiled Gemini 2.5, its most powerful and sophisticated artificial intelligence model to date, marking a new milestone in the development of advanced cognitive technologies. This evolution represents a qualitative leap compared to previous versions and reinforces the tech giant’s strategy in its race to lead the next generation of intelligent agents.

The Gemini 2.5 Pro Experimental model, the first iteration of this new generation, is already topping major evaluation rankings such as LMArena, a classification based on human preferences. The results highlight its advanced reasoning capabilities, as well as outstanding performance in programming, mathematics, and sciences, without resorting to additional majority voting techniques that often increase computational resource consumption.

AI that “thinks” before responding

Unlike other systems that limit themselves to classifying or predicting outcomes, Gemini 2.5 introduces the concept of thinking models, capable of interpreting context, analyzing information, drawing logical conclusions, and making informed decisions before generating a response. This evolution is made possible by a combination of a more robust base model and a refined post-training process, which substantially improves the quality of responses.

Leader in technical testing

The results of Gemini 2.5 Pro are not just theoretical. In tests like AIME 2025, focused on high-level mathematics, and GPQA, which evaluates complex scientific questions, the model has far surpassed its predecessors. Additionally, it has achieved an 18.8% accuracy in the challenge Humanity’s Last Exam, a battery of tests developed by hundreds of experts to test reasoning at the limits of human knowledge.

Notable improvements in programming

One of the areas where this new version stands out the most is in programming. On the SWE-Bench Verified platform, used to validate the capabilities of coding agents, Gemini 2.5 has achieved a 63.8% success rate in automated tests. This represents a significant improvement over Gemini 2.0. According to the published data, the model is capable of creating interactive web applications, modifying existing code, and even developing video games from simple textual descriptions.

Multimodality and massive context

Gemini 2.5 continues to champion native multimodality, which means it can work simultaneously with text, images, video, audio, and code. This approach makes it a more versatile tool applicable to more complex real-world cases.

Another of its notable advancements is the expansion of the operational context: it can now handle up to 1 million tokens, allowing for the analysis of lengthy documents, multiple data sources, or complex programming structures. Google has announced that this capability will soon be expanded to 2 million tokens.

Availability and business environment

Gemini 2.5 is now available for testing through Google AI Studio and in the Gemini Advanced mobile and desktop application. Additionally, the model will be integrated in the coming weeks into Vertex AI, Google’s cloud-based artificial intelligence platform, adapted for business environments with flexible pricing plans and scalable limits.

A strong commitment to intelligent autonomy

According to statements from Koray Kavukcuoglu, vice president of research at Google DeepMind, the future of artificial intelligence at the company involves developing more autonomous systems capable of deeply understanding context:

“We are building increasingly capable agents that are aware of their environment, aimed at helping to solve real and complex problems,” the executive stated.

With Gemini 2.5, Google solidifies its leadership in the race for general artificial intelligence, setting a new standard in the industry and laying the groundwork for an AI that will be increasingly reasoning, useful, and reliable for developers, researchers, businesses, and advanced users.

Source: Artificial Intelligence News

X (Twitter) Facebook Pinterest LinkedIn E-mail