×
Google’s Gemini 2.5 Pro sets new reasoning benchmark with 18.8% score
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Google’s Gemini 2.5 represents a significant leap in AI reasoning capabilities, positioning the company at the forefront of the competitive AI landscape. With benchmark scores substantially higher than rival systems, this latest model demonstrates Google’s commitment to rapid AI advancement through frequent, meaningful updates. The new version’s enhanced thinking capabilities signal a shift toward AI systems that can tackle increasingly complex problems while supporting more context-aware applications.

The big picture: Google has unveiled Gemini 2.5 Pro Experimental, which it claims is its “most intelligent AI model” yet, featuring substantially improved reasoning capabilities.

  • The new model combines an enhanced base architecture with improved post-training to achieve what Google describes as “a new level of performance.”
  • Available immediately to Gemini Advanced subscribers, this release continues Google’s aggressive pace of AI development following recent launches of Gemini Deep Research and updates to NotebookLM.

Key details: Gemini 2.5’s enhanced thinking capabilities will be incorporated into all future Google AI models to tackle more complex problems.

  • Users can access the experimental model through the Gemini app or Google’s AI Studio, though a Gemini Advanced subscription is required.
  • Google plans to release additional 2.5 models in the future, with pricing for scaled production use to be announced in the coming weeks.

Impressive benchmarks: The new model scored 18.8% on Humanity’s Last Exam, significantly outperforming competitors’ AI systems on this challenging benchmark.

  • This score represents the highest ever achieved on this demanding test without tool use, surpassing ChatGPT’s o3-mini (14%) and DeepSeek R1 (8.6%).
  • Google characterizes Gemini 2.5 Pro’s reasoning capabilities as “state-of-the-art,” a claim supported by these benchmark results.

Why this matters: Google’s rapid succession of AI updates demonstrates its commitment to maintaining competitive advantage in the increasingly crowded AI development space.

  • The significant improvement in reasoning capabilities could translate to more capable AI assistants for both consumer and enterprise applications.
  • The company’s focus on reasoning suggests a shift toward AI systems that can handle more nuanced, complex tasks rather than simply generating content.
Gemini 2.5 is now available for Advanced users and it seriously improves Google’s AI reasoning

Recent News

RL impact on LLM reasoning capacity questioned in new study

Study finds reinforcement learning in LLMs narrows reasoning pathways rather than creating new reasoning capabilities.

Google AI scrapes blocked sites, raising privacy concerns

Google exploits policy loophole to train AI on opted-out websites by allowing DeepMind to respect blocks while other company divisions still use the same data.

Open-source MCP integration Klavis AI gains traction

Open-source Klavis AI simplifies Model Control Protocol integration, allowing developers to deploy AI capabilities in minutes rather than spending weeks on infrastructure development.