Google has announced the launch of Gemini 3, its new artificial intelligence model. The company claims that it is their most advanced reasoning model because “it has been designed to understand depth and nuance.”
Gemini 3 will also be available as standard as part of AI Mode in the renewed Google search engine, initially only in the U.S. This marks the first time that Google offers the benefits of its AI model from day one in the search engine. It will also be accessible through the Gemini app and for developers utilizing AI Studio and Vertex AI.
Following the success of Gemini 2.5 Pro and Flash, this new version has expanded its reach, arriving in 30 new languages, including Catalan, Basque, and Galician. Users can start testing it today in the U.S.—or elsewhere with a VPN.
Gemini 3 Promises: Early Test Results
Google highlights that the model has exhibited outstanding behavior in various synthetic tests, leading the LMArena classification with 1,501 points—the first to surpass the 1,500-point barrier. In tests such as Humanity’s Last Exam, Gemini 3 manages to reason at a “PhD level”, surpassing 37.5% of the test without tools and achieving a score of 91.9% in GPQA Diamond.
Furthermore, Gemini 3 shows exponential progress in mathematics, scoring 23.4% on the MathArena Apex test, while competitors such as GPT 5.1 and Claude Sonnet 4.5 scored significantly lower. The model aims to provide more direct responses, emphasizing valuable information over clichés, effectively advising users with the mantra: “Tells you what you need to hear, not just what you want to hear.”
Addressing Complexity Simply
Gemini 3 features a context window of up to one million tokens, facilitating the analysis of extensive repositories of code and text. Its multimodal support allows comprehensive analysis of information, enabling users to decipher handwritten recipes or analyze athletic performance metrics for personalized training plans.
Notably, the integration of Gemini 3 within Google Search will enhance user experiences through interactive visual elements—such as widgets, calculators, and simulations—aimed at making searches more interactive and informative.
Programming and Agent Development
In the programming domain, Gemini 3 excels, topping the WebDev Arena leaderboard with an ELO score of 1,487. Its impressive capabilities allow it to operate tools and manage complex tasks autonomously. This is particularly evidenced in its performance on Terminal-Bench 2.0 and SWE-bench Verified tests, demonstrating a leap in automation potential for software development.
These enhancements are poised to be utilized in a new agent development platform called Google Antigravity, providing developers access to a conventional AI integrated development environment (IDE). Consequently, agents can plan, execute tasks, and validate code autonomously, making it easier for human developers to audit their work.
Will Users Notice a Significant Change?
On paper, Gemini 3 is positioned as a breakthrough model compared to its competitors, as corroborated by the positives from its testing results. However, the question remains whether users will genuinely notice these differences. In recent months, various companies have introduced new AI models; yet, many users experienced only subtle improvements as their previous encounters with AI already produced effective outcomes.
Google’s chances to demonstrate Gemini 3’s capabilities lie primarily in programming, where experts may capitalize on its additional features. For the general user community, it will depend heavily on AI Mode and the Gemini app to showcase its innovative features. The promised interactive elements—like dynamic graphics and widgets—might ultimately unlock new experiences and establish Gemini 3’s viability in the daily lives of users.
As Google shifts its focus away from traditional assistants, the effective implementation of Gemini 3 could be critical in shaping future interactions between AI and end-users.

