Google has released Gemma 4, a new family of open artificial intelligence models designed to run directly on local devices, as the company expands its push into developer tools and reduces reliance on cloud-based AI services.
The models, announced April 2, are built for what Google describes as “advanced reasoning and agentic workflows,” signalling a shift beyond chatbots toward systems that can execute tasks and integrate into software environments. In its release, Google said “Gemma 4 delivers an unprecedented level of intelligence per parameter," enabling developers to run capable AI systems without the heavy computing costs typically associated with large models.
Gemma 4 comes in four sizes, from lightweight 2B and 4B models for phones and edge devices to 26B and 31B models that fit on a single 80GB GPU. The smaller models offer responsive offline AI, while the larger versions deliver reasoning and memory rivaling much bigger proprietary systems. For everyday workflows, this can speed up real tasks like coding locally, processing long documents, or handling audio and visual inputs without relying on external servers.
1. It’s Google’s most capable open model yet
Google describes Gemma 4 as its “most intelligent open models to date,” built for advanced reasoning and agentic workflows. The company says it delivers “an unprecedented level of intelligence-per-parameter,” meaning it can handle complex tasks without requiring massive compute. This positions it as a more efficient alternative to larger models. It also builds on strong adoption, with over 400 million downloads of earlier Gemma versions. For developers, this signals a maturing open AI ecosystem backed by Google.
Subscribe for free to continue reading this article
Subscribe SubscribeAlready Have an Account? Log In