xAI Releases Grok 4 Fast With Free Access and API Pricing Details
xAI’s new model focuses on speed, efficiency, and lower costs
When Grok 4 launched, it quickly drew attention for both its capabilities and the controversies that followed, including erratic and offensive antisemitic comments that raised questions about xAI’s approach to safety. That early reception left many observers curious about what direction the company would take next.

Grok 4 Fast Efficiency Gains
The answer came with Grok-4 Fast. The Elon Musk-owned AI startup announced this latest release, positioned as a step towards improving speed and efficiency rather than reshaping the field. It uses a unified architecture that can switch between reasoning and non-reasoning tasks, allowing it to manage long conversations or complex codebases while still producing quick responses for simpler queries, similar to other chatbots like ChatGPT. The 2 million token context window could make that possible, allowing it to process the equivalent of several books or hours of video in one go.
One of the major criticisms of earlier versions was efficiency. Grok 4 often consumed more tokens than necessary to reach the right answer, driving up costs and slowing performance. Internal testings of Grok 4 Fast suggests it uses 40% fewer tokens and achieves benchmark results more cheaply. For users, that could translate into faster responses and lower resource use across coding, research, or web tasks.
Availability and Pricing
xAI has made Grok 4 Fast available to all users, including those on the free tier, with access through web browsers, iOS, and Android. This broader rollout brings the model to a global audience, making experimentation easier for casual users as well as professionals.
For developers and businesses, Grok 4 Fast is also available through the xAI API. Two versions are offered: one tuned for reasoning-heavy work and another for lighter, faster use cases, both sharing the same 2 million token context window. The pricing structure for these users is tiered, starting at $0.20 per million input tokens for smaller requests, rising to $0.40 once inputs exceed 128,000 tokens. Output tokens cost between $0.50 and $1.00 per million, depending on size, while cached input tokens are priced at $0.05 per million.
How Grok 4 Fast Ranks Against Rivals
Compared with peers, Grok 4 Fast is performing unevenly. On LMArena, the platform that pits models head-to-head, it ranks first for search-related tasks and eighth for text-based performance. That suggests it has an edge in information retrieval but remains less competitive for writing, a logical fit given its integration with X.
Its release comes as rivals push ahead. OpenAI is advancing GPT-5, Google is expected to update Gemini, and Anthropic has just rolled out Claude 4.1. The big question now is whether this balance of speed and efficiency can hold as the broader market evolves.


