Nebius Launches Token Factory Platform to Challenge Microsoft and Amazon

The artificial intelligence infrastructure landscape just got more competitive.

Yesterday, Nebius, a neocloud provider that split from Russian search engine Yandex in 2024, unveiled Token Factory, a comprehensive platform designed to give businesses flexible access to open-source AI models alongside the computing muscle to run them. The platform supports all major open models currently available in the market, including DeepSeek, OpenAI's GPT-OSS, Meta's Llama, Nvidia's Nemotron, and Qwen, with immediate availability and compatibility for over 60 open-source models.

Token Factory focuses specifically on inference workloads, which are essentially the production phase of AI where trained models actually do useful work like answering questions, generating content, and making decisions. It brings together high-performance inference, post-training, and fine-grained access management into a single governed platform, all while maintaining 99.9% uptime reliability. What makes this particularly appealing to enterprises is the freedom it offers. Businesses can experiment with different models, compare performance characteristics, and optimize their AI deployments based on specific needs rather than being locked into proprietary platforms that dictate their choices.

The timing of this launch is particularly interesting given Nebius's existing relationship with Microsoft. Just two months ago, Nebius signed a $19.4 billion multi-year deal with Microsoft to deliver AI infrastructure from its new data center in Vineland, New Jersey, starting later in the year. So Nebius is simultaneously supplying infrastructure to Microsoft while launching a platform that directly competes with Microsoft's own Azure AI Foundry.

Token Factory enters a crowded arena dominated by tech giants: Microsoft's Azure AI Foundry and Amazon's Bedrock. Both platforms have enormous advantages—global infrastructure, deep enterprise relationships, and integration with vast ecosystems of cloud services. But they have complex pricing structures, long lead times, and the very vendor lock-in that's driving customers to explore alternatives.

Nebius isn't alone in challenging the hyperscalers. A wave of specialist startups like Fireworks, Baseten, Together AI, and Replicate are competing in the open-source model hosting space with developer-friendly deployment, autoscaling, and low-latency inference focused on open models. Their value proposition is compelling; they can deploy high-density GPU infrastructure within months rather than the multi-year builds required for hyperscale data centers, often at significant cost savings.

Similar competitive dynamics have played out before in the cloud wars, but the AI infrastructure battle has its own unique characteristics. When AWS pioneered cloud computing, it took years for Microsoft and Google to mount serious challenges. This time around, the technology is moving faster, the stakes are higher, and the barriers to entry, while still substantial, are different.

Don't Miss the Latest News

Success! Now Check Your Email

Spread the Word

Spread the Word

You May Be Interested View All

A UK Bank App Glitch Showed Customers Other People’s Transaction History

🌍 A Future for AI in Africa?

Maximizing Artificial Intelligence Output With A Precise Gemini Lawyer Prompt

NVIDIA GTC 2026: What to Expect and How to Watch Jensen Huang’s Keynote