Subscribe to Our Newsletter

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks
OpenAI Admits Prompt Injection Isn’t Going Away as It Hardens Security for Atlas
Photo by Zulfugar Karimov / Unsplash

OpenAI Admits Prompt Injection Isn’t Going Away as It Hardens Security for Atlas

Prompt injection has become a permanent risk for AI agents on the open web, and even OpenAI now admits the goal is damage control, not total prevention.

Ogbonda Chivumnovu profile image
by Ogbonda Chivumnovu

AI agents are getting better at browsing the web on our behalf. They can read emails, scan documents, and take action faster than any human could. But OpenAI is now openly admitting something uncomfortable: the same systems that make AI agents useful also make them easy to trick.

That admission sits at the heart of a blog post OpenAI published on Monday, where the company outlined new security measures for Atlas, its AI browser, while conceding a hard truth. “Prompt injection, much like scams and social engineering on the web, is unlikely to ever be fully ‘solved,’” the company wrote. In the same post, OpenAI acknowledged that Atlas’ “agent mode” inevitably “expands the security threat surface.”

Prompt injections work by slipping hidden instructions into places AI agents are designed to trust, web pages, emails, documents. When the agent reads that content, it may follow those instructions instead of the user’s original request. The risk became obvious on the day Altas launched in October, when researchers demonstrated how indirect prompt injections could override Atlas’ behaviour. That same day, Brave, a privacy-focused internet browser, warned that the issue wasn’t unique to OpenAI, calling it a “systematic challenge” for AI-powered browsers, including Perplexity’s Comet.

Comet AI vs ChatGPT Atlas
Which AI browser is the best?

By early December, the concern had reached governments. The UK’s National Cyber Security Centre warned that prompt injection attacks against generative AI tools “may never be totally mitigated,” urging organisations to focus on limiting damage rather than expecting full prevention.

Google and Anthropic have taken a similar stance, framing prompt injection as a structural risk that requires layered defences, tighter permissions, and architectural controls, not a one-time fix. Where OpenAI diverges is in how it stress-tests those defences. Instead of relying only on human red teams, the company built an automated attacker powered by a large language model. Trained with reinforcement learning, the bot plays the role of a hacker, repeatedly probing Atlas in simulated environments. It tests an attack, observes how the AI reasons through it, adjusts the strategy, and tries again.

That internal visibility matters. Because OpenAI can see how Atlas “thinks,” it can identify attack paths faster than outsiders. In testing, the automated attacker uncovered strategies that human testers hadn’t found. One demo showed a malicious email slipping into an inbox and instructing the agent to send a resignation message. After the update, Atlas flagged the attempt instead of acting on it.

Still, stronger defences don’t erase the trade-offs. As Wiz security researcher Rami McCarthy puts it, risk in AI systems comes down to autonomy multiplied by access. Agentic browsers sit in an uncomfortable middle ground, with enough freedom to act and enough access to cause real harm.

OpenAI now recommends limiting what agents can access, requiring confirmation before sending messages or payments, and avoiding vague instructions like “take whatever action is needed.” As the company put it, “wide latitude makes it easier for hidden or malicious content to influence the agent.”

The bigger question is whether today’s agentic browsers justify their risk. For many everyday users, the value still feels theoretical, while the consequences of failure are very real. That balance will likely shift as defences mature, but for now, AI agents browsing the web remain powerful, promising, and fundamentally hard to secure.

OpenAI Rolls Out Atlas Browser to Rival Google Chrome
Atlas could turn everyday browsing into an AI-powered experience that understands, remembers, and acts for you.

Ogbonda Chivumnovu profile image
by Ogbonda Chivumnovu

Subscribe to Techloy.com

Get the latest information about companies, products, careers, and funding in the technology industry across emerging markets globally.

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks

Read More