New AI-Targeted Cloaking Attack Tricks AI Crawlers Into Citing Fake Info as Verified Facts

Ravie LakshmananOct 29, 2025Machine Learning / AI Safety

Cybersecurity researchers have flagged a new security issue in agentic web browsers like OpenAI ChatGPT Atlas that exposes underlying artificial intelligence (AI) models to context poisoning attacks.

In the attack devised by AI security company SPLX, a bad actor can set up websites that serve different content to browsers and AI crawlers run by ChatGPT and Perplexity. The technique has been codenamed AI-targeted cloaking.

The approach is a variation of search engine cloaking, which refers to the practice of presenting one version of a web page to users and a different version to search engine crawlers with the end goal of manipulating search rankings.

The only difference in this case is that attackers optimize for AI crawlers from various providers by means of a trivial user agent check that leads to content delivery manipulation.

"Because these systems rely on direct retrieval, whatever content is served to them becomes ground truth in AI Overviews, summaries, or autonomous reasoning," security researchers Ivan Vlahov and Bastien Eymery said. "That means a single conditional rule, 'if user agent = ChatGPT, serve this page instead,' can shape what millions of users see as authoritative output."

SPLX said AI-targeted cloaking, while deceptively simple, can also be turned into a powerful misinformation weapon, undermining trust in AI tools. By instructing AI crawlers to load something else instead of the actual content, it can also introduce bias and influence the outcome of systems leaning on such signals.

"AI crawlers can be deceived just as easily as early search engines, but with far greater downstream impact," the company said. "As SEO [search engine optimization] increasingly incorporates AIO [artificial intelligence optimization], it manipulates reality."

The disclosure comes as an analysis of browser agents against 20 of the most common abuse scenarios, ranging from multi-accounting to card testing and support impersonation, discovered that the products attempted nearly every malicious request without the need for any jailbreaking, the hCaptcha Threat Analysis Group (hTAG) said.

Furthermore, the study found that in scenarios where an action was "blocked," it mostly came down due to the tool missing a technical capability rather than due to safeguards built into them. ChatGPT Atlas, hTAG noted, has been found to carry out risky tasks when they are framed as part of debugging exercises.

Claude Computer Use and Gemini Computer Use, on the other hand, have been identified as capable of executing dangerous account operations like password resets without any constraints, with the latter also demonstrating aggressive behavior when it comes to brute-forcing coupons on e-commerce sites.

hTAG also tested the safety measures of Manus AI, uncovering that it executes account takeovers and session hijacking without any issue, while Perplexity Comet runs unprompted SQL injection to exfiltrate hidden data.

"Agents often went above and beyond, attempting SQL injection without a user request, injecting JavaScript on-page to attempt to circumvent paywalls, and more," it said. "The near-total lack of safeguards we observed makes it very likely that these same agents will also be rapidly used by attackers against any legitimate users who happen to download them."

Found this article interesting? Follow us on Google News, Twitter and LinkedIn to read more exclusive content we post.