#1 Trusted Cybersecurity News Platform
Followed by 5.20+ million
The Hacker News Logo
Subscribe – Get Latest News
AWS EKS Security Best Practices

AI Safety | Breaking Cybersecurity News | The Hacker News

Category — AI Safety
New TokenBreak Attack Bypasses AI Moderation with Single-Character Text Changes

New TokenBreak Attack Bypasses AI Moderation with Single-Character Text Changes

Jun 12, 2025 AI Jailbreaking / Prompt Injection
Cybersecurity researchers have discovered a novel attack technique called TokenBreak that can be used to bypass a large language model's (LLM) safety and content moderation guardrails with just a single character change. "The TokenBreak attack targets a text classification model's tokenization strategy to induce false negatives, leaving end targets vulnerable to attacks that the implemented protection model was put in place to prevent," Kieran Evans, Kasimir Schulz, and Kenneth Yeung said in a report shared with The Hacker News. Tokenization is a fundamental step that LLMs use to break down raw text into their atomic units – i.e., tokens – which are common sequences of characters found in a set of text. To that end, the text input is converted into their numerical representation and fed to the model.  LLMs work by understanding the statistical relationships between these tokens, and produce the next token in a sequence of tokens. The output tokens are detokeni...
New Reports Uncover Jailbreaks, Unsafe Code, and Data Theft Risks in Leading AI Systems

New Reports Uncover Jailbreaks, Unsafe Code, and Data Theft Risks in Leading AI Systems

Apr 29, 2025 Vulnerability / Artificial Intelligence
Various generative artificial intelligence (GenAI) services have been found vulnerable to two types of jailbreak attacks that make it possible to produce illicit or dangerous content. The first of the two techniques, codenamed Inception, instructs an AI tool to imagine a fictitious scenario, which can then be adapted into a second scenario within the first one where there exists no safety guardrails . "Continued prompting to the AI within the second scenarios context can result in bypass of safety guardrails and allow the generation of malicious content," the CERT Coordination Center (CERT/CC) said in an advisory released last week. The second jailbreak is realized by prompting the AI for information on how not to reply to a specific request.  "The AI can then be further prompted with requests to respond as normal, and the attacker can then pivot back and forth between illicit questions that bypass safety guardrails and normal prompts," CERT/CC added. Success...
Researchers Reveal 'Deceptive Delight' Method to Jailbreak AI Models

Researchers Reveal 'Deceptive Delight' Method to Jailbreak AI Models

Oct 23, 2024 Artificial Intelligence / Vulnerability
Cybersecurity researchers have shed light on a new adversarial technique that could be used to jailbreak large language models (LLMs) during the course of an interactive conversation by sneaking in an undesirable instruction between benign ones. The approach has been codenamed Deceptive Delight by Palo Alto Networks Unit 42, which described it as both simple and effective, achieving an average attack success rate (ASR) of 64.6% within three interaction turns. "Deceptive Delight is a multi-turn technique that engages large language models (LLM) in an interactive conversation, gradually bypassing their safety guardrails and eliciting them to generate unsafe or harmful content," Unit 42's Jay Chen and Royce Lu said. It's also a little different from multi-turn jailbreak (aka many-shot jailbreak) methods like Crescendo , wherein unsafe or restricted topics are sandwiched between innocuous instructions, as opposed to gradually leading the model to produce harmful outpu...
cyber security

SANS Institute Complimentary Training Bundle ($3240 Value) at Network Security 2025

websiteSANS InstituteCyber Security Training
Register to attend in-person training at Network Security 2025 in Las Vegas, NV and claim a complimentary cyber-pro pass that includes an OnDemand bundle, AND a free pass to compete in NetWars!
cyber security

Key Essentials to Modern SaaS Data Resilience

websiteVeeamSaaS Security / Data Resilience
Learn how to modernize your SaaS data protection strategy and strengthen security to avoid risks of data loss.
Expert Insights Articles Videos
Cybersecurity Resources