Meta Launches LlamaFirewall Framework to Stop AI Jailbreaks, Injections, and Insecure Code
Apr 30, 2025
Secure Coding / Vulnerability
Meta on Tuesday announced LlamaFirewall , an open-source framework designed to secure artificial intelligence (AI) systems against emerging cyber risks such as prompt injection, jailbreaks, and insecure code, among others. The framework , the company said, incorporates three guardrails, including PromptGuard 2, Agent Alignment Checks, and CodeShield. PromptGuard 2 is designed to detect direct jailbreak and prompt injection attempts in real-time, while Agent Alignment Checks is capable of inspecting agent reasoning for possible goal hijacking and indirect prompt injection scenarios. CodeShield refers to an online static analysis engine that seeks to prevent the generation of insecure or dangerous code by AI agents. "LlamaFirewall is built to serve as a flexible, real-time guardrail framework for securing LLM-powered applications," the company said in a GitHub description of the project. "Its architecture is modular, enabling security teams and developers to com...