Agentic AI Security
Agent architectures & threat surface, tool/action security, delegation & permission escalation, memory & context poisoning, multi-agent system security.
What is Agentic AI Security?
Agentic AI security addresses the unique threat landscape that emerges when AI systems operate autonomously — making decisions, calling tools, delegating to sub-agents, and taking actions in the real world with minimal human oversight. Unlike traditional LLM chatbots that generate text responses, AI agents can execute code, browse the web, send emails, modify databases, manage infrastructure, and chain together multi-step workflows, dramatically expanding the blast radius of any vulnerability.
Agent architectures introduce novel attack surfaces beyond prompt injection. Tool security is critical — if an agent can call APIs, execute shell commands, or access file systems, then compromising the agent's decision-making grants the attacker the agent's full permissions. Delegation chains create transitive trust risks where a compromised sub-agent can influence parent agent behavior. Memory poisoning attacks inject malicious instructions into an agent's persistent memory or context, creating time-delayed attacks that activate in future sessions.
Securing agentic systems requires rethinking traditional security models. Least-privilege tool access, sandboxed execution environments, human-in-the-loop approval for high-risk actions, cryptographic verification of delegation chains, and adversarial testing of agent decision-making are all essential. The field is nascent but rapidly becoming critical as organizations deploy AI agents for customer service, code generation, security operations, and business process automation.
Why it matters
AI agents have real-world authority to take actions, not just generate text. A compromised agent is not just an information leak — it's an active threat with the permissions and capabilities of the systems it can access.
Agentic AI security is the frontier of AI security, extending concepts from LLM security, AI infrastructure security, and AI safety into autonomous systems that act in the world. As AI agents become the primary interface between AI models and enterprise systems, securing them becomes existential.
Control Access & Trust
Decide who or what can do what, enforce it cryptographically, constrain AI behaviour.
Other domains in this layer
See how this layer connects to the rest of the domain map →Standards and frameworks
Curated resources
Authoritative sources we ground Agentic AI Security questions in — frameworks, research, guides, and tools.
OWASP Agentic AI Security
OWASP guidance on securing agentic AI systems — tool use, delegation chains, memory poisoning, and multi-agent architectures.
LangChain Security Best Practices
Security documentation for LangChain agent framework — sandboxing, tool permissions, prompt injection defenses, and deployment hardening.
OpenAI — "Practices for Governing Agentic AI Systems" (2024)
Framework for agentic AI governance: scope control, human oversight, auditability, containment. Defines key properties agents should have and failure modes to prevent.
Ruan et al. — "Identifying the Risks of LM Agents with an LM-Emulated Sandbox" (2024)
ToolEmu framework for evaluating agent risks in sandboxed environments. 36 risk categories across tool use failures. Practical methodology for agent security testing questions.
Mialon et al. — "Augmented Language Models: A Survey"
Survey of tool-using, retrieval-augmented, and reasoning LMs. The architectural foundation for understanding agent capabilities and their security implications.
OWASP — "Top 10 for LLM Applications: Agentic Applications" (2025 supplement)
Extension of the LLM Top 10 specifically for agentic patterns. Covers excessive agency, insecure plugin/tool design, and multi-agent trust boundaries.
Gartner — Top Strategic Technology Trends
Annual trends report. AI trust, risk, and security management (AI TRiSM) has been featured prominently. Good for strategic-level questions about where the industry is heading.
Anthropic — "Challenges in Deploying Machine Learning Agents" research
Analysis of risks specific to AI agents: tool use, chain-of-thought exploitation, multi-step task failures, delegation risks. Key for understanding why agents create new attack surfaces beyond single-turn interactions.
Model Context Protocol (MCP) Specification
Anthropic's open protocol for connecting AI models to external tools and data sources. Critical reading for agentic AI security.
Certifications that signal this domain
Credentials whose blueprint meaningfully covers this domain. Core means centrally covered; also touched means present in the blueprint but not the primary focus.
Core coverage
OffSec AI Security Practitioner
Offensive AI security — adversarial ML, LLM attacks, agent abuse.
Browse all certifications → — pick a cert on the interactive map to highlight every domain it covers.
Education and certifications
More in Cybersecurity of AI Systems
See how your Agentic AI Security skills stack up
311 questions available. Compete head-to-head or run a quick speed quiz to benchmark yourself.