Pillar C: Cybersecurity of AI SystemsC11

Agentic AI Security

Agent architectures & threat surface, tool/action security, delegation & permission escalation, memory & context poisoning, multi-agent system security.

Part of Pillar C: Cybersecurity of AI Systems · Cybersecurity of AI Systems groups the disciplines that share methods, tools, and threat models with Agentic AI Security.

What is Agentic AI Security?

Agentic AI security addresses the unique threat landscape that emerges when AI systems operate autonomously — making decisions, calling tools, delegating to sub-agents, and taking actions in the real world with minimal human oversight. Unlike traditional LLM chatbots that generate text responses, AI agents can execute code, browse the web, send emails, modify databases, manage infrastructure, and chain together multi-step workflows, dramatically expanding the blast radius of any vulnerability.

Agent architectures introduce novel attack surfaces beyond prompt injection. Tool security is critical — if an agent can call APIs, execute shell commands, or access file systems, then compromising the agent's decision-making grants the attacker the agent's full permissions. Delegation chains create transitive trust risks where a compromised sub-agent can influence parent agent behavior. Memory poisoning attacks inject malicious instructions into an agent's persistent memory or context, creating time-delayed attacks that activate in future sessions.

Securing agentic systems requires rethinking traditional security models. Least-privilege tool access, sandboxed execution environments, human-in-the-loop approval for high-risk actions, cryptographic verification of delegation chains, and adversarial testing of agent decision-making are all essential. The field is nascent but rapidly becoming critical as organizations deploy AI agents for customer service, code generation, security operations, and business process automation.

Why it matters

AI agents have real-world authority to take actions, not just generate text. A compromised agent is not just an information leak — it's an active threat with the permissions and capabilities of the systems it can access.

Agentic AI security is the frontier of AI security, extending concepts from LLM security, AI infrastructure security, and AI safety into autonomous systems that act in the world. As AI agents become the primary interface between AI models and enterprise systems, securing them becomes existential.

Decide who or what can do what, enforce it cryptographically, constrain AI behaviour.

Other domains in this layer

See how this layer connects to the rest of the domain map →

Standards and frameworks

Curated resources

Authoritative sources we ground Agentic AI Security questions in — frameworks, research, guides, and tools.

OWASPguide

OWASP Agentic AI Security

OWASP guidance on securing agentic AI systems — tool use, delegation chains, memory poisoning, and multi-agent architectures.

LangChainguide

LangChain Security Best Practices

Security documentation for LangChain agent framework — sandboxing, tool permissions, prompt injection defenses, and deployment hardening.

OpenAIresearch

OpenAI — "Practices for Governing Agentic AI Systems" (2024)

Framework for agentic AI governance: scope control, human oversight, auditability, containment. Defines key properties agents should have and failure modes to prevent.

Unknownresearch

Ruan et al. — "Identifying the Risks of LM Agents with an LM-Emulated Sandbox" (2024)

ToolEmu framework for evaluating agent risks in sandboxed environments. 36 risk categories across tool use failures. Practical methodology for agent security testing questions.

Unknownresearch

Mialon et al. — "Augmented Language Models: A Survey"

Survey of tool-using, retrieval-augmented, and reasoning LMs. The architectural foundation for understanding agent capabilities and their security implications.

OWASPtool

OWASP — "Top 10 for LLM Applications: Agentic Applications" (2025 supplement)

Extension of the LLM Top 10 specifically for agentic patterns. Covers excessive agency, insecure plugin/tool design, and multi-agent trust boundaries.

Gartnerresearch

Gartner — Top Strategic Technology Trends

Annual trends report. AI trust, risk, and security management (AI TRiSM) has been featured prominently. Good for strategic-level questions about where the industry is heading.

Anthropicresearch

Anthropic — "Challenges in Deploying Machine Learning Agents" research

Analysis of risks specific to AI agents: tool use, chain-of-thought exploitation, multi-step task failures, delegation risks. Key for understanding why agents create new attack surfaces beyond single-turn interactions.

Anthropicframework

Model Context Protocol (MCP) Specification

Anthropic's open protocol for connecting AI models to external tools and data sources. Critical reading for agentic AI security.

Certifications that signal this domain

Credentials whose blueprint meaningfully covers this domain. Core means centrally covered; also touched means present in the blueprint but not the primary focus.

Core coverage

OSAIProfessional·OffSecOfficial page →

OffSec AI Security Practitioner

Offensive AI security — adversarial ML, LLM attacks, agent abuse.

Browse all certifications → — pick a cert on the interactive map to highlight every domain it covers.

Education and certifications

More in Cybersecurity of AI Systems

See how your Agentic AI Security skills stack up

311 questions available. Compete head-to-head or run a quick speed quiz to benchmark yourself.