Where every claim in SecProve
comes from.
A dense reading catalog. Every claim is footnoted. Sort by source, filter by pillar, type, or recency. Built for analysts who want to see what we are standing on.
The definitive security risk list for LLM-powered applications. Covers prompt injection, insecure output handling, training data poisoning, and more.
Test your knowledge · C2Comprehensive taxonomy of adversarial ML attacks and mitigations. Covers evasion, poisoning, extraction, and inference attacks with standardized terminology.
Test your knowledge · C1Adversarial Threat Landscape for AI Systems. ATT&CK-style knowledge base of adversarial ML techniques, tactics, and real-world case studies.
Comprehensive guide to AI red teaming from Microsoft's dedicated AI security team. Covers methodology, tools, and findings.
Test your knowledge · C5The authoritative framework for managing AI risks. Defines four core functions: Govern, Map, Measure, Manage. Essential reading for anyone building or deploying AI systems.
Test your knowledge · C7Updated cybersecurity framework with six core functions: Govern, Identify, Protect, Detect, Respond, Recover.
Test your knowledge · C7Introduced DP-SGD for training neural networks with formal differential privacy guarantees. Foundation for private ML.
Test your knowledge · C4First practical membership inference attack against ML models. Showed that ML APIs leak information about their training data.
Test your knowledge · C4Introduced PGD-based adversarial training, currently the most reliable defense against adversarial examples. Established the robustness-accuracy tradeoff.
Test your knowledge · C1International standard for establishing and maintaining an AI management system. Includes 39 controls across 10 areas.
Test your knowledge · C7Seminal backdoor attack paper. Demonstrated trojaned models in transfer learning scenarios. Foundational for AI supply chain security questions.
Test your knowledge · C3Demonstrated that adversarial examples transfer between models, enabling black-box attacks via surrogate models. Key work on transferability.
Test your knowledge · C1Introduced the C&W attack, demonstrating that defensive distillation and other defenses could be reliably bypassed. Changed how robustness is evaluated.
Test your knowledge · C1Collection of Anthropic's published research on AI safety, alignment, interpretability, and security.
Test your knowledge · C8The European Union's comprehensive AI regulation. Classifies AI systems by risk level and sets requirements for high-risk systems.
Test your knowledge · C7Python Risk Identification Toolkit for generative AI. Automated red teaming framework for testing LLM applications.
Test your knowledge · C5Voluntary framework for improving privacy through enterprise risk management. Complements the Cybersecurity Framework.
Test your knowledge · C4The seminal paper introducing FGSM (Fast Gradient Sign Method). Established that adversarial examples are a fundamental property of neural networks, not a bug.
Test your knowledge · C1Demonstrated that LLMs memorize and can be prompted to regurgitate training data verbatim, including PII. Foundational work on LLM privacy risks.
Test your knowledge · C2Coalition for Content Provenance and Authenticity. Technical standard for digital content provenance and integrity.
Test your knowledge · C9Hugging Face's safe serialization format for ML models. Prevents arbitrary code execution from pickle-based attacks.
Test your knowledge · C3Showed that gradually escalating benign conversations can bypass safety filters over multiple turns. Defeats per-message safety checks.
Test your knowledge · C2Demonstrated indirect prompt injection attacks through RAG documents, emails, and web content. Essential reading for RAG security.
Test your knowledge · C2The GCG attack paper. Showed that adversarial suffixes can bypass safety alignment in LLMs, transferring across models.
Test your knowledge · C2CISA guidance on understanding, detecting, and defending against deepfake threats in organizational contexts.
Test your knowledge · C9Five practical safety problems: avoiding side effects, reward hacking, scalable oversight, safe exploration, distributional shift. Still the canonical taxonomy for AI safety research questions.
Test your knowledge · C8The largest model hub. Security features: malware scanning, pickle scanning, safetensors format. Questions on model provenance, serialization risks (pickle exploits), and model marketplace trust.
Test your knowledge · C3Security documentation for LangChain agent framework — sandboxing, tool permissions, prompt injection defenses, and deployment hardening.
Test your knowledge · C11Application container security guide covering image, registry, orchestrator, container, and host OS security.
Test your knowledge · C6NVIDIA's open-source LLM vulnerability scanner. Tests for prompt injection, jailbreaking, data leakage, and more.
Test your knowledge · C5Reports on state-affiliated actors using AI for influence operations. Documents actual observed misuse, not theoretical risks. Key for questions about real-world AI-enabled disinformation.
Test your knowledge · C10Research on propaganda techniques, cognitive security, and information warfare. The "firehose of falsehood" model explains high-volume, multi-channel disinformation. Good for strategic questions.
Test your knowledge · C10Security docs for major ML platforms. Covers authentication, authorization, experiment tracking security, model registry access controls. Practical infrastructure security questions.
Test your knowledge · C6Introduced SISA training for efficient machine unlearning — enabling models to "forget" specific training data without full retraining.
Test your knowledge · C4Standardized benchmark for evaluating adversarial robustness of ML models. Leaderboard of most robust models.
Test your knowledge · C1Benchmark measuring whether language models generate truthful answers. Tests for common misconceptions and falsehoods.
Test your knowledge · C8Industry coalition implementing C2PA. Open-source tools for content credentials. Practical implementation questions about provenance at scale.
Test your knowledge · C9Largest public AI red teaming event. 2,200+ participants testing multiple foundation models. Established community norms for responsible AI red teaming. Good for questions on practical red team methodology.
Test your knowledge · C5Analysis of risks specific to AI agents: tool use, chain-of-thought exploitation, multi-step task failures, delegation risks. Key for understanding why agents create new attack surfaces beyond single-turn interactions.
Test your knowledge · C11Crowdsourced red teaming methodology with 38,961 attacks across multiple models. Taxonomy of harmful outputs and effectiveness of different red teaming strategies. Key reference for structured AI red teaming.
Test your knowledge · C5Anthropic's framework for responsible AI development. Defines AI Safety Levels (ASL) and capability thresholds.
Test your knowledge · C8Anthropic's approach to AI alignment using a set of principles (a "constitution") to train helpful and harmless AI. Foundation of modern RLHF alternatives.
Test your knowledge · C8Demonstrated that long-context LLMs can be jailbroken by providing many examples of the desired behavior. Scales with context window size.
Test your knowledge · C2Anthropic's open protocol for connecting AI models to external tools and data sources. Critical reading for agentic AI security.
Test your knowledge · C11Technical standard for content provenance. Cryptographic binding of creation metadata to content. The leading technical approach to synthetic media authentication. Questions on architecture, limitations, and adoption challenges.
Test your knowledge · C9Comprehensive taxonomy of AI risks: weaponization, misinformation, power concentration, value lock-in, rogue AI. Good for strategic-level safety questions beyond technical alignment.
Test your knowledge · C8Official Kubernetes documentation on securing clusters, pods, and workloads. Essential for ML infrastructure security.
Test your knowledge · C6Framework for analyzing and countering disinformation. Provides a structured approach to information manipulation threats.
Test your knowledge · C10(See cross-cutting.md.) For C7 specifically: conformity assessments, technical documentation requirements, post-market monitoring, fundamental rights impact assessments. Detailed compliance questions.
Test your knowledge · C7Law enforcement perspective on deepfake threats: evidence tampering, identity fraud, CEO fraud, CSAM. Policy and response frameworks.
Test your knowledge · C9Annual trends report. AI trust, risk, and security management (AI TRiSM) has been featured prominently. Good for strategic-level questions about where the industry is heading.
Test your knowledge · C11Positions AI security technologies on the hype cycle. Useful for questions about technology maturity, adoption timelines, and distinguishing hype from operational readiness.
Test your knowledge · C7Analysis of how LLMs can amplify influence operations: cost reduction, scalability, personalization, multilingual content. Framework for assessing disinformation risk from generative AI.
Test your knowledge · C10Open-source DP libraries and practical guides. Bridges theory to implementation. Good for questions on real-world DP deployment challenges and privacy budget management.
Test your knowledge · C4Google's conceptual framework for securing AI systems. Covers supply chain, data governance, and deployment security.
Test your knowledge · C7Research on reward modeling, debate, recursive reward modeling, and interpretability. Provides an alternative perspective to Anthropic/OpenAI approaches.
Test your knowledge · C8Framework for evaluating dangerous capabilities: persuasion, deception, cyber operations, self-replication. Defines evaluation methodology for frontier model safety. Questions on what to test and how to interpret results.
Test your knowledge · C5Google DeepMind's watermarking technology for AI-generated content. Embeds imperceptible watermarks in images, audio, and text.
Test your knowledge · C9Extracted training data from ChatGPT (production model) using a divergence attack. Showed alignment doesn't prevent memorization. Questions on the gap between safety fine-tuning and data protection.
Test your knowledge · C4Security best practices for using Hugging Face Hub — model scanning, SafeTensors, access controls, and supply chain considerations.
Test your knowledge · C3Comprehensive library for adversarial ML. Supports attacks, defenses, and robustness evaluation across multiple ML frameworks.
Test your knowledge · C1Discovered 100+ malicious models on Hugging Face exploiting pickle deserialization for code execution. Real-world evidence of AI supply chain attacks. Good for scenario-based questions.
Test your knowledge · C3Microsoft's tool for assessing the security of ML models. Supports evasion, extraction, and inversion attacks.
Test your knowledge · C1Practical lessons from large-scale LLM red teaming across real products. Covers failure modes, testing methodologies, and organizational patterns. Rare insight into enterprise-scale AI security.
Test your knowledge · C2The theoretical foundation for differential privacy. Essential for questions on privacy-preserving ML training (DP-SGD) and the epsilon-delta framework.
Test your knowledge · C4Landmark study: false news spreads farther, faster, deeper than true news on social media. Not AI-specific but foundational for understanding why AI-generated disinformation is dangerous.
Test your knowledge · C10Companion to AI RMF 1.0 specifically for generative AI. Maps 12 GenAI risks to RMF actions. Covers CBRN, CSAM, confabulation, data privacy, environmental, human-AI interaction, information integrity, IP, obscenity, toxicity, value chain.
(See cross-cutting.md for details.) The primary AI governance framework for US context. Questions should test practical application of Govern/Map/Measure/Manage, not just recall.
Test your knowledge · C7Extending software bill of materials concepts to AI: model cards, data cards, training provenance. Emerging standard for AI supply chain transparency.
Test your knowledge · C3GPU cluster security, multi-tenant GPU isolation, model serving infrastructure hardening. Vendor-specific but covers unique infrastructure challenges (GPU memory isolation, CUDA vulnerabilities) not covered elsewhere.
Test your knowledge · C6Framework for agentic AI governance: scope control, human oversight, auditability, containment. Defines key properties agents should have and failure modes to prevent.
Test your knowledge · C11Description of external red teaming program and findings from GPT-4 pre-deployment testing. The system card details risk categories, testing methodology, and residual risks.
Test your knowledge · C5Research on the core alignment challenge: can weaker systems supervise stronger ones? Showed partial generalization is possible. Key for superalignment and scalable oversight questions.
Test your knowledge · C8Framework for ensuring the integrity of software artifacts throughout the supply chain. Applicable to ML model pipelines.
Test your knowledge · C3Extension of the LLM Top 10 specifically for agentic patterns. Covers excessive agency, insecure plugin/tool design, and multi-agent trust boundaries.
Test your knowledge · C11OWASP guidance on securing agentic AI systems — tool use, delegation chains, memory poisoning, and multi-agent architectures.
Test your knowledge · C11Top 10 security risks specific to machine learning systems, including supply chain attacks, data poisoning, and model theft.
Test your knowledge · C1Certification program for responsible AI. Assessment criteria across fairness, explainability, accountability, robustness. Emerging industry certification.
Test your knowledge · C7Research group studying abuse in information technologies, including AI-enabled disinformation, platform manipulation, and election interference.
Test your knowledge · C10Comprehensive annual data on AI progress: research output, investment, policy, public opinion, technical performance. The best source for quantitative AI landscape questions.
Test your knowledge · C7Security audit firm with deep AI/ML expertise. Published research on pickle deserialization attacks, model file format security, and ML pipeline vulnerabilities. Technical depth from a security-first perspective.
Test your knowledge · C6Large-scale benchmark dataset and tools for detecting facial manipulation in images and video. Used for deepfake detection research.
Test your knowledge · C9Historical survey tracing adversarial ML from 2004 spam filters through deep learning. Essential for questions on the evolution and taxonomy of adversarial attacks (evasion, poisoning, model extraction).
Test your knowledge · C1Extended training data extraction to image models. Showed Stable Diffusion memorizes and regurgitates training images. Important for multimodal AI data security questions.
Test your knowledge · C4The RLHF paper that enabled ChatGPT-style alignment. Reward model from human preferences + PPO. Foundational for understanding modern alignment approaches and their limitations.
Test your knowledge · C8Survey of tool-using, retrieval-augmented, and reasoning LMs. The architectural foundation for understanding agent capabilities and their security implications.
Test your knowledge · C11Comprehensive survey covering generation techniques (autoencoders, GANs, diffusion), detection approaches (visual artifacts, frequency analysis, physiological signals), and the arms race dynamic.
Test your knowledge · C9Largest prompt injection competition dataset. Taxonomy of prompt injection techniques: context ignoring, fake completion, payload splitting, obfuscation. Empirical data on attack success rates across models.
Test your knowledge · C2Benchmark dataset and detection methods for facial manipulation. Covers DeepFakes, Face2Face, FaceSwap, NeuralTextures. Standard reference for deepfake detection evaluation.
Test your knowledge · C9ToolEmu framework for evaluating agent risks in sandboxed environments. 36 risk categories across tool use failures. Practical methodology for agent security testing questions.
Test your knowledge · C11Systematic analysis of jailbreak techniques: competing objectives and mismatched generalization. Framework for understanding why safety training is inherently incomplete. Essential for nuanced jailbreak questions.
Test your knowledge · C2Ready to test what you've learned?
Our questions are built directly from these resources. Take a quiz and see how your knowledge stacks up.