An AI agent that can place trades and reads the open internet to decide what to trade is a genuinely new kind of target. The good news: the attack surface is small and knowable, and every attack class maps to a guardrail you already control. Here’s the map, so the rest of this section reads as defense, not paranoia.
The one thing every attack has in common
A trading agent makes decisions from text it reads — news, tickers, social posts, the output of its own tools. Any attack on a trading agent is some version of getting hostile content into that input stream and having the agent treat it as a command instead of data. That’s prompt injection, and on a money-agent it has a dollar value. (The fundamentals.)
The four attack classes
- Poisoned news — instructions or fake signals planted in the articles your agent reads to gauge a move. → Poisoned headlines
- Manufactured hype — coordinated social sentiment engineered to make a momentum- or sentiment-driven agent chase a spike. → Pump-and-dump vs. your agent
- Symbol confusion — look-alike or wrong tickers that send an order to the wrong, often manipulated, security. → Fake tickers
- Manipulated data — a tool, feed, or third-party MCP returning false values your agent acts on. → When the data lies
The defenses that cover all four
You don’t need a different control per attack — a handful of guardrails close most of the surface at once:
- News and signals are advisory only. They inform a decision you or the approval gate make; they never fire an order. This single boundary defuses most poisoned-news and hype attacks. (On by default in the Safety Kit.)
- An allowlist means a confused or planted symbol simply can’t be traded. (Set it up.)
- Fixed caps mean even a successful manipulation can only move what you pre-authorized — your per-trade, daily, and concentration limits don’t flex for a "great" story.
- The approval gate puts a human in front of anything large or unusual.
- The injection-refusal rule tells the agent to treat rule-changing input as hostile and flag it.
The mindset
Assume everything your agent reads is untrusted, keep decisions gated and caps fixed, and the attack surface shrinks to something you chose. The rest of this section walks each attack and its specific defense.
Spotting a crafted attack in the wild is a learnable, measurable skill — exactly what we rate at secprove.com. Test yours.