Context Manipulation Attacks: Why Web Agents Need Secure Memory
Published on
July 14, 2025
Read time:
6 mins
Context Manipulation Attacks: Why Web Agents Need Secure Memory
  • We show that stateless web agents rely on external memory systems that can be corrupted, creating a critical new attack surface
  • In our previous work on Web3 agents, we introduced context manipulation and CrAIBench, demonstrating cross‐platform memory injection on ElizaOS (Figure 1)
  • Here we formalize “plan injection,” in which adversaries insert malicious steps into an agent’s task plan, achieving up to 3× higher success than prompt attacks
  • We evaluate both the Plan Injection Benchmark and the WebVoyager-Privacy Benchmark to show how semantic alignment drives attack efficacy
  • Standard prompt defenses cut prompt injection rates but leave plan injection largely unaffected, with 46 percent success on Agent-E and 63 percent on Browser-use

Figure 1. Information flow and context manipulation attack vectors for web agents.

Why agent memory matters

Autonomous web agents translate natural‐language instructions into browser actions but are inherently stateless. To maintain context they depend on memory stored client-side or by third parties, outside the secure boundary of centralized chat systems. This creates a vulnerability: malicious actors can tamper with stored context rather than just prompts or retrieved data.

Our previous work in Web3

In “Real AI Agents with Fake Memories,” we revealed how context manipulation attacks on Web3 agents can lead to unauthorized crypto transfers. We introduced CrAIBench, a 150+ task benchmark for financial agents, and demonstrated cross‐platform memory injection on ElizaOS (Discord → Twitter), resulting in irreversible fund theft.

 Figure 2. Plan injection attack to leak private data. Malicious steps (red) blend with legitimate subtasks.

Benchmarking plan injection and WebVoyager-Privacy

Plan Injection Benchmark (Agent-E)

  • 15 samples × 5 runs across four categories: factual, opinion, advertisement, privacy
  • Opinion tasks: 94.7 percent success for task-aligned vs 70.7 percent for non-contextual
  • Factual tasks: 18.7 percent for task-aligned vs 0 percent for non-contextual

Figure 3. Attack success rates across task categories and injection strategies for Agent-E.
Figure 4. Embedding-space analysis: context-chained injections achieve optimal balance between user and attacker similarity.

WebVoyager-Privacy Benchmark

  • 45 privacy tasks drawn from 9 domains of the WebVoyager dataset
  • Dynamic LLM-crafted injections ensure realistic scenarios
Figure 5. ASR for weak/strong prompt injection and single plan injection on Agent-E under different defenses.

Figure 6. Comparison of injection sophistication levels on Agent-E and Browser-use. Context-chained injections clearly outperform other methods.

Defenses fall short

We implemented two prompt defenses, explicit security guidelines and sandwiching retrieved content. Both reduced prompt injection ASR from > 80 percent to < 20 percent. However, a single plan injection at planning time still achieved 46 percent on Agent-E and 63 percent on Browser-use.

Securing agent memory

Our findings show that prompt‐only defenses are insufficient. We recommend:

  • Semantic integrity checks to detect and remove malicious plan steps
  • Strict memory isolation and verifiable context modules in agent architectures

As web agents take on sensitive tasks, securing their memory layer is critical to prevent context manipulation attacks.