Radware disclosed ZombieAgent, a zero-click prompt injection that persisted in ChatGPT agents

Radware security researcher Zvika Babo disclosed ZombieAgent, a set of indirect prompt injection vulnerabilities in ChatGPT that enabled zero-click data exfiltration and persistent compromise. The attack exploited ChatGPT Connectors to read malicious emails containing hidden instructions, then exfiltrated sensitive data character by character via pre-built URLs that bypassed OpenAI guardrails. The vulnerability also allowed attackers to implant persistent malicious logic into ChatGPT Memory and self-propagate to new victims via harvested email addresses.

OpenAI · Incident Sep 26, 2025 · Indexed Jun 4, 2026 · 2 sources

Pre-built URLs for every character let ChatGPT exfiltrate data one letter at a time, sidestepping guardrails that only blocked dynamic URL construction.
What
Radware security researcher Zvika Babo disclosed ZombieAgent, a set of indirect prompt injection vulnerabilities in ChatGPT that enabled zero-click data exfiltration and persistent compromise.
Incident date
Sep 26, 2025
Who
OpenAI
Failure mode
Prompt Injection
AI surface
Agentic Workflow
Severity
High

What happened

Radware researcher Zvika Babo reported ZombieAgent to OpenAI via BugCrowd on September 26, 2025, demonstrating that an attacker could send a malicious email to a ChatGPT user and, when the user later asked ChatGPT to perform a routine task involving their inbox, the agent would automatically read the malicious email and execute hidden prompt injection instructions. The attack extracted sensitive data from connected services (Gmail, Google Drive, GitHub, Slack, Teams), user chat history, and personal memories character by character via pre-built URLs. The vulnerability also allowed persistent compromise by injecting malicious instructions into ChatGPT Memory, ensuring data exfiltration continued across sessions, and self-propagation by harvesting email addresses and sending malicious payloads to new targets. OpenAI fixed the vulnerability on December 16, 2025, and Radware publicly disclosed it on January 8, 2026.

What broke inside the model

Failure path · mode profile · Prompt Injection
  1. 01 · TriggerThe model reads retrieved or user-supplied text.
  2. 02 · Model stepThat text carries hidden instructions.
  3. 03 · Control gapNothing separates untrusted data from trusted commands.
  4. 04 · FailureThe injected instruction overrides the operator's.
  5. 05 · ConsequenceThe system acts on an outsider's intent.

At the injection point, retrieved text overrides the operator's instruction.

OpenAI had implemented guardrails to prevent URL manipulation by blocking dynamic URL construction, but ZombieAgent bypassed this by supplying a complete pre-built URL for every possible character, so ChatGPT only selected from the list instead of constructing URLs. OpenAI also attempted to separate Memory and Connector contexts, but the protection only blocked one direction, allowing ChatGPT to access Memory first and then use Connectors to exfiltrate data. These incomplete guardrails left the agent open to persistent compromise across sessions.

Public visibilityHigh
Regulatory exposurePossible
Customer impactClass-wide
Financial impactUnknown
Time to disclosureMonths
  1. PrimaryZombieAgent: New ChatGPT Vulnerabilities Let Data Theft Continue ...radware.com
  2. Press'ZombieAgent' Attack Let Researchers Take Over ChatGPTsecurityweek.com
Permalinkhttps://failureindex.ai/failures/radware-disclosed-zombieagent-zero-click
CitationAI Failure Index. "Radware disclosed ZombieAgent, a zero-click prompt injection that persisted in ChatGPT agents" (FI-0182). Realm Labs. https://failureindex.ai/failures/radware-disclosed-zombieagent-zero-click (indexed Jun 4, 2026).
Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0182. Full dataset at /data.

Note from Realm Labs, the Index steward

How Realm would have caught this

Controls for this failure mode
  • Prism
  • OmniGuard

Realm inspects the model's internal state for the signature of instructions arriving through the data channel, so an injected command can be flagged and blocked inline before the model acts on it, instead of trusting a classifier that scores the input as safe.