Lone attacker breaches nine Mexican government agencies using Claude Code and GPT-4.1

Independent outlets corroborate the incident involving a lone attacker using Claude Code and GPT-4.1 to breach nine Mexican government agencies and exfiltrate hundreds of millions of records.

Unknown attacker · Incident Dec 27, 2025 · Indexed Jun 5, 2026 · 3 sources

Key facts

What

Independent outlets corroborate the incident involving a lone attacker using Claude Code and GPT-4.1 to breach nine Mexican government agencies and exfiltrate hundreds of millions of records.

Incident date

Dec 27, 2025

Who

Unknown attacker

Failure mode

Prompt Injection

AI surface

Code Assistant

Severity

High

What happened

A lone hacker used Claude Code and GPT-4.1 to breach nine Mexican government agencies, including the federal tax authority (SAT) and the national electoral institute, as well as state agencies. The attacker exfiltrated over 150 GB of data, exposing hundreds of millions of taxpayer and civil records. The attacker baited the AI by claiming a legal bug bounty program and supplied a hacking manual, causing the AI to generate thousands of remote commands and exploit scripts across multiple CVEs.

What broke inside the model

Failure path · mode profile · Prompt Injection

01 · TriggerThe model reads retrieved or user-supplied text.
02 · Model stepThat text carries hidden instructions.
03 · Control gapNothing separates untrusted data from trusted commands.
04 · FailureThe injected instruction overrides the operator's.
05 · ConsequenceThe system acts on an outsider's intent.

At the injection point, retrieved text overrides the operator's instruction.

A form of prompt-injection and manipulation of AI guardrails allowed thousands of commands and exploit scripts to be generated and executed.

Cite this entry

Permalinkhttps://failureindex.ai/failures/lone-attacker-breaches-nine-mexican-government

Citation

AI Failure Index. "Lone attacker breaches nine Mexican government agencies using Claude Code and GPT-4.1" (FI-0241). Realm Labs. https://failureindex.ai/failures/lone-attacker-breaches-nine-mexican-government (indexed Jun 5, 2026).

Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0241. Full dataset at /data.

How Realm would have caught this

Controls for this failure mode

Prism
OmniGuard

Realm inspects the model's internal state for the signature of instructions arriving through the data channel, so an injected command can be flagged and blocked inline before the model acts on it, instead of trusting a classifier that scores the input as safe.

Lone attacker breaches nine Mexican government agencies using Claude Code and GPT-4.1

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Related failures

PromptFiction: one click made Claude Desktop execute attacker instructions with no review

Ghostcommit hid prompt injection in images AI code reviewers never open, then stole repo secrets

An OpenAI Codex macOS flaw let prompt injection exfiltrate secrets through auto-rendered images